Skip to main content

This site requires you to update your browser. Your browsing experience maybe affected by not having the most up to date version.

General Questions /

General questions about getting started with SilverStripe that don't fit in any of the categories above.

Moderators: martimiz, Sean, biapar, Willr, Ingo, swaiba, simon_w

Truncating text while maintaining HTML

Go to End

4 Posts   1148 Views


Community Member, 49 Posts

11 September 2012 at 7:22am


I've built a news section to my site where the landing page is a summary of the latest 10 news items which a read more below each.

I'm using $Content.FirstParagraph to truncate my content before the read more.

Key issue here is if for some reason the first paragraph sits within a div:

<p>Hello World!</p>

The opening div tag will come through in my summary, but not the closing div tag - resulting in all sorts of styling headaches.

I'm wondering if anyone has tackled this issue before? What's the best way to show a summary of your content while retaining HTML opening and closing tags? Is this possible?


Community Member, 49 Posts

23 October 2012 at 11:57am

I didn't have much luck finding a saphire specific answer to this question, so I went searching for a PHP function to help me with this and came across this post:

Here is my end result. I'm still curious as to whether there is an easier way to achieve this?

	 * Get a bbcode parsed summary of the blog entry
	function ParagraphSummary(){
		if(self::$allow_wysiwyg_editing) {
			return $this->printTruncated(800, $this->obj('Content')->FirstParagraph('html'));
		} else {
			$parser = new BBCodeParser($this->Content);
			$html = new HTMLText('Content');
			return $this->printTruncated(800, $html->FirstParagraph('html'));
	* Responsible for closing any tags which have been left open in the summary...
	public function printTruncated($maxLength, $html, $isUtf8=true)
		$printedLength = 0;
		$position = 0;
		$tags = array();
		$string = '';
		// For UTF-8, we need to count multibyte sequences as one character.
		$re = $isUtf8
			? '{</?([a-z]+)[^>]*>|&#?[a-zA-Z0-9]+;|[\x80-\xFF][\x80-\xBF]*}'
			: '{</?([a-z]+)[^>]*>|&#?[a-zA-Z0-9]+;}';
		while ($printedLength < $maxLength && preg_match($re, $html, $match, PREG_OFFSET_CAPTURE, $position))
			list($tag, $tagPosition) = $match[0];
			// Print text leading up to the tag.
			$str = substr($html, $position, $tagPosition - $position);
			if ($printedLength + strlen($str) > $maxLength)
				$string .= substr($str, 0, $maxLength - $printedLength);
				$printedLength = $maxLength;
			$string .= $str;
			$printedLength += strlen($str);
			if ($printedLength >= $maxLength) break;
			if ($tag[0] == '&' || ord($tag) >= 0x80)
				// Pass the entity or UTF-8 multibyte sequence through unchanged.
				$string .= $tag;
				// Handle the tag.
				$tagName = $match[1][0];
				if ($tag[1] == '/')
					// This is a closing tag.
					$openingTag = array_pop($tags);
					assert($openingTag == $tagName); // check that tags are properly nested.
					$string .= $tag;
				else if ($tag[strlen($tag) - 2] == '/')
					// Self-closing tag.
					$string .= $tag;
					// Opening tag.
					$string .= $tag;
					$tags[] = $tagName;
			// Continue after the tag.
			$position = $tagPosition + strlen($tag);
		// Print any remaining text.
		if ($printedLength < $maxLength && $position < strlen($html))
			$string .= substr($html, $position, $maxLength - $printedLength);
		// Close any open tags.
		while (!empty($tags))
			$string .= sprintf('</%s>', array_pop($tags));
		return $string;


Community Member, 22 Posts

23 October 2012 at 11:57pm

I had the same problem and never solved it. Fortunately, I didn't need to but if I did then a custom function would've been required.
You could use PHP strip_tags to selectively remove things like <div> while leaving <strong> for example, then truncate the cleaned up version.

What I never worked out was how to do this custom function without modifying core code, so would be interested to know if this is possible by extending a class.


Forum Moderator, 1166 Posts

24 October 2012 at 5:34am

I've never tested this but I think you should be able to create a decorator for the Text class, something like this:

class ExtendedText extends Extension {

	function ParagraphSummary() {
		return $data;

Then in config: Object::add_extension('Text', 'ExtendedText');