Skip to main content

This site requires you to update your browser. Your browsing experience maybe affected by not having the most up to date version.

We've moved the forum!

Please use forum.silverstripe.org for any new questions (announcement).
The forum archive will stick around, but will be read only.

You can also use our Slack channel or StackOverflow to ask for help.
Check out our community overview for more options to contribute.

Customising the CMS /

Moderators: martimiz, Sean, Ed, biapar, Willr, Ingo, swaiba

Content Summary in valid HTML (LimitWordCount)


Go to End


3 Posts   4569 Views

Avatar
torleif2

Community Member, 6 Posts

24 November 2009 at 5:45pm

Here's a HTML safe LimitWordCount() function. Often I want to limit the amount of words in a content block, but retain the formatting.

eg:
Content = <p>this <br/>paragraph would normally get cut off</p>
getSummaryHTML (2) returns: <p>this <br/>paragraph</p>

This current function is to be placed in the model (it would be nice to place in controller, but it doesn't work for some reason.) It would be easy to put inside your own extended HTMLText element.

Code (in Page.php):

	/*
	* limits words to a number, but tries to validate the code
	*/
	public function getSummaryHTML ($limit = 100){
		$m = 0;
		$addEplisis = '';
		$returnstr = '';
		$returnArray = array();
		$html = array();
		$chars = preg_split('/(<[^>]*[^\/]>| )/i', $this->Content, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
		foreach ($chars as $elemnt) {
			// found start tag
			if(preg_match('/^<(p|h1|h2|h3|h4|h5|h6|q|b|i|strong|em)(.*)>$/', $elemnt)){
				preg_match('/^<(p|h1|h2|h3|h4|h5|h6|q|b|i|strong|em)(.*)>$/', $elemnt, $matches);
				array_push($html, $matches[1]);// convert <p class=""> to p
				array_push($returnArray, $elemnt);
				// found end tag
			} else if(preg_match('/^<\/(p|h1|h2|h3|h4|h5|h6|q|b|i|strong|em)(.*)>$/', $elemnt)){
				preg_match('/^<\/(p|h1|h2|h3|h4|h5|h6|q|b|i|strong|em)(.*)>$/', $elemnt, $matches);
				$testelement = array_pop ($html);
				// match (ie: <p>etc</p>)
				if($testelement==$elemnt[1]) array_pop($html);
				array_push($returnArray, $elemnt);
			} else {
				// done
				if($elemnt == ' ') continue;
				array_push($returnArray, $elemnt);
				$m++;
				if($m > $limit) {
					$addEplisis = '&hellip;';
					break;
				}
			}
		}
		// convert start tags to end tags
		$tmpr = '';
		foreach ($html as $elemnt) {
			$tmpr.='</'.$elemnt.'>';
		}
		return implode($returnArray, ' ') . $addEplisis . $tmpr;
	}

This will limit Content to 100 characters, and add any broken tags on to the end. If the Content is bigger than 100 characters it adds an ellipsis on the end.

If you find any improvements, or this is already somewhere in HTMLText, let me know.

Avatar
Benedikt

Community Member, 16 Posts

17 December 2009 at 12:27am

There is ticket for this at: http://open.silverstripe.org/ticket/4048

I just reopened it and also linked to your workaround.

Thanks.

Avatar
torleif2

Community Member, 6 Posts

23 November 2010 at 4:37pm

This code will find <pre> elements. I noted a bug while trying to integrate summarized text with GeSHi (the Generic Syntax Highlighter)

public function getSummaryHTML ($limit = 150){
      $m = 0;
      $addEplisis = '';
      $returnstr = '';
      $returnArray = array();
      $html = array();
      $chars = preg_split('/(<[^>]*[^\/]>| )/i', $this->Content, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
      foreach ($chars as $elemnt) {
         // found start tag
         if(preg_match('/^<(h1|h2|h3|h4|h5|h6|q|b|i|strong|em|pre|p)(.*)>$/', $elemnt)){
            preg_match('/^<(h1|h2|h3|h4|h5|h6|q|b|i|strong|em|pre|p)(.*)>$/', $elemnt, $matches);
            array_push($html, $matches[1]);// convert <p class=""> to p
            array_push($returnArray, $elemnt);
            // found end tag
         } else if(preg_match('/^<\/(h1|h2|h3|h4|h5|h6|q|b|i|strong|em|pre|p)(.*)>$/', $elemnt)){
            preg_match('/^<\/(h1|h2|h3|h4|h5|h6|q|b|i|strong|em|pre|p)(.*)>$/', $elemnt, $matches);
            $testelement = array_pop ($html);
            // match (ie: <p>etc</p>)
            if($testelement==$elemnt[1]) array_pop($html);
            array_push($returnArray, $elemnt);
         } else {
            // done
            if($elemnt == ' ') continue;
            array_push($returnArray, $elemnt);
            $m++;
            if($m > $limit) {
               $addEplisis = '&hellip; <small><i>(text truncated)</i></small>';
               break;
            }
         }
      }
      // convert start tags to end tags
      $tmpr = '';
      foreach ($html as $elemnt) {
         $tmpr.='</'.$elemnt.'>';
      }
      $v = new HTMLText();
      $v->value = implode($returnArray, ' ') . implode($html, '') . $addEplisis . $tmpr;
      return $v;
   }