Skip to main content

This site requires you to update your browser. Your browsing experience maybe affected by not having the most up to date version.

Customising the CMS /

Content Summary in valid HTML (LimitWordCount)


Reply


3 Posts   2919 Views

Avatar
torleif2

Community Member, 4 Posts

24 November 2009 at 5:45pm

Here's a HTML safe LimitWordCount() function. Often I want to limit the amount of words in a content block, but retain the formatting.

eg:
Content = <p>this <br/>paragraph would normally get cut off</p>
getSummaryHTML (2) returns: <p>this <br/>paragraph</p>

This current function is to be placed in the model (it would be nice to place in controller, but it doesn't work for some reason.) It would be easy to put inside your own extended HTMLText element.

Code (in Page.php):

   /*
   * limits words to a number, but tries to validate the code
   */
   public function getSummaryHTML ($limit = 100){
      $m = 0;
      $addEplisis = '';
      $returnstr = '';
      $returnArray = array();
      $html = array();
      $chars = preg_split('/(<[^>]*[^\/]>| )/i', $this->Content, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
      foreach ($chars as $elemnt) {
         // found start tag
         if(preg_match('/^<(p|h1|h2|h3|h4|h5|h6|q|b|i|strong|em)(.*)>$/', $elemnt)){
            preg_match('/^<(p|h1|h2|h3|h4|h5|h6|q|b|i|strong|em)(.*)>$/', $elemnt, $matches);
            array_push($html, $matches[1]);// convert <p class=""> to p
            array_push($returnArray, $elemnt);
            // found end tag
         } else if(preg_match('/^<\/(p|h1|h2|h3|h4|h5|h6|q|b|i|strong|em)(.*)>$/', $elemnt)){
            preg_match('/^<\/(p|h1|h2|h3|h4|h5|h6|q|b|i|strong|em)(.*)>$/', $elemnt, $matches);
            $testelement = array_pop ($html);
            // match (ie: <p>etc</p>)
            if($testelement==$elemnt[1]) array_pop($html);
            array_push($returnArray, $elemnt);
         } else {
            // done
            if($elemnt == ' ') continue;
            array_push($returnArray, $elemnt);
            $m++;
            if($m > $limit) {
               $addEplisis = '&hellip;';
               break;
            }
         }
      }
      // convert start tags to end tags
      $tmpr = '';
      foreach ($html as $elemnt) {
         $tmpr.='</'.$elemnt.'>';
      }
      return implode($returnArray, ' ') . $addEplisis . $tmpr;
   }

This will limit Content to 100 characters, and add any broken tags on to the end. If the Content is bigger than 100 characters it adds an ellipsis on the end.

If you find any improvements, or this is already somewhere in HTMLText, let me know.

Avatar
Benedikt

Community Member, 16 Posts

17 December 2009 at 12:27am

There is ticket for this at: http://open.silverstripe.org/ticket/4048

I just reopened it and also linked to your workaround.

Thanks.

Avatar
torleif2

Community Member, 4 Posts

23 November 2010 at 4:37pm

This code will find <pre> elements. I noted a bug while trying to integrate summarized text with GeSHi (the Generic Syntax Highlighter)

public function getSummaryHTML ($limit = 150){
$m = 0;
$addEplisis = '';
$returnstr = '';
$returnArray = array();
$html = array();
$chars = preg_split('/(<[^>]*[^\/]>| )/i', $this->Content, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
foreach ($chars as $elemnt) {
// found start tag
if(preg_match('/^<(h1|h2|h3|h4|h5|h6|q|b|i|strong|em|pre|p)(.*)>$/', $elemnt)){
preg_match('/^<(h1|h2|h3|h4|h5|h6|q|b|i|strong|em|pre|p)(.*)>$/', $elemnt, $matches);
array_push($html, $matches[1]);// convert <p class=""> to p
array_push($returnArray, $elemnt);
// found end tag
} else if(preg_match('/^<\/(h1|h2|h3|h4|h5|h6|q|b|i|strong|em|pre|p)(.*)>$/', $elemnt)){
preg_match('/^<\/(h1|h2|h3|h4|h5|h6|q|b|i|strong|em|pre|p)(.*)>$/', $elemnt, $matches);
$testelement = array_pop ($html);
// match (ie: <p>etc</p>)
if($testelement==$elemnt[1]) array_pop($html);
array_push($returnArray, $elemnt);
} else {
// done
if($elemnt == ' ') continue;
array_push($returnArray, $elemnt);
$m++;
if($m > $limit) {
$addEplisis = '&hellip; <small><i>(text truncated)</i></small>';
break;
}
}
}
// convert start tags to end tags
$tmpr = '';
foreach ($html as $elemnt) {
$tmpr.='</'.$elemnt.'>';
}
$v = new HTMLText();
$v->value = implode($returnArray, ' ') . implode($html, '') . $addEplisis . $tmpr;
return $v;
}