Jump to:

324 Posts in 153 Topics by 156 members

Migrating a Site to Silverstripe

SilverStripe Forums » Migrating a Site to Silverstripe » StaticSiteImporter previews all HTML code

What you need to know when migrating your existing site to SilverStripe.

Moderators: martimiz, Sean, biapar, Willr, Ingo, swaiba, simon_w

Page: 1
Go to End
Author Topic: 1544 Views
  • suntrop
    Avatar
    Community Member
    131 Posts

    StaticSiteImporter previews all HTML code Link to this post

    Hi there.
    I want to turn a static website (about 150 pages) into a SS site. I installed SS to test the StaticSiteImporter (latest trunk) and it works - but I don't know if it is as it should be.

    When I check "Preview the content that will be extracted" I get many pages listed with the extracted content.
    But the content looks like:

    <?xml version="1.0" encoding="utf-8"?>
    <!-- ra -->
    <!DOCTYPE html ....
    all HTML code

    } catch(err) {}
    //]]>
    </script><!-- InstanceEnd -->
    </body>
    </html>

    As I understand this feature it should only show me the grabbed content, right?

    That is what I defined in mysite/_config.php

    StaticImporter::set_url("www.example.com/");
    StaticImporter::set_allowed_extensions(array('php','html','jpg','pdf'));
    StaticImporter::set_rules(
          array(
             // Default rules for all other URLs
             'conditions' => array(),
             'fields' => array(
                'Title' => array(
                   'xpath' => array(
                      '//h1'
                   ),
                   'exclusive' => 1
                ),
                'Hierarchy' => array(
                   'xpath' => array(
                      '//h2[contains(@class, "location")]/a/@href',
                   ),
                   'exclusive' => 1
                ),
                'Content' => array(
                   'xpath' => '//div[contains(@id, "content")]',
                   'includeMatchedTag' => 0
                )
             ),
             'exclusive' => 1
          )
       );

    Targeted website: http://bit.ly/b5yrHV

    Is the XPath wrong or can I import?

  • suntrop
    Avatar
    Community Member
    131 Posts

    Re: StaticSiteImporter previews all HTML code Link to this post

    After consulting various forums the XPath (all 50 ;)) isn't the failure.

    What could be the problem? Why do I get the whole DOM displayed in the very, very tiny textarea?
    In which _config.php do I have to put in the StaticImporter::set_rules?

    Appreciate all help I can get to this!

  • suntrop
    Avatar
    Community Member
    131 Posts

    Re: StaticSiteImporter previews all HTML code Link to this post

    I made some changes, rewinded, made others but nothing works.
    So I clicked to insert all in the database but unfortunately that doesn't work either.
    After one page I exits with an "website error" message. The sitetree has a new item ImportedFiles>Folder>Filename but its content is empty. The page name is almost correct.

  • Bambii7
    Avatar
    Community Member
    254 Posts

    Re: StaticSiteImporter previews all HTML code Link to this post

    Hmmmm I haven't used it before, read about it though. For 150 pages I'd sit down for a few hours and copy paste them all, only to guarantee formatting of pages.
    I think you'll need to target the content div or container. From memory there was a method of doing this. Other wise it has no way of knowing what is content or what is a side bar or menu element.

    1544 Views
Page: 1
Go to Top

Want to know more about the company that brought you SilverStripe? Then check out SilverStripe.com

Comments on this website? Please give feedback.