Each week we're interviewing someone who has made a major contribution to our upcoming release of SilverStripe CMS, version 2.4.
This week we're discussing a hotly desired feature, hierarchical URLs, with the person behind the feature, Andrew Short. Based in Australia, Andrew is a member of our open source community who has worked with SilverStripe CMS for several years. The feature Andrew is discussing here is also colloquially known as "nested URLs."
First off, describe nested URLs to the layman, and why they're so cool.
In a nutshell, the nested URLs feature means that your site URLs now reflect the actual parent/child page structure of your site. In SilverStripe 2.3 and earlier, all page URLs would be on the top level, regardless of whether they were nested under other pages or not. In 2.4 however, the URLs now map directly to the chain of parent and child pages. The below table shows a quick summary of what these changes mean for your site:
Although this does have the effect of adding more characters to URLs, it prevents URL "shortages" and makes the URLs far more descriptive. For example, if you visit example.com/store/shoes, you could then expect to be able to visit example.com/store and view all the categories. This is especially useful for large sites with lots of pages in nested categories and sections. A good example of where this is useful is the Metlink Melbourne website, one of the first websites to use this feature.
Cool. So what led you to begin writing this feature and, perhaps more importantly, sticking with it and taking it to completion?
Quite a while ago, I was working on a client's relatively large ecommerce website that had lots of similarly named products. This led to URLs like site.com/product-name-12, which were not only ugly, but the myriad of similarly named URLs became quite confusing. The client also raised concerns with the URL structure being suboptimal for SEO purposes. I created a really quick and hacky patch that added half-baked nested URL support to the site, and then posted the patch on an old website of mine and just let it sit. Although I never really advertised the patch, it received quite a bit of attention and I realised just how many people wanted a proper implementation of nested URLs.
I decided that I might as well take the time to do it properly - it was an interesting technical challenge and it felt good to give something back to the community. The development took a very long time - just over a year from start to finish - since it was only worked on in my spare time. I think the fact that I could work at my own pace with no pressure made it easier to stick with the project and get it done over the long run. I also had a lot of help and advice from the SilverStripe core team, mainly Sam Minnée, which was a huge help.
Now, delving a little into how this feature works; does it maintain internal links when you alter page URLs?
Yes. Prior to SilverStripe 2.4, links to pages were inserted into content fields as plain HTML links. Although there were systems to track these links and update them when pages were moved, they were somewhat buggy. As such, once you inserted a link then renamed a page, internal links could become broken.
A new feature that was added to SilverStripe 2.4 was shortcodes - an API for replacing simple BBCode-like tags in HTMLText and HTMLVarchar fields with dynamically generated content. It is very similar to the WordPress syntax and is documented at the ShortCodes developer wiki page. Basically, when you insert a link into a Content field it inserts an href attribute that looks like /software/framework/features/. When rendered into a template, this is replaced with a plain HTML link. By linking to a constant ID rather than a volatile URL, when you move a page around the site structure, or change any of its fields the link will still be correctly generated. People who visit the old URL of a page are also redirected to the new URL in most cases.
If you delete a linked page altogether, the link is still replaced at runtime with a link to the latest version of the page before it was deleted - which when visited just generates a plain 404 error, as is expected.
How do you turn nested URLs on and off, and what are the ramifications of migrating a site to nested URL mode?
By default, a fresh install of 2.4 will have nested URLs turned on. You can enable nested URLs on an existing site by adding this line to your _config.php file:
Note that if you just remove the call to enable_nested_urls() in the config file, nested URLs will be disabled - it is disabled unless the enabling method is explicitly called, in order to maximise backwards compatibility. You then need to visit yoursite.com/dev/tasks/MigrateSiteTreeLinkingTask to rewrite all the existing links inserted into content fields into shortcodes. If you fail to do this, then most of your internal links will no longer point to the right location because, for example, site.com/shoes might have moved site.com/products/shoes. However, until you add a new page at site.com/shoes, SilverStripe's "404 Page Not Found" logic will automatically redirect you from site.com/shoes to site.com/products/shoes. This reduces the problem of external inbound links resulting in pages not being found.
For most sites, enabling the nested URLs method call and running the linking migration task should be all that needs to be done - nested URLs is designed to be as backwards compatible as possible. However, you may run into a couple of problems in more complex sites, but unfortunately this is often the case when upgrading major versions. Refer to the upgrading notes on the wiki for more information on migrating a site.
What happens if you switch a site from nested to flat mode?
SilverStripe 2.4 robustly supports switching from nested URLs back to flat URLs, but you need to deal with URL conflicts. In other words, if you have two pages on your website like site.com/products/shoes and site.com/offers/shoes they will both end up being site.com/shoes, and there will be undefined behaviour when you visit that page. To automatically resolve URL conflicts you need to visit yoursite.com/SiteTreeMaintenanceTask/makelinksunique. Using the example given just given, this would rename one of these pages, so that you end up with site.com/shoes and site.com/shoes-2. The last thing to point out is that, because SilverStripe 2.4 now records unique IDs when managing internal links, the internal links will still all point to a valid URL when you switch off nested URLs. In other words, you don't need to perform any task to maintain the integrity of these links.
What else is important for developers to know about nested URLs?
There are many changes to other parts of SilverStripe in order to support nested URLs. You will be able to read about them in the SilverStripe 2.4 changelog when it is finalised. But some of the important ones are:
- URLSegment can no longer be used for linking: Some people have been using $URLSegment in their templates to generate links. Although this has always been bad practice, as of 2.4 this will no longer work properly for nested pages. Instead you should be using $Link.
- You must explicitly define actions: Prior to 2.4 you could run any action on a page without defining it. In order for an action to be accessible you must either define an entry in your controller's allowed_actions definition (preferred way), or if your controller does not have a list of allowed actions it must have a method with the same name as the action. Otherwise, if you attempt to visit site.com/page/action, and action doesn't exist, a 404 error will be returned.
- You need to cast data being passed into the template: The shortcode parser relies on template data being casted to HTMLText/Varchar objects in order to perform substitution. This means that if you have an action that returns something like array('Content' => $this->CustomContentField), and CustomContentField contains a shortcode, it won't get parsed. Instead you need to do array('Content' => $this->obj('CustomContentField')).