Say you have a SilverStripe website. Then along comes a bot probing your security vulnerabilities on your site. This bot could be written by a Chinese hacker, the NSA or, most embarrassingly, by a security firm you hired to tell you if your website is vulnerable to attack.
The next thing you know: your server’s load goes through the roof; the server runs out of memory; and the website crashes, failing to respond to any requests until you do a hard-reboot.
What just happened?
The inner life of security bots
Security scanning bots work is by sending requests to your website trying to detect potentially exploitable code. The bot has a long list of possible URLs that indicate your website is running xyz software. If xyz software is detected, the bot can try various exploits on that software. Since you are running a SilverStripe website, most of the probing tests result in a 404 responses. There is, for example, no “/wp-admin” on your website.
SilverStripe and 404 Page Not Found requests
Here is what happens when you send request to a SilverStripe site, and that request results in a “Page not found” response:
The request goes to Apache and then Apache runs rules to check for a static file matching the request.
- Apache passes the request to SilverStripe.
- SilverStripe spins up, loads 1000s of files, checks for logged-in users, etc.
- SilverStripe checks for routes that match request.
- SilverStripe checks for controllers that match request.
- SilverStripe does a bunch of database queries to try and match the request to URLSegments of pages in the database.
- SilverStripe fetches the “Error page” from the database.
- SilverStripe asks the “Error page” to construct a response.
- The response object is rendered to HTML.
- The server passes the HTML response back to the web browser.
That whole process typically takes somewhere between 200 - 1000ms, depending on the complexity and page count of your website.
If the bot has no request throttling built-in, it will send a flood of probing requests to your site, all resulting in 404 responses. Each of these requests will trigger multiple database queries, tying up server threads and preventing your site responding to genuine user requests. If too many requests come in all at once, then your server experiences a Denial of Service (DoS) attack and goes down in flames.
I wrote a simple script to simulate a bot scanning for security vulnerabilities. The script sequentially queries 880 potentially vulnerable URLs and reports how long the scan takes. The longer this mock-scan takes, the worse your website will do targeted by a real scan. The result here:
- Large fully-featured SilverStripe website with 2000 pages: 302 seconds
- Base SilverStripe installation: 138 seconds
- Large fully-featured SilverStripe website with Route Whitelist installed: 8 seconds
Now I bet you are wondering: what’s this Route Whitelist thing and how does it makes the security scanning problem go away?
Route Whitelist is SilverStripe module that generates a whitelist of potentially valid URLs. That is, a list of URLs that may result in a 200 response. Any request not matching a URL in the whitelist will definitely result in a 404.
It is nearly impossible to generate a list of every possible valid URL for a large complex site. So, instead, the Route Whitelist module generates a list of only all top-level routes, all top-level pages and all controllers. The whitelist will include, for example:
The module then adds a check to the very first step of the SilverStripe page serving process. That is, it adds an Apache htaccess rule to immediately serve a 404 if a request does not match a URL in the whitelist. That’s why it’s so fast. It cuts out 9 unnecessary steps.
But it only includes top-level items. What about subURLs like: “/who-we-are/our-team”?
Route Whitelist will match only on the first segment of a URL. A request for an invalid subURL (/who-we-are/zyx-cheap-viagra) still gets processed in the slow traditional way.
But that’s okay. We don’t need to speed up every possible “Page Not Found” request. Speeding up 99.9% of 404-requests is plenty good enough. It’s a pragmatic compromise to make this module work.
No fear of the big bad bot
In summary, by installing Route Whitelist you no longer need to fear that a security scanning bot might accidentally or intentionally take down your website.
If that sounds like something you’d like to have running on your website, then go ahead, download a copy of the Route Whitelist module.