http://silverstripe.com/extending-hacking-silverstripe-forum/show/28535 <= here is the topic where the whole idea is described.
The program is probably not ready for the presentation yet (it will be open source, of course), but I have got some screenshots to share with you :)
I have tested my program on one of my GHOP sites to check if it is formatting the markup and IDs correctly. Here are two screenshots:
note that it has corrected the indents and the IDs start with uppercase on the second screenshot, too. It should work all right with all UTF-8 sites.
peak memory usage: 1068kb
CPU time: 0.016s