Skip to main content

This site requires you to update your browser. Your browsing experience maybe affected by not having the most up to date version.

Form Questions /

Moderators: martimiz, Sean, Ed, biapar, Willr, Ingo, swaiba

Sanitizing HTMLText input with whitelist

Go to End

3 Posts   2125 Views


Community Member, 46 Posts

23 March 2011 at 12:09am

Edited: 23/03/2011 12:48am

I have a form with multiple textarea fields, for some of which I use HTMLText type. I provide the users with a very stripped down version of TinyMCE for those. (buttons for bold, italic, setting links only)

If someone turned javascript off though, they could just put html in there, and it would all get accepted, including scripts.

The data in the forms is displayed on another page. So I guess there are two issues here, how to store it in the database, and how to display what is stored in the database.

I think I would probably be happy for people to write any kind of HTML, as long as everything in script tags will be removed. So that would actually be more like a blacklist really. I am wondering how best to achieve that.

Has anybody come across this problem, or got an idea how best to approach this? Are there any in-built methods in Silverstripe for this? Or would it be enough to just write some code to remove <script>..</script> from input of those fields? [edit]: Of course it is more complicated than that as I have just seen :( But do i really need to use HTMLPurifier?



Forum Moderator, 1899 Posts

24 March 2011 at 5:02am

I don't think you'd need more than a regular expression....

I often don't use regex if I don't understand them and in this case I'd probably use one of the functions on that the google search yeilded, there is quite often alot of gold in the comments of the php doc pages...


Community Member, 46 Posts

24 March 2011 at 6:00am

Edited: 24/03/2011 6:01am

Thanks for this, Barry

I did in the end implement HTMLpurifier though. I found that Andrew Short had used it in the rssconnector module, so that gave me an idea how to implement it. I put the HTMLPurifier code in a thirdparty directory under mysite/ , then created a class with a function in which config an purifier objects get created. This I can then call whenever I have some input that I would like to be purified.

It seems to work quite well. It might be a bit heavy, but I really like the idea that only specific tags are left through, rather than stripping out certain tags.