First of all, I'm not sure if this is the correct forum section for this question or not, or if I should actually start a discussion about this in GitHub.
We have a website that uses the SilverStripe's Diff class to show changes between content versions in the frontend for users who need to see the history of the content provided by the website. However, sometimes there are meaningless changes between the content versions.
For example, a paragraph might have a class called "left" in the old version, but in the new version the class name could have been replaced with "center". The content of the paragraph stays the same. So, <p class="left">Hello World!</p> becomes <p class="center">Hello World!</p>. Literally speaking, the paragraph _has_ changed, and the Diff class notices the change, but on the other hand there is no real changes in the _content_. Meta data changes are not important for the users of this website.
In fact, we have already modified the class a little bit to change a few things, for example to ignore certain tags that are only used for styling (i.e. <strong> and <em>) so adding/removing those tags in the content does not trigger a detectable change in the content. We have also made it to ignore whitespace changes.
And we have also tried to make it to ignore attributes, but now I see that it doesn't always succeed on that. I'm not even 100% sure if our other modifications work every time, but at least they usually do. Our customised class might be a little bit hard to read as it's not the cleanest code, so I would first like to ask you some general advices about how would you do it. I don't think that I need precise step by step instructions. I don't know how the diff algorithm work deep inside.
Of course, I can show the customised code if needed. And if it's useful (or becomes useful after more development), perhaps it could be merged to the core framework or could be converted to a module. I understand that the current Diff works now as it should work and if I ever suggest changes to that, those shouldn't affect the default behaviour, but rather be something that developers could toggle on when they use the Diff class for their own purposes.
Thank you for your support!