- Rewrite the parsing engine to use DOM instead of resetting the HTML.
- Easier customisation on search engines.
Previously the highlighting was done by grabbing the entire HTML inside the DOM element, doing some sort of regular expression substitution, and then resetting DOM element with the resulting HTML. It has some issues especially in correctly identifying the text, as well as some regexp magic that behaves differently across browsers.
In SE-Hilite 1.3 it has been changed to use DOM functionality exclusively. A DOM-walker has been implemented to iterate through all text nodes and create/insert <span/> elements for matches. It also makes possible to pause in the middle of a DOM-walk, pass the thread back to browser, and then resume the walk later, so that highlighting large documents can be done progressively without locking up the browser.
Performance wise, a DOM walk is also faster than resetting the HTML attribute of a DOM element. The benchmarking document I used was W3C's Document Object Model Core at 436kb, highlighting 4 keywords. Under Firefox 1.5beta1/Win32, both methods were on-par, finished highlighting at around 1.6 seconds on a 1.7Ghz Pentium M. Under Internet Explorer 6 SP 1 however, SE-Hite 1.3 finished the test in 3.1 seconds, but the old innerHTML method hangs the browser completely.
I have yet to test the new code on Safari 1.2/2.0, but I think it should work as Safari supports DOM2.