Over the past few days I have been working on a JavaScript based syntax highlighted code editor to add to my JS Library (known to some as the Juice Library).
I wanted to split the lines of code displayed into a series of words with any spacing preserved, so started off by splitting each line on any none word “W”.
var words = line.split(/W/g);
This worked fine in Firefox, but when it came to IE the results were not quite what I was expecting. All the words were matched but all white space had been completely ignored, so instead I turned to using “b” as the delimiter.
var words = line.split(/b/g);
This took me a step closer to what was required but it was still not the desired result as I didn’t want any non-word characters to be grouped with alpha-numeric values, such as “123”.
I spent a little – in fact too much -time browsing the web for some pointers and came across an article posted on the SitePoint blog outlining the inconstancies of the String.prototype.split method across different browsers, which seemed explained the problem I was getting earlier but unfortunately offered no resolution.
Now to try and find a solution to all this.
Because splitting on “W” was almost correct I reverted back to using that and now knowing what I had read over at SitePoint I somehow needed to prevent any non-word character sets from being ignored but without affecting the output in any way.
One possibility that arose was to wrap use the special character of null “” around each section and then split on that value. I tried a long winded version of the following to see if it would take me any closer to the desired result and simplified it afterwords.
var words = line.replace(/(W)/g, '$1').split('');
The output form this was pretty much almost what was required in both FF and IE, however it does not ease my frustration in the time wasted coming up and researching into this solution.
I can now continue on with my syntax highlighter in peace, or at least until the next major cross browser issue arises.