You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TextMate grammars used for syntax highlighting can individually contain thousands of regexes and are, collectively, quite large. Their regexes can be optimized via minification that also improves their performance. An existing library, oniguruma-parser's Optimizer module, is made specifically for this -- it minifies Oniguruma regexes (the regex flavor used by TM grammars) without any change to what they match, and it applies automatic performance improvements to some regexes.
oniguruma-parser's optimizer has been battle-tested by the popular Shiki library, which recently starting running all of its more than 220 included TM grammars through it (the underlying Oniguruma parser has been used by Shiki's JS engine for much longer). Shiki does so in tm-grammars, and tests that syntax highlighting results are identical for all grammars before and after optimization. The size reduction and performance improvements are significant for some grammars. E.g., it shaves more than 35,000 characters off of regexes in just the C++ grammar (which doesn't include any whitespace or comments), and it improves the C++ grammar's performance by making changes that significantly reduce the amount of backtracking needed by some very large, complex, and slow regexes (again, without any changes to what any of the regexes match). You can see how Shiki applies it to TM grammars here.
CCing @alexr00 since she's been handling TM grammar updates.
The text was updated successfully, but these errors were encountered:
TextMate grammars used for syntax highlighting can individually contain thousands of regexes and are, collectively, quite large. Their regexes can be optimized via minification that also improves their performance. An existing library,
oniguruma-parser
's Optimizer module, is made specifically for this -- it minifies Oniguruma regexes (the regex flavor used by TM grammars) without any change to what they match, and it applies automatic performance improvements to some regexes.oniguruma-parser
's optimizer has been battle-tested by the popular Shiki library, which recently starting running all of its more than 220 included TM grammars through it (the underlying Oniguruma parser has been used by Shiki's JS engine for much longer). Shiki does so in tm-grammars, and tests that syntax highlighting results are identical for all grammars before and after optimization. The size reduction and performance improvements are significant for some grammars. E.g., it shaves more than 35,000 characters off of regexes in just the C++ grammar (which doesn't include any whitespace or comments), and it improves the C++ grammar's performance by making changes that significantly reduce the amount of backtracking needed by some very large, complex, and slow regexes (again, without any changes to what any of the regexes match). You can see how Shiki applies it to TM grammars here.CCing @alexr00 since she's been handling TM grammar updates.
The text was updated successfully, but these errors were encountered: