Regular Expression parser for Synergy/DE
Sterling Camden
Yes, Richard, I did it – I created another problem for everyone. I’ve implemented a regex parser for Synergy/DE. And because it’s written in Synergy/DE (as opposed to say, a C DLL), it’s portable between any number of Unices, Linux, Windows, and even OpenVMS – as long as you’re running Synergy/DE version 9.1.5 or above.
It performs pretty well, too – mostly because I started with an excellent (if limited) deterministic algorithm from Amer Gerzic. I must admit that extending this algorithm to track parenthesized sub-expressions was one of the greatest programming challenges I’ve faced in quite some time. It forced me to make the engine not quite as deterministic (there can now be more than one active state for each potential start of a match), but I completely avoided back-tracking – so performance suffers very little. I was also able to optimize for the case in which no sub-expressions were compiled.
The breakthrough occurred to me while I was waiting in the doctor’s office for a colonoscopy – during which they would administer the Milk of Amnesia to make me forget the procedure. I was so worried that it would wash away my brilliant algorithm that I asked my wife for a piece of paper so I could jot down some notes. Fortunately, though, the anesthetic seemed to have only a limited effect (I was awake throughout the procedure – ewwww) and I remembered everything about my algorithm afterwards.
The trick was to abstract the notion of inputs beyond just characters received. My implementation includes a transition for entering and exiting a potential sub-expression. Whenever we can move in such a direction, we create a fork in the road for our deterministic engine – but rather than trying them one at a time, we try them all at once. Each character thereafter collects the new states for each path we’re on until they dead-end or reach an accepting state.
I used a similar approach for handling line anchors. Character classes also have more abstract transitions than just the characters they include, which reduces the number of paths we have to maintain in the state table — though it can increase the number of paths we’re following simultaneously.
Any way, it works, it’s fast, and I’m happy. You can find full documentation here, and download the sources below.
I also added “match” and “replace” methods to Var, created mappers MapMatched(regex) and MapReplace(regex, with), and added some more useful methods to ls – all of which are included below. I haven’t documented these or updated the official downloads for Var and ls yet, because it will involve adding everything from Regex.
Speaking of which, I’m thinking of consolidating the downloads for all of my Synergy/DE version 9 stuff and organizing it into a single library with a single include file and a right proper makefile. What do you Synergistas think of that plan?
OK, I’ve given you regexen, now go save the day.
UPDATE: now merged into synthesis library.
Posted in Regexen, SynergyDE, Wildly popular |
No Comments » RSS 2.0 | Sphere it!





A couple of years ago, I published a


