I have just released version 2.0.2 of the Synthesis library for Synergy/DE 9.3 and above. You can download the library sources using the link at the bottom of this page. This version includes a few minor corrections, and the following enhancements:
Regular Expressions (Regex class)
You can now back-reference a captured group within the pattern itself. This is particularly useful for matching pairs of items, like quotes or HTML tags. For instance,
R$("/(['""]).*\1/") matches a quoted string, either in single or double quotes, but not mixed. Note that the duplicate " in the character class is merely used to insert a quote into a quoted string in Synergy/DE. For repeating groups, the back-reference uses the most recently encountered value. If a reference is made to a group that has not yet been captured, then a zero-length string is used.
If you want to using grouping but don’t want it numbered among potential back-references, you can now enclose it within
(?: ). That can be useful when you need more than 9 groupings, because you can’t have more than 9 back-references (at least, not until I implement named groups).
You can now insert a comment in your regex even if you aren’t using Extended syntax mode, by enclosing it within
Lists (ls class)
While tuning a QuickSort algorithm in C for a client, it occurred to me to try some of the same optimizations on the quicksort method of ls. Changing the algorithm to use the “collapse the walls” approach and falling back to an insertion sort when the list contains fewer than five elements both improved performance significantly. I experimented with trying to optimize the pivot value using both a median of 3 and a pseudomedian of 9, but neither of those approaches yielded any benefit. I think that the overhead of invoking the compare test generally outweighs the benefit of using a better pivot value, extreme cases excepted.
Then it occurred to me to take another look at my mergesort method to see if that could be tuned at all. Sure enough, I figured out how to cut down the number of copy operations by about half. This improved performance dramatically (as you might expect), and made mergesort faster than quicksort when the list contains about 100,000 elements or more (down from 900,000 in the previous version, even with the improvements to quicksort). Below 100,000 elements, quicksort still wins consistently.
So, I decided to make the sort method into a hybrid. It chooses quicksort for lists of less than 100,000 items. For more than that, it starts out as a mergesort, but for any segment that contains less than 100,000 items it switches to quicksort. Since quicksort switches to Insertion sort on less than 5 items, we get a combination of three different sort algorithms, depending on the size of the set being sorted. This hybrid performs comparably to quicksort up to 100,000 (as you would expect), but beyond that point it outperforms both mergesort and quicksort by taking advantage of the strengths of both approaches.
Of course, MergeSort and QuickSort bring other considerations to the table besides performance. MergeSort uses O(N) additional memory, while QuickSort does everything in place. However, MergeSort is stable (meaning, the order of duplicates is preserved), while my new quicksort algorithm is not. Since the sort method combines these approaches, it is neither stable nor lean on memory, but it performs best. So, depending on your needs, you can choose which sort to use:
- Fast: sort
- Lean: quicksort
- Stable: mergesort
I improved the performance of the CodeCoverage class by caching the source file object, rather than always looking it up in the associative list. That benefits sampling of executables that stay in one source file for a number of lines at a time. I also improved the code coverage for most of the test routines.