Atom Feed SITE FEED   ADD TO GOOGLE READER

SortedList, now faster

Externally, Glazed Lists is all lists. No other ADTs - no heaps, trees or graphs - just lists. But internally, much is implemented using trees. For example, FilterList remembers which elements are filtered out using a custom AVL-tree called Barcode. SortedList maps indices using two instances of our custom IndexedTree class. And since version 1.6, ListEvents keep track of change blocks using yet another custom tree called BC2. Since each tree has different requirements we need different implementations:
  • FilterList's Barcode simply stores boolean 'filtered' or 'unfiltered' state and provides methods to convert between filtered and unfiltered indices.
  • SortedList's IndexedTree allows nodes to be inserted either by index or in sorted order using a comparator. Individual nodes in an otherwise sorted tree can be made unsorted in order to preserve the location of edited values.
  • ListEvent's BC2 Tree stores nodes in one of four states - inserted, updated, deleted and unchanged. In addition, it needs to offset between the 'before' tree and the 'after' tree. In the future we plan to store deleted elements in the nodes.


It's quite unpleasant to have 3 implementations of the same basic concept, especially since each is particularly complex. Plus, the different trees haven't been uniformly tested, optimized or even documented.

This problem has prompted me to spend the last week or so to unify our internal tree classes. The first step has been removing IndexedTree from SortedList, replacing it with an improved BC2 Tree. BC2 is our best and fastest tree but it needed sorting support to be used in SortedList. I added the required fields and logic and after some effort I had it working with SortedList.

But BC2 maintains state that SortedList doesn't need, which makes it slower than IndexedTree.

Since I'm sure Glazed Lists users wouldn't like to see a performance regression, I had a problem. Options:
  1. give up, go back to IndexedTree
  2. accept the performance regression
  3. cut and paste the BC2.java class and strip out the unneeded fields


Yikes. I decided to go with a modified version of option 3. If I were to create a one-time copy of BC2, each time I fixed a bug or optimized something in BC2, I'd need to do it twice. Yuck. Instead, I annotated the source of BC2 using M4 macros to designate which fields and methods are not needed by IndexedTree. Then I can run a quick ant script and generate the 'light' version of BC2.java whenever necessary*.


The result of all my hackery? SortedList is 5-10% faster than before.



* Why do the dynamic stuff at build time with hacks like M4 macros rather than runtime with elegant abstractions like interfaces? Interfaces are great but in this rare, performance-critical situation, the runtime cost of dynamic method invocation is a dealbreaker. I ran tests with Japex comparing both approaches, and even with Hotspot magic, the M4 solution is 20% faster.