Saturday, February 7, 2015

Upcoming LZHAM decompressor optimizations

All of these changes are on github:

Over the previous week I've increased the decompressor's average speed by roughly 1.5%-2%, by enabling Huffman decoding acceleration tables on the distance LSB symbol table.

I'm currently testing several changes which decrease the time it takes for the decompressor to initialize, and increases avg. throughput on tiny (<10KB) streams:

  • The initial set of Huffman tables are very simple because the symbol frequencies are all 1's. The decompressor now quickly detects this case and bypasses the more general canonical Huffman codesize computation step.
  • The decompressor bypasses the construction of Huffman decoding acceleration tables near the beginning of the stream. It computes the approx. cost of constructing the accel. tables and decoding the next X symbols using those tables, verses just decoding the next X symbols without accel. tables, and chooses whatever is predicted to be cheapest. The Huff decode acceleration tables can be rather big (up to 2048 entries), so this is only a win at the beginning of streams (when the tables are rapidly updated).

x86 timing results on a 64 byte test file (file to file decompression, 64MB dictionary):

Before:
Total time including init, I/O, decompression: .146ms
Decompression only time: .040ms

After:
Total time including init, I/O, decompression: .120ms
Decompression only time: .020ms

x86 timing results on a 3087 byte test file:

Before:
Total: .269ms
Decomp only: .170ms

After:
Total: .230ms
Decomp only: .132ms

These optimizations should significantly reduce the time it takes to reinit() and restart decompression (hopefully around 30-50%). I'm going to profile the decompressor over the time it takes to startup and decode the first few KB next.


No comments:

Post a Comment