Huffman JavaScript Compression

Huffman encoding is based on the principle that letters that appear more frequently should have a smaller code than the ones that are used less often. So, in the English language, vowels would be used more than the letter 'z', and would get shorter codes. This javascript-based compression example uses this method to compress whatever you give it. It can work on web pages, javascript code, and tons more. The downfall is that it is extremely slow. It also re-encodes the binary data in a method similar to UUEncode, which inflates 3 bytes of binary data to 4 bytes of textual data, so some of the awesome compression that is possible will be eliminated from the expansion of data.

Here are my thoughts on this experiment:

  • It works!
  • The decompression code really isn't all that big.
  • The decompression code isn't all that slow either.
  • Re-encoding the binary data (33% increase) negates the compression savings.
  • JavaScript really shouldn't be used for compressing data (too slow)

Test it out for yourself. Insert web pages, javascript, or just simple text and then press the button. It can take quite a while (15k of a web page takes minutes on my computer). The advantages of having it run all client-side in JavaScript are that it is all client-side (you don't send any confidential data to my server) and it's quick to write. Start with small files and work your way larger.

I have a feeling that this would work great if you performed the compression across all of your pages if you could generate the "l" array based on the letter frequency of all of your web pages, and then put the decompression code and the "l" array in an external .js file. You could do this with only moderate difficulty with this script -- just edit the code and make a static Letters[] array.

Keep in mind that when you see a 0% compression ratio, I still removed two bits per byte. Because I need to re-encode the binary data back into some sort of text format, those savings are lost. If it wasn't for JavaScript and the need to represent my binary data as text, the savings would be a lot more obvious.


View results of the compression in a popup window.


No piece of paper can be folded in half more than 7 times without unfolding the paper first. Tyler Akins <>
Legal Info