This compressor uses a variant of LZ78 compression. It compresses a string by looking for repeated sets of characters in a dictionary. For instance, if you use the phrase "and then" a lot, the second time it may be shortened to four or five letters. The third or fourth time might bring it down to three characters, and so on. Subsequent times may use less and less space in the compressed version because the dictionary keeps growing with the letters you use.
It is important to note that the compressor isn't perfect. Either make sure that you have several bytes of test data or else make sure your range of character values is > 16. Basically, don't use "aaa" and if you include one uppercase and one lowercase letter, you should be just fine. Even if you disregard this paragraph, you will just get weird looking output; not actual errors.
The compression stats are so poor because of the base64-ish encoding I need to do to keep the data as text only preserves 6 bytes of binary per character. So, if you have a 0% compression factor, this code still removed 2 bits per character.
|When Samoset first met the pilgrims in 1621, he said, "Welcome English. I am Samoset. Do you have beer?" This was the first contact the pilgrims had with any natives.||
Tyler Akins <>