It is often a goal to compress web pages to that they take less bandwidth, less hard drive space, and are difficult for people to reverse engineer. This has been called "compression" but it is more along the lines of consolidation and minification is a far better term. The process typically removes whitespace, removes comments, replaces names with shorter ones, and reworks the structure to use a smaller technique. Few solutions perform actual compression of the information.
What I present here is not an ideal solution. Compression in this form is far better when used at the server; if your server can compress data in transit then the browser will decompress it using native code and the compression will be significantly better. The output generated on this page is more appropriate when you need to serve data from
file:// URIs, such as documentation on a CD. Another alternative is to bundle your site into a
.jar file because it has built-in compression.
This algorithm operates by walking through the original text and seeing if the characters at the current position are a repeat of something earlier. If it is, it encodes the starting point of the repetition and how long to copy.
The technique was published by Abraham Lempel and Jacob Ziv in 1977 is a "sliding window" compression algorithm, which is a type of a dictionary coder.
This recodes letters with shorter codes for frequently used letters. For instance, English text is typically lowercase letters and the most common letter is "e". By analyzing the text that will be compressed, the program figures out how often each letter is used and will encode them into a tree, then determine the new codes.
Technically this is not compression. Instead, it is a way of taking binary data and changing it to a text-only form. The encoding increases the size of the information by 33% and is the method that email attachments are sent.