Tillmann Steinbrecher made a wonderful command-line tool for extracting
.tar, and most
.zip files. It is called untgz and version 0.95 is the most recent, with that one being released February 17, 1997. One very great thing he did was to release the source code with the tool and put everything under the GPL, version 2.
- Command-line tool (no GUI, much smaller).
- Executables for various Intel x86 flavors: DOS (really old machines), DOS on 386+ machines (still really old), Win32 for 32-bit Windows (95/98/NT/2000/ME/XP), and a native OS/2 binary.
- Extracts a
.tgzand either decompresses it to the single
.taror to the separate files inside the
I was looking for a tiny tool that I would put on my boot disk to extract an archive to a ramdisk so I could install more tools than the single floppy would allow. The method is similar to Bart's Modular Boot Disk. Originally, I used
.zip archives because I found a 50k unzip program. Then I switched to
.rar because I located a 26k unrar program. I upgraded my Linux machine and found out that the newer version of rar can't be decompressed with my unrar file, so I searched the web again for a standard format that I could create on Linux and decompress on DOS. I was very fortunate and found untgz.
Since I had the source at my disposal, I felt an overwhelming urge to see how much extra I could strip from this 57k program, especially since I would be using this as a decompression tool only on my boot floppy, where space is at a premium. So, with a "just extract the file" type of mindset, I made the following changes to reduce the size of the program:
- The ability to set the timestamp was removed.
- All of the good output was removed – If the program did its job, you won't see a thing.
- Almost all of the error output was removed and just cryptic error messages remain.
- The CRC lookup table was eliminated by adding a tiny amount of code.
- All calls to
printf()were replaced with
- All calls to
sscanf()were replaced by a custom function.
- Command-line options were eliminated.
- Code that didn't actually extract the file was removed.
- Other code was tweaked/optimized in order to save space.
- Zip support was removed because it didn't actually support zip files; it supported zipped tar files.
- Optimization flags for the compilers were tweaked.
- UPX 1.25 was used to compress the programs further.
In the end, I had a 9k untgz program that only used the 8.3 DOS filename standard, and a 15k executable that supported 32-bit and long filenames. I would say that's a significant savings! 1/6 the size for the little one, 1/4 the size for the 32-bit version.
I certainly don't think that this is a useful all-around tool anymore. However, if you have a specific need for an extremely small program that can extract
.tar.gz files, this will work wonders> for you! If the program reports an error while decompressing, use the larger version for a more detailed description of what's going on at that moment.
Use the original version first! If you have any errors with your
.tgz file, you will see more detailed explanations of problems there. If you have no problems, you can then move on to the stripped version.
To achieve maximum compression and remove the "original filename" from the gzip header, you should use pipes. Pass the data directly to gzip from tar like this:
tar cvf - data_directory/ | gzip -9 > output.tgz
tar cvfz instead will use the default compression level (6) and will include more data in the gzip header. We're only talking about a few bytes here, but that can be enough when you are dealing with a tiny boot disk.
- original version - 57k binaries for DOS, Win32, OS/2. Includes source.
- smaller version - 9k binary for DOS, 15k binary for Win32. Includes source, but large chunks of it were deleted to make the program smaller.