18 months ago, I did an informal test with GUI archivers and their various formats. Recently, I’ve been reading about data compression and formats, and got it into my head to do some more testing.
Because someone like Werner Bergmans says just about everything there is to say about compressors of all stripes, I thought my own testing would focus primarily on the sorts of programs and formats that one would see in everyday use. Thus, no esoteric command line encoders from universities and enthusiasts will make it into this test. Rather, I will be looking at the old standby ZIP format, as well as 7-Zip’s 7z, WinRAR’s rar, WinACE’s ace, UHARC, and the stalwart *nix tools, gzip and bzip2. All of these tools are readily usable in GUIs for the Windows platform.
Two other programs that I considered but did not end up using were KGB Archiver and WinRK. The former is a frontend for PAQ6v2, and its compression time, even under the “normal” setting, was prohibitively long, though had I used the maximum setting, it might have surpassed the other formats after a few days of compressing. The same with WinRK: its maximum (PWCM) mode would have gotten extremely good results after a ridiculous and unusable amount of time. For that reason, both have been excluded.
Windows consists of my current Windows directory in a tarball, consisting of almost 6,500 files in just over 1GB of data, a mix of text, binary, and media. Linux 2.6.18 consists of the source for v2.6.18 of the Linux kernel Maximum Compression consists of Werner Bergmans’ test corpus, which he offers for download (a pack of the individual data types, not the 510-file corpus used for MFC tests). Genesis ROMs is a collection of over 1,200 pure binary dumps from Sega Genesis cartridges. Website backup is a tarball of a website I created some time ago, of which most of the size comes from the original, 300dpi source photographs used in the content.
|Windows||zip||4.43*||dictionary = 64K; word size = 128||417,771,283||0.407|
|Windows||7zip||4.43||LZMA; dictionary = 64MB; word size = 128; solid||317,998,198||0.310|
|Windows||rar||3.61||“best”; dictionary – 4096K; solid||355,591,630||0.347|
|Windows||ace||2.65||“maximum”; dictionary = 4096K; solid||370,975,963||0.362|
|Windows||uharc||0.6b||-mx -md32768 -mm+||326,051,179||0.318|
|Linux 2.6.18||zip||4.43*||dictionary = 64K; word size = 128||47,386,583||0.197|
|Linux 2.6.18||7zip||4.43||LZMA; dictionary = 64MB; word size = 128; solid||33,974,276||0.141|
|Linux 2.6.18||rar||3.61||“best”; dictionary – 4096K; solid||35,342,737||0.147|
|Linux 2.6.18||ace||2.65||“maximum”; dictionary = 4096K; solid||40,184,600||0.167|
|Linux 2.6.18||uharc||0.6b||-mx -md32768 -mm+||30,251,189||0.126|
|Maximum Compression Corpus||tar||1.13||n/a||53,144,064||1.000|
|Maximum Compression Corpus||zip||4.43*||dictionary = 64K; word size = 128||13,962,749||0.263|
|Maximum Compression Corpus||gzip||1.3.5||-9||14,953,388||0.281|
|Maximum Compression Corpus||bzip2||1.0.3||-9||13,532,091||0.255|
|Maximum Compression Corpus||7zip||4.43||LZMA; dictionary = 64MB; word size = 128; solid||12,371,876||0.233|
|Maximum Compression Corpus||rar||3.61||“best”; dictionary – 4096K; solid||12,536,286||0.236|
|Maximum Compression Corpus||ace||2.65||“maximum”; dictionary = 4096K; solid||13,213,213||0.249|
|Maximum Compression Corpus||uharc||0.6b||-mx -md32768 -mm+||11,516,783||0.217|
|Genesis ROMs||zip||4.43*||dictionary = 64K; word size = 128||837,796,034||0.507|
|Genesis ROMs||7zip||4.43||LZMA; dictionary = 64MB; word size = 128; solid||515,316,845||0.312|
|Genesis ROMs||rar||3.61||“best”; dictionary – 4096K; solid||599,596,546||0.363|
|Genesis ROMs||ace||2.65||“maximum”; dictionary = 4096K; solid||611,391,679||0.370|
|Genesis ROMs||uharc||0.6b||-mx -md32768 -mm+||529,355,573||0.320|
|Website Backup||zip||4.43*||dictionary = 64K; word size = 128||271,871,915||0.983|
|Website Backup||7zip||4.43||-m0=lzma: a=1: d=0: lc=8: LP0: PB0: mf=bt2**||268,550,888||0.971|
|Website Backup||rar||3.61||“best”; dictionary – 4096K; solid||269,030,245||0.972|
|Website Backup||ace||2.65||“maximum”; dictionary = 4096K; solid||269,240,799||0.973|
|Website Backup||uharc||0.6b||-mx -md32768 -mm+||264,782,664||0.957|
* ZIP has any one of a number of implementations. The open-source zip that is part of the GNU toolchain doesn’t yet support deflate64 as an algorithm, so instead I used the Zip capabilities in 7zip 4.43, which does.
** When invoked with the GUI, 7zip choked on this test corpus at 24%, inexplicably. Invoked through the command line with the following switch string (taken from Werner Bergman’s configuration), it compressed without issue.
Interestingly, the top spot seems split between 7-Zip and UHARC. 7-Zip even beats WinRAR in compressing JPEGs, which surprises me, given the former’s lack of a multimedia filter. UHARC, of course, is a heftier compressor, meaning it takes longer to run and its resultant archives are less flexible. Still, one can easily make an SFX package with it.
In general, 7-Zip and WinRAR were the fastest, even at their maximum settings; these two programs get the best compression in the best time—a fact that I will point out even though time was not a factor in my tests. They also have the nicest user interfaces, in my estimation. WinACE is a waste of time. Using the ZIP format on its highest setting resulted in mediocre compression, but required one of the longest compression times, a result that doesn’t speak highly.