Posts tagged `compression`

Though I’m not the sort of person who believes that native 64-bit compilations of programs will automagically make them perform faster or better, I do like to keep an eye on the state of the art, since I was an early adopter of native 64-bit OSes (I’ve been using 64-bit Linux since about Fedora Core 2 or 3, and beta versions of Windows XP x64) when AMD launched their K8 platform.

Previously, I’ve casually benchmarked the Javascript speeds of 64-bit browsers v. their 32-bit counterparts (here); more recently, I benchmarked a 64-bit compile of FLAC against several other 32-bit compiles of the same version (here).

This time, I decided to test various and sundry file compression utilities—more specifically, those which offer both 32- and 64-bit versions of themselves. This benchmark did not exhaustively test all potential combinations of compression options (if you’re interested in that, see Werner Bergman’s excellent Maximum Compression and Matt Mahoney’s Data Compression Programs), nor will it compare various compressors to each other; neither will it even list how well the programs actually compressed, since that’s not really a consideration here. The sole purpose of the benchmark was to compare the execution time of a 32-bit program with its 64-bit version.

Read more…

§4991 · March 6, 2010 · (No comments) · Tags: , ,

Pages: 1 2 3 4 5 6 7 8 9 10 11 12


I’m a big fan of 7-Zip. It isn’t the best-looking application ever written, but that could be because its creator, Igor Pavlov, is concerned much more with its compression methods than its interface. 7-Zip has its own container format, but more important is the LZMA compression algorithm that Igor wrote and put into the public domain.

I decided to do some quick and dirty benchmarks to track the progress of LZMA/7-Zip over time. I went back as far as Igor supplied binaries, including one from the very old 3.x series. Rather than test every single release between then and now, I used only “stable” releases, with the exception of version 4.65, which is the latest version of any sort, as well as 4.66, which uses an alpha version of Igor’s new LZMA2 codec (and, as you’ll see, provides definite performance improvement).

I used Igor’s Timer utility to time the process (global time was reported). The corpus in this case was the Linux kernel source, v2.6.28. I conducted these tests on a RAM disk to eliminate hard disk latency issues (especially for decompressions, which improved by about 25% from my initial HDD-based tests). My rig is a Intel Core 2 Quad Q6600 [2.4Ghz], with 4GB of RAM (one dedicated to the RAM disk), running Vista SP1 x64.

The command line setup was an approximation of the 7-Zip GUI’s “ultra” settings: -t7z -m0=lzma -mx=9 -mfb=64 -md=32m -ms=on, letting the archiver auto-choose the number of threads to spawn. Read more…

§3580 · February 9, 2009 · 1 comment · Tags: , , , , ,

Last year, I moved our small programming department from using JDeveloper and editing shared files directly on a network drive to using Netbeans 6.x and a proper version control system (Subversion).

After the initial learning curve, this has all been going swimmingly. I merged my first development branch into the trunk yesterday, and this branch just so happens to dovetail nicely into the whole point of this post, which is the YUI compressor, an open-source javascript and CSS minification tool developed by Yahoo’s YUI team.

Read more…

§2692 · September 22, 2008 · 7 comments · Tags: , , , , , , ,

Just a few days ago, I compared the relative sizes of Microsoft’s Office Open XML (OOXML) and OASIS’s OpenDocument format (ODF). I noticed that while OOXML was smaller for smaller amounts of text, ODF was smaller for larger documents. I was curious as to the turning point for this curve, which I hypothesize has to do with the complexity of OOXML’s markup.

I ran a brief test using generated Lorem Ipsum text in approximate amounts (the leftmost column), and recorded its size (in bytes) when pasted into Notepad, and then as OpenDocument Text (OpenOffice.org 2.3.1), and then as OOXML (Office 2007 SP1).

After the data table is a graphical representation of the results. It’s clear that ODF slips below OOXML somewhere between 300Kb and 400Kb of raw textual data.

Comparison of file format sizes
Size Text OOXML ODF
5k 5030 12209 29408
25k 25158 14173 29715
50k 50318 15116 30039
100k 100638 18020 30616
200k 201276 24901 31670
300k 301918 31238 32676
400k 402558 37594 33634
800k 805118 61805 37418
1600k 1610238 110468 44881

file sizes


A while ago, as OpenOffice.org 2.0 approached completion, I compared the file sizes of Microsoft Office’s binary format against OpenOffice’s new OpenDocument format. Recall that OpenDocument is an XML-based storage formatted that is ultimate compressed into a zip file, creating smaller file sizes. Microsoft’s new Office Open XML is essentially the same thing, but with a totally different XML schema.

I decided to revisit this kind of test, and had four test files:

  1. The text of Ulysses, in HTML format. I chose HTML format to test the extra markup, as it should theoretically create a more complex document.
  2. A very large generated Lorem Ipsum block (205’000+ characters), which is pseudo-random, but with a lot of redundancy.
  3. A one-page block of Lorem Ipsum text, in order to test the handling of small files
  4. A randomly generated CSV with multiple kinds of text and 5’000 records. Converted used in OpenOffice Calc and Microsoft Excel.

Read on for the data table on observations.

Read more…

§1977 · February 12, 2008 · 3 comments · Tags: , , , , ,