Image Compression
 Introduction Classifying image data Quantization Bit Allocation Entropy Coding Source Code

Back to main page

## An Introduction to Image Compression

Compressing an image is significantly different than compressing raw binary data. Of course, general purpose compression programs can be used to compress images, but the result is less than optimal. This is because images have certain statistical properties which can be exploited by encoders specifically designed for them. Also, some of the finer details in the image can be sacrificed for the sake of saving a little more bandwidth or storage space. This also means that lossy compression techniques can be used in this area.

Lossless compression involves with compressing data which, when decompressed, will be an exact replica of the original data. This is the case when binary data such as executables, documents etc. are compressed. They need to be exactly reproduced when decompressed. On the other hand, images (and music too) need not be reproduced 'exactly'. An approximation of the original image is enough for most purposes, as long as the error between the original and the compressed image is tolerable.

### Error Metrics

Two of the error metrics used to compare the various image compression techniques are the Mean Square Error (MSE) and the Peak Signal to Noise Ratio (PSNR). The MSE is the cumulative squared error between the compressed and the original image, whereas PSNR is a measure of the peak error. The mathematical formulae for the two are

MSE   = PSNR =   20 * log10 (255 / sqrt(MSE))

where I(x,y) is the original image, I'(x,y) is the approximated version (which is actually the decompressed image) and M,N are the dimensions of the images. A lower value for MSE means lesser error, and as seen from the inverse relation between the MSE and PSNR, this translates to a high value of PSNR. Logically, a higher value of PSNR is good because it means that the ratio of Signal to Noise is higher. Here, the 'signal' is the original image, and the 'noise' is the error in reconstruction. So, if you find a compression scheme having a lower MSE (and a high PSNR), you can recognise that it is a better one.

### The Outline

We'll take a close look at compressing grey scale images. The algorithms explained can be easily extended to colour images, either by processing each of the colour planes separately, or by transforming the image from RGB representation to other convenient representations like YUV in which the processing is much easier.

The usual steps involved in compressing an image are

1. Specifying the Rate (bits available) and Distortion (tolerable error) parameters for the target image.
2. Dividing the image data into various classes, based on their importance.
3. Dividing the available bit budget among these classes, such that the distortion is a minimum.
4. Quantize each class separately using the bit allocation information derived in step 3.
5. Encode each class separately using an entropy coder and write to the file.

Remember, this is how 'most' image compression techniques work. But there are exceptions. One example is the Fractal Image Compression technique, where possible self similarity within the image is identified and used to reduce the amount of data required to reproduce the image. Traditionally these methods have been time consuming, but some latest methods promise to speed up the process. Literature regarding fractal image compression can be found at <findout>.

Reconstructing the image from the compressed data is usually a faster process than compression. The steps involved are

1. Read in the quantized data from the file, using an entropy decoder. (reverse of step 5).
2. Dequantize the data. (reverse of step 4).
3. Rebuild the image. (reverse of step 2).

### Details

Details about classifying image data (using the Discrete Wavelet Transform).

Details about bit allocation (using the Rate-Distortion optimization procedure).