An Introduction to Image Compression
Compressing an image is significantly different than
compressing raw binary data. Of course, general purpose
compression programs can be used to compress images, but
the result is less than optimal. This is because images
have certain statistical properties which can be
exploited by encoders specifically designed for them.
Also, some of the finer details in the image can be
sacrificed for the sake of saving a little more bandwidth
or storage space. This also means that lossy compression
techniques can be used in this area.
Lossless compression involves with compressing data
which, when decompressed, will be an exact replica of the
original data. This is the case when binary data such as
executables, documents etc. are compressed. They need to
be exactly reproduced when decompressed. On the other
hand, images (and music too) need not be reproduced
'exactly'. An approximation of the original image is
enough for most purposes, as long as the error between
the original and the compressed image is tolerable.
Error Metrics
Two of the error metrics used to compare the various
image compression techniques are the Mean Square Error
(MSE) and the Peak Signal to Noise Ratio (PSNR). The MSE
is the cumulative squared error between the compressed
and the original image, whereas PSNR is a measure of the
peak error. The mathematical formulae for the two
are
MSE =
PSNR = 20 * log10 (255 / sqrt(MSE))
where I(x,y) is the original image, I'(x,y) is the
approximated version (which is actually the decompressed
image) and M,N are the dimensions of the images. A lower
value for MSE means lesser error, and as seen from the
inverse relation between the MSE and PSNR, this
translates to a high value of PSNR. Logically, a higher
value of PSNR is good because it means that the ratio of
Signal to Noise is higher. Here, the 'signal' is the
original image, and the 'noise' is the error in
reconstruction. So, if you find a compression scheme
having a lower MSE (and a high PSNR), you can recognise
that it is a better one.
The Outline
We'll take a close look at compressing grey scale
images. The algorithms explained can be easily extended
to colour images, either by processing each of the colour
planes separately, or by transforming the image from RGB
representation to other convenient representations like
YUV in which the processing is much easier.
The usual steps involved in compressing an image
are
- Specifying the Rate (bits available) and
Distortion (tolerable error) parameters for the
target image.
- Dividing the image data into various classes,
based on their importance.
- Dividing the available bit budget among these
classes, such that the distortion is a
minimum.
- Quantize each class separately using the bit
allocation information derived in step 3.
- Encode each class separately using an entropy
coder and write to the file.
Remember, this is how 'most' image compression
techniques work. But there are exceptions. One example is
the Fractal Image Compression technique, where possible
self similarity within the image is identified and used
to reduce the amount of data required to reproduce the
image. Traditionally these methods have been time
consuming, but some latest methods promise to speed up
the process. Literature regarding fractal image
compression can be found at <findout>.
Reconstructing the image from the compressed data is
usually a faster process than compression. The steps
involved are
- Read in the quantized data from the file, using
an entropy decoder. (reverse of step 5).
- Dequantize the data. (reverse of step 4).
- Rebuild the image. (reverse of step 2).
Details
Details about classifying image
data (using the Discrete Wavelet Transform).
Details about bit allocation
(using the Rate-Distortion optimization procedure).
Details about quantization.
Details about entropy coding.
Some interesting sites
Copyright (c) Satish Kumar. S 2001-2003. Last Modified - 22 Oct 2001
Suggestions/Broken links/queries? Write to satish@debugmode.com
|