Image File Formats
Signal intensities are encoded as numbers in the image file, one number
for
each pixel. For color images, things are a bit more complicated: for
each color
channel, one number has to be stored, for example one intensity each
for red,
green, and blue. Make sure you use grayscale
instead of
color images: the extra color information provides no benefit for the
subsequent analysis of 2D gels.
When your image file is, for example, a "16-bit TIFF"
file this
means that
image intensities are encoded with 16 bit numbers, giving 65,536 (2 to
the
power of 16) possible different values for each pixel. In contrast, an
8-bit
image file only stores 256 different values per pixel. The number of
bits per
pixel is also called the color depth. The following table shows some
examples.
Typical color depths
| color depth |
intensity levels |
example |
| 1 bit |
2 |
black and white FAX image |
| 8 bit |
256 |
GIF image |
| 10 bit |
1,024 |
TIFF image |
| 12 bit |
4,096 |
TIFF image |
| 16 bit |
65,536 |
TIFF image |
TIFF
images can have different color depths, this is one of the
reasons why
TIFF
is a widely used image file format.
Generally, having more possible intensity values per pixel (higher
color
depth) is better for the analysis. The tradeoff is between higher
accuracy and
need for more space to store the information: a 16 bit TIFF
file is
twice as
large as the equivalent 8-bit file, but gives you 256 times more
nuances in the
image. Another consideration is that many everyday image processing
programs
are not able to deal with color depths greater than 8 bit.
Decreasing color depth in the scanned image file normally results in
loss of
accuracy. It is not recommended to increase color depth after the scan
is
completed: if your scanner produced only an 8-bit image you have at
most 256
different intensity values in the image. Converting the file to a
16-bit image
will only give you at most 256 out of 65,536 possible gray values.
You can find more information about image file formats in the Wikipedia
Category: Graphics File Formats. At the ProteomeInformatics.net Proteomics Image
Analysis Forum there is a discussion thread "Compatibility of Image formats across different
Image Analysis Softwares" that focuses on experiences from
the proteomics community with vendor specific formats.
Data reduction and image calibration
Some imaging devices can measure more intensity values than what
fits into
the available image formats. One way to deal with this is distributing
the
intensity values linearly over the whole intensity range that is
offered by the
image file. An example: Say, the Aanalog to Digital signal converter can deliver 1024
intensity
levels, but the image file is limited to 256 levels (8-bit color
depth). Linear
transform condenses 4 A/D converter levels to 1 intensity level in the
image.
Of course, this process effectively wastes accuracy.
Especially if light scanners are used for image generation this technique
can be improved by using only the real dynamic range delivered
by the A/D converter. For our example this could mean: The A/D
converter can
deliver 1024 different intensity levels but in our experiment the gel
only has
intensity values in the range from 128 to 920. The 792 intensity levels
have to
be encoded in the image file - so only 3 A/D converter levels have to
be
condensed to 1 image intensity level. Compared to the former approach
that wastes
the lower and higher dynamic range this improvement results in a 25%
finer
resolution.
There are instruments that can
distinguish 100,000 and more intensity levels, far more than can be
encoded in a
TIFF
or similar file format. In order to save as much information as
possible in
these files, intensity levels are encoded using a nonlinear calibration
curve.
During scanning, measured intensities are converted to pixel gray
values
according to this curve. During quantitation, the image analysis
software has to
decode the pixel values to arrive at the originally measured
intensities. The
curve is designed such that lower intensities will be encoded with
higher
accuracy than higher intensities.
Example image calibration curve.
 |
During image analysis, it is
important that
the image analysis software recognizes the calibration curve so it can
do
quantitation using the intensities originally measured by the scanner.
General
image processing packages such as Photoshop ignore grayscale
calibration. It is
even possible that calibration information is lost during
processing with
these packages. Additionally, TIFF
files do not include calibration
information
so you will have to use vendor-specific formats.
File compression
Compression of image data means the usage of algorithms to reduce
the size of
a gel image file while retaining all or most of the image information.
If you
use calibrated image file formats then image compression is not an
option
because compressed file formats do not store the calibration
information.
Image compression algorithms can be classified as lossy or lossless
methods.
Lossy compression means that information can be lost in the image
details,
giving a higher compression ratio than lossless methods.
Uncalibrated files may be saved in file formats that use loss-less
data
compression e.g. subformats of *.tiff / *.tif or *.png. Often
compression to 50
% of the original size is possible. The commonly used JPEG format uses
a lossy
compression method. Large compression ratios may heavily change or
destroy your
data. For low compression ratios, there is still an unknown influence
of
compression on spot quantities. That is why we recommend to avoid using
the JPEG
file format for image analysis purposes.
Image compresson only saves disc space but does not affect the
amount of working memory (RAM)
needed for image analysis because compressed files will also be
completely
extracted before starting analysis.
The following table summarizes the properties for commonly used
image file
formats.
Commonly used image file formats
| format |
compression |
gray levels |
use for quantitative analysis |
calibration |
| gif |
lossless |
256 |
no |
no |
| bmp |
no |
2; 256 |
(yes) |
no |
| png |
lossless |
256 |
yes |
no |
| jpg |
lossy |
256 |
no |
no |
| tif(f) |
no (loss less) |
2; 256; 1024; 4096; 65536 |
yes |
no |
| img inf |
no |
1024; 4096; 65536 |
yes |
yes |
| gel |
no |
65536 |
yes |
yes |
|
|