Color Spaces, Bitmaps and Pumpkins

2 hours ago 1

In computer graphics, color can be encoded in several different ways. A common encoding is sRGB (standard RGB), in which a color value is split into red, green and blue components. Each component can be stored again in a number of different ways, the most common being an unsigned integer (usually 8 bits wide): a zero value in the red component, for example, means no color, while 255 means full red color. Using this encoding, a 160×160 bitmap needs 160*160*(8+8+8), or 76.800 bytes. This poses a few problems to memory-constrained devices, like old PalmOS PDAs: such a bitmap would not fit a single resource, which are limited to just short of 65.536 bytes. Another problem is memory alignment, since each pixel requires an odd number of bytes. The Motorola 68000-compatible processors that powered those devices had a 16 bits data bus, making it inefficient to access a long stream of data aligned at 24 bits. Because of that, later PalmOS versions offered two different encodings for storing color bitmaps: 8 bits palette bitmaps, capable of using up to 256 simultaneous colors, and 16-bit direct color bitmaps, capable of using 65.536 different colors (encoded using 5 bits for red, 6 for green, and 5 for blue).

Our main concern here is the first type: a palette stores 256 colors, indexed from 0 to 255. Each palette entry defines the red, green and blue components as unsigned 8-bit integers. Given a palette index, retrieving the color components can be achieved in constant time, or O(1) using the Big O Notation. The inverse operation, however, is more interesting. Given an arbitrary RGB color triplet, of all 256 palette entries, which one is closest to the triplet? An RGB triplet can represent 16.777.216 different colors, but our palette stores only 256 unique colors. It is obviously impossible to provide an accurate color for each triplet, but we can find a compromise and return an approximation. Before we can answer this question, however, we must define closest. Given two arbitrary colors in the palette, which one best represents the target RGB triplet?

There are many ways to define a closest function, and the generic algorithm is outlined in this simplified C function named RGBToIndex. It accepts a RGB target color and returns the index of the color that best approximates the target:

Simple algorithm to find the closest color in a palette

The function iterates the palette, and for each entry, it calls a difference function that accepts two colors, the target and the palette entry. The value returned by the difference() function is compared to the current minimum (initialized to a large value before the loop) and, if it is lower, the minimum is updated. At the end, the index of the entry that produced that minimum is returned. There is an obvious optimization that was left out for simplicity: if there is an exact match, the loop can be exited earlier, since no other color would produce a better result.

Our goal is to compare different algorithms and see how they perform, analyzing the quality of the produced image and the time it takes to run. The test image is a 320×180 frame from the Big Buck Bunny short film from the Blender Foundation, licensed under CC BY 3.0.

The test image in 24 bits color
The same image using an optimized palette

On the left, there is the original 24-bit RGB image, scaled down to 320×180 pixels. To make things even, since we will be comparing images that use a color palette. On the right, there is the same image converted with Gimp to an optimized palette of 256 colors. Our test images will also have a color palette, but they will use the default palette found in PalmOS, which obviously is not optimized for any specific image.

Now the hard part: how do we compare colors? We will start with a color space model that represents colors in a way that mimics how humans perceive colors. It turns out that we do not see all shades of red, green and blue in the same way. To use this color space, named CIELAB, we must convert the RGB values to L*a*b* values (the “*” here is part of the notation, and not multiplication), where L* represents perceptual lightness and a* and b* represent the unique colors of human vision. This conversion relies on somewhat complicated formulas involving floating-point numbers that are beyond the scope of this article. After the conversion, colors (L1*a1*b1*) and (L2*a2*b2*) are interpreted as points in the 3D space and the Euclidean distance is computed between the points:

Delta E*ab is the value returned by the function difference() in the code above. Using this algorithm, the RGBToIndex function was iterated 1.000.000 times with a random color value as input parameter, and the total time was measured. In my setup, the measured time was around 25 seconds, but since we are going to compare it with other methods, the absolute time is not relevant. We will interpret this time as 1,0 time units and scale other times accordingly. The image converted with this algorithm is shown below.

Image converted using Euclidean differences in CIELAB color space

Using an accurate color model is important if precision is required, but in practice, we can get away with simpler models that are much faster to compute. The first optimization is to forget about how we perceive colors differently and simply handle red, green and blue in a uniform way. We can skip the initial conversion entirely and compute the Euclidean distance directly on the RGB components:

Using the same random seed (so the same color sequence is generated), RGBToIndex was again iterated 1.000.000 times. The total time is now 0,0260 time units, a dramatic improvement.

Image converted using Euclidean differences in sRGB color space

There are noticeable differences in the base of the tree trunk, where the CIELAB method performed better, and also on grass, where apparently the sRGB method produced a nicer-looking result.

Another improvement is possible if we realize that the absolute distance between two colors is not important, but only their relative distance. If the square root of (a) is less than the square root of (b), it follows that (a) is less than (b). It means that we can skip a costly floating-point square root operation and compare the squared distances, working only with integers now. Using this little optimization, the total time drops to 0,0015 time units, another substantial improvement.

Same as before, but without square roots

The red line in the picture below shows the Euclidean distance between the black dot on the bottom left and the black dot on the top right (in 2D space, but the same principle applies to 3D space). The colors in the picture are arbitrary; they do not relate to RGB values. The blue and green lines use what is called Manhattan distance: instead of tracing the shortest path between two points, we are forced to walk along the gray lines, taking corners as we wish. Using this metric, the blue line is 18 units long (9 horizontal + 9 vertical). The green line is also 18 units (4 + 3 + 3 + 3 + 2 + 3), showing that, no matter which path we take on the grid, the distance is the same.

“Manhattan” distance between two points in 2D space

The theoretical advantage of this method over the previous one is that no multiplications are necessary, only additions. It no longer gives a direct representation of distance, but the relative distances between points are preserved. The RGBToIndex run with this new distance function completes in 0,0015 time units, a tie with the previous method. The quality of the image is also very similar to the previous two that used Euclidean distances.

Image converted using Manhattan differences in sRGB color space

So far, all methods use what is called a linear or sequential search. Iterating all entries in the palette is an O(n) operation. In theory, it is possible to find the closest color without testing each palette entry. But for that, we need a special data structure called K-D Tree, or K-Dimensional Tree. In our case, K is 3, because there are three dimensions: red, green and blue. A K-D Tree is like a binary search tree, in which the dimensions cycle at each level of the tree. Every no-leaf node divides the color space into two hyperplanes. For example, to the left of the root node are all colors with red values less than the root, and to the right are all colors with red values greater than the root. On the second level, the green dimension is used, on the third level, blue is used, and the cycle repeats. In practice, however, the implementation of K-D Tree I have tested performed poorer than all methods using Euclidean or Manhattan distances.

PumpkinOS, the PalmOS re-implementation I have been developing, depends on an efficient implementation of RGBToIndex to perform complex bitmap rendering using multiple, incompatible color palettes. After much experimentation, the current distance function is based on the squared Euclidean method.

Read Entire Article