High-quality compression & format conversion made simple!

Analysis of PNG image compression principle

What is PNG?

The full name of PNG is Portable Network Graphics. It is currently one of the most popular image formats for online transmission and display, for the following reasons:

Lossless Compression: PNG images use a compression algorithm derived from LZ77 to compress files, enabling a higher compression ratio, smaller file sizes, and no loss of data.


Small File Size: It uses special encoding methods to mark repeatedly occurring data, resulting in smaller file sizes for PNG images compared to those in other formats with the same content. Due to bandwidth constraints in network communication, PNG format is preferred when ensuring images are clear and realistic.


Support for Transparency Effects: PNG supports defining 256 levels of transparency for the original image, allowing the edges of the image to blend smoothly with any background—a feature that GIF and JPEG do not have.

Types of PNG

There are three main types of PNG images: PNG 8, PNG 24, and PNG 32.


PNG 8: The "8" in PNG 8 refers to 8 bits (one byte), which means the number of color types in an image is stored using a size of 2^8 (2 to the power of 8). Since 2^8 equals 256, PNG 8 can store up to 256 colors. If an image has very few color types, setting it to the PNG 8 type is highly suitable.


PNG 24: The "24" in PNG 24 is equivalent to 3 multiplied by 8 (3×8 = 24), meaning three 8-bit channels are used to represent R (Red), G (Green), and B (Blue) respectively. With R, G, and B each ranging from 0 to 255, PNG 24 can display 256×256×256 = 16,777,216 colors. As a result, PNG 24 can represent images with richer colors than PNG 8, but it occupies relatively more storage space.


PNG 32: The "32" in PNG 32 is equivalent to PNG 24 plus an 8-bit alpha (transparency) channel, representing R (Red), G (Green), B (Blue), and A (Alpha) respectively. Here, R, G, B, and A each range from 0 to 255. Compared to PNG 24, PNG 32 adds an alpha channel, which means it can display the same number of colors as PNG 24 while supporting 256 levels of transparency, enabling it to represent images with more diverse color types.

Data Structure of PNG Images

The data structure of a PNG image is similar to that of an HTTP request, consisting of a header followed by multiple data chunks, as shown in the figure below:


If you open a PNG image in Vim’s encoding viewing mode, it will look like the following:


The meaning of this series of hexadecimal codes is as follows:


89504e470d0a1a0a: This is the header of a PNG image. All PNG images share this specific code, and image software uses it to determine whether a file is in PNG format.

0000000d: This is the length of the iHDR chunk, which is 13.

49484452: This is the chunk type, which is "IHDR", followed immediately by the data.

000002bc: This represents the width of the image.



000003a5: This represents the height of the image.


And so on—each segment of hexadecimal code corresponds to a specific meaning. We won’t analyze the remaining codes one by one here, as there are too many; feel free to look them up on your own, friends.

Which PNG Images Are More Suitable for Compression?

For regular PNG images, the more uniform the colors, the fewer the color values, and the smaller the color differences, the higher the compression ratio. Take the image below as an example:


It consists only of red and green. If we use "0" to represent red and "1" to represent green, the image can be expressed numerically as follows:

00000000000000000

00000000000000000

00000000000000000

1111111111111111111111111

1111111111111111111111111

1111111111111111111111111


As we can see, this image contains a lot of repeated numbers. We can remove these repetitions and represent the entire image simply with the array [0, 1]. Using just two numbers, we can represent a large image, which significantly compresses the PNG file size.


Therefore! PNG images with more uniform colors, fewer color values, and smaller color differences have higher compression ratios and smaller file sizes.

PNG Compression

PNG image compression occurs in two stages:


  1. Prediction: This is a preprocessing stage for the PNG image, preparing it for easier subsequent compression. It’s similar to how a person might apply a primer, lotion, and serum before putting on makeup—steps that make it easier to apply foundation, whitening products, eyeshadow, and set the look later.
  2. Compression: The Deflate compression algorithm is executed, which combines the LZ77 algorithm and Huffman coding to encode the image.

Prediction

PNG images use delta encoding for preprocessing, which processes the values of each channel in every pixel. There are several types of delta encoding:


No filtering

X-A

X-B

X-(A+B)/2 (also known as the average)

Paeth prediction (this one is more complex)


Suppose a PNG image looks like this:


This image is a gradient map with gradually increasing red intensity, where the red color strengthens from left to right. When mapped to an array, its values are [1, 2, 3, 4, 5, 6, 7, 8]. Using the X-A delta encoding method, the calculation would be:

[2-1=1, 3-2=1, 4-3=1, 5-4=1, 6-5=1, 7-6=1, 8-7=1]

The resulting array is:

[1, 1, 1, 1, 1, 1, 1]

This final result [1, 1, 1, 1, 1, 1, 1] contains a large number of repeated numbers, making it highly suitable for compression.


This explains why gradient images, images with minimal color value changes, and images with uniform colors are easier to compress.


The purpose of delta encoding is to convert PNG image data values into a set of repeated, low-magnitude values as much as possible, as such values are more easily compressed.


It’s important to note that delta encoding processes the values of each color channel in every pixel, with the values of the four color channels—R (Red), G (Green), B (Blue), and A (Alpha)—being processed separately.

Compression

In the compression stage, the results from the preprocessing stage undergo Deflate compression, which consists of Huffman coding and LZ77 compression.


As mentioned earlier, Deflate compression marks all repeated data in the image, records data features and structures, and generates encoded data for the PNG image with the maximum compression ratio.


Deflate is an algorithm for compressing data streams. It can be used wherever streaming compression is required.


Additionally, as we noted earlier, a PNG image is composed of many data chunks, but some information within these chunks is actually useless. For example, when a PNG image is saved in Photoshop, there is a chunk that records "this image was created by Photoshop." Many such pieces of information are unnecessary. Using Photoshop’s "Export for Web" function can remove this useless information. The comparison before and after exporting for the web format is shown in the figure below:


As can be seen, after exporting in the web format and removing a lot of unnecessary information, the image has become significantly smaller.