HTTP Compression

Data compression is the process of encoding or reformatting information such that it takes less data than the original representation. HTTP Compression is a common technique that is used to increase the performance of a website by decreasing its bandwidth requirements.


Data that is being transmitted using HTTP can be compressed in different ways. The compressibility of a file typically depends on the level of data redundancy, as well as how much of the data can be removed such that the result is perceptibly dissimilar within an acceptable tolerance. Compression algorithms differ in several ways but they can generally be divided into those that perform lossy compression versus lossless compression.

Lossy compression

Lossy compression is the process of removing data permanently, and although the original representation cannot be fully recreated, the result is imperceptibly different from the original. For example, a JPEG file is a size-reduced image that relies on the fact that the human eye more easily detects variations in luminescence compared to color. The compression algorithm works by rounding off bits that are deemed nonessential. Similarly, audio compression algorithms routinely remove audio data that is outside the range of human hearing. Lossy compression is also used extensively for video.

Depending on the data and the algorithm being employed, there is a trade-off between quality and size. Specifically, an image or audio file can be stripped of increasingly more data based on how good the final quality needs to be.

Lossless compression

Algorithms that employ lossless compression are not one-way, as is the case with lossy compression. Rather, they can temporarily shrink data to a size less than the original representation, then fully recreate it. In the interim, the compressed data might be stored, utilizing less storage space, or in the case of HTTP, transmitted using lesser bandwidth requirements.

The trade-off for lossless compression algorithms is time versus size. Clearly, there are diminishing returns if it takes too long to compress and then subsequently decompress the file. However, if the effective cost of bandwidth is very much greater than the cost of time, then maximally compressing data is the best choice. In practice, a reasonable trade-off is made between the overhead, in terms of time, and reduction in data size.

End-to-end compression

In this context, end-to-end compression refers to the process of compressing data at the origin, transmitting it through as many intermediaries as needed, until it reaches the destination where it is uncompressed. Proxies and middleboxes such as load-balancers or firewalls do not uncompress the data. An exception to this is a security-related middlebox that is used for traffic inspection. Other than in this case, the only transmission delay is the transfer itself, with no overhead required for the intermediate uncompression and examination of the message.

Several end-to-end compression algorithms exist but the two most popular ones are gzip and Brotli (br). The br algorithm is optimized for HTTP. Servers use proactive Content Negotiation to choose which algorithm to use, based on what the client is willing to accept. The client is responsible for including its supported compression algorithms in the HTTP request.

Other standard compression algorithms include compress, deflate, deflate-raw, Efficient XML Interchange (exi), identity, pack200-gzip, and Zstandard compression (zstd). Several additional unofficial compression algorithms may be available, depending on the server.


Servers are not obligated to employ compression on the data, regardless of what the client HTTP request includes.


Newer browsers, e.g. Chromium based, also support deflate-raw to give webmaster access to the raw deflate stream without headers or footers. This is for example needed when reading and writing ZIP files.

Hop-by-hop compression

Hop-by-hop compression is similar to end-to-end compression, although it differs both in concept and operation. Importantly, the server does not compress the message body, and the client does not deal with a compressed message body. Instead, intermediary nodes use a negotiation mechanism to compress data only between hops. With the control at the intermediary level, it means that the transmission between any two nodes can use a different compression algorithm or even no compression at all. For the final transmission to the client, compression is not used.

As part of this protocol, the transmitting node will use the TE header to inform the receiving node of what compression algorithms it supports. When the response comes back, the algorithm in use will be specified in the Transfer-Encoding header.


HTTP/2 has specific restrictions on the use of the TE header; namely, it can only contain the keyword “trailers”.


Hop-by-hop compression is rarely used and several server implementations do not have an easy way to configure the functionality.


In this end-to-end compression example, the client requests an HTML file and lists zstd, gzip, br, and deflate in the HTTP Accept-Encoding request header field. The server chooses gzip and in the HTTP response, indicates this in the HTTP Content-Encoding response header field.


GET /news.html HTTP/1.1
Accept-Encoding: zstd, gzip, br, deflate


HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 5000
Content-Encoding: gzip

<compressed version of the resource is returned>


HTTP Compression is used to improve performance by reducing the bandwidth required for HTTP messages. It gives the perception of increased network performance but in fact, less data is being transmitted. Depending on the compression, the new version will either be identical or imperceptibly different from the original. Content Negotiation is used to choose a compression algorithm and the process can be performed end-to-end, between server and client, or hop-by-hop between intermediate nodes.

See Also

Last updated: August 2, 2023