HTTP Connection
Managing HTTP connections is an important process in HTTP. The type of HTTP connection impacts performance, stability, and the general operation of web-based applications and websites. Connection types and options are areas that have been updated between versions of HTTP, either to increase performance or improve robustness.
Connection scope
HTTP connections made between a client and a server are not created in an end-to-end manner. Rather, they are made on a hop-by-hop basis, which means that the first HTTP connection is opened between the client and the first intermediary, or proxy, between it and the server. A new HTTP connection is created between each intermediary and finally, there is a new one between the final intermediary node and the server.
This is important because intermediate nodes can use their own parameters for specifying HTTP connection types, and a single end-to-end trip can consist of different ones. This can affect both performance and stability.
Connection types
In HTTP/1.1, TCP is used to provide short-lived HTTP connections, persistent HTTP connections, and pipelining. Prior to HTTP/1.0, many HTTP connections were required to download resources on a single webpage. All of these were short-lived HTTP connections, each to transfer exactly one resource, even if each resource was just a single image as part of a larger multimedia webpage. HTTP connections were not made in parallel, which meant that each one had to be closed before the subsequent one was established.
Persistent connection
To reduce the number of HTTP connections required, the persistent HTTP connection type was introduced in HTTP/1.0. It cut back on overhead because several HTTP requests can be made over a single HTTP connection, without incurring the additional bandwidth required for dropping and reestablishing new ones. It is not difficult to recognize that on a page with many images, each originally requiring a separate HTTP connection, that the persistent HTTP connection was a time and bandwidth saver. In the former model, network latency was amplified primarily because of the additional handshakes, which needed at least one full network round trip each.
The persistent HTTP connection is also known as a keep-alive connection and is the default method in HTTP/1.1. It has a performance advantage beyond eliminating multiple handshakes. Specifically, the longer an HTTP connection stays open, the longer the TCP congestion control algorithm has to find the optimal throughput. This is because each TCP connection begins in a heavily rate-limited mode to mitigate the impact of lost data packets. If the HTTP connection supports more bandwidth, it is gradually adjusted. This is sometimes referred to as a warm HTTP connection. When a HTTP connection is dropped, this entire process needs to be reset.
Persistent HTTP connections consume resources even when they are idle, and as a result, one is not maintained indefinitely. An idle HTTP connection will be dropped to conserve server resources and increase performance.
Pipelining
HTTP pipelining is similar to a persistent HTTP connection in that several HTTP requests can be made before it is closed. What distinguishes these two HTTP connection types is that a persistent HTTP connection is intended to be a set of complete HTTP request/response pairs. As each HTTP request is made, the server generates and sends the corresponding HTTP response. One successively follows the other. With pipelining, several HTTP requests can be made without having to wait for a HTTP response, which ultimately reduces network latency.
Pipelining is limited to idempotent HTTP methods including HTTP GET, HEAD, PUT, and DELETE. The reason for this is that if the message is lost, then the entire HTTP request can be resent without having to be concerned with side effects.
There are several problems with the pipelining HTTP connection method and because of these, it is no longer used in modern web browsers. The first problem stems from the complexity in implementation. Several factors including the size of the resource, the length of time required for a full network round trip, and the bandwidth, all have to be considered. Without visibility of these, lower-priority messages can arrive before critical ones, ultimately resulting in only a small improvement in most cases. Difficulties with implementation have led to issues with proxies, which is still a problem that is difficult to account for. Finally, pipelining is susceptible to the head-of-line problem.
The HTTP/2 replacement for pipelining is a new protocol called multiplexing.
Domain sharding
Domain sharding is a technique that was used to improve page load performance by opening multiple simultaneous HTTP connections. Essentially, it permits more resources than normally allowed to be retrieved concurrently. However, this technique has been deprecated and is not needed in HTTP/2. This is because HTTP/2 adeptly handles multiple HTTP requests in parallel, thus subsuming the concept and any benefit of domain sharding. Furthermore, domain sharding can be detrimental to performance in some cases.
Frames, streams, and multiplexing
HTTP/2 introduces the concept of a frame, which is a binary replacement for the header and body sections of an HTTP message. Simply put, a frame is the basic data structure that is sent during the HTTP connection, and multiple frames make up a message. There are several types of frames including a header frame, data frame, etc. One of the advantages of packaging the HTTP header section into a binary frame is that it can be compressed, again reducing bandwidth requirements and improving performance.
The message, or set of frames, is then packaged into a stream. A stream is a bi-directional sequence of frames that can co-exist with many streams on the same HTTP connection. This is what is referred to as multiplexing. Streams are identified by a stream ID to avoid collisions and the main advantage is that a single HTTP connection can be used for all of the messages in both directions. Multiplexing also solves the head-of-line blocking problem because requests that are transmitted over the same HTTP connection can be responded to out of order.
QUIC and HTTP/3
Although HTTP/2 is an improvement over HTTP/1.1, one of the problems is that it merely mitigates the head-of-line blocking problem. The reason for this is that a single TCP connection can indeed support multiple streams but if one fails, all of the streams remain blocked. Essentially, the traffic comes to a standstill until the lost packets are successfully retransmitted and accepted. The problem is inherent with TCP and cannot be easily fixed. Hence, QUIC is built on top of UDP.
As QUIC is not reliant on TCP, it is not constrained by its limitations. However, it represents a significant effort because QUIC replicates a superset of the functionality offered by TCP but using the UDP protocol.
Takeaway
HTTP connection management is central to HTTP and it has been improved with each iteration of the protocol. HTTP connections between a client and server are on a hop-by-hop level, as opposed to end-to-end, which means that different types of HTTP connections can be present in a single full network round trip. The version of HTTP being used will dictate the type of HTTP connection, which is in turn tied to overall network performance.