Connection coalescing breaks the Internet
Connection coalescing is the dumbest idea to ever reach RFC status. I can’t believe nobody stopped it before it got this far.
It breaks everything.
Thus starts my latest opinion post.
What is connection coalescing?
It’s specified in the RFC for HTTP/2 as connection reuse, but tl;dr: If the IP address of host A and B overlap, and host A presents a TLS cert that also includes B (via explicit CN/SAN or wildcard cert), then the client is allowed to send HTTP requests directed to B on the connection that was established to A.
Why did they do that?
To save roundtrips and TLS handshakes. It seems like a good idea if you don’t think about it too much.
Why does it break everything?
I’ll resist just yelling “layering violation”, because that’s not helpful. Instead I’ll be more concrete.
Performing connection coalescing is a client side (e.g. browser) decision. But it implicitly mandates a very strict server architecture. It assumes that ALL affected hostnames are configured exactly the same in many regards, and indeed that the HTTP server even has the config for all hostnames.
Concrete things that this breaks:
- The server can’t have a freestanding TLS termination layer, that routes to HTTP servers based on SNI.
- The HTTP server can’t reference count HTTP config fragments, since requests can come in for anything.
- Hosts with stricter TLS config and/or mTLS cannot prevent the client from leaking headers into a less secure connection by inadvertent request smuggling. Good luck not logging secrets, while still detecting it properly.
I’m sure there are more ways that it breaks everything. It commits all servers everywhere forever to be locked in to how it works. Countless possible architectures can never be, because connection coalescing has already committed all servers into a very specific implementation.
Did the RFC not consider this?
Not really. It has a handwavy “oh the server can(!) send HTTP 421, and the client is then allowed to retry the request on a fresh connection”.
But how is the server even supposed to know? This forces a HUGE restriction on the server even detecting this happening.
And it’s too late! The secret requests with cookies and other secret tokens have already been leaked to the wrong server!
Not to mention that some clients don’t implement handling 421, even if it were possible for the server to detect the situation. Which it can’t, in the general case.
So what do I do?
For any nontrivial server setup, you should probably:
- Reject all requests on a connection that don’t match the first request. And
“hope” that SNI matches the first request. Or better yet, verify SNI against
Host
header. - Don’t put more than one FQDN in your TLS certs, and definitely don’t use wildcard certs. “Hope” that you catch all cases.
- Always use separate IP addresses per hostname. Like in the pre-SNI 1900’s. Again “hope” that you catch all cases.
And obviously hope is not a strategy.
Nobody does any of these workarouds. The Internet (well, the web) will be broken forever.
How do I even “reject”? To 421 or not to 421…
Let’s pretend that you can detect all cases of misrouted requests. What do you do? The spec allows you to return 421. But it’s a free Internet, you can do whatever you want.
If you return 421 then some clients will handle this correctly. Others will have not implemented 421 handling (it’s not mandatory), and will break in some other way.
(but remember. It’s already too late. The client has already sent you the secret request that may contain PII)
Arguably you should return some 5xx code, so that you can more easily detect when you’ve screwed up with your certs or other SNI routing. This assumes that you monitor for 500s, in some way. Basically the logic is that it’s better to work 0% of the time than 98% of the time, since you’ll be sure to fix the former, but won’t even know why some people keep complaining when it happens to work just fine for you.
The RFC says “[421] MUST NOT be generated by proxies”. Presumably this only means forward proxies?
“A 421 response is cacheable by default”. What does a cached 421 even mean? A 421 is a layering violation. You might as well say that a TCP SYN is cachable.
Summary
Connection coalescing considered dumb and harmful.
Further reading
- https://daniel.haxx.se/blog/2016/08/18/http2-connection-coalescing/
- https://blog.cloudflare.com/connection-coalescing-experiments/