by enz on 6/5/22, 1:27 PM with 46 comments
by ThePhysicist on 6/5/22, 7:04 PM
TCPLS seems like a good proposal but I won't hold my breath for it as the big players like Cloudflare, Google, Akamai, Fastly are already committed to QUIC. There's still significant churn in the proposals and some features like MASQUE are still being finalized, but overall QUIC works really well.
by scarmig on 6/5/22, 5:00 PM
by throwaway787544 on 6/5/22, 5:08 PM
Compare that to finalizing a new protocol, putting it in the kernel, and letting all applications just connect() and send() and recv(). We lose the ability to constantly mutate the protocol whenever FAANG feels like it, but we also gain the ability for every application to use the new protocol just by changing a flag passed to a syscall.
by gsliepen on 6/5/22, 6:58 PM
by a-dub on 6/5/22, 5:18 PM
by simmervigor on 6/5/22, 2:21 PM
https://twitter.com/alagoutte/status/1532013841718120449?t=W...
by badrabbit on 6/5/22, 6:44 PM
by Matthias247 on 6/5/22, 9:05 PM
There's some good ideas in this one! The idea of just keeping a TCP stream per Tcpls stream while sharing some crypto state means it can be really efficient, since all optimizations that exist for TCP (including hardware offloads) will continue to work. That is also visible in the efficiency benchmarks [1] on the site.
One might even argue that with such a design where the most expensive parts of TLS (handshakes) become no-ops an introduce no additional latency, techniques like HTTP/2 become unnecessary - one could keep just having a single TCP stream per request and enjoy better flow control, fairness and observability than with anything application managed.
However there are some challenges that need to be resolved for running protocols on "real-world infrastructure and datacenters", which change the picture a bit. One is that a single IP address announced by large-scale webservices doesn't necessarily point to a single server, but to e.g. a rack of them or even bigger units. This means that a second TCP connection might (or even should be) established to a different server - which would not be aware about the TLS session state. If the state is kept on a different server, the original one would need to forward all packets either via IP-in-IP encapsulation or by creating a separate upstream TCP connection and proxying all data bidirectionally. All those approaches would be fairly expensive - every time a TLS session would be joined the amount of packets handled by a server would be double of the ideal version.
While a similar challenge exists for QUIC, it's actually cheaper on average: Since multiple streams use the same QUIC connection and usually also the same source/destination IP and port pairs, traffic would usually be directed to the same host by common infrastructure and require no special handling. It would only require effort to "fix" misdirected QUIC connections in the event of connection migrations - which are rather rare. If packet forwarding is required, it can be achieved rather cheap and easily by interpreting the QUIC connection ID fields in all UDP packets and doing raw UDP packet forwarding.
Therefore as far as I understand the proposals so far, it might not be necessarily clear-cut on whether running QUIC or TCPLS would be cheaper and an ingress or load-balancing layer - which are the layers where both protocols have the biggest benefits due to the highest amount of concurrent connections and higher client-facing RTT.
But maybe the challenge for TCPLS could be somehow solved by having a shared place for the TLS state between hosts and making sure streams are operating fully independently - but it requires a bit more thoughts.
I'm wondering how close such a proposal is then to the existing TLS session resumption mechanisms - just extended by allowing more than 1 follow-up session.
[1] Note that the diagrams are really about efficiency and not performance. They tell nothing about what throughput might be available for end-users that are e.g. 100ms away from the server. For local tests the delay and packet introduced by large-scale internet infrastructure doesn't play a role.
by bediger4000 on 6/5/22, 3:27 PM