Weblog

HTTP/2 and SPDY

4 October 2012, 14:47

The IETF has formed a working group to develop and maintain the new HTTP/2 standard. They are using Google's SPDY protocol as a starting point. I have of course already taken a look at SPDY and I'm not quite happy about it. I'll explain why that is. Of course, I'm looking at SPDY from the webserver's point of view and therefor not looking at what it means for the browser.

Compression: With SPDY, all content is compressed. This sounds great, but it isn't. The biggest part of the content of a websites consists of images, which can't be compressed. Compressing something that can't be compressed is just a waste of CPU power. Compression is a good thing, but it should be optional, just like it is now. A webserver should have a configuration option to compress only files that can be compressed, like HTML, CSS and Javascript.
HTML is something that can be compressed very well. The creators of PHP know this. That's why they gave PHP compression support (via the zlib.output_compression setting). I really believe other languages should have the same.

Encryption: According to the SPDY specs, all content should also be encrypted. Encrypting everything is a bad idea. Most content on the internet isn't confidential, so why encrypt it? It's again a waste of CPU power. And since most of the people who use the internet don't know what encryption means and what to do in case of a certificate error, encrypting everything doesn't make the internet more safe.
It also makes hosting a webserver more expensive, since it requires a certificate. Specially for people, like me, who have a lot of hobby websites. If all those websites would require a certificate, it would cost a few hundreds of euro's per year. Quite expensive for just some hobby websites that work just fine with HTTP/1.1.

Server push: A SPDY-enabled server can push items to a client when a server knows that a client needs it. My question is: how does a server know what a client needs? Due to caching, it is very well possible a client already has the resource that a server is about to push. And for those few rare cases that a server is 100% sure the client can't have the resource, is it really worth the trouble of implementing this feature?

Multiplexing: a web performance researcher has tested SPDY with real world sites and found that SPDY doesn't offer the same performance boost as it does with sites in test environments. The reason is that most websites have resources located multiple servers. The benefit from multiplexing only works when all resources come from the same server. In my opinion, thinking that this will change because of SPDY is naive. The multiplexing part in SPDY is the main reason why I think that SPDY is a protocol by Google, for Google.

What would I like to see in HTTP/2?

Instead of the SPDY features, I think the following things are far more easy and also offer improvement:

Send certain headers only once: In HTTP/2, at least the following headers should only be sent in the first requests: Host, User-Agent, Accept-Encoding, Accept-Language, Accept-Charset. Those headers also count for the following requests. By not sending them over and over again saves a lot of bytes.

Connection header deprecated: Keep-alive is default in HTTP/2. A client or server can disconnect any time. Telling it via the Connection header is unnecessary.

Related-Resource: A server application can optionally list all the resources via a Related-Resource HTTP header. Via this header, a browser knows what other resources to request before the entire body is received. Via request pipelining, those resources can be requested very quickly when needed. This is the same as the Server Hint in SPDY.

These are my first thoughts about HTTP/2 and SPDY. I really like to hear yours.

Chris Wadge

8 October 2012, 21:13

I had a very similar conversation about SPDY with a co-worker recently. From my perspective, it seems Google wants to make this a well-supported standard to make browser support more ubiquitous, not because it makes a lot of sense for the rest of the world. I'm actually kind of afraid SPDY will become a widely adopted standard, and wind up in the middle-manager's vocabulary in the next round of buzz-word bingo the real technical people have to sit through. My $0.02.

C-S

8 October 2012, 21:24

There is a very nice blog about the CRIME attack on compressed encrypted content which is maybe another reason not to use SPDY now:

https://blog.torproject.org/blog/some-thoughts-crime-attack

Frank from Berlin/Germany

8 October 2012, 21:29

I have to admit, I have only a basic understanding of SPDY, but from years of practical experiences as a email/webserver admin, I can tell the more comlicated a system is the more likely it is to be exploited, missused or ineffective.

Compression:
Very good point. From the websites I read every month, I guess about 95% of the traffic is due to images, compressed jpg, png, gif. 1MB of binary image data is a small image, but 1MB of HTML-Text, well dozens of pages!

Encryption:
Yeah! Encryption is a tricky business. If you do it right, you are well off. If you don't know what you are doing - and most people fall into this category- you get a false sense of security. A nightmare might arise. So a omnipresent encryption scheme may not be a good solution after all.
Regarding certificates, I couldn't agree more with Hugo. Trading/Selling certificates is big business and the independent hobbiest or semi-pro will eventually lose. I am sure some browsers will refuse to work with self-created certificates in the future.

Server push:
I am not comptent on that. But I have a feeling it open many security issues. Any opinions on that? anyone?

Mulitplexing:
Seems to me 99% marketing. People always want it faster, bigger, brighter.

celilo

9 October 2012, 06:44

If you can identify the issues that you have so quickly, it's clear that SPDY has not been well conceived. I agree with Frank, simplicity is the mark of most good technology. I like your suggestion of passing information only when necessary.

Samiux

9 October 2012, 07:24

Even PHP Session is encrypted, it is very easy to be hijacked by hacker.

Remco

9 October 2012, 09:45

Hi Hugo,

The way you explain SPDY, I agree that it not that useful. (But hey, it enables new discussions...) You propose some alternative HTTP/2.0 ideas. I think there are several important things wrong with your proposal.

HTTP/1.x is stateless. With your suggestion you introduce a state: the server has to store initial values. This may introduce new 'challenges' to take care of. For example, how does the client know that the server has up to data header values? The server may have decided to remove the client headers from its cache. A solution would be adding some sort of an handshake protocol: the client can send its headers only when thinks it is necessary, if the client does not send its headers and the server has no headers for the client, then the server can send a reply back saying "please send me your headers...". If this kind of handshake has to be executed too often, it will be a unwanted penalty.

I agree that keep-alive should be a standard. But it should not be used as a 'state'-control mechanism. As far as I understand it, keep-alive should always set to a short time, so it is there to serve multiplexing for free. Setting the keep-alive window too high may lead to unneeded server-resources waste.

I believe that another solution to reduce the number of bytes needed for each HTTP request can be: introducing a binary header (such as for example IPV6). It will remove the need for specifying the full text description for each field, yet for each request the headers can be send to the server. It may even save more bytes than your proposed the proposed "send only once", in the case that only a few resources are requested from a server.

Of course, one can decide to have both HTTP/1.1 and HTTP/2.0 as accepted standards. Then this whole post may be pointless... It all comes to proper design decisions.

Hugo Leisink

9 October 2012, 16:33

@Remco: the introduction of a state in HTTP/2.0 is not such a big problem as you might think. For Accept-Encoding, the default can be 'none'. For Accept-Language, the default can be 'en'. For Accept-Charset, the default can be 'utf-8'. If not sent, use the default. When sent, use that value from that point on. A server can very well do without the User-Agent, so that one should be optional. The only interesting is the Host header. Think about it, it would be weird to ever change that after the first request. What's the chance that a user visits two sites that are hosted on the same server? Very little. So, it's not a big problem to say that the Host header is mandatory for the first HTTP/2.0 request and no longer valid in the following requests in the same connection. You call it a 'state', I see it more of a logical way of how things work in the real world.

Martin Tournoij

10 October 2012, 16:35

Falling back to "safe" defaults would cause all sort of problems, such as websites suddenly changing language.

If the client didn't send Accept-Language, and the server doesn't remember it, the server should be able to request is from the client.
Speed-wise, this would create an additional request before *anything* useful can happen... Those extra few bytes are insignificant compared to the overhead of setting up an additional HTTP request ...

Personally, I'd go for the KISS solution and keep HTTP stateless. The advantages of not sending the headers are quite small.

Always enabling compression and encryption is senseless, just like it's almost always senseless to "always" do something. Since "always" covers a whole lot of usage scenarios.

Server push is something that actually strikes me as quite useful in some cases. But I'm afraid of developers not understanding it who will push, PUSH, *PUSH*!!!!!!!11

One more thing I'd like in HTTP 2 is virtual hosting with HTTPS.

Maik

22 October 2012, 07:18

I also would not remove the host header. Think about subdomains on a server like mail.hiawatha-webserver.org. With "cached" host headers one could not go to the mail site, once one has opened a page on the main site, am I right?

Hugo Leisink

22 October 2012, 07:44

No, for a subdomain an extra connection can be opened. Don't forget that a subdomain can be located on a different webserver.

Maik

22 October 2012, 07:50

Okay if that's the case, I agree with you

Imran Geriskovan

3 April 2013, 16:38

Sending certain headers only once can save a couple of bits. But, I guess your assumption is based on direct and omni present TCP connection between browser and server. However, there may be a lot of intermediateries (i.e. caches) and connections may be broken and reestablished between them. This renders your implicit assumption of "sending headers once per session" unapplicable. Because a connection or session seen by the client may be infinitely complex chain of connection and cache servings. Yet worse, a cache may receive requests from different clients however forward them on a single connection if target ips are same and there is a keep-alive connection at hand. Yet worse, in such a connection, targets ips may be the same but 'hosts' may be different for virtual hosting. This is just the tip of iceberg, considering just how many decades had passed for consolidation of HTTP1.0 and then 1.1. Infact it is just sending some headers, a blank line and the a payload over a TCP connection. Simple isnt it?

As a summary, coupling a session with a connection (whether it is multiplexed or not) is a bad idea.

For Http2.0, "Multiplexed connection" may be the only interesting item in theory, however I do not believe it would worth the hustle, while there is Http 1.1. It may just be
necessary for large payloads on single connections without blocking the request queue.

Here, everyone should be careful on what would "Large Payload" means. Does it mean commercials pushed to you?

Craig

11 April 2014, 23:03

It's really hard to take a guy seriously when he uses PHP as an example of good design.

19 April 2014, 15:03

There's something stupid-huge that EVERYONE is missing.

Making SSL OR compression mandatory has some interestingly stupid implications for hosting po .. ahem static files. With sendfile() you can simply map that file with sendfile() and the kernel takes it from there. The thread that initiated the transmission returns to the pool. Simple, and there's a lot of stuff like this floating around the web. Not compressible. Not private.

Once you introduce SSL -or- compression, you have 2 choices for big files:

1. Chunk it out and send pieces, epoll, send piece, epoll.
2. Put OpenSSL in the kernel. That sound scary, but it's what IIS did without bothering to expose it to IOCPs.

There is no point whatsoever in compressing the vast majority of large files since they're usually already compressed, or uncompressible.

19 April 2014, 15:07

@Craig As much as PHP is a terrible language, the zlib decision is correct. If you're going to be buffering output then copying it, you might as well compress it _first_. It's not going to be zero-copy, but you might as well do it earlier since it will be copied from one process to another in most cases. mod_php may make it academic.

Hugo Leisink

19 April 2014, 16:55

@JS: don't feed the troll.

19 April 2014, 21:32

@Hugo,

I can't help it. What do trolls eat that sates them?

Any comment on my less-troll related post above? I'm a high level programmer, but I know how the innards work to some extent. I don't get also why they compress headers. It's not worth it. The entire idea of requiring compression is stupid. Encryption I could possibly see, because cookies can leak on some wifi network somewhere. To do it correctly, OpenSSL will also be in your kernel.

Anyway, mapping a file and saying "here kernel, eat this verbatim" will always be better, no?

The coffee shop problem is much less severe than something like heartbleed in the kernel.

23 April 2014, 02:44

@Hugo I can see only 1 benefit to compressing headers. Big cookies. But it's also not clear compression will work amazingly well on them.

frosch

18 November 2014, 01:48

vip

Ré

3 March 2015, 20:57

@Hugo, now http/2 is almost finished, will this be implemented in Hiawatha?
https://lists.w3.org/Archives/Public/ietf-http-wg/2015JanMar/0478.html

Nice enhancement for Hiawatha 10.x ;-)

Hugo Leisink

3 March 2015, 20:59

When I implement this, it will be in Hiawatha v10.0. But it won't happen any time soon. Probably some where at the end of this year, maybe next year. HTTP/2 may sound fancy and cool, but for you (assuming you don't own a Google-like company) it won't make any difference.

Starbeamrainbowlabs

6 September 2016, 08:02

Compressing images is indeed useless, but only if the images themselves were optimised in the first place.