I think the problem I was seeing is a result of a latency overlap between the client sending a request and the server closing the socket as the TimeForReqest timeout kicks in. Having scanned through section 8 of RFC2616 on persistent connections, my understanding is that this can occur:
"A client, server, or proxy MAY close the transport connection at any time. For example, a client might have started to send a new request at the same time that the server has decided to close the "idle" connection. From the server's point of view, the connection is being closed while it was idle, but from the client's point of view, a request is in progress."
It goes on to state that the client SHOULD reopen a connection to request the object again. It looks to me that this isn't always happening - I've seen the problem with both Chrome and IE10 on windows. Usually this showed up as missing images on a webpage, but just now and again I got a 408 error page displayed in response to the initial page request.
With experiment I could get these types of problem to occur more frequently by timing when I request a page in the browser relative to the TimeForRequest setting in the
config - eg with TimeForRequest set to 10s, by clicking on a link at about 10s after the last page completed loading, the errors would occur most frequently.
I was also getting LOTS of timeout errors listed in the hiawatha system.log of the reverse proxy:
144.59.*.*|Sat 24 May 2014 23:45:22 +0000|Timeout while waiting for request
and these errors showed up in the log for my local IP at the point at which I was getting the errors in the browser.
This is my setup:
Hiawatha 9.5 acting as reverse proxy connecting to a couple of Hiawatha 9.5 backends on openvz VMs (although I think the problem has existed since at least 9.3.1, possibly
earlier) All public traffic to the reverse proxy is https, all traffic between reverse proxy and backends is http. Reverse proxy caching is enabled for common static files.
The key question is why is the client not opening a new connection and requesting the missing images again? I started to try and look at how the proxy was closing
connections, but for https, this appears to be implemented in the PolarSSL code and I quickly got lost!
It is my understanding of RFC2068 (19.7.1.1) that adding the KeepAlive response header informs the client how long an idle connection will be kept open, and how
many requests will be served over a connection before it is closed by the server. It is better explained here:
http://tools.ietf.org/id/draft-thomson-hybi-http-timeout-01.html
It appears that common browsers do use this info to ensure that they don't send requests on connections which are about to timeout. With these response headers sent by the server the problem has gone and I haven't seen a single basic timeout error in the reverse proxy system log. I do still get a few of these:
98.21.*.*|Sun 01 Jun 2014 06:05:15 +0100|Timeout while waiting for first request
92.24.*.*|Sun 01 Jun 2014 06:05:16 +0100|Silent client disconnected
but I've assumed that these relate to the fact that most browsers open multiple connections and may on occasion end up not using one of them, so it times out without any request being sent.
Further, from RFC2068, the KeepAlive response header is a "hop-by-hop" header which is not supposed to be forwarded by proxies, does this mean that it should really be
added at the reverse proxy, not the backend? I tried this, it didn't work, but by adding the headers at the backend they are being passed through the reverse proxy and are
received by the browser.
As I've said before, I'm not an expert on HTTP, so my understanding could be wrong, but it does appear to have fixed the problem and my system.log files are a lot smaller.
Hope this is helpful. If you need any more information, copies of config files etc please let me know.
Regards, David