Patchwork [1,of,3,stable] hgweb: force connection close on early response

login
register
mail settings
Submitter Augie Fackler
Date July 24, 2013, 9:07 p.m.
Message ID <54ff8d5ce7107ac20c38.1374700047@augie-macbookair>
Download mbox | patch
Permalink /patch/1949/
State Accepted
Commit 60e060f4faa9941b65d2ad406d119dd92c5f13e7
Headers show

Comments

Augie Fackler - July 24, 2013, 9:07 p.m.
# HG changeset patch
# User Augie Fackler <raf@durin42.com>
# Date 1368322815 18000
#      Sat May 11 20:40:15 2013 -0500
# Branch stable
# Node ID 54ff8d5ce7107ac20c38cd85527968e2f540806e
# Parent  32e502b26983eaa89574835a024a9b035ad72bf4
hgweb: force connection close on early response

Not all WSGI servers close the socket when an early response is sent
to a large POST request, which can cause the server to interpret the
already-sent request body as an incoming (but hopelessly invalid)
request.
Augie Fackler - July 29, 2013, 6:11 p.m.
On Jul 29, 2013, at 10:41 AM, Kevin Bullock <kbullock+mercurial@ringworld.org> wrote:

>> hgweb: force connection close on early response
>> 
>> Not all WSGI servers close the socket when an early response is sent
>> to a large POST request, which can cause the server to interpret the
>> already-sent request body as an incoming (but hopelessly invalid)
>> request.
> 
> Uhmmm... *ouch*. Would love to hear more details about this.

Sure. Each HTTP request goes through a state machine, roughly like this:

1) Connect
2) Send request headers
3) Send request body
4) Get response headers
5) Get response body

When we decide to respond early to a client, we jump to step 4 while step 3 is still happening. The RFC 2616 is silent on what you're supposed to do in this case, and doesn't even explicitly call it out as a valid case (though every server I've ever used allows it, although some only with non-200 status codes).

So you end up in a state where the client has probably sent some data on the wire, and then saw the response body[0]. This basically poisons the read side of the socket for the server, which will now see some fragment of POST body data as a new request header line, except that (except in *really* unlucky cases) the data lurking on the inbound (to the server) side of the socket won't be valid, and it'll cause the server to be angry and close the socket anyway.

The workaround is to always do "Connection: Close" on any early response, because the keepalive benefits aren't worth the risk. We only have to do this because some web servers aren't smart enough to do this on their own.

Does that help?


[0] This is actually the best case. Worst case is what httplib does, which is merrily writes request bytes until it gets an error from the server, which closed the socket because it's correctly paranoid about having sent an early response. Yay.

Patch

diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
--- a/mercurial/hgweb/hgweb_mod.py
+++ b/mercurial/hgweb/hgweb_mod.py
@@ -176,6 +176,8 @@ 
                                  '').lower() != '100-continue') or
                     req.env.get('X-HgHttp2', '')):
                     req.drain()
+                else:
+                    req.headers.append(('Connection', 'Close'))
                 req.respond(inst, protocol.HGTYPE,
                             body='0\n%s\n' % inst.message)
                 return ''