Patchwork D2064: wireprotoserver: document and improve the httplib workaround

login
register
mail settings
Submitter phabricator
Date Feb. 6, 2018, 9:04 p.m.
Message ID <differential-rev-PHID-DREV-xvkqvcoyt7m4q4y2wirp-req@phab.mercurial-scm.org>
Download mbox | patch
Permalink /patch/27387/
State Superseded
Headers show

Comments

phabricator - Feb. 6, 2018, 9:04 p.m.
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  This workaround dates all the way back to https://phab.mercurial-scm.org/rHGa42d27bc809dda4939c5b3807b397bcc811ebbe0 in 2008.
  The code is esoteric enough to warrant an inline explanation.
  So I've added one.
  
  At the time the code was written, the only wire protocol command
  that accepted an HTTP request body was "unbundle." In the years
  since, we've grown the ability to accept command arguments via
  HTTP POST requests. So, the code has been changed to apply the
  httplib workaround to all HTTP POST requests.
  
  While staring at this code, I realized that the HTTP response
  body in case of error is always the same. And, it appears to
  mimic the behavior of a failed call to the "unbundle" command.
  Since we can hit this code path on theoretically any protocol
  request (since self.check_perm accepts custom auth checking
  functions which may raise), I'm having a hard time believing
  that clients react well to an "unbundle" response payload on
  any wire protocol command. I wouldn't be surprised if our test
  coverage for this feature only covers HTTP POST calls to
  "unbundle." In other words, the experimental support for sending
  arguments via HTTP POST request bodies may result in badness on
  the client. Something to investigate another time perhaps...

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2064

AFFECTED FILES
  mercurial/wireprotoserver.py

CHANGE DETAILS




To: indygreg, #hg-reviewers
Cc: mercurial-devel
phabricator - Feb. 7, 2018, 10:19 p.m.
durin42 accepted this revision.
durin42 added a comment.
This revision is now accepted and ready to land.


  I'm so sorry for this technical debt, even if it's httplib's fault. :/

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2064

To: indygreg, #hg-reviewers, durin42
Cc: durin42, mercurial-devel

Patch

diff --git a/mercurial/wireprotoserver.py b/mercurial/wireprotoserver.py
--- a/mercurial/wireprotoserver.py
+++ b/mercurial/wireprotoserver.py
@@ -286,8 +286,9 @@ 
         req.respond(HTTP_OK, HGTYPE, body=rsp)
         return []
     elif isinstance(rsp, wireproto.pusherr):
-        # drain the incoming bundle
+        # This is the httplib workaround documented in _handlehttperror().
         req.drain()
+
         proto.restore()
         rsp = '0\n%s\n' % rsp.res
         req.respond(HTTP_OK, HGTYPE, body=rsp)
@@ -300,16 +301,28 @@ 
 
 def _handlehttperror(e, req, cmd):
     """Called when an ErrorResponse is raised during HTTP request processing."""
-    # A client that sends unbundle without 100-continue will
-    # break if we respond early.
-    if (cmd == 'unbundle' and
+
+    # Clients using Python's httplib are stateful: the HTTP client
+    # won't process an HTTP response until all request data is
+    # sent to the server. The intent of this code is to ensure
+    # we always read HTTP request data from the client, thus
+    # ensuring httplib transitions to a state that allows it to read
+    # the HTTP response. In other words, it helps prevent deadlocks
+    # on clients using httplib.
+
+    if (req.env[r'REQUEST_METHOD'] == r'POST' and
+        # But not if Expect: 100-continue is being used.
         (req.env.get('HTTP_EXPECT',
                      '').lower() != '100-continue') or
+        # Or the non-httplib HTTP library is being advertised by
+        # the client.
         req.env.get('X-HgHttp2', '')):
         req.drain()
     else:
         req.headers.append((r'Connection', r'Close'))
 
+    # TODO This response body assumes the failed command was
+    # "unbundle." That assumption is not always valid.
     req.respond(e, HGTYPE, body='0\n%s\n' % e)
 
     return ''