Patchwork D231: httppeer: add support for httppostargs when we're sending a file

login
register
mail settings
Submitter phabricator
Date Aug. 4, 2017, 9:05 p.m.
Message ID <differential-rev-PHID-DREV-y2ed6335r4rwqr3e4a6x-req@phab.mercurial-scm.org>
Download mbox | patch
Permalink /patch/22678/
State Superseded
Headers show

Comments

phabricator - Aug. 4, 2017, 9:05 p.m.
durin42 created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  This is probably only used in the 'unbundle' command, but the code
  ended up being cleaner to make it generic and treat *all* httppostargs
  with a non-args request body as though they were file-like in
  nature. It also means we get test coverage more or less for free. A
  previous version of this change didn't use io.BytesIO, and it was a
  lot more complicated.
  
  This also fixes a server-side bug, so anyone using httppostargs should
  update all of their servers to this revision or later *before* this
  gets to their clients, otherwise servers will hang trying to over-read
  the POST body.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D231

AFFECTED FILES
  mercurial/hgweb/protocol.py
  mercurial/httppeer.py

CHANGE DETAILS




To: durin42, #hg-reviewers
Cc: mercurial-devel
phabricator - Aug. 5, 2017, 5:04 a.m.
martinvonz added inline comments.

INLINE COMMENTS

> httppeer.py:95
> +                raise ValueError(
> +                    'fileprepender only supports file objects that '
> +                    'have a length but this one does not:', type(f), f)

I'm guessing 'fileprepender' here and a few placed below is the old name for '_multifile' (so please update)

> httppeer.py:96
> +                    'fileprepender only supports file objects that '
> +                    'have a length but this one does not:', type(f), f)
> +        self._fileobjs = fileobjs

It doesn't matter much since it's just a programming error if it happens, but how will these arguments to ValueError be rendered?

> httppeer.py:105
> +    def read(self, amt=None):
> +        if not amt:
> +            return ''.join(f.read() for f in self._fileobjs)

i read that negative amount means the same thing (read all). do we care to follow that contract here?

> httppeer.py:111
> +            got = len(parts[-1])
> +            if got < amt:
> +                self._index += 1

nit: i think this can be "if got <= amt" to avoid an unnecessary 0-length read the next time _multifile.read() (and it does look like that will be the normal case, that the reader will read exactly the size of the first "file" first)

> httppeer.py:125
> +                'could be fixed if you need it')
> +        for f in self._fileobjs:
> +            f.seek(0)

also need to set self._index=0, no?

> httppeer.py:193
>              strargs = urlreq.urlencode(sorted(args.items()))
>              if strargs:
>                  if not data:

Nit: to reduce indentation, it looks like you can change the condition above to "if postargsok and args" instead (because bool(strargs) == bool(args) AFAICT)

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D231

To: durin42, #hg-reviewers
Cc: martinvonz, mercurial-devel
phabricator - Aug. 8, 2017, 2:41 p.m.
durin42 added inline comments.

INLINE COMMENTS

> martinvonz wrote in httppeer.py:96
> It doesn't matter much since it's just a programming error if it happens, but how will these arguments to ValueError be rendered?

ValueError: ('_multifile only supports file objects that have a length but this one does not:', <type 'file'>, <open file 'README', mode 'r' at 0x110ff0b70>)

It ain't pretty, but it's got all the information we need.

> martinvonz wrote in httppeer.py:111
> nit: i think this can be "if got <= amt" to avoid an unnecessary 0-length read the next time _multifile.read() (and it does look like that will be the normal case, that the reader will read exactly the size of the first "file" first)

No, because consider these file buffers:

'foo', 'bar'

if we read(2), we'll pull 2 bytes off the first buffer, which is not yet exhausted, but got == amt.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D231

To: durin42, #hg-reviewers
Cc: martinvonz, mercurial-devel
phabricator - Aug. 21, 2017, 1:21 p.m.
yuja added inline comments.

INLINE COMMENTS

> httppeer.py:105
> +    def read(self, amt=None):
> +        if amt <= 0:
> +            return ''.join(f.read() for f in self._fileobjs)

Just nits:

- `read(0)` should return an empty string.
- `None <= 0` is TypeError on Python 3.

> httppeer.py:223
> +                argsio.length = len(strargs)
> +                data = _multifile(argsio, data)
> +            headers['X-HgArgs-Post'] = len(strargs)

Does this mean `data` must have `length` attribute if it isn't
a string? A plain file object has no such attribute.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D231

To: durin42, #hg-reviewers, martinvonz
Cc: yuja, martinvonz, mercurial-devel

Patch

diff --git a/mercurial/httppeer.py b/mercurial/httppeer.py
--- a/mercurial/httppeer.py
+++ b/mercurial/httppeer.py
@@ -9,6 +9,7 @@ 
 from __future__ import absolute_import
 
 import errno
+import io
 import os
 import socket
 import struct
@@ -86,6 +87,44 @@ 
 
     resp.__class__ = readerproxy
 
+class _multifile(object):
+    def __init__(self, *fileobjs):
+        for f in fileobjs:
+            if not util.safehasattr(f, 'length'):
+                raise ValueError(
+                    'fileprepender only supports file objects that '
+                    'have a length but this one does not:', type(f), f)
+        self._fileobjs = fileobjs
+        self._index = 0
+
+    @property
+    def length(self):
+        return sum(f.length for f in self._fileobjs)
+
+    def read(self, amt=None):
+        if not amt:
+            return ''.join(f.read() for f in self._fileobjs)
+        parts = []
+        while amt and self._index < len(self._fileobjs):
+            parts.append(self._fileobjs[self._index].read(amt))
+            got = len(parts[-1])
+            if got < amt:
+                self._index += 1
+            amt -= got
+        return ''.join(parts)
+
+    def seek(self, offset, whence=os.SEEK_SET):
+        if whence != os.SEEK_SET:
+            raise NotImplementedError(
+                'fileprepender does not support anything other'
+                ' than os.SEEK_SET for whence on seek()')
+        if offset != 0:
+            raise NotImplementedError(
+                'fileprepender only supports seeking to start, but that '
+                'could be fixed if you need it')
+        for f in self._fileobjs:
+            f.seek(0)
+
 class httppeer(wireproto.wirepeer):
     def __init__(self, ui, path):
         self.path = path
@@ -149,16 +188,19 @@ 
         # with infinite recursion when trying to look up capabilities
         # for the first time.
         postargsok = self.caps is not None and 'httppostargs' in self.caps
-        # TODO: support for httppostargs when data is a file-like
-        # object rather than a basestring
-        canmungedata = not data or isinstance(data, basestring)
-        if postargsok and canmungedata:
+        if postargsok:
             strargs = urlreq.urlencode(sorted(args.items()))
             if strargs:
                 if not data:
                     data = strargs
-                elif isinstance(data, basestring):
-                    data = strargs + data
+                else:
+                    if isinstance(data, basestring):
+                        i = io.BytesIO(data)
+                        i.length = len(data)
+                        data = i
+                    argsio = io.BytesIO(strargs)
+                    argsio.length = len(strargs)
+                    data = _multifile(argsio, data)
                 headers['X-HgArgs-Post'] = len(strargs)
         else:
             if len(args) > 0:
diff --git a/mercurial/hgweb/protocol.py b/mercurial/hgweb/protocol.py
--- a/mercurial/hgweb/protocol.py
+++ b/mercurial/hgweb/protocol.py
@@ -75,6 +75,9 @@ 
         return args
     def getfile(self, fp):
         length = int(self.req.env['CONTENT_LENGTH'])
+        # If httppostargs is used, we need to read Content-Length
+        # minus the amount that was consumed by args.
+        length -= int(self.req.env.get('HTTP_X_HGARGS_POST', 0))
         for s in util.filechunkiter(self.req, limit=length):
             fp.write(s)
     def redirect(self):