Patchwork [remotefilelog-ext,getfile-errors] getfile: add error reporting to getfile method

login
register
mail settings
Submitter Augie Fackler
Date Aug. 4, 2015, 7:35 p.m.
Message ID <77e2a6bed76ff7bf337b.1438716940@arthedain.pit.corp.google.com>
Download mbox | patch
Permalink /patch/10104/
State Accepted
Headers show

Comments

Augie Fackler - Aug. 4, 2015, 7:35 p.m.
# HG changeset patch
# User Augie Fackler <augie@google.com>
# Date 1438714793 14400
#      Tue Aug 04 14:59:53 2015 -0400
# Node ID 77e2a6bed76ff7bf337b40f8e02b723d3a3d328f
# Parent  73bc61b89e758824232c8eddd3ecdf0ae2b2880d
getfile: add error reporting to getfile method

Without this, the only way to report a failure of a file load in a
batched set of getfile requests is to fail the entire batch, which is
potentially painful. Instead, add our own error reporting in-band
which the client can then detect and raise.

I'm not completely happy with the somewhat adhoc error reporting here,
but we expect our server to have at least one additional error ("not
allowed to see file contents") which will require some special
handling on our end, so we need some level of flexibility in the error
reporting protocol so we can extend it later. Sigh.

Open question: should we reserve some range of error codes so that
it's easy for strange custom servers to have related monkeypatches to
client code for custom handling of unforseen-by-remotefilelog
conditions?

I couldn't figure out how to actually get the client to try loading
file contents over http in the test, but the get-with-headers test at
least proves that the server responses look the way I expect.
Durham Goode - Aug. 6, 2015, 10:09 p.m.
On 8/4/15 12:35 PM, Augie Fackler wrote:
> # HG changeset patch
> # User Augie Fackler <augie@google.com>
> # Date 1438714793 14400
> #      Tue Aug 04 14:59:53 2015 -0400
> # Node ID 77e2a6bed76ff7bf337b40f8e02b723d3a3d328f
> # Parent  73bc61b89e758824232c8eddd3ecdf0ae2b2880d
> getfile: add error reporting to getfile method
>
> Without this, the only way to report a failure of a file load in a
> batched set of getfile requests is to fail the entire batch, which is
> potentially painful. Instead, add our own error reporting in-band
> which the client can then detect and raise.
>
> I'm not completely happy with the somewhat adhoc error reporting here,
> but we expect our server to have at least one additional error ("not
> allowed to see file contents") which will require some special
> handling on our end, so we need some level of flexibility in the error
> reporting protocol so we can extend it later. Sigh.
>
> Open question: should we reserve some range of error codes so that
> it's easy for strange custom servers to have related monkeypatches to
> client code for custom handling of unforseen-by-remotefilelog
> conditions?
>
> I couldn't figure out how to actually get the client to try loading
> file contents over http in the test, but the get-with-headers test at
> least proves that the server responses look the way I expect.
Generally looks good.  I'll add a doc comment documenting the protocol 
and I'll queue it.  Any reason you chose the <stringint>\0<data> 
protocol instead of <binaryint><data>?
Augie Fackler - Aug. 7, 2015, 3:54 a.m.
> On Aug 6, 2015, at 6:09 PM, Durham Goode <durham@fb.com> wrote:
> 
> 
> 
> On 8/4/15 12:35 PM, Augie Fackler wrote:
>> # HG changeset patch
>> # User Augie Fackler <augie@google.com>
>> # Date 1438714793 14400
>> #      Tue Aug 04 14:59:53 2015 -0400
>> # Node ID 77e2a6bed76ff7bf337b40f8e02b723d3a3d328f
>> # Parent  73bc61b89e758824232c8eddd3ecdf0ae2b2880d
>> getfile: add error reporting to getfile method
>> 
>> Without this, the only way to report a failure of a file load in a
>> batched set of getfile requests is to fail the entire batch, which is
>> potentially painful. Instead, add our own error reporting in-band
>> which the client can then detect and raise.
>> 
>> I'm not completely happy with the somewhat adhoc error reporting here,
>> but we expect our server to have at least one additional error ("not
>> allowed to see file contents") which will require some special
>> handling on our end, so we need some level of flexibility in the error
>> reporting protocol so we can extend it later. Sigh.
>> 
>> Open question: should we reserve some range of error codes so that
>> it's easy for strange custom servers to have related monkeypatches to
>> client code for custom handling of unforseen-by-remotefilelog
>> conditions?
>> 
>> I couldn't figure out how to actually get the client to try loading
>> file contents over http in the test, but the get-with-headers test at
>> least proves that the server responses look the way I expect.
> Generally looks good.  I'll add a doc comment documenting the protocol and I'll queue it.  Any reason you chose the <stringint>\0<data> protocol instead of <binaryint><data>?

A mix of laziness, misguided hope for future extensibility, and consistency with the lookup wireproto command. Mostly laziness, I guess.

Patch

diff --git a/remotefilelog/fileserverclient.py b/remotefilelog/fileserverclient.py
--- a/remotefilelog/fileserverclient.py
+++ b/remotefilelog/fileserverclient.py
@@ -43,7 +43,10 @@  def peersetup(ui, peer):
                     'configured remotefile server does not support getfile')
             f = wireproto.future()
             yield {'file': file, 'node': node}, f
-            yield f.value
+            code, data = f.value.split('\0', 1)
+            if int(code):
+                raise error.LookupError(data)
+            yield data
     peer.__class__ = remotefilepeer
 
 class cacheconnection(object):
diff --git a/remotefilelog/remotefilelogserver.py b/remotefilelog/remotefilelogserver.py
--- a/remotefilelog/remotefilelogserver.py
+++ b/remotefilelog/remotefilelogserver.py
@@ -214,14 +214,14 @@  def _loadfileblob(repo, cachepath, path,
 
 def getfile(repo, proto, file, node):
     if shallowrepo.requirement in repo.requirements:
-        raise util.Abort(_('cannot fetch remote files from shallow repo'))
+        return '1\0' + _('cannot fetch remote files from shallow repo')
     cachepath = repo.ui.config("remotefilelog", "servercachepath")
     if not cachepath:
         cachepath = os.path.join(repo.path, "remotefilelogcache")
     node = bin(node.strip())
     if node == nullid:
-        return ''
-    return _loadfileblob(repo, cachepath, file, node)
+        return '0\0'
+    return '0\0' + _loadfileblob(repo, cachepath, file, node)
 
 def getfiles(repo, proto):
     """A server api for requesting particular versions of particular files.
diff --git a/tests/test-http.t b/tests/test-http.t
--- a/tests/test-http.t
+++ b/tests/test-http.t
@@ -10,6 +10,10 @@ 
   $ hg commit -qAm x
   $ hg serve -p $HGPORT -d --pid-file=../hg1.pid -E ../error.log
 
+Build a query string for later use:
+  $ GET=`hg debugdata -m 0 | python -c \
+  > 'import sys ; print [("?cmd=getfile&file=%s&node=%s" % tuple(s.split("\0"))) for s in sys.stdin.read().splitlines()][0]'`
+
   $ cd ..
   $ cat hg1.pid >> $DAEMON_PIDS
 
@@ -32,3 +36,36 @@  as the getfile method it offers doesn't 
   400 no such method: this-command-does-not-exist
   $ get-with-headers.py localhost:$HGPORT '?cmd=getfiles' | head -n 1
   400 no such method: getfiles
+
+Verify serving from a shallow clone doesn't allow for remotefile
+fetches. This also serves to test the error handling for our batchable
+getfile RPC.
+
+  $ cd shallow
+  $ hg serve -p $HGPORT1 -d --pid-file=../hg2.pid -E ../error2.log
+  $ cd ..
+  $ cat hg2.pid >> $DAEMON_PIDS
+
+This GET should work, because this server is serving master, which is
+a full clone.
+
+  $ get-with-headers.py localhost:$HGPORT "$GET"
+  200 Script output follows
+  
+  0\x00U\x00\x00\x00\xff (esc)
+  2\x00x (esc)
+  \x14\x06\xe7A\x18bv\x94&\x84\x17I\x1f\x01\x8aJ\x881R\xf0\x00\x01\x00\x14\xf0\x06\xb2\x92\xc1\xe31\x1f\xd0\xf1:\xe8;@\x9c\xaa\xe4\xa6\xd1\xfb4\x8c\x00 (no-eol) (esc)
+
+This GET should fail using the in-band signalling mechanism, because
+it's not a full clone. Note that it's also plausible for servers to
+refuse to serve file contents for other reasons, like the file
+contents not being visible to the current user.
+
+  $ get-with-headers.py localhost:$HGPORT1 "$GET"
+  200 Script output follows
+  
+  1\x00cannot fetch remote files from shallow repo (no-eol) (esc)
+
+Both error logs should be empty:
+  $ cat error.log
+  $ cat error2.log