Patchwork lfs: add the 'Authorization' property to the Batch API response, if present

login
register
mail settings
Submitter Matt Harbison
Date April 15, 2018, 11:41 p.m.
Message ID <6d8c47590030244033d5.1523835692@Envy>
Download mbox | patch
Permalink /patch/31086/
State Accepted
Headers show

Comments

Matt Harbison - April 15, 2018, 11:41 p.m.
# HG changeset patch
# User Matt Harbison <matt_harbison@yahoo.com>
# Date 1523027627 14400
#      Fri Apr 06 11:13:47 2018 -0400
# Node ID 6d8c47590030244033d51c2d0b390d2ee6337dea
# Parent  acd5a25c179269df689b8799aa7cbc52d5451251
lfs: add the 'Authorization' property to the Batch API response, if present

The client copies all of these properties under 'header' to the HTTP Headers of
the subsequent GET or PUT request that it performs.  That allows the Basic HTTP
authentication used to authorize the Batch API request to also authorize the
upload/download action.

There's likely further work to do here.  There's an 'authenticated' boolean key
in the Batch API response that can be set, and there is an 'LFS-Authenticate'
header that is used instead of 'WWW-Authenticate'[1].  (We likely need to
support both, since some hosting solutions are likely to only respond with the
latter.)  In any event, this works with SCM Manager, so there is real world
benefit.

I'm limiting the headers returned to 'Basic', because that's all the lfs spec
calls out.  In practice, I've seen gitbucket emit custom header content[2].

[1] https://github.com/git-lfs/git-lfs/blob/master/docs/api/batch.md#response-errors
[2] https://github.com/gitbucket/gitbucket/blob/35655f33c7713f08515ed640ece0948acd6d6168/src/main/scala/gitbucket/core/servlet/GitRepositoryServlet.scala#L119
Yuya Nishihara - April 16, 2018, 11:58 a.m.
On Sun, 15 Apr 2018 19:41:32 -0400, Matt Harbison wrote:
> # HG changeset patch
> # User Matt Harbison <matt_harbison@yahoo.com>
> # Date 1523027627 14400
> #      Fri Apr 06 11:13:47 2018 -0400
> # Node ID 6d8c47590030244033d51c2d0b390d2ee6337dea
> # Parent  acd5a25c179269df689b8799aa7cbc52d5451251
> lfs: add the 'Authorization' property to the Batch API response, if present
> 
> The client copies all of these properties under 'header' to the HTTP Headers of
> the subsequent GET or PUT request that it performs.  That allows the Basic HTTP
> authentication used to authorize the Batch API request to also authorize the
> upload/download action.

I'm not pretty sure, but I think it's up to the client to resend an
Authorization header depending on the realm provided by the server. Doesn't
the server request authentication for batch requests?
Matt Harbison - April 16, 2018, 12:22 p.m.
> On Apr 16, 2018, at 7:58 AM, Yuya Nishihara <yuya@tcha.org> wrote:
> 
>> On Sun, 15 Apr 2018 19:41:32 -0400, Matt Harbison wrote:
>> # HG changeset patch
>> # User Matt Harbison <matt_harbison@yahoo.com>
>> # Date 1523027627 14400
>> #      Fri Apr 06 11:13:47 2018 -0400
>> # Node ID 6d8c47590030244033d51c2d0b390d2ee6337dea
>> # Parent  acd5a25c179269df689b8799aa7cbc52d5451251
>> lfs: add the 'Authorization' property to the Batch API response, if present
>> 
>> The client copies all of these properties under 'header' to the HTTP Headers of
>> the subsequent GET or PUT request that it performs.  That allows the Basic HTTP
>> authentication used to authorize the Batch API request to also authorize the
>> upload/download action.
> 
> I'm not pretty sure, but I think it's up to the client to resend an
> Authorization header depending on the realm provided by the server. Doesn't
> the server request authentication for batch requests?

It does request authentication for batch requests, but doesn’t for the transfer, which surprised me.  Somewhere I think I read that the authentication request is also tied to the URI, which would explain why the client isn’t resending on its own.

I wireshark traced git-lfs to a couple of different servers, and this seemed to be what it was doing.  That gitbucket footnote shows it rolling its own authorization token that it expects on transfer, so I thought this was by design.
Yuya Nishihara - April 17, 2018, 11:40 a.m.
On Mon, 16 Apr 2018 08:22:13 -0400, Matt Harbison wrote:
> 
> > On Apr 16, 2018, at 7:58 AM, Yuya Nishihara <yuya@tcha.org> wrote:
> > 
> >> On Sun, 15 Apr 2018 19:41:32 -0400, Matt Harbison wrote:
> >> # HG changeset patch
> >> # User Matt Harbison <matt_harbison@yahoo.com>
> >> # Date 1523027627 14400
> >> #      Fri Apr 06 11:13:47 2018 -0400
> >> # Node ID 6d8c47590030244033d51c2d0b390d2ee6337dea
> >> # Parent  acd5a25c179269df689b8799aa7cbc52d5451251
> >> lfs: add the 'Authorization' property to the Batch API response, if present
> >> 
> >> The client copies all of these properties under 'header' to the HTTP Headers of
> >> the subsequent GET or PUT request that it performs.  That allows the Basic HTTP
> >> authentication used to authorize the Batch API request to also authorize the
> >> upload/download action.
> > 
> > I'm not pretty sure, but I think it's up to the client to resend an
> > Authorization header depending on the realm provided by the server. Doesn't
> > the server request authentication for batch requests?
> 
> It does request authentication for batch requests, but doesn’t for the transfer, which surprised me.  Somewhere I think I read that the authentication request is also tied to the URI, which would explain why the client isn’t resending on its own.

Queued, but can you investigate further why the server doesn't send 401
response?

> I wireshark traced git-lfs to a couple of different servers, and this seemed to be what it was doing.  That gitbucket footnote shows it rolling its own authorization token that it expects on transfer, so I thought this was by design.

Sending new token might make some sense, but echoing back the original
Authorization header seems a bit weird.
Matt Harbison - April 17, 2018, 11:57 a.m.
> On Apr 17, 2018, at 7:40 AM, Yuya Nishihara <yuya@tcha.org> wrote:
> 
>> On Mon, 16 Apr 2018 08:22:13 -0400, Matt Harbison wrote:
>> 
>>>> On Apr 16, 2018, at 7:58 AM, Yuya Nishihara <yuya@tcha.org> wrote:
>>>> 
>>>> On Sun, 15 Apr 2018 19:41:32 -0400, Matt Harbison wrote:
>>>> # HG changeset patch
>>>> # User Matt Harbison <matt_harbison@yahoo.com>
>>>> # Date 1523027627 14400
>>>> #      Fri Apr 06 11:13:47 2018 -0400
>>>> # Node ID 6d8c47590030244033d51c2d0b390d2ee6337dea
>>>> # Parent  acd5a25c179269df689b8799aa7cbc52d5451251
>>>> lfs: add the 'Authorization' property to the Batch API response, if present
>>>> 
>>>> The client copies all of these properties under 'header' to the HTTP Headers of
>>>> the subsequent GET or PUT request that it performs.  That allows the Basic HTTP
>>>> authentication used to authorize the Batch API request to also authorize the
>>>> upload/download action.
>>> 
>>> I'm not pretty sure, but I think it's up to the client to resend an
>>> Authorization header depending on the realm provided by the server. Doesn't
>>> the server request authentication for batch requests?
>> 
>> It does request authentication for batch requests, but doesn’t for the transfer, which surprised me.  Somewhere I think I read that the authentication request is also tied to the URI, which would explain why the client isn’t resending on its own.
> 
> Queued, but can you investigate further why the server doesn't send 401
> response?

Will do.  There’s more to look at in this area in general, and the lfs spec in this area is a bit vague, at least to me.

>> I wireshark traced git-lfs to a couple of different servers, and this seemed to be what it was doing.  That gitbucket footnote shows it rolling its own authorization token that it expects on transfer, so I thought this was by design.
> 
> Sending new token might make some sense, but echoing back the original
> Authorization header seems a bit weird.

Patch

diff --git a/hgext/lfs/wireprotolfsserver.py b/hgext/lfs/wireprotolfsserver.py
--- a/hgext/lfs/wireprotolfsserver.py
+++ b/hgext/lfs/wireprotolfsserver.py
@@ -238,18 +238,27 @@  def _batchresponseobjects(req, objects, 
 
         expiresat = datetime.datetime.now() + datetime.timedelta(minutes=10)
 
+        def _buildheader():
+            # The spec doesn't mention the Accept header here, but avoid
+            # a gratuitous deviation from lfs-test-server in the test
+            # output.
+            hdr = {
+                'Accept': 'application/vnd.git-lfs'
+            }
+
+            auth = req.headers.get('Authorization', '')
+            if auth.startswith('Basic '):
+                hdr['Authorization'] = auth
+
+            return hdr
+
         rsp['actions'] = {
             '%s' % action: {
                 'href': '%s%s/.hg/lfs/objects/%s'
                     % (req.baseurl, req.apppath, oid),
                 # datetime.isoformat() doesn't include the 'Z' suffix
                 "expires_at": expiresat.strftime('%Y-%m-%dT%H:%M:%SZ'),
-                'header': {
-                    # The spec doesn't mention the Accept header here, but avoid
-                    # a gratuitous deviation from lfs-test-server in the test
-                    # output.
-                    'Accept': 'application/vnd.git-lfs'
-                }
+                'header': _buildheader(),
             }
         }
 
diff --git a/tests/test-lfs-serve-access.t b/tests/test-lfs-serve-access.t
--- a/tests/test-lfs-serve-access.t
+++ b/tests/test-lfs-serve-access.t
@@ -331,3 +331,105 @@  Test a checksum failure during the proce
   $LOCALIP - - [$ERRDATE$] HG error:      hint=_('run hg verify')) (glob)
   $LOCALIP - - [$ERRDATE$] HG error:  LfsCorruptionError: detected corrupt lfs object: 276f73cfd75f9fb519810df5f5d96d6594ca2521abd86cbcd92122f7d51a1f3d (glob)
   $LOCALIP - - [$ERRDATE$] HG error:   (glob)
+
+Basic Authorization headers are returned by the Batch API, and sent back with
+the GET/PUT request.
+
+  $ rm -f $TESTTMP/access.log $TESTTMP/errors.log
+
+  $ cat >> $HGRCPATH << EOF
+  > [experimental]
+  > lfs.disableusercache = True
+  > [auth]
+  > l.schemes=http
+  > l.prefix=lo
+  > l.username=user
+  > l.password=pass
+  > EOF
+
+  $ cat << EOF > userpass.py
+  > import base64
+  > from mercurial.hgweb import common
+  > def perform_authentication(hgweb, req, op):
+  >     auth = req.headers.get(b'Authorization')
+  >     if not auth:
+  >         raise common.ErrorResponse(common.HTTP_UNAUTHORIZED, b'who',
+  >                 [(b'WWW-Authenticate', b'Basic Realm="mercurial"')])
+  >     if base64.b64decode(auth.split()[1]).split(b':', 1) != [b'user',
+  >                                                             b'pass']:
+  >         raise common.ErrorResponse(common.HTTP_FORBIDDEN, b'no')
+  > def extsetup():
+  >     common.permhooks.insert(0, perform_authentication)
+  > EOF
+
+  $ hg --config extensions.x=$TESTTMP/userpass.py \
+  >    -R server serve -d -p $HGPORT1 --pid-file=hg.pid \
+  >    -A $TESTTMP/access.log -E $TESTTMP/errors.log
+  $ mv hg.pid $DAEMON_PIDS
+
+  $ hg clone --debug http://localhost:$HGPORT1 auth_clone | egrep '^[{}]|  '
+  {
+    "objects": [
+      {
+        "actions": {
+          "download": {
+            "expires_at": "$ISO_8601_DATE_TIME$"
+            "header": {
+              "Accept": "application/vnd.git-lfs"
+              "Authorization": "Basic dXNlcjpwYXNz"
+            }
+            "href": "http://localhost:$HGPORT1/.hg/lfs/objects/276f73cfd75f9fb519810df5f5d96d6594ca2521abd86cbcd92122f7d51a1f3d"
+          }
+        }
+        "oid": "276f73cfd75f9fb519810df5f5d96d6594ca2521abd86cbcd92122f7d51a1f3d"
+        "size": 14
+      }
+    ]
+    "transfer": "basic"
+  }
+
+  $ echo 'another blob' > auth_clone/lfs.blob
+  $ hg -R auth_clone ci -Aqm 'add blob'
+  $ hg -R auth_clone --debug push | egrep '^[{}]|  '
+  {
+    "objects": [
+      {
+        "actions": {
+          "upload": {
+            "expires_at": "$ISO_8601_DATE_TIME$"
+            "header": {
+              "Accept": "application/vnd.git-lfs"
+              "Authorization": "Basic dXNlcjpwYXNz"
+            }
+            "href": "http://localhost:$HGPORT1/.hg/lfs/objects/df14287d8d75f076a6459e7a3703ca583ca9fb3f4918caed10c77ac8622d49b3"
+          }
+        }
+        "oid": "df14287d8d75f076a6459e7a3703ca583ca9fb3f4918caed10c77ac8622d49b3"
+        "size": 13
+      }
+    ]
+    "transfer": "basic"
+  }
+
+  $ $PYTHON $RUNTESTDIR/killdaemons.py $DAEMON_PIDS
+
+  $ cat $TESTTMP/access.log $TESTTMP/errors.log
+  $LOCALIP - - [$LOGDATE$] "GET /?cmd=capabilities HTTP/1.1" 401 - (glob)
+  $LOCALIP - - [$LOGDATE$] "GET /?cmd=capabilities HTTP/1.1" 200 - (glob)
+  $LOCALIP - - [$LOGDATE$] "GET /?cmd=batch HTTP/1.1" 200 - x-hgarg-1:cmds=heads+%3Bknown+nodes%3D x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull (glob)
+  $LOCALIP - - [$LOGDATE$] "GET /?cmd=getbundle HTTP/1.1" 200 - x-hgarg-1:bookmarks=1&bundlecaps=HG20%2Cbundle2%3DHG20%250Abookmarks%250Achangegroup%253D01%252C02%252C03%250Adigests%253Dmd5%252Csha1%252Csha512%250Aerror%253Dabort%252Cunsupportedcontent%252Cpushraced%252Cpushkey%250Ahgtagsfnodes%250Alistkeys%250Aphases%253Dheads%250Apushkey%250Aremote-changegroup%253Dhttp%252Chttps%250Arev-branch-cache%250Astream%253Dv2&cg=1&common=0000000000000000000000000000000000000000&heads=506bf3d83f78c54b89e81c6411adee19fdf02156+525251863cad618e55d483555f3d00a2ca99597e&listkeys=bookmarks&phases=1 x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull (glob)
+  $LOCALIP - - [$LOGDATE$] "POST /.git/info/lfs/objects/batch HTTP/1.1" 401 - (glob)
+  $LOCALIP - - [$LOGDATE$] "POST /.git/info/lfs/objects/batch HTTP/1.1" 200 - (glob)
+  $LOCALIP - - [$LOGDATE$] "GET /.hg/lfs/objects/276f73cfd75f9fb519810df5f5d96d6594ca2521abd86cbcd92122f7d51a1f3d HTTP/1.1" 200 - (glob)
+  $LOCALIP - - [$LOGDATE$] "GET /?cmd=capabilities HTTP/1.1" 401 - (glob)
+  $LOCALIP - - [$LOGDATE$] "GET /?cmd=capabilities HTTP/1.1" 200 - (glob)
+  $LOCALIP - - [$LOGDATE$] "GET /?cmd=batch HTTP/1.1" 200 - x-hgarg-1:cmds=heads+%3Bknown+nodes%3D525251863cad618e55d483555f3d00a2ca99597e+4d9397055dc0c205f3132f331f36353ab1a525a3 x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull (glob)
+  $LOCALIP - - [$LOGDATE$] "GET /?cmd=listkeys HTTP/1.1" 200 - x-hgarg-1:namespace=phases x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull (glob)
+  $LOCALIP - - [$LOGDATE$] "GET /?cmd=listkeys HTTP/1.1" 200 - x-hgarg-1:namespace=bookmarks x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull (glob)
+  $LOCALIP - - [$LOGDATE$] "GET /?cmd=branchmap HTTP/1.1" 200 - x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull (glob)
+  $LOCALIP - - [$LOGDATE$] "GET /?cmd=listkeys HTTP/1.1" 200 - x-hgarg-1:namespace=bookmarks x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull (glob)
+  $LOCALIP - - [$LOGDATE$] "POST /.git/info/lfs/objects/batch HTTP/1.1" 401 - (glob)
+  $LOCALIP - - [$LOGDATE$] "POST /.git/info/lfs/objects/batch HTTP/1.1" 200 - (glob)
+  $LOCALIP - - [$LOGDATE$] "PUT /.hg/lfs/objects/df14287d8d75f076a6459e7a3703ca583ca9fb3f4918caed10c77ac8622d49b3 HTTP/1.1" 201 - (glob)
+  $LOCALIP - - [$LOGDATE$] "POST /?cmd=unbundle HTTP/1.1" 200 - x-hgarg-1:heads=666f726365 x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull (glob)
+  $LOCALIP - - [$LOGDATE$] "GET /?cmd=listkeys HTTP/1.1" 200 - x-hgarg-1:namespace=phases x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull (glob)