Patchwork [V2] lfs: add an experimental config to masquerade as git for the blob transfer

login
register
mail settings
Submitter Matt Harbison
Date Dec. 13, 2017, 11:14 p.m.
Message ID <a8d7f9c5012763e87460.1513206841@MATT7H-PC.attotech.com>
Download mbox | patch
Permalink /patch/26273/
State Rejected, archived
Headers show

Comments

Matt Harbison - Dec. 13, 2017, 11:14 p.m.
# HG changeset patch
# User Matt Harbison <matt_harbison@yahoo.com>
# Date 1513109762 18000
#      Tue Dec 12 15:16:02 2017 -0500
# Node ID a8d7f9c5012763e87460f90a8f06fee9d043ad79
# Parent  4937db58b663faa6893c51a41cec28114a165dd0
lfs: add an experimental config to masquerade as git for the blob transfer

As we were trying to transition off of the non production lfs-test-server for
further experimenting, one of the problems we ran into was interoperability.  A
coworker setup gitbucket[1] to act as the blob server, tested with git, and
passed it off to me.  But push failed with a message saying "abort: LFS server
returns invalid JSON:", and then proceeded to dump a huge HTML page to the
screen.  It turns out that it is assuming that git is the only thing that wants
to do a blob transfer, and everything else is a web browser wanting HTML.

It's only a single data point, but I suspect other things may be doing this too.
I didn't turn it on by default, because hopefully these things can be reported
and fixed.  I made it experimental because I suspect we will need it for longer
than the extension is experimental, but hopefully it can go away.  There's
nothing significant about the version of git chosen, other than it is the
current version.

RFC7231 says that the order of product identifiers are listed in decreasing
order of significance by convention. [2]  It works with gitbucket in any order,
but I figured listing git first would help in cases where the server
implementation is using strncmp() instead of strstr().

[1] https://github.com/gitbucket/gitbucket/
[2] https://tools.ietf.org/html/rfc7231#page-46

Patch

diff --git a/hgext/lfs/__init__.py b/hgext/lfs/__init__.py
--- a/hgext/lfs/__init__.py
+++ b/hgext/lfs/__init__.py
@@ -28,6 +28,11 @@  Configs::
     # the local directory to store lfs files for sharing across local clones.
     # If not set, the cache is located in an OS specific cache location.
     usercache = /path/to/global/cache
+
+    [experimental]
+    # Some servers will check the User-Agent header, and send back html if it
+    # doesn't look like a git client.  Set this to True to masquerade as git.
+    lfs.git-user-agent = True
 """
 
 from __future__ import absolute_import
@@ -63,6 +68,10 @@  testedwith = 'ships-with-hg-core'
 configtable = {}
 configitem = registrar.configitem(configtable)
 
+configitem('experimental', 'lfs.git-user-agent',
+    default=False,
+)
+
 configitem('lfs', 'url',
     default=configitem.dynamicdefault,
 )
diff --git a/hgext/lfs/blobstore.py b/hgext/lfs/blobstore.py
--- a/hgext/lfs/blobstore.py
+++ b/hgext/lfs/blobstore.py
@@ -101,7 +101,10 @@  class _gitlfsremote(object):
         self.ui = ui
         baseurl, authinfo = url.authinfo()
         self.baseurl = baseurl.rstrip('/')
-        self.urlopener = urlmod.opener(ui, authinfo)
+        useragent = None
+        if repo.ui.configbool('experimental', 'lfs.git-user-agent'):
+            useragent = 'git/2.15.1 mercurial/%s' % util.version()
+        self.urlopener = urlmod.opener(ui, authinfo, useragent)
         self.retry = ui.configint('lfs', 'retry')
 
     def writebatch(self, pointers, fromstore):
diff --git a/mercurial/url.py b/mercurial/url.py
--- a/mercurial/url.py
+++ b/mercurial/url.py
@@ -466,7 +466,7 @@  class cookiehandler(urlreq.basehandler):
 
 handlerfuncs = []
 
-def opener(ui, authinfo=None):
+def opener(ui, authinfo=None, useragent=None):
     '''
     construct an opener suitable for urllib2
     authinfo will be added to the password manager
@@ -512,8 +512,14 @@  def opener(ui, authinfo=None):
     # own distribution name. Since servers should not be using the user
     # agent string for anything, clients should be able to define whatever
     # user agent they deem appropriate.
-    agent = 'mercurial/proto-1.0 (Mercurial %s)' % util.version()
-    opener.addheaders = [(r'User-agent', pycompat.sysstr(agent))]
+    #
+    # The custom user agent is for lfs, because unfortunately some servers
+    # do look at this value.
+    if not useragent:
+        agent = 'mercurial/proto-1.0 (Mercurial %s)' % util.version()
+        opener.addheaders = [(r'User-agent', pycompat.sysstr(agent))]
+    else:
+        opener.addheaders = [(r'User-agent', pycompat.sysstr(useragent))]
 
     # This header should only be needed by wire protocol requests. But it has
     # been sent on all requests since forever. We keep sending it for backwards