Patchwork [5,of,6] lfs: infer the blob store URL from an explicit push source or default-push

login
register
mail settings
Submitter Matt Harbison
Date April 9, 2018, 4:26 a.m.
Message ID <b8c871c097d3153d1eb7.1523248005@Envy>
Download mbox | patch
Permalink /patch/30585/
State Accepted
Headers show

Comments

Matt Harbison - April 9, 2018, 4:26 a.m.
# HG changeset patch
# User Matt Harbison <matt_harbison@yahoo.com>
# Date 1523211732 14400
#      Sun Apr 08 14:22:12 2018 -0400
# Node ID b8c871c097d3153d1eb71e0072cb7edf3356360c
# Parent  2aecf5b7dfdaeb12a6f6ac151d40c3b60f789abc
lfs: infer the blob store URL from an explicit push source or default-push

Same idea as pull, but the push command needs to hold onto the '_subtopath'
field slightly longer.  Since this code has already resolved an explicit path or
'default-push' or 'default', it seems reasonable to make this simple tweak to
avoid recalculating that.
Yuya Nishihara - April 9, 2018, 1:43 p.m.
On Mon, 09 Apr 2018 00:26:45 -0400, Matt Harbison wrote:
> # HG changeset patch
> # User Matt Harbison <matt_harbison@yahoo.com>
> # Date 1523211732 14400
> #      Sun Apr 08 14:22:12 2018 -0400
> # Node ID b8c871c097d3153d1eb71e0072cb7edf3356360c
> # Parent  2aecf5b7dfdaeb12a6f6ac151d40c3b60f789abc
> lfs: infer the blob store URL from an explicit push source or default-push
> 
> Same idea as pull, but the push command needs to hold onto the '_subtopath'
> field slightly longer.  Since this code has already resolved an explicit path or
> 'default-push' or 'default', it seems reasonable to make this simple tweak to
> avoid recalculating that.

> +        opargs = dict(opts.get('opargs', {})) # copy opargs since it may mutate
> +        opargs.setdefault('pushvars', []).extend(opts.get('pushvars', []))
> +
> +        pushop = exchange.push(repo, other, opts.get('force'), revs=revs,
> +                               newbranch=opts.get('new_branch'),
> +                               bookmarks=opts.get('bookmark', ()),
> +                               opargs=opargs)
> +        # _subtopath stays for this repo push to assist LFS server discovery
>      finally:
>          del repo._subtoppath
>  
> -    opargs = dict(opts.get('opargs', {})) # copy opargs since we may mutate it
> -    opargs.setdefault('pushvars', []).extend(opts.get('pushvars', []))
> -
> -    pushop = exchange.push(repo, other, opts.get('force'), revs=revs,
> -                           newbranch=opts.get('new_branch'),
> -                           bookmarks=opts.get('bookmark', ()),
> -                           opargs=opargs)

Can't we wrap e.g. exchange.push/pull() to temporarily replace
repo.lfsremoteblobstore based on the remote repository?

I think the root problem is that we can't decide the remoteblobstore at
reposetup().
Matt Harbison - April 9, 2018, 3:19 p.m.
> On Apr 9, 2018, at 9:43 AM, Yuya Nishihara <yuya@tcha.org> wrote:
> 
>> On Mon, 09 Apr 2018 00:26:45 -0400, Matt Harbison wrote:
>> # HG changeset patch
>> # User Matt Harbison <matt_harbison@yahoo.com>
>> # Date 1523211732 14400
>> #      Sun Apr 08 14:22:12 2018 -0400
>> # Node ID b8c871c097d3153d1eb71e0072cb7edf3356360c
>> # Parent  2aecf5b7dfdaeb12a6f6ac151d40c3b60f789abc
>> lfs: infer the blob store URL from an explicit push source or default-push
>> 
>> Same idea as pull, but the push command needs to hold onto the '_subtopath'
>> field slightly longer.  Since this code has already resolved an explicit path or
>> 'default-push' or 'default', it seems reasonable to make this simple tweak to
>> avoid recalculating that.
> 
>> +        opargs = dict(opts.get('opargs', {})) # copy opargs since it may mutate
>> +        opargs.setdefault('pushvars', []).extend(opts.get('pushvars', []))
>> +
>> +        pushop = exchange.push(repo, other, opts.get('force'), revs=revs,
>> +                               newbranch=opts.get('new_branch'),
>> +                               bookmarks=opts.get('bookmark', ()),
>> +                               opargs=opargs)
>> +        # _subtopath stays for this repo push to assist LFS server discovery
>>     finally:
>>         del repo._subtoppath
>> 
>> -    opargs = dict(opts.get('opargs', {})) # copy opargs since we may mutate it
>> -    opargs.setdefault('pushvars', []).extend(opts.get('pushvars', []))
>> -
>> -    pushop = exchange.push(repo, other, opts.get('force'), revs=revs,
>> -                           newbranch=opts.get('new_branch'),
>> -                           bookmarks=opts.get('bookmark', ()),
>> -                           opargs=opargs)
> 
> Can't we wrap e.g. exchange.push/pull() to temporarily replace
> repo.lfsremoteblobstore based on the remote repository?

I’ll take another look at it.  That was where I started, but it seemed confusing to pass in a path to blobstore.remote() (but only sometimes), and then use lfs.url or paths.default.  It also would require changing the parameters to uploadblobsfromrevs(), which is flagged as called by other (FB?) extensions. (Probably a path=None would work here.)

> I think the root problem is that we can't decide the remoteblobstore at
> reposetup().

Exactly.  And the path info isn’t conveniently available in other places.
Yuya Nishihara - April 9, 2018, 3:42 p.m.
On Mon, 9 Apr 2018 11:19:24 -0400, Matt Harbison wrote:
> 
> > On Apr 9, 2018, at 9:43 AM, Yuya Nishihara <yuya@tcha.org> wrote:
> > 
> >> On Mon, 09 Apr 2018 00:26:45 -0400, Matt Harbison wrote:
> >> # HG changeset patch
> >> # User Matt Harbison <matt_harbison@yahoo.com>
> >> # Date 1523211732 14400
> >> #      Sun Apr 08 14:22:12 2018 -0400
> >> # Node ID b8c871c097d3153d1eb71e0072cb7edf3356360c
> >> # Parent  2aecf5b7dfdaeb12a6f6ac151d40c3b60f789abc
> >> lfs: infer the blob store URL from an explicit push source or default-push
> >> 
> >> Same idea as pull, but the push command needs to hold onto the '_subtopath'
> >> field slightly longer.  Since this code has already resolved an explicit path or
> >> 'default-push' or 'default', it seems reasonable to make this simple tweak to
> >> avoid recalculating that.
> > 
> >> +        opargs = dict(opts.get('opargs', {})) # copy opargs since it may mutate
> >> +        opargs.setdefault('pushvars', []).extend(opts.get('pushvars', []))
> >> +
> >> +        pushop = exchange.push(repo, other, opts.get('force'), revs=revs,
> >> +                               newbranch=opts.get('new_branch'),
> >> +                               bookmarks=opts.get('bookmark', ()),
> >> +                               opargs=opargs)
> >> +        # _subtopath stays for this repo push to assist LFS server discovery
> >>     finally:
> >>         del repo._subtoppath
> >> 
> >> -    opargs = dict(opts.get('opargs', {})) # copy opargs since we may mutate it
> >> -    opargs.setdefault('pushvars', []).extend(opts.get('pushvars', []))
> >> -
> >> -    pushop = exchange.push(repo, other, opts.get('force'), revs=revs,
> >> -                           newbranch=opts.get('new_branch'),
> >> -                           bookmarks=opts.get('bookmark', ()),
> >> -                           opargs=opargs)
> > 
> > Can't we wrap e.g. exchange.push/pull() to temporarily replace
> > repo.lfsremoteblobstore based on the remote repository?
> 
> I’ll take another look at it.  That was where I started, but it seemed confusing to pass in a path to blobstore.remote() (but only sometimes), and then use lfs.url or paths.default.  It also would require changing the parameters to uploadblobsfromrevs(), which is flagged as called by other (FB?) extensions. (Probably a path=None would work here.)

FWIW, thinking about paths.<name>:lfsurl, we might have to add a hook point to
commands.push/pull() such that a resolved "path" object can be intercepted.
Matt Harbison - April 9, 2018, 9:43 p.m.
> On Apr 9, 2018, at 9:43 AM, Yuya Nishihara <yuya@tcha.org> wrote:
> 
>> On Mon, 09 Apr 2018 00:26:45 -0400, Matt Harbison wrote:
>> # HG changeset patch
>> # User Matt Harbison <matt_harbison@yahoo.com>
>> # Date 1523211732 14400
>> #      Sun Apr 08 14:22:12 2018 -0400
>> # Node ID b8c871c097d3153d1eb71e0072cb7edf3356360c
>> # Parent  2aecf5b7dfdaeb12a6f6ac151d40c3b60f789abc
>> lfs: infer the blob store URL from an explicit push source or default-push
>> 
>> Same idea as pull, but the push command needs to hold onto the '_subtopath'
>> field slightly longer.  Since this code has already resolved an explicit path or
>> 'default-push' or 'default', it seems reasonable to make this simple tweak to
>> avoid recalculating that.
> 
>> +        opargs = dict(opts.get('opargs', {})) # copy opargs since it may mutate
>> +        opargs.setdefault('pushvars', []).extend(opts.get('pushvars', []))
>> +
>> +        pushop = exchange.push(repo, other, opts.get('force'), revs=revs,
>> +                               newbranch=opts.get('new_branch'),
>> +                               bookmarks=opts.get('bookmark', ()),
>> +                               opargs=opargs)
>> +        # _subtopath stays for this repo push to assist LFS server discovery
>>     finally:
>>         del repo._subtoppath
>> 
>> -    opargs = dict(opts.get('opargs', {})) # copy opargs since we may mutate it
>> -    opargs.setdefault('pushvars', []).extend(opts.get('pushvars', []))
>> -
>> -    pushop = exchange.push(repo, other, opts.get('force'), revs=revs,
>> -                           newbranch=opts.get('new_branch'),
>> -                           bookmarks=opts.get('bookmark', ()),
>> -                           opargs=opargs)
> 
> Can't we wrap e.g. exchange.push/pull() to temporarily replace
> repo.lfsremoteblobstore based on the remote repository?

That works for exchange.push(), where the files are sent in a prepush hook.  That doesn’t work for pull -u, where the update occurs after the exchange.  Commands.postincoming() doesn’t have path info (that’s where repo._subtoppath comes in), and we can’t wrap commands.pull(), because that won’t help subrepos.

What about a hybrid approach of wrapping push(), but still looking at _subtoppath to handle pull?  That way, the core doesn’t have to change.

> I think the root problem is that we can't decide the remoteblobstore at
> reposetup().
Yuya Nishihara - April 10, 2018, 1:34 p.m.
On Mon, 9 Apr 2018 17:43:47 -0400, Matt Harbison wrote:
> 
> > On Apr 9, 2018, at 9:43 AM, Yuya Nishihara <yuya@tcha.org> wrote:
> > 
> >> On Mon, 09 Apr 2018 00:26:45 -0400, Matt Harbison wrote:
> >> # HG changeset patch
> >> # User Matt Harbison <matt_harbison@yahoo.com>
> >> # Date 1523211732 14400
> >> #      Sun Apr 08 14:22:12 2018 -0400
> >> # Node ID b8c871c097d3153d1eb71e0072cb7edf3356360c
> >> # Parent  2aecf5b7dfdaeb12a6f6ac151d40c3b60f789abc
> >> lfs: infer the blob store URL from an explicit push source or default-push
> >> 
> >> Same idea as pull, but the push command needs to hold onto the '_subtopath'
> >> field slightly longer.  Since this code has already resolved an explicit path or
> >> 'default-push' or 'default', it seems reasonable to make this simple tweak to
> >> avoid recalculating that.
> > 
> >> +        opargs = dict(opts.get('opargs', {})) # copy opargs since it may mutate
> >> +        opargs.setdefault('pushvars', []).extend(opts.get('pushvars', []))
> >> +
> >> +        pushop = exchange.push(repo, other, opts.get('force'), revs=revs,
> >> +                               newbranch=opts.get('new_branch'),
> >> +                               bookmarks=opts.get('bookmark', ()),
> >> +                               opargs=opargs)
> >> +        # _subtopath stays for this repo push to assist LFS server discovery
> >>     finally:
> >>         del repo._subtoppath
> >> 
> >> -    opargs = dict(opts.get('opargs', {})) # copy opargs since we may mutate it
> >> -    opargs.setdefault('pushvars', []).extend(opts.get('pushvars', []))
> >> -
> >> -    pushop = exchange.push(repo, other, opts.get('force'), revs=revs,
> >> -                           newbranch=opts.get('new_branch'),
> >> -                           bookmarks=opts.get('bookmark', ()),
> >> -                           opargs=opargs)
> > 
> > Can't we wrap e.g. exchange.push/pull() to temporarily replace
> > repo.lfsremoteblobstore based on the remote repository?
> 
> That works for exchange.push(), where the files are sent in a prepush hook.  That doesn’t work for pull -u, where the update occurs after the exchange.  Commands.postincoming() doesn’t have path info (that’s where repo._subtoppath comes in), and we can’t wrap commands.pull(), because that won’t help subrepos.
> 
> What about a hybrid approach of wrapping push(), but still looking at _subtoppath to handle pull?  That way, the core doesn’t have to change.

Ok, abusing _subtoppath would be no worse than the current state since the
situation is quite similar to subrepo pull. So maybe it's okay for now.

Can you try to fix the mess after the 4.6 release? For "pull -u" of
subrepos/lfs, we'll probably need

 a) an API to explicitly prefetch changes (recursively?) from the specified
    peer repository
 b) or, a mechanism to carry peer path/repository while updating (something
    like pull/pushop)
 c) or, a formal API to temporarily bind peer path/repository to the local
    repo (i.e. formalize _subtoppath in some way)
Matt Harbison - April 10, 2018, 3:24 p.m.
> On Apr 10, 2018, at 9:34 AM, Yuya Nishihara <yuya@tcha.org> wrote:
> 
>> On Mon, 9 Apr 2018 17:43:47 -0400, Matt Harbison wrote:
>> 
>>>> On Apr 9, 2018, at 9:43 AM, Yuya Nishihara <yuya@tcha.org> wrote:
>>>> 
>>>> On Mon, 09 Apr 2018 00:26:45 -0400, Matt Harbison wrote:
>>>> # HG changeset patch
>>>> # User Matt Harbison <matt_harbison@yahoo.com>
>>>> # Date 1523211732 14400
>>>> #      Sun Apr 08 14:22:12 2018 -0400
>>>> # Node ID b8c871c097d3153d1eb71e0072cb7edf3356360c
>>>> # Parent  2aecf5b7dfdaeb12a6f6ac151d40c3b60f789abc
>>>> lfs: infer the blob store URL from an explicit push source or default-push
>>>> 
>> That works for exchange.push(), where the files are sent in a prepush hook.  That doesn’t work for pull -u, where the update occurs after the exchange.  Commands.postincoming() doesn’t have path info (that’s where repo._subtoppath comes in), and we can’t wrap commands.pull(), because that won’t help subrepos.
>> 
>> What about a hybrid approach of wrapping push(), but still looking at _subtoppath to handle pull?  That way, the core doesn’t have to change.
> 
> Ok, abusing _subtoppath would be no worse than the current state since the
> situation is quite similar to subrepo pull. So maybe it's okay for now.
> 
> Can you try to fix the mess after the 4.6 release? For "pull -u" of
> subrepos/lfs, we'll probably need
> 
> a) an API to explicitly prefetch changes (recursively?) from the specified
>    peer repository

Probably this one, for pull anyway. The current state of things is if you `hg pull -u <path>`, it does a parent checkout, and then tries to pull the subrepo.  If that fails, you’re left in the intermediate state that’s hard to recover from, especially if you had uncommitted changes.

It’s been on my list for awhile, but I’ll try to give it priority.

> b) or, a mechanism to carry peer path/repository while updating (something
>    like pull/pushop)
> c) or, a formal API to temporarily bind peer path/repository to the local
>    repo (i.e. formalize _subtoppath in some way)

Patch

diff --git a/hgext/lfs/__init__.py b/hgext/lfs/__init__.py
--- a/hgext/lfs/__init__.py
+++ b/hgext/lfs/__init__.py
@@ -87,10 +87,10 @@  Configs::
     #   git-lfs endpoint
     # - file:///tmp/path
     #   local filesystem, usually for testing
-    # if unset, lfs will assume the repository at ``paths.default`` also handles
-    # blob storage for http(s) and ssh URLs.  Https is substituted for ssh, but
-    # the URL is otherwise unchanged.  If ``paths.default`` points to a file or
-    # is unset, lfs will prompt to set this when it must use this value.
+    # if unset, lfs will assume the remote repository also handles blob storage
+    # for http(s) and ssh URLs.  Https is substituted for ssh, but the URL is
+    # otherwise unchanged.  If ``paths.default`` points to a local path or is
+    # unset, lfs will prompt to set this when it must use this value.
     # (default: unset)
     url = https://example.com/repo.git/info/lfs
 
diff --git a/hgext/lfs/blobstore.py b/hgext/lfs/blobstore.py
--- a/hgext/lfs/blobstore.py
+++ b/hgext/lfs/blobstore.py
@@ -540,8 +540,8 @@  def remote(repo):
     if url.scheme is None:
         # The pull command sets this during the optional update phase, which
         # tells exactly where the pull originated, whether 'paths.default' or
-        # explicit.  Sadly, the push command only sets this for the duration of
-        # any subrepo push, and clears it before pushing the main repo.
+        # explicit.  Likewise, the push command sets this to the appropriate
+        # path, possibly including 'paths.default-push'.
         if util.safehasattr(repo, '_subtoppath'):
             defaulturl = util.url(repo._subtoppath)
         else:
diff --git a/hgext/lfs/wrapper.py b/hgext/lfs/wrapper.py
--- a/hgext/lfs/wrapper.py
+++ b/hgext/lfs/wrapper.py
@@ -367,7 +367,9 @@  def uploadblobs(repo, pointers):
     if not pointers:
         return
 
-    remoteblob = repo.svfs.lfsremoteblobstore
+    # By building a new remote store here, we can account for any explicit path
+    # given on the command line, or subrepo direct by its parent repo.
+    remoteblob = blobstore.remote(repo)
     remoteblob.writebatch(pointers, repo.svfs.lfslocalblobstore)
 
 def upgradefinishdatamigration(orig, ui, srcrepo, dstrepo, requirements):
diff --git a/mercurial/commands.py b/mercurial/commands.py
--- a/mercurial/commands.py
+++ b/mercurial/commands.py
@@ -4169,17 +4169,18 @@  def push(ui, repo, dest=None, **opts):
             result = c.sub(s).push(opts)
             if result == 0:
                 return not result
+
+        opargs = dict(opts.get('opargs', {})) # copy opargs since it may mutate
+        opargs.setdefault('pushvars', []).extend(opts.get('pushvars', []))
+
+        pushop = exchange.push(repo, other, opts.get('force'), revs=revs,
+                               newbranch=opts.get('new_branch'),
+                               bookmarks=opts.get('bookmark', ()),
+                               opargs=opargs)
+        # _subtopath stays for this repo push to assist LFS server discovery
     finally:
         del repo._subtoppath
 
-    opargs = dict(opts.get('opargs', {})) # copy opargs since we may mutate it
-    opargs.setdefault('pushvars', []).extend(opts.get('pushvars', []))
-
-    pushop = exchange.push(repo, other, opts.get('force'), revs=revs,
-                           newbranch=opts.get('new_branch'),
-                           bookmarks=opts.get('bookmark', ()),
-                           opargs=opargs)
-
     result = not pushop.cgresult
 
     if pushop.bkresult is not None:
diff --git a/tests/test-lfs-serve.t b/tests/test-lfs-serve.t
--- a/tests/test-lfs-serve.t
+++ b/tests/test-lfs-serve.t
@@ -241,7 +241,23 @@  lfs content, and the extension enabled.
   $ echo 'this is another lfs file' > lfs2.txt
   $ hg ci -Aqm 'lfs file with lfs client'
 
-  $ hg push -q
+  $ hg --config paths.default= push -v http://localhost:$HGPORT
+  pushing to http://localhost:$HGPORT/
+  searching for changes
+  remote has heads on branch 'default' that are not known locally: 8374dc4052cb
+  lfs: assuming remote store: http://localhost:$HGPORT/.git/info/lfs
+  lfs: uploading a82f1c5cea0d40e3bb3a849686bb4e6ae47ca27e614de55c1ed0325698ef68de (25 bytes)
+  lfs: processed: a82f1c5cea0d40e3bb3a849686bb4e6ae47ca27e614de55c1ed0325698ef68de
+  lfs: uploaded 1 files (25 bytes)
+  1 changesets found
+  uncompressed size of bundle content:
+       206 (changelog)
+       172 (manifests)
+       275  lfs2.txt
+  remote: adding changesets
+  remote: adding manifests
+  remote: adding file changes
+  remote: added 1 changesets with 1 changes to 1 files
   $ grep 'lfs' .hg/requires $SERVER_REQUIRES
   .hg/requires:lfs
   $TESTTMP/server/.hg/requires:lfs
diff --git a/tests/test-lfs-test-server.t b/tests/test-lfs-test-server.t
--- a/tests/test-lfs-test-server.t
+++ b/tests/test-lfs-test-server.t
@@ -81,6 +81,7 @@  store.
   checking for updated bookmarks
   listing keys for "bookmarks"
   lfs: computing set of blobs to upload
+  http auth: user foo, password ***
   Status: 200
   Content-Length: 309 (git-server !)
   Content-Length: 350 (hg-server !)
@@ -217,6 +218,7 @@  actions property completely.
   listing keys for "bookmarks"
   listing keys for "bookmarks"
   lfs: computing set of blobs to upload
+  http auth: user foo, password ***
   Status: 200
   Content-Length: 901 (git-server !)
   Content-Length: 755 (hg-server !)
@@ -485,6 +487,7 @@  Test a corrupted file upload
   listing keys for "bookmarks"
   listing keys for "bookmarks"
   lfs: computing set of blobs to upload
+  http auth: user foo, password ***
   Status: 200
   Content-Length: 309 (git-server !)
   Content-Length: 350 (hg-server !)