Patchwork [1,of,5,V2] pull: add --subrepos flag

login
register
mail settings
Submitter Angel Ezquerra
Date March 3, 2013, 9:05 p.m.
Message ID <c5e164d3282a4a2fb541.1362344745@Angel-PC.localdomain>
Download mbox | patch
Permalink /patch/1075/
State Rejected, archived
Headers show

Comments

Angel Ezquerra - March 3, 2013, 9:05 p.m.
# HG changeset patch
# User Angel Ezquerra <angel.ezquerra@gmail.com>
# Date 1360519226 -3600
# Node ID c5e164d3282a4a2fb54171e6246b3b2e8a520080
# Parent  a07be895373394be66ba38b1ff111e26aca03ac8
pull: add --subrepos flag

The purpose of this new flag is to ensure that you are able to update to any
incoming revision without requiring any network access. The idea is to make sure
that the repository is self-contained after doing hg pull --subrepos, as long as
it was already self-contained before the pull.

When the --subrepos flag is enabled, pull will also pull (or clone) all subrepos
that are present on the current revision and those that are referenced by any of
the incoming revisions. Pulls are recursive (i.e. subrepos withing subrepos will
be also pulled as needed), but clones are not recursive yet (it requires adding
a similar --subrepos flag to clone, which will be done on another patch).

If the incoming revisions refer to subrepos that are not on the working
directory yet they will be cloned. If any of the subrepositories changes its
pull source (as defined on the .hgsub file) it will be pulled from the current
and the new source.

This first patch only supports mercurial subrepos (a NotImplementedError
exception will be raised for all other subrepo types). Future patches will add
support for other subrepo types.
Matt Mackall - April 16, 2013, 11:39 p.m.
On Sun, 2013-03-03 at 22:05 +0100, Angel Ezquerra wrote:
> # HG changeset patch
> # User Angel Ezquerra <angel.ezquerra@gmail.com>
> # Date 1360519226 -3600
> # Node ID c5e164d3282a4a2fb54171e6246b3b2e8a520080
> # Parent  a07be895373394be66ba38b1ff111e26aca03ac8
> pull: add --subrepos flag
> 
> The purpose of this new flag is to ensure that you are able to update to any
> incoming revision without requiring any network access. The idea is to make sure
> that the repository is self-contained after doing hg pull --subrepos, as long as
> it was already self-contained before the pull.
> 
> When the --subrepos flag is enabled, pull will also pull (or clone) all subrepos
> that are present on the current revision and those that are referenced by any of
> the incoming revisions.

Sorry for the long wait on this one.

I'm afraid this is not quite what I'd envisioned. I was expecting it to
just visit heads or even just the head we're likely to update to. 

The largefiles code has a similar issue of having bits that might want
to be pulled before we disconnect and they've gone through a couple
iterations of approaches before finally arriving at:

- only pull what we're going to need for update
- have a separate flag that specifies how to pull more (--lfpull)

I think -S shouldn't try to be fancy for now: just recurse the current
subrepos in the working dir and pull. Then there's no question of
"what's going to get pulled", "where we're going to put it", and "what
do we do if sources conflict", which are big unsolved problems.

It might not work when we next try to update. Oh well.

Generally speaking, pulling every subrepo referenced in history isn't
even going to work for a lot of people: they've committed broken .hgsub
files at some point or the repos have moved.

Patch

diff --git a/mercurial/commands.py b/mercurial/commands.py
--- a/mercurial/commands.py
+++ b/mercurial/commands.py
@@ -4654,6 +4654,8 @@ 
     ('B', 'bookmark', [], _("bookmark to pull"), _('BOOKMARK')),
     ('b', 'branch', [], _('a specific branch you would like to pull'),
      _('BRANCH')),
+    ('S', 'subrepos', None,
+     _('pull current and incoming subrepos recursively')),
     ] + remoteopts,
     _('[-u] [-f] [-r REV]... [-e CMD] [--remotecmd CMD] [SOURCE]'))
 def pull(ui, repo, source="default", **opts):
@@ -4698,7 +4700,9 @@ 
                     "so a rev cannot be specified.")
             raise util.Abort(err)
 
-    modheads = repo.pull(other, heads=revs, force=opts.get('force'))
+    modheads = repo.pull(other, heads=revs, force=opts.get('force'),
+        subrepos=opts.get('subrepos'))
+
     bookmarks.updatefromremote(ui, repo, other, source)
     if checkout:
         checkout = str(repo.changelog.rev(other.lookup(checkout)))
diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -10,12 +10,14 @@ 
 import changelog, dirstate, filelog, manifest, context, bookmarks, phases
 import lock, transaction, store, encoding, base85
 import scmutil, util, extensions, hook, error, revset
+import config
 import match as matchmod
 import merge as mergemod
 import tags as tagsmod
 from lock import release
 import weakref, errno, os, time, inspect
 import branchmap
+import itertools
 propertycache = util.propertycache
 filecache = scmutil.filecache
 
@@ -1646,11 +1648,13 @@ 
 
         return r
 
-    def pull(self, remote, heads=None, force=False):
+    def pull(self, remote, heads=None, force=False, subrepos=False):
         # don't open transaction for nothing or you break future useful
         # rollback call
         tr = None
         trname = 'pull\n' + util.hidepassword(remote.url())
+        if subrepos:
+            oldtip = len(self)
         lock = self.lock()
         try:
             tmp = discovery.findcommonincoming(self, remote, heads=heads,
@@ -1730,8 +1734,43 @@ 
                 tr.release()
             lock.release()
 
+        # update current and new subrepos
+        if subrepos:
+            # pull (or clone) the subrepos that are referenced by the
+            # current revision or by any of the incoming revisions
+            revstocheck = itertools.chain(['.'], self.changelog.revs(oldtip))
+            self.getsubrepos(revs=revstocheck, source=remote.url())
+
         return result
 
+    def getsubrepos(self, revs=None, source=None):
+        """Get (clone or pull) the subrepos that are referenced
+        on any of the revisions on the given revision list
+        """
+        if revs is None:
+            # check all revisions
+            revs = self.changelog.revs()
+        # use a sortdict to make sure that we get the subrepos
+        # in the order they are found
+        substopull = config.sortdict()
+        for rev in revs:
+            ctx = self[rev]
+            # read the substate items in alphabetical order to ensure
+            # that we always process the subrepos in the same order
+            for sname in sorted(ctx.substate):
+                sinfo = ctx.substate[sname]
+                substopull[(sname, sinfo[0], sinfo[2])] = ctx
+        try:
+            self._subtoppath = source
+            for (sname, ssource, stype), ctx in substopull.items():
+                try:
+                    ctx.sub(sname).pull(ssource)
+                except (error.RepoError, urllib2.HTTPError), ex:
+                    self.ui.warn(_('could not pull subrepo %s from %s (%s)\n')
+                                 % (sname, ssource, str(ex)))
+        finally:
+            del self._subtoppath
+
     def checkpush(self, force, revs):
         """Extensions can override this function if additional checks have
         to be performed before pushing, or call it if they override push
diff --git a/mercurial/subrepo.py b/mercurial/subrepo.py
--- a/mercurial/subrepo.py
+++ b/mercurial/subrepo.py
@@ -330,6 +330,11 @@ 
         """
         raise NotImplementedError
 
+    def pull(self, source):
+        """pull from the given parent repository source
+        """
+        raise NotImplementedError
+
     def get(self, state, overwrite=False):
         """run whatever commands are needed to put the subrepo into
         this state
@@ -528,8 +533,13 @@ 
         self._repo.ui.note(_('removing subrepo %s\n') % subrelpath(self))
         hg.clean(self._repo, node.nullid, False)
 
+    def pull(self, source):
+        state = (source, None, 'hg')
+        return self._get(state)
+
     def _get(self, state):
         source, revision, kind = state
+        subrepos = revision is None
         if revision not in self._repo:
             self._repo._subsource = source
             srcurl = _abssource(self._repo)
@@ -547,7 +557,7 @@ 
             else:
                 self._repo.ui.status(_('pulling subrepo %s from %s\n')
                                      % (subrelpath(self), srcurl))
-                self._repo.pull(other)
+                self._repo.pull(other, subrepos=subrepos)
                 bookmarks.updatefromremote(self._repo.ui, self._repo, other,
                                            srcurl)
 
diff --git a/tests/test-debugcomplete.t b/tests/test-debugcomplete.t
--- a/tests/test-debugcomplete.t
+++ b/tests/test-debugcomplete.t
@@ -204,7 +204,7 @@ 
   init: ssh, remotecmd, insecure
   log: follow, follow-first, date, copies, keyword, rev, removed, only-merges, user, only-branch, branch, prune, patch, git, limit, no-merges, stat, graph, style, template, include, exclude
   merge: force, rev, preview, tool
-  pull: update, force, rev, bookmark, branch, ssh, remotecmd, insecure
+  pull: update, force, rev, bookmark, branch, subrepos, ssh, remotecmd, insecure
   push: force, rev, bookmark, branch, new-branch, ssh, remotecmd, insecure
   remove: after, force, include, exclude
   serve: accesslog, daemon, daemon-pipefds, errorlog, port, address, prefix, name, web-conf, webdir-conf, pid-file, stdio, cmdserver, templates, style, ipv6, certificate