Patchwork [1,of,5] pull: add --subrev option

login
register
mail settings
Submitter Angel Ezquerra
Date Nov. 23, 2013, 8:28 p.m.
Message ID <ecd1ca99f1b3e4d0f6ea.1385238480@Angel-PC.localdomain>
Download mbox | patch
Permalink /patch/3112/
State Superseded
Headers show

Comments

Angel Ezquerra - Nov. 23, 2013, 8:28 p.m.
# HG changeset patch
# User Angel Ezquerra <angel.ezquerra@gmail.com>
# Date 1372976561 -7200
#      Fri Jul 05 00:22:41 2013 +0200
# Node ID ecd1ca99f1b3e4d0f6ea8abd606990ef24667779
# Parent  1c46b18b0e1c47fa4cecf21b78c083a54ae9903f
pull: add --subrev option

The purpose of this new option is to make it possible to pull subrepos at the
same time that you pull the parent repository, instead of having to update to
the newly pulled revision to pull the corresponding subrepos.

This has multiple advantages. First of all it lets you pull the subrepos from
a non default path (when subrepos are pulled during udpate they are always
pulled from the default path). Also, it lets you pull subrepos referenced in all
new revisions in one go, rather than potentially having to update to every new
revision.

The main use case for this new flag is to be able to ensure that your
repositories are alwasys "self contained". That is, that you will be able to
update to any incoming revision without requiring any network access. This can
be done by adding "--subrev .:tip" to your pull command (it would be nicer if
there was an "incoming()" revset expression, akin to the existing
"outgoing(path)" expression).

When the --subrev flag is enabled, pull will also pull (or clone) all subrepos
that are present on the revisions matching the --subrev revset(s). Pulls are
recursive (i.e. subrepos withing subrepos will be also pulled as needed), but
clones are not recursive yet (it requires adding a similar --subrev flag to
clone, which will be done on another patch).

If the selected --subrev revisions refer to subrepos that are not on the working
directory yet they will be cloned. If any of the subrepositories changes its
pull source (as defined on the .hgsub file) it will be pulled from the current
and the new source. If a pulled subrepo also refers to one of its own subrepos
on any of the incoming subrepo revisions those "sub-subrepos" will be pulled
too.

This first patch only supports mercurial subrepos (a NotImplementedError
exception will be raised for all other subrepo types). Future patches will add
support for other subrepo types.
Pierre-Yves David - Jan. 15, 2014, 10:15 p.m.
On Sat, Nov 23, 2013 at 09:28:00PM +0100, Angel Ezquerra wrote:
> # HG changeset patch
> # User Angel Ezquerra <angel.ezquerra@gmail.com>
> # Date 1372976561 -7200
> #      Fri Jul 05 00:22:41 2013 +0200
> # Node ID ecd1ca99f1b3e4d0f6ea8abd606990ef24667779
> # Parent  1c46b18b0e1c47fa4cecf21b78c083a54ae9903f
> pull: add --subrev option

Why not --subrepos as for other -S option ?

> 
> The purpose of this new option is to make it possible to pull subrepos at the
> same time that you pull the parent repository, instead of having to update to
> the newly pulled revision to pull the corresponding subrepos.

I love that!

> This has multiple advantages. First of all it lets you pull the subrepos from
> a non default path (when subrepos are pulled during udpate they are always
> pulled from the default path). Also, it lets you pull subrepos referenced in all
> new revisions in one go, rather than potentially having to update to every new
> revision.
> 
> The main use case for this new flag is to be able to ensure that your
> repositories are alwasys "self contained".

This is one of the biggest pain point of subrepo so its great to have it solved.

> That is, that you will be able to
> update to any incoming revision without requiring any network access. This can
> be done by adding "--subrev .:tip" to your pull command (it would be nicer if
> there was an "incoming()" revset expression, akin to the existing
> "outgoing(path)" expression).

Wait what? your new -S argument takes an argument to expression which revision
you want to synchronise? why don't he just check all incoming one by default ?

> When the --subrev flag is enabled, pull will also pull (or clone) all subrepos
> that are present on the revisions matching the --subrev revset(s). Pulls are
> recursive (i.e. subrepos withing subrepos will be also pulled as needed), but
> clones are not recursive yet (it requires adding a similar --subrev flag to
> clone, which will be done on another patch).
> 
> If the selected --subrev revisions refer to subrepos that are not on the working
> directory yet they will be cloned.

What is the clone destination then? somewhere in .hg/ ?

> If any of the subrepositories changes its
> pull source (as defined on the .hgsub file) it will be pulled from the current
> and the new source. If a pulled subrepo also refers to one of its own subrepos
> on any of the incoming subrepo revisions those "sub-subrepos" will be pulled
> too.
> 
> This first patch only supports mercurial subrepos (a NotImplementedError
> exception will be raised for all other subrepo types). Future patches will add
> support for other subrepo types.
> 
> diff --git a/mercurial/commands.py b/mercurial/commands.py
> --- a/mercurial/commands.py
> +++ b/mercurial/commands.py
> @@ -4563,6 +4563,7 @@
>      ('B', 'bookmark', [], _("bookmark to pull"), _('BOOKMARK')),
>      ('b', 'branch', [], _('a specific branch you would like to pull'),
>       _('BRANCH')),
> +    ('S', 'subrev', [], _('pull subrepos from these revisions')),
>      ] + remoteopts,
>      _('[-u] [-f] [-r REV]... [-e CMD] [--remotecmd CMD] [SOURCE]'))
>  def pull(ui, repo, source="default", **opts):
> @@ -4608,7 +4609,8 @@
>                      "so a rev cannot be specified.")
>              raise util.Abort(err)
>  
> -    modheads = repo.pull(other, heads=revs, force=opts.get('force'))
> +    modheads = repo.pull(other, heads=revs, force=opts.get('force'),
> +        subrev=opts.get('subrev'))
>      bookmarks.updatefromremote(ui, repo, remotebookmarks, source)
>      if checkout:
>          checkout = str(repo.changelog.rev(other.lookup(checkout)))
> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> --- a/mercurial/localrepo.py
> +++ b/mercurial/localrepo.py
> @@ -11,6 +11,7 @@
>  import lock as lockmod
>  import transaction, store, encoding
>  import scmutil, util, extensions, hook, error, revset
> +import config, urllib2
>  import match as matchmod
>  import merge as mergemod
>  import tags as tagsmod
> @@ -1666,7 +1667,7 @@
>  
>          return r
>  
> -    def pull(self, remote, heads=None, force=False):
> +    def pull(self, remote, heads=None, force=False, subrev=None):
>          if remote.local():
>              missing = set(remote.requirements) - self.supported
>              if missing:
> @@ -1757,8 +1758,49 @@
>                  tr.release()
>              lock.release()
>  
> +        # update current and new subrepos
> +        if subrev:
> +            # pull (or clone) the subrepos that are on any of the revisions
> +            # on the subrev list
> +            self.getsubrepos(subrev=subrev, source=remote.url())
> +
>          return result
>  
> +    def getsubrepos(self, subrev=None, source=None):

1) DO NOT ADD NEW FUNCTION IN LOCALREPOSITORY CLASS!

If you function is not to be called all over the place by a lot of people it belong somewhere else. Please make it a `getsubrepos(repo, subrev=None, source=None)` function in the `mercurial.subrepo` module.

I also think the name in unfortunate: by "get" I would expect something just
returning the list of subrepo object. while you don't return anything but
change their content.

I would rename that `pullsubrepos` (or something related)

> +        """Get (clone or pull) the subrepos that are referenced
> +        on any of the revisions on the given revision list
> +        """
> +        if not subrev:
> +            return
> +        try:
> +            revs = self.revs('%lr', subrev)
> +        except error.RepoLookupError, ex:
> +            # No matches to the subrev revset
> +            self.ui.note(_("no revisions found matching subrev revset '%s'\n"))

I think this should go to standard error output

> +            return
> +        revs.sort()
> +        # use a sortdict to make sure that we get the subrepos
> +        # in the order they are found
> +        substopull = config.sortdict()
> +        for rev in revs:
> +            ctx = self[rev]
> +            # read the substate items in alphabetical order to ensure
> +            # that we always process the subrepos in the same order
> +            for sname in sorted(ctx.substate):
> +                sinfo = ctx.substate[sname]
> +                substopull[(sname, sinfo[0], sinfo[2])] = ctx
> +        try:
> +            self._subtoppath = source
> +            for (sname, ssource, stype), ctx in substopull.items():
> +                try:
> +                    ctx.sub(sname).pull(ssource)
> +                except (error.RepoError, urllib2.HTTPError,
> +                        subrepo.SubrepoAbort), ex:
> +                    self.ui.warn(_('could not pull subrepo %s from %s (%s)\n')
> +                                 % (sname, ssource, str(ex)))

consider honoring the --traceback option here.

> +        finally:
> +            del self._subtoppath
> +
>      def checkpush(self, force, revs):
>          """Extensions can override this function if additional checks have
>          to be performed before pushing, or call it if they override push
> diff --git a/mercurial/subrepo.py b/mercurial/subrepo.py
> --- a/mercurial/subrepo.py
> +++ b/mercurial/subrepo.py
> @@ -380,6 +380,11 @@
>          """
>          raise NotImplementedError
>  
> +    def pull(self, source):
> +        """pull from the given parent repository source
> +        """
> +        raise NotImplementedError
> +
>      def get(self, state, overwrite=False):
>          """run whatever commands are needed to put the subrepo into
>          this state
> @@ -646,34 +651,49 @@
>          self._repo.ui.note(_('removing subrepo %s\n') % subrelpath(self))
>          hg.clean(self._repo, node.nullid, False)
>  
> -    def _get(self, state):
> +    def pull(self, source, heads=None, subrev=None):
> +        '''pull (or clone) the subrepository and its subrepos'''
> +        if subrev is None:
> +            # Pull subrepos referenced by all incoming changesets
> +            revs = list(self._repo.changelog.revs())
> +            if revs:
> +                tip = revs[-1]
> +                subrev = ['%d:' % (tip + 1)]
> +            else:
> +                subrev = ['0:']
> +        return self._pullorclone(source, heads=heads, subrev=subrev)
> +
> +    def _get(self, state, subrev=None):
>          source, revision, kind = state
>          if revision not in self._repo:
> -            self._repo._subsource = source
> -            srcurl = _abssource(self._repo)
> -            other = hg.peer(self._repo, {}, srcurl)
> -            if len(self._repo) == 0:
> -                self._repo.ui.status(_('cloning subrepo %s from %s\n')
> -                                     % (subrelpath(self), srcurl))
> -                parentrepo = self._repo._subparent
> -                shutil.rmtree(self._repo.path)
> -                other, cloned = hg.clone(self._repo._subparent.baseui, {},
> -                                         other, self._repo.root,
> -                                         update=False)
> -                self._repo = cloned.local()
> -                self._initrepo(parentrepo, source, create=True)
> +            self._pullorclone(source, subrev=subrev)
> +
> +    def _pullorclone(self, source, heads=None, subrev=None):
> +        self._repo._subsource = source
> +        srcurl = _abssource(self._repo)
> +        other = hg.peer(self._repo, {}, srcurl)
> +        if len(self._repo) == 0:
> +            self._repo.ui.status(_('cloning subrepo %s from %s\n')
> +                                 % (subrelpath(self), srcurl))
> +            parentrepo = self._repo._subparent
> +            shutil.rmtree(self._repo.path)
> +            other, cloned = hg.clone(self._repo._subparent.baseui, {},
> +                                     other, self._repo.root,
> +                                     update=False, heads=heads)
> +            self._repo = cloned.local()
> +            self._initrepo(parentrepo, source, create=True)
> +            self._cachestorehash(srcurl)
> +        else:
> +            self._repo.ui.status(_('pulling subrepo %s from %s\n')
> +                                 % (subrelpath(self), srcurl))
> +            cleansub = self.storeclean(srcurl)
> +            remotebookmarks = other.listkeys('bookmarks')
> +            self._repo.pull(other, heads=heads, subrev=subrev)
> +            bookmarks.updatefromremote(self._repo.ui, self._repo,
> +                                       remotebookmarks, srcurl)
> +            if cleansub:
> +                # keep the repo clean after pull
>                  self._cachestorehash(srcurl)
> -            else:
> -                self._repo.ui.status(_('pulling subrepo %s from %s\n')
> -                                     % (subrelpath(self), srcurl))
> -                cleansub = self.storeclean(srcurl)
> -                remotebookmarks = other.listkeys('bookmarks')
> -                self._repo.pull(other)
> -                bookmarks.updatefromremote(self._repo.ui, self._repo,
> -                                           remotebookmarks, srcurl)
> -                if cleansub:
> -                    # keep the repo clean after pull
> -                    self._cachestorehash(srcurl)
>  
>      @annotatesubrepoerror
>      def get(self, state, overwrite=False):
> diff --git a/tests/test-completion.t b/tests/test-completion.t
> --- a/tests/test-completion.t
> +++ b/tests/test-completion.t
> @@ -206,7 +206,7 @@
>    init: ssh, remotecmd, insecure
>    log: follow, follow-first, date, copies, keyword, rev, removed, only-merges, user, only-branch, branch, prune, patch, git, limit, no-merges, stat, graph, style, template, include, exclude
>    merge: force, rev, preview, tool
> -  pull: update, force, rev, bookmark, branch, ssh, remotecmd, insecure
> +  pull: update, force, rev, bookmark, branch, subrev, ssh, remotecmd, insecure
>    push: force, rev, bookmark, branch, new-branch, ssh, remotecmd, insecure
>    remove: after, force, include, exclude
>    serve: accesslog, daemon, daemon-pipefds, errorlog, port, address, prefix, name, web-conf, webdir-conf, pid-file, stdio, cmdserver, templates, style, ipv6, certificate

> # HG changeset patch
> # User Angel Ezquerra <angel.ezquerra@gmail.com>
> # Date 1372976561 -7200
> #      Fri Jul 05 00:22:41 2013 +0200
> # Node ID ecd1ca99f1b3e4d0f6ea8abd606990ef24667779
> # Parent  1c46b18b0e1c47fa4cecf21b78c083a54ae9903f
> pull: add --subrev option
> 
> The purpose of this new option is to make it possible to pull subrepos at the
> same time that you pull the parent repository, instead of having to update to
> the newly pulled revision to pull the corresponding subrepos.
> 
> This has multiple advantages. First of all it lets you pull the subrepos from
> a non default path (when subrepos are pulled during udpate they are always
> pulled from the default path). Also, it lets you pull subrepos referenced in all
> new revisions in one go, rather than potentially having to update to every new
> revision.
> 
> The main use case for this new flag is to be able to ensure that your
> repositories are alwasys "self contained". That is, that you will be able to
> update to any incoming revision without requiring any network access. This can
> be done by adding "--subrev .:tip" to your pull command (it would be nicer if
> there was an "incoming()" revset expression, akin to the existing
> "outgoing(path)" expression).
> 
> When the --subrev flag is enabled, pull will also pull (or clone) all subrepos
> that are present on the revisions matching the --subrev revset(s). Pulls are
> recursive (i.e. subrepos withing subrepos will be also pulled as needed), but
> clones are not recursive yet (it requires adding a similar --subrev flag to
> clone, which will be done on another patch).
> 
> If the selected --subrev revisions refer to subrepos that are not on the working
> directory yet they will be cloned. If any of the subrepositories changes its
> pull source (as defined on the .hgsub file) it will be pulled from the current
> and the new source. If a pulled subrepo also refers to one of its own subrepos
> on any of the incoming subrepo revisions those "sub-subrepos" will be pulled
> too.
> 
> This first patch only supports mercurial subrepos (a NotImplementedError
> exception will be raised for all other subrepo types). Future patches will add
> support for other subrepo types.
> 
> diff --git a/mercurial/commands.py b/mercurial/commands.py
> --- a/mercurial/commands.py
> +++ b/mercurial/commands.py
> @@ -4563,6 +4563,7 @@
>      ('B', 'bookmark', [], _("bookmark to pull"), _('BOOKMARK')),
>      ('b', 'branch', [], _('a specific branch you would like to pull'),
>       _('BRANCH')),
> +    ('S', 'subrev', [], _('pull subrepos from these revisions')),
>      ] + remoteopts,
>      _('[-u] [-f] [-r REV]... [-e CMD] [--remotecmd CMD] [SOURCE]'))
>  def pull(ui, repo, source="default", **opts):
> @@ -4608,7 +4609,8 @@
>                      "so a rev cannot be specified.")
>              raise util.Abort(err)
>  
> -    modheads = repo.pull(other, heads=revs, force=opts.get('force'))
> +    modheads = repo.pull(other, heads=revs, force=opts.get('force'),
> +        subrev=opts.get('subrev'))
>      bookmarks.updatefromremote(ui, repo, remotebookmarks, source)
>      if checkout:
>          checkout = str(repo.changelog.rev(other.lookup(checkout)))
> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> --- a/mercurial/localrepo.py
> +++ b/mercurial/localrepo.py
> @@ -11,6 +11,7 @@
>  import lock as lockmod
>  import transaction, store, encoding
>  import scmutil, util, extensions, hook, error, revset
> +import config, urllib2
>  import match as matchmod
>  import merge as mergemod
>  import tags as tagsmod
> @@ -1666,7 +1667,7 @@
>  
>          return r
>  
> -    def pull(self, remote, heads=None, force=False):
> +    def pull(self, remote, heads=None, force=False, subrev=None):
>          if remote.local():
>              missing = set(remote.requirements) - self.supported
>              if missing:
> @@ -1757,8 +1758,49 @@
>                  tr.release()
>              lock.release()
>  
> +        # update current and new subrepos
> +        if subrev:
> +            # pull (or clone) the subrepos that are on any of the revisions
> +            # on the subrev list
> +            self.getsubrepos(subrev=subrev, source=remote.url())
> +
>          return result
>  
> +    def getsubrepos(self, subrev=None, source=None):
> +        """Get (clone or pull) the subrepos that are referenced
> +        on any of the revisions on the given revision list
> +        """
> +        if not subrev:
> +            return
> +        try:
> +            revs = self.revs('%lr', subrev)
> +        except error.RepoLookupError, ex:
> +            # No matches to the subrev revset
> +            self.ui.note(_("no revisions found matching subrev revset '%s'\n"))
> +            return
> +        revs.sort()
> +        # use a sortdict to make sure that we get the subrepos
> +        # in the order they are found
> +        substopull = config.sortdict()
> +        for rev in revs:
> +            ctx = self[rev]
> +            # read the substate items in alphabetical order to ensure
> +            # that we always process the subrepos in the same order
> +            for sname in sorted(ctx.substate):
> +                sinfo = ctx.substate[sname]
> +                substopull[(sname, sinfo[0], sinfo[2])] = ctx
> +        try:
> +            self._subtoppath = source
> +            for (sname, ssource, stype), ctx in substopull.items():
> +                try:
> +                    ctx.sub(sname).pull(ssource)
> +                except (error.RepoError, urllib2.HTTPError,
> +                        subrepo.SubrepoAbort), ex:
> +                    self.ui.warn(_('could not pull subrepo %s from %s (%s)\n')
> +                                 % (sname, ssource, str(ex)))
> +        finally:
> +            del self._subtoppath

This _subtoppath stuff is weird. can't we do something cleaner ?

>      def checkpush(self, force, revs):
>          """Extensions can override this function if additional checks have
>          to be performed before pushing, or call it if they override push
> diff --git a/mercurial/subrepo.py b/mercurial/subrepo.py
> --- a/mercurial/subrepo.py
> +++ b/mercurial/subrepo.py
> @@ -380,6 +380,11 @@
>          """
>          raise NotImplementedError
>  
> +    def pull(self, source):
> +        """pull from the given parent repository source
> +        """
> +        raise NotImplementedError
> +
>      def get(self, state, overwrite=False):
>          """run whatever commands are needed to put the subrepo into
>          this state
> @@ -646,34 +651,49 @@
>          self._repo.ui.note(_('removing subrepo %s\n') % subrelpath(self))
>          hg.clean(self._repo, node.nullid, False)
>  
> -    def _get(self, state):
> +    def pull(self, source, heads=None, subrev=None):
> +        '''pull (or clone) the subrepository and its subrepos'''
> +        if subrev is None:
> +            # Pull subrepos referenced by all incoming changesets
> +            revs = list(self._repo.changelog.revs())
> +            if revs:
> +                tip = revs[-1]
> +                subrev = ['%d:' % (tip + 1)]
> +            else:
> +                subrev = ['0:']
> +        return self._pullorclone(source, heads=heads, subrev=subrev)
> +
> +    def _get(self, state, subrev=None):
>          source, revision, kind = state
>          if revision not in self._repo:
> -            self._repo._subsource = source
> -            srcurl = _abssource(self._repo)
> -            other = hg.peer(self._repo, {}, srcurl)
> -            if len(self._repo) == 0:
> -                self._repo.ui.status(_('cloning subrepo %s from %s\n')
> -                                     % (subrelpath(self), srcurl))
> -                parentrepo = self._repo._subparent
> -                shutil.rmtree(self._repo.path)
> -                other, cloned = hg.clone(self._repo._subparent.baseui, {},
> -                                         other, self._repo.root,
> -                                         update=False)
> -                self._repo = cloned.local()
> -                self._initrepo(parentrepo, source, create=True)
> +            self._pullorclone(source, subrev=subrev)
> +
> +    def _pullorclone(self, source, heads=None, subrev=None):
> +        self._repo._subsource = source
> +        srcurl = _abssource(self._repo)
> +        other = hg.peer(self._repo, {}, srcurl)
> +        if len(self._repo) == 0:
> +            self._repo.ui.status(_('cloning subrepo %s from %s\n')
> +                                 % (subrelpath(self), srcurl))
> +            parentrepo = self._repo._subparent
> +            shutil.rmtree(self._repo.path)
> +            other, cloned = hg.clone(self._repo._subparent.baseui, {},
> +                                     other, self._repo.root,
> +                                     update=False, heads=heads)
> +            self._repo = cloned.local()
> +            self._initrepo(parentrepo, source, create=True)
> +            self._cachestorehash(srcurl)
> +        else:
> +            self._repo.ui.status(_('pulling subrepo %s from %s\n')
> +                                 % (subrelpath(self), srcurl))
> +            cleansub = self.storeclean(srcurl)
> +            remotebookmarks = other.listkeys('bookmarks')
> +            self._repo.pull(other, heads=heads, subrev=subrev)
> +            bookmarks.updatefromremote(self._repo.ui, self._repo,
> +                                       remotebookmarks, srcurl)
> +            if cleansub:
> +                # keep the repo clean after pull
>                  self._cachestorehash(srcurl)
> -            else:
> -                self._repo.ui.status(_('pulling subrepo %s from %s\n')
> -                                     % (subrelpath(self), srcurl))
> -                cleansub = self.storeclean(srcurl)
> -                remotebookmarks = other.listkeys('bookmarks')
> -                self._repo.pull(other)
> -                bookmarks.updatefromremote(self._repo.ui, self._repo,
> -                                           remotebookmarks, srcurl)
> -                if cleansub:
> -                    # keep the repo clean after pull
> -                    self._cachestorehash(srcurl)
>  
>      @annotatesubrepoerror
>      def get(self, state, overwrite=False):
> diff --git a/tests/test-completion.t b/tests/test-completion.t
> --- a/tests/test-completion.t
> +++ b/tests/test-completion.t
> @@ -206,7 +206,7 @@
>    init: ssh, remotecmd, insecure
>    log: follow, follow-first, date, copies, keyword, rev, removed, only-merges, user, only-branch, branch, prune, patch, git, limit, no-merges, stat, graph, style, template, include, exclude
>    merge: force, rev, preview, tool
> -  pull: update, force, rev, bookmark, branch, ssh, remotecmd, insecure
> +  pull: update, force, rev, bookmark, branch, subrev, ssh, remotecmd, insecure
>    push: force, rev, bookmark, branch, new-branch, ssh, remotecmd, insecure
>    remove: after, force, include, exclude
>    serve: accesslog, daemon, daemon-pipefds, errorlog, port, address, prefix, name, web-conf, webdir-conf, pid-file, stdio, cmdserver, templates, style, ipv6, certificate

> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
Angel Ezquerra - Jan. 16, 2014, 7:33 a.m.
On Wed, Jan 15, 2014 at 2:15 PM, Pierre-Yves David
<pierre-yves.david@ens-lyon.org> wrote:
> On Sat, Nov 23, 2013 at 09:28:00PM +0100, Angel Ezquerra wrote:
>> # HG changeset patch
>> # User Angel Ezquerra <angel.ezquerra@gmail.com>
>> # Date 1372976561 -7200
>> #      Fri Jul 05 00:22:41 2013 +0200
>> # Node ID ecd1ca99f1b3e4d0f6ea8abd606990ef24667779
>> # Parent  1c46b18b0e1c47fa4cecf21b78c083a54ae9903f
>> pull: add --subrev option
>
> Why not --subrepos as for other -S option ?

I thought it would be confusing since usually the --subrepos flag
takes no arguments. Maybe what we should do is to _not_ have a short
-S form, to make it clear that is not just another --subrepo flag?

>>
>> The purpose of this new option is to make it possible to pull subrepos at the
>> same time that you pull the parent repository, instead of having to update to
>> the newly pulled revision to pull the corresponding subrepos.
>
> I love that!
>
>> This has multiple advantages. First of all it lets you pull the subrepos from
>> a non default path (when subrepos are pulled during udpate they are always
>> pulled from the default path). Also, it lets you pull subrepos referenced in all
>> new revisions in one go, rather than potentially having to update to every new
>> revision.
>>
>> The main use case for this new flag is to be able to ensure that your
>> repositories are alwasys "self contained".
>
> This is one of the biggest pain point of subrepo so its great to have it solved.

Yes! :-D

>> That is, that you will be able to
>> update to any incoming revision without requiring any network access. This can
>> be done by adding "--subrev .:tip" to your pull command (it would be nicer if
>> there was an "incoming()" revset expression, akin to the existing
>> "outgoing(path)" expression).
>
> Wait what? your new -S argument takes an argument to expression which revision
> you want to synchronise? why don't he just check all incoming one by default ?

The original version of this patch which I sent on the previous cycle
I believe did just that. I believe that it was Matt who said that you
cannot just blindly check subrepos found on all incoming revisions
because perhaps there could be conflicts, source repository changes,
etc. Instead it seems better to give users a way to specify which
revisions should be checked for subrepos to pull or clone.

>> When the --subrev flag is enabled, pull will also pull (or clone) all subrepos
>> that are present on the revisions matching the --subrev revset(s). Pulls are
>> recursive (i.e. subrepos withing subrepos will be also pulled as needed), but
>> clones are not recursive yet (it requires adding a similar --subrev flag to
>> clone, which will be done on another patch).
>>
>> If the selected --subrev revisions refer to subrepos that are not on the working
>> directory yet they will be cloned.
>
> What is the clone destination then? somewhere in .hg/ ?

No, with this patch series they are just placed on the working
directory. This is the same that would happen if you first pull and
then update to the corresponding revision.

If we used this series in combination with my "subrepo cache" series
then it could be possible and natural to simply get them into the
subrepo cache instead, which would definitely be cleaner. It would
perhaps also get rid of the main reason why we cannot just pull or
clone the subrepos found on all the incoming revisions...

>> If any of the subrepositories changes its
>> pull source (as defined on the .hgsub file) it will be pulled from the current
>> and the new source. If a pulled subrepo also refers to one of its own subrepos
>> on any of the incoming subrepo revisions those "sub-subrepos" will be pulled
>> too.
>>
>> This first patch only supports mercurial subrepos (a NotImplementedError
>> exception will be raised for all other subrepo types). Future patches will add
>> support for other subrepo types.
>>
>> diff --git a/mercurial/commands.py b/mercurial/commands.py
>> --- a/mercurial/commands.py
>> +++ b/mercurial/commands.py
>> @@ -4563,6 +4563,7 @@
>>      ('B', 'bookmark', [], _("bookmark to pull"), _('BOOKMARK')),
>>      ('b', 'branch', [], _('a specific branch you would like to pull'),
>>       _('BRANCH')),
>> +    ('S', 'subrev', [], _('pull subrepos from these revisions')),
>>      ] + remoteopts,
>>      _('[-u] [-f] [-r REV]... [-e CMD] [--remotecmd CMD] [SOURCE]'))
>>  def pull(ui, repo, source="default", **opts):
>> @@ -4608,7 +4609,8 @@
>>                      "so a rev cannot be specified.")
>>              raise util.Abort(err)
>>
>> -    modheads = repo.pull(other, heads=revs, force=opts.get('force'))
>> +    modheads = repo.pull(other, heads=revs, force=opts.get('force'),
>> +        subrev=opts.get('subrev'))
>>      bookmarks.updatefromremote(ui, repo, remotebookmarks, source)
>>      if checkout:
>>          checkout = str(repo.changelog.rev(other.lookup(checkout)))
>> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
>> --- a/mercurial/localrepo.py
>> +++ b/mercurial/localrepo.py
>> @@ -11,6 +11,7 @@
>>  import lock as lockmod
>>  import transaction, store, encoding
>>  import scmutil, util, extensions, hook, error, revset
>> +import config, urllib2
>>  import match as matchmod
>>  import merge as mergemod
>>  import tags as tagsmod
>> @@ -1666,7 +1667,7 @@
>>
>>          return r
>>
>> -    def pull(self, remote, heads=None, force=False):
>> +    def pull(self, remote, heads=None, force=False, subrev=None):
>>          if remote.local():
>>              missing = set(remote.requirements) - self.supported
>>              if missing:
>> @@ -1757,8 +1758,49 @@
>>                  tr.release()
>>              lock.release()
>>
>> +        # update current and new subrepos
>> +        if subrev:
>> +            # pull (or clone) the subrepos that are on any of the revisions
>> +            # on the subrev list
>> +            self.getsubrepos(subrev=subrev, source=remote.url())
>> +
>>          return result
>>
>> +    def getsubrepos(self, subrev=None, source=None):
>
> 1) DO NOT ADD NEW FUNCTION IN LOCALREPOSITORY CLASS!
>
> If you function is not to be called all over the place by a lot of people it belong somewhere else. Please make it a `getsubrepos(repo, subrev=None, source=None)` function in the `mercurial.subrepo` module.

OK, good point. I'll look into it when I prepare a new version of this series.

> I also think the name in unfortunate: by "get" I would expect something just
> returning the list of subrepo object. while you don't return anything but
> change their content.
>
> I would rename that `pullsubrepos` (or something related)

The reason I named it this way is that in the subrepo.py module "get"
is used as meaning "get a subrepo" in the sense of getting a given
revision on a subrepo by either pulling or cloning it. Maybe if I move
getsubrepos to the subrepo.py module the name would not feel so
confusing?


>> +        """Get (clone or pull) the subrepos that are referenced
>> +        on any of the revisions on the given revision list
>> +        """
>> +        if not subrev:
>> +            return
>> +        try:
>> +            revs = self.revs('%lr', subrev)
>> +        except error.RepoLookupError, ex:
>> +            # No matches to the subrev revset
>> +            self.ui.note(_("no revisions found matching subrev revset '%s'\n"))
>
> I think this should go to standard error output

I did not want to stop the pull if revs does not match any revisions.
Do you think it should? If not perhaps this should just be a warning?

>> +            return
>> +        revs.sort()
>> +        # use a sortdict to make sure that we get the subrepos
>> +        # in the order they are found
>> +        substopull = config.sortdict()
>> +        for rev in revs:
>> +            ctx = self[rev]
>> +            # read the substate items in alphabetical order to ensure
>> +            # that we always process the subrepos in the same order
>> +            for sname in sorted(ctx.substate):
>> +                sinfo = ctx.substate[sname]
>> +                substopull[(sname, sinfo[0], sinfo[2])] = ctx
>> +        try:
>> +            self._subtoppath = source
>> +            for (sname, ssource, stype), ctx in substopull.items():
>> +                try:
>> +                    ctx.sub(sname).pull(ssource)
>> +                except (error.RepoError, urllib2.HTTPError,
>> +                        subrepo.SubrepoAbort), ex:
>> +                    self.ui.warn(_('could not pull subrepo %s from %s (%s)\n')
>> +                                 % (sname, ssource, str(ex)))
>
> consider honoring the --traceback option here.

I'm not sure what you mean. Isn't --traceback a global option that is
automatically taken into account?


>> +    def getsubrepos(self, subrev=None, source=None):
>> +        """Get (clone or pull) the subrepos that are referenced
>> +        on any of the revisions on the given revision list
>> +        """
>> +        if not subrev:
>> +            return
>> +        try:
>> +            revs = self.revs('%lr', subrev)
>> +        except error.RepoLookupError, ex:
>> +            # No matches to the subrev revset
>> +            self.ui.note(_("no revisions found matching subrev revset '%s'\n"))
>> +            return
>> +        revs.sort()
>> +        # use a sortdict to make sure that we get the subrepos
>> +        # in the order they are found
>> +        substopull = config.sortdict()
>> +        for rev in revs:
>> +            ctx = self[rev]
>> +            # read the substate items in alphabetical order to ensure
>> +            # that we always process the subrepos in the same order
>> +            for sname in sorted(ctx.substate):
>> +                sinfo = ctx.substate[sname]
>> +                substopull[(sname, sinfo[0], sinfo[2])] = ctx
>> +        try:
>> +            self._subtoppath = source
>> +            for (sname, ssource, stype), ctx in substopull.items():
>> +                try:
>> +                    ctx.sub(sname).pull(ssource)
>> +                except (error.RepoError, urllib2.HTTPError,
>> +                        subrepo.SubrepoAbort), ex:
>> +                    self.ui.warn(_('could not pull subrepo %s from %s (%s)\n')
>> +                                 % (sname, ssource, str(ex)))
>> +        finally:
>> +            del self._subtoppath
>
> This _subtoppath stuff is weird. can't we do something cleaner ?

It is a bit weird but it is not something that this patch introduces.
The patch just works around the existing weirdness (perhaps making it
a little bit weirder in the process :-P ).

If I understood how it works correctly we must temporarily change the
_subtoppath (which is used to get an absolute path to a subrepo) to
the source of the pull, and clear it afterwards so that the next
operation does not use the wrong one to calculate the absoluate
subrepo paths.

In other words, I don't think we can make it cleaner, or if we do I
don't think it should be done in this series.

Thanks a lot for the detailed review!

Angel

Patch

# HG changeset patch
# User Angel Ezquerra <angel.ezquerra@gmail.com>
# Date 1372976561 -7200
#      Fri Jul 05 00:22:41 2013 +0200
# Node ID ecd1ca99f1b3e4d0f6ea8abd606990ef24667779
# Parent  1c46b18b0e1c47fa4cecf21b78c083a54ae9903f
pull: add --subrev option

The purpose of this new option is to make it possible to pull subrepos at the
same time that you pull the parent repository, instead of having to update to
the newly pulled revision to pull the corresponding subrepos.

This has multiple advantages. First of all it lets you pull the subrepos from
a non default path (when subrepos are pulled during udpate they are always
pulled from the default path). Also, it lets you pull subrepos referenced in all
new revisions in one go, rather than potentially having to update to every new
revision.

The main use case for this new flag is to be able to ensure that your
repositories are alwasys "self contained". That is, that you will be able to
update to any incoming revision without requiring any network access. This can
be done by adding "--subrev .:tip" to your pull command (it would be nicer if
there was an "incoming()" revset expression, akin to the existing
"outgoing(path)" expression).

When the --subrev flag is enabled, pull will also pull (or clone) all subrepos
that are present on the revisions matching the --subrev revset(s). Pulls are
recursive (i.e. subrepos withing subrepos will be also pulled as needed), but
clones are not recursive yet (it requires adding a similar --subrev flag to
clone, which will be done on another patch).

If the selected --subrev revisions refer to subrepos that are not on the working
directory yet they will be cloned. If any of the subrepositories changes its
pull source (as defined on the .hgsub file) it will be pulled from the current
and the new source. If a pulled subrepo also refers to one of its own subrepos
on any of the incoming subrepo revisions those "sub-subrepos" will be pulled
too.

This first patch only supports mercurial subrepos (a NotImplementedError
exception will be raised for all other subrepo types). Future patches will add
support for other subrepo types.

diff --git a/mercurial/commands.py b/mercurial/commands.py
--- a/mercurial/commands.py
+++ b/mercurial/commands.py
@@ -4563,6 +4563,7 @@ 
     ('B', 'bookmark', [], _("bookmark to pull"), _('BOOKMARK')),
     ('b', 'branch', [], _('a specific branch you would like to pull'),
      _('BRANCH')),
+    ('S', 'subrev', [], _('pull subrepos from these revisions')),
     ] + remoteopts,
     _('[-u] [-f] [-r REV]... [-e CMD] [--remotecmd CMD] [SOURCE]'))
 def pull(ui, repo, source="default", **opts):
@@ -4608,7 +4609,8 @@ 
                     "so a rev cannot be specified.")
             raise util.Abort(err)
 
-    modheads = repo.pull(other, heads=revs, force=opts.get('force'))
+    modheads = repo.pull(other, heads=revs, force=opts.get('force'),
+        subrev=opts.get('subrev'))
     bookmarks.updatefromremote(ui, repo, remotebookmarks, source)
     if checkout:
         checkout = str(repo.changelog.rev(other.lookup(checkout)))
diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -11,6 +11,7 @@ 
 import lock as lockmod
 import transaction, store, encoding
 import scmutil, util, extensions, hook, error, revset
+import config, urllib2
 import match as matchmod
 import merge as mergemod
 import tags as tagsmod
@@ -1666,7 +1667,7 @@ 
 
         return r
 
-    def pull(self, remote, heads=None, force=False):
+    def pull(self, remote, heads=None, force=False, subrev=None):
         if remote.local():
             missing = set(remote.requirements) - self.supported
             if missing:
@@ -1757,8 +1758,49 @@ 
                 tr.release()
             lock.release()
 
+        # update current and new subrepos
+        if subrev:
+            # pull (or clone) the subrepos that are on any of the revisions
+            # on the subrev list
+            self.getsubrepos(subrev=subrev, source=remote.url())
+
         return result
 
+    def getsubrepos(self, subrev=None, source=None):
+        """Get (clone or pull) the subrepos that are referenced
+        on any of the revisions on the given revision list
+        """
+        if not subrev:
+            return
+        try:
+            revs = self.revs('%lr', subrev)
+        except error.RepoLookupError, ex:
+            # No matches to the subrev revset
+            self.ui.note(_("no revisions found matching subrev revset '%s'\n"))
+            return
+        revs.sort()
+        # use a sortdict to make sure that we get the subrepos
+        # in the order they are found
+        substopull = config.sortdict()
+        for rev in revs:
+            ctx = self[rev]
+            # read the substate items in alphabetical order to ensure
+            # that we always process the subrepos in the same order
+            for sname in sorted(ctx.substate):
+                sinfo = ctx.substate[sname]
+                substopull[(sname, sinfo[0], sinfo[2])] = ctx
+        try:
+            self._subtoppath = source
+            for (sname, ssource, stype), ctx in substopull.items():
+                try:
+                    ctx.sub(sname).pull(ssource)
+                except (error.RepoError, urllib2.HTTPError,
+                        subrepo.SubrepoAbort), ex:
+                    self.ui.warn(_('could not pull subrepo %s from %s (%s)\n')
+                                 % (sname, ssource, str(ex)))
+        finally:
+            del self._subtoppath
+
     def checkpush(self, force, revs):
         """Extensions can override this function if additional checks have
         to be performed before pushing, or call it if they override push
diff --git a/mercurial/subrepo.py b/mercurial/subrepo.py
--- a/mercurial/subrepo.py
+++ b/mercurial/subrepo.py
@@ -380,6 +380,11 @@ 
         """
         raise NotImplementedError
 
+    def pull(self, source):
+        """pull from the given parent repository source
+        """
+        raise NotImplementedError
+
     def get(self, state, overwrite=False):
         """run whatever commands are needed to put the subrepo into
         this state
@@ -646,34 +651,49 @@ 
         self._repo.ui.note(_('removing subrepo %s\n') % subrelpath(self))
         hg.clean(self._repo, node.nullid, False)
 
-    def _get(self, state):
+    def pull(self, source, heads=None, subrev=None):
+        '''pull (or clone) the subrepository and its subrepos'''
+        if subrev is None:
+            # Pull subrepos referenced by all incoming changesets
+            revs = list(self._repo.changelog.revs())
+            if revs:
+                tip = revs[-1]
+                subrev = ['%d:' % (tip + 1)]
+            else:
+                subrev = ['0:']
+        return self._pullorclone(source, heads=heads, subrev=subrev)
+
+    def _get(self, state, subrev=None):
         source, revision, kind = state
         if revision not in self._repo:
-            self._repo._subsource = source
-            srcurl = _abssource(self._repo)
-            other = hg.peer(self._repo, {}, srcurl)
-            if len(self._repo) == 0:
-                self._repo.ui.status(_('cloning subrepo %s from %s\n')
-                                     % (subrelpath(self), srcurl))
-                parentrepo = self._repo._subparent
-                shutil.rmtree(self._repo.path)
-                other, cloned = hg.clone(self._repo._subparent.baseui, {},
-                                         other, self._repo.root,
-                                         update=False)
-                self._repo = cloned.local()
-                self._initrepo(parentrepo, source, create=True)
+            self._pullorclone(source, subrev=subrev)
+
+    def _pullorclone(self, source, heads=None, subrev=None):
+        self._repo._subsource = source
+        srcurl = _abssource(self._repo)
+        other = hg.peer(self._repo, {}, srcurl)
+        if len(self._repo) == 0:
+            self._repo.ui.status(_('cloning subrepo %s from %s\n')
+                                 % (subrelpath(self), srcurl))
+            parentrepo = self._repo._subparent
+            shutil.rmtree(self._repo.path)
+            other, cloned = hg.clone(self._repo._subparent.baseui, {},
+                                     other, self._repo.root,
+                                     update=False, heads=heads)
+            self._repo = cloned.local()
+            self._initrepo(parentrepo, source, create=True)
+            self._cachestorehash(srcurl)
+        else:
+            self._repo.ui.status(_('pulling subrepo %s from %s\n')
+                                 % (subrelpath(self), srcurl))
+            cleansub = self.storeclean(srcurl)
+            remotebookmarks = other.listkeys('bookmarks')
+            self._repo.pull(other, heads=heads, subrev=subrev)
+            bookmarks.updatefromremote(self._repo.ui, self._repo,
+                                       remotebookmarks, srcurl)
+            if cleansub:
+                # keep the repo clean after pull
                 self._cachestorehash(srcurl)
-            else:
-                self._repo.ui.status(_('pulling subrepo %s from %s\n')
-                                     % (subrelpath(self), srcurl))
-                cleansub = self.storeclean(srcurl)
-                remotebookmarks = other.listkeys('bookmarks')
-                self._repo.pull(other)
-                bookmarks.updatefromremote(self._repo.ui, self._repo,
-                                           remotebookmarks, srcurl)
-                if cleansub:
-                    # keep the repo clean after pull
-                    self._cachestorehash(srcurl)
 
     @annotatesubrepoerror
     def get(self, state, overwrite=False):
diff --git a/tests/test-completion.t b/tests/test-completion.t
--- a/tests/test-completion.t
+++ b/tests/test-completion.t
@@ -206,7 +206,7 @@ 
   init: ssh, remotecmd, insecure
   log: follow, follow-first, date, copies, keyword, rev, removed, only-merges, user, only-branch, branch, prune, patch, git, limit, no-merges, stat, graph, style, template, include, exclude
   merge: force, rev, preview, tool
-  pull: update, force, rev, bookmark, branch, ssh, remotecmd, insecure
+  pull: update, force, rev, bookmark, branch, subrev, ssh, remotecmd, insecure
   push: force, rev, bookmark, branch, new-branch, ssh, remotecmd, insecure
   remove: after, force, include, exclude
   serve: accesslog, daemon, daemon-pipefds, errorlog, port, address, prefix, name, web-conf, webdir-conf, pid-file, stdio, cmdserver, templates, style, ipv6, certificate