Patchwork [V2] convert: support incremental conversion with hg subrepos

login
register
mail settings
Submitter Matt Harbison
Date June 12, 2015, 4:12 p.m.
Message ID <386fabac4a15250e3700.1434125557@MATT7H-PC.attotech.com>
Download mbox | patch
Permalink /patch/9609/
State Accepted
Headers show

Comments

Matt Harbison - June 12, 2015, 4:12 p.m.
# HG changeset patch
# User Matt Harbison <matt_harbison@yahoo.com>
# Date 1432920334 14400
#      Fri May 29 13:25:34 2015 -0400
# Node ID 386fabac4a15250e3700cfd6505a030be662c2d2
# Parent  c0995cd8ff6fdc44ff20835e005771f08452a353
convert: support incremental conversion with hg subrepos

This was implied in issue3486, which specifically asked for subrepo support in
lfconvert.  Now that lfconvert uses the convert extension internally when going
to normal files, the issue is half fixed.  But now even non largefile repos
benefit when other transformations are needed.

Supporting a full subrepo tree conversion from a single command doesn't seem
reasonable, given the number of options that can be provided, and the
transformations that would need to occur when entering a subrepo (consider
'filemap' paths).  Instead, this allows the user to incrementally convert each
hg subrepo from bottom up like so:

  # so convert knows the dest type when it sees a non empty dest dir
  $ hg init converted

  $ hg convert orig/sub1 converted/sub1
  $ hg convert orig/sub2 converted/sub2
  $ hg convert orig converted

This allows different options to be applied to different subrepos more readily.
It assumes the shamap is in the default location in each converted subrepo for
simplicity.  It also allows for a subrepo to be cloned into place, in case _it_
doesn't need a conversion.  I was able to convert away from using
largefiles/bfiles in several subrepos with this mechanism.
Augie Fackler - June 12, 2015, 10:06 p.m.
On Fri, Jun 12, 2015 at 12:12:37PM -0400, Matt Harbison wrote:
> # HG changeset patch
> # User Matt Harbison <matt_harbison@yahoo.com>
> # Date 1432920334 14400
> #      Fri May 29 13:25:34 2015 -0400
> # Node ID 386fabac4a15250e3700cfd6505a030be662c2d2
> # Parent  c0995cd8ff6fdc44ff20835e005771f08452a353
> convert: support incremental conversion with hg subrepos

Queued this, many thanks.

>
> This was implied in issue3486, which specifically asked for subrepo support in
> lfconvert.  Now that lfconvert uses the convert extension internally when going
> to normal files, the issue is half fixed.  But now even non largefile repos
> benefit when other transformations are needed.
>
> Supporting a full subrepo tree conversion from a single command doesn't seem
> reasonable, given the number of options that can be provided, and the
> transformations that would need to occur when entering a subrepo (consider
> 'filemap' paths).  Instead, this allows the user to incrementally convert each
> hg subrepo from bottom up like so:
>
>   # so convert knows the dest type when it sees a non empty dest dir
>   $ hg init converted
>
>   $ hg convert orig/sub1 converted/sub1
>   $ hg convert orig/sub2 converted/sub2
>   $ hg convert orig converted
>
> This allows different options to be applied to different subrepos more readily.
> It assumes the shamap is in the default location in each converted subrepo for
> simplicity.  It also allows for a subrepo to be cloned into place, in case _it_
> doesn't need a conversion.  I was able to convert away from using
> largefiles/bfiles in several subrepos with this mechanism.
>
> diff --git a/hgext/convert/__init__.py b/hgext/convert/__init__.py
> --- a/hgext/convert/__init__.py
> +++ b/hgext/convert/__init__.py
> @@ -328,6 +328,23 @@ def convert(ui, src, dest=None, revmapfi
>      Mercurial Destination
>      #####################
>
> +    The Mercurial destination will recognize Mercurial subrepositories in the
> +    destination directory, and update the .hgsubstate file automatically if the
> +    destination subrepositories contain the <dest>/<sub>/.hg/shamap file.
> +    Converting a repository with subrepositories requires converting a single
> +    repository at a time, from the bottom up.
> +
> +    .. container:: verbose
> +
> +       An example showing how to convert a repository with subrepositories::
> +
> +         # so convert knows the type when it sees a non empty destination
> +         $ hg init converted
> +
> +         $ hg convert orig/sub1 converted/sub1
> +         $ hg convert orig/sub2 converted/sub2
> +         $ hg convert orig converted
> +
>      The following options are supported:
>
>      :convert.hg.clonebranches: dispatch source branches in separate
> diff --git a/hgext/convert/hg.py b/hgext/convert/hg.py
> --- a/hgext/convert/hg.py
> +++ b/hgext/convert/hg.py
> @@ -23,7 +23,7 @@ from mercurial.i18n import _
>  from mercurial.node import bin, hex, nullid
>  from mercurial import hg, util, context, bookmarks, error, scmutil, exchange
>
> -from common import NoRepo, commit, converter_source, converter_sink
> +from common import NoRepo, commit, converter_source, converter_sink, mapfile
>
>  import re
>  sha1re = re.compile(r'\b[0-9a-f]{12,40}\b')
> @@ -59,6 +59,7 @@ class mercurial_sink(converter_sink):
>          self.lock = None
>          self.wlock = None
>          self.filemapmode = False
> +        self.subrevmaps = {}
>
>      def before(self):
>          self.ui.debug('run hg sink pre-conversion action\n')
> @@ -135,6 +136,45 @@ class mercurial_sink(converter_sink):
>              fp.write('%s %s\n' % (revid, s[1]))
>          return fp.getvalue()
>
> +    def _rewritesubstate(self, source, data):
> +        fp = cStringIO.StringIO()
> +        for line in data.splitlines():
> +            s = line.split(' ', 1)
> +            if len(s) != 2:
> +                continue
> +
> +            revid = s[0]
> +            subpath = s[1]
> +            if revid != hex(nullid):
> +                revmap = self.subrevmaps.get(subpath)
> +                if revmap is None:
> +                    revmap = mapfile(self.ui,
> +                                     self.repo.wjoin(subpath, '.hg/shamap'))
> +                    self.subrevmaps[subpath] = revmap
> +
> +                    # It is reasonable that one or more of the subrepos don't
> +                    # need to be converted, in which case they can be cloned
> +                    # into place instead of converted.  Therefore, only warn
> +                    # once.
> +                    msg = _('no ".hgsubstate" updates will be made for "%s"\n')
> +                    if len(revmap) == 0:
> +                        sub = self.repo.wvfs.reljoin(subpath, '.hg')
> +
> +                        if self.repo.wvfs.exists(sub):
> +                            self.ui.warn(msg % subpath)
> +
> +                newid = revmap.get(revid)
> +                if not newid:
> +                    if len(revmap) > 0:
> +                        self.ui.warn(_("%s is missing from %s/.hg/shamap\n") %
> +                                     (revid, subpath))
> +                else:
> +                    revid = newid
> +
> +            fp.write('%s %s\n' % (revid, subpath))
> +
> +        return fp.getvalue()
> +
>      def putcommit(self, files, copies, parents, commit, source, revmap, full,
>                    cleanp2):
>          files = dict(files)
> @@ -152,6 +192,8 @@ class mercurial_sink(converter_sink):
>                  return None
>              if f == '.hgtags':
>                  data = self._rewritetags(source, revmap, data)
> +            if f == '.hgsubstate':
> +                data = self._rewritesubstate(source, data)
>              return context.memfilectx(self.repo, f, data, 'l' in mode,
>                                        'x' in mode, copies.get(f))
>
> diff --git a/tests/test-convert.t b/tests/test-convert.t
> --- a/tests/test-convert.t
> +++ b/tests/test-convert.t
> @@ -279,6 +279,12 @@
>        Mercurial Destination
>        #####################
>
> +      The Mercurial destination will recognize Mercurial subrepositories in the
> +      destination directory, and update the .hgsubstate file automatically if
> +      the destination subrepositories contain the <dest>/<sub>/.hg/shamap file.
> +      Converting a repository with subrepositories requires converting a single
> +      repository at a time, from the bottom up.
> +
>        The following options are supported:
>
>        convert.hg.clonebranches
> diff --git a/tests/test-fileset.t b/tests/test-fileset.t
> --- a/tests/test-fileset.t
> +++ b/tests/test-fileset.t
> @@ -180,16 +180,54 @@ Test subrepo predicate
>    $ hg -R sub add sub/suba
>    $ hg -R sub ci -m sub
>    $ echo 'sub = sub' > .hgsub
> +  $ hg init sub2
> +  $ echo b > sub2/b
> +  $ hg -R sub2 ci -Am sub2
> +  adding b
> +  $ echo 'sub2 = sub2' >> .hgsub
>    $ fileset 'subrepo()'
>    $ hg add .hgsub
>    $ fileset 'subrepo()'
>    sub
> +  sub2
>    $ fileset 'subrepo("sub")'
>    sub
>    $ fileset 'subrepo("glob:*")'
>    sub
> +  sub2
>    $ hg ci -m subrepo
>
> +Test that .hgsubstate is updated as appropriate during a conversion.  The
> +saverev property is enough to alter the hashes of the subrepo.
> +
> +  $ hg init ../converted
> +  $ hg --config extensions.convert= convert --config convert.hg.saverev=True  \
> +  >      sub ../converted/sub
> +  initializing destination ../converted/sub repository
> +  scanning source...
> +  sorting...
> +  converting...
> +  0 sub
> +  $ hg clone -U sub2 ../converted/sub2
> +  $ hg --config extensions.convert= convert --config convert.hg.saverev=True  \
> +  >      . ../converted
> +  scanning source...
> +  sorting...
> +  converting...
> +  4 addfiles
> +  3 manychanges
> +  2 diverging
> +  1 merge
> +  0 subrepo
> +  no ".hgsubstate" updates will be made for "sub2"
> +  $ hg up -q -R ../converted -r tip
> +  $ hg --cwd ../converted cat sub/suba sub2/b -r tip
> +  a
> +  b
> +  $ oldnode=`hg log -r tip -T "{node}\n"`
> +  $ newnode=`hg log -R ../converted -r tip -T "{node}\n"`
> +  $ [[ "$oldnode" != "$newnode" ]] || echo "nothing changed"
> +
>  Test with a revision
>
>    $ hg log -G --template '{rev} {desc}\n'
> @@ -241,6 +279,7 @@ Test with a revision
>
>    $ fileset -r4 'subrepo("re:su.*")'
>    sub
> +  sub2
>    $ fileset -r4 'subrepo("sub")'
>    sub
>    $ fileset -r4 'b2 or c1'
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> https://selenic.com/mailman/listinfo/mercurial-devel

Patch

diff --git a/hgext/convert/__init__.py b/hgext/convert/__init__.py
--- a/hgext/convert/__init__.py
+++ b/hgext/convert/__init__.py
@@ -328,6 +328,23 @@  def convert(ui, src, dest=None, revmapfi
     Mercurial Destination
     #####################
 
+    The Mercurial destination will recognize Mercurial subrepositories in the
+    destination directory, and update the .hgsubstate file automatically if the
+    destination subrepositories contain the <dest>/<sub>/.hg/shamap file.
+    Converting a repository with subrepositories requires converting a single
+    repository at a time, from the bottom up.
+
+    .. container:: verbose
+
+       An example showing how to convert a repository with subrepositories::
+
+         # so convert knows the type when it sees a non empty destination
+         $ hg init converted
+
+         $ hg convert orig/sub1 converted/sub1
+         $ hg convert orig/sub2 converted/sub2
+         $ hg convert orig converted
+
     The following options are supported:
 
     :convert.hg.clonebranches: dispatch source branches in separate
diff --git a/hgext/convert/hg.py b/hgext/convert/hg.py
--- a/hgext/convert/hg.py
+++ b/hgext/convert/hg.py
@@ -23,7 +23,7 @@  from mercurial.i18n import _
 from mercurial.node import bin, hex, nullid
 from mercurial import hg, util, context, bookmarks, error, scmutil, exchange
 
-from common import NoRepo, commit, converter_source, converter_sink
+from common import NoRepo, commit, converter_source, converter_sink, mapfile
 
 import re
 sha1re = re.compile(r'\b[0-9a-f]{12,40}\b')
@@ -59,6 +59,7 @@  class mercurial_sink(converter_sink):
         self.lock = None
         self.wlock = None
         self.filemapmode = False
+        self.subrevmaps = {}
 
     def before(self):
         self.ui.debug('run hg sink pre-conversion action\n')
@@ -135,6 +136,45 @@  class mercurial_sink(converter_sink):
             fp.write('%s %s\n' % (revid, s[1]))
         return fp.getvalue()
 
+    def _rewritesubstate(self, source, data):
+        fp = cStringIO.StringIO()
+        for line in data.splitlines():
+            s = line.split(' ', 1)
+            if len(s) != 2:
+                continue
+
+            revid = s[0]
+            subpath = s[1]
+            if revid != hex(nullid):
+                revmap = self.subrevmaps.get(subpath)
+                if revmap is None:
+                    revmap = mapfile(self.ui,
+                                     self.repo.wjoin(subpath, '.hg/shamap'))
+                    self.subrevmaps[subpath] = revmap
+
+                    # It is reasonable that one or more of the subrepos don't
+                    # need to be converted, in which case they can be cloned
+                    # into place instead of converted.  Therefore, only warn
+                    # once.
+                    msg = _('no ".hgsubstate" updates will be made for "%s"\n')
+                    if len(revmap) == 0:
+                        sub = self.repo.wvfs.reljoin(subpath, '.hg')
+
+                        if self.repo.wvfs.exists(sub):
+                            self.ui.warn(msg % subpath)
+
+                newid = revmap.get(revid)
+                if not newid:
+                    if len(revmap) > 0:
+                        self.ui.warn(_("%s is missing from %s/.hg/shamap\n") %
+                                     (revid, subpath))
+                else:
+                    revid = newid
+
+            fp.write('%s %s\n' % (revid, subpath))
+
+        return fp.getvalue()
+
     def putcommit(self, files, copies, parents, commit, source, revmap, full,
                   cleanp2):
         files = dict(files)
@@ -152,6 +192,8 @@  class mercurial_sink(converter_sink):
                 return None
             if f == '.hgtags':
                 data = self._rewritetags(source, revmap, data)
+            if f == '.hgsubstate':
+                data = self._rewritesubstate(source, data)
             return context.memfilectx(self.repo, f, data, 'l' in mode,
                                       'x' in mode, copies.get(f))
 
diff --git a/tests/test-convert.t b/tests/test-convert.t
--- a/tests/test-convert.t
+++ b/tests/test-convert.t
@@ -279,6 +279,12 @@ 
       Mercurial Destination
       #####################
   
+      The Mercurial destination will recognize Mercurial subrepositories in the
+      destination directory, and update the .hgsubstate file automatically if
+      the destination subrepositories contain the <dest>/<sub>/.hg/shamap file.
+      Converting a repository with subrepositories requires converting a single
+      repository at a time, from the bottom up.
+  
       The following options are supported:
   
       convert.hg.clonebranches
diff --git a/tests/test-fileset.t b/tests/test-fileset.t
--- a/tests/test-fileset.t
+++ b/tests/test-fileset.t
@@ -180,16 +180,54 @@  Test subrepo predicate
   $ hg -R sub add sub/suba
   $ hg -R sub ci -m sub
   $ echo 'sub = sub' > .hgsub
+  $ hg init sub2
+  $ echo b > sub2/b
+  $ hg -R sub2 ci -Am sub2
+  adding b
+  $ echo 'sub2 = sub2' >> .hgsub
   $ fileset 'subrepo()'
   $ hg add .hgsub
   $ fileset 'subrepo()'
   sub
+  sub2
   $ fileset 'subrepo("sub")'
   sub
   $ fileset 'subrepo("glob:*")'
   sub
+  sub2
   $ hg ci -m subrepo
 
+Test that .hgsubstate is updated as appropriate during a conversion.  The
+saverev property is enough to alter the hashes of the subrepo.
+
+  $ hg init ../converted
+  $ hg --config extensions.convert= convert --config convert.hg.saverev=True  \
+  >      sub ../converted/sub
+  initializing destination ../converted/sub repository
+  scanning source...
+  sorting...
+  converting...
+  0 sub
+  $ hg clone -U sub2 ../converted/sub2
+  $ hg --config extensions.convert= convert --config convert.hg.saverev=True  \
+  >      . ../converted
+  scanning source...
+  sorting...
+  converting...
+  4 addfiles
+  3 manychanges
+  2 diverging
+  1 merge
+  0 subrepo
+  no ".hgsubstate" updates will be made for "sub2"
+  $ hg up -q -R ../converted -r tip
+  $ hg --cwd ../converted cat sub/suba sub2/b -r tip
+  a
+  b
+  $ oldnode=`hg log -r tip -T "{node}\n"`
+  $ newnode=`hg log -R ../converted -r tip -T "{node}\n"`
+  $ [[ "$oldnode" != "$newnode" ]] || echo "nothing changed"
+
 Test with a revision
 
   $ hg log -G --template '{rev} {desc}\n'
@@ -241,6 +279,7 @@  Test with a revision
 
   $ fileset -r4 'subrepo("re:su.*")'
   sub
+  sub2
   $ fileset -r4 'subrepo("sub")'
   sub
   $ fileset -r4 'b2 or c1'