Patchwork [3,of,3] convert: fix convert dropping p2 contents during filemap merge

login
register
mail settings
Submitter Durham Goode
Date Aug. 15, 2015, 9:11 p.m.
Message ID <70d9adcbe24113713282.1439673078@dev2000.prn2.facebook.com>
Download mbox | patch
Permalink /patch/10214/
State Accepted
Delegated to: Augie Fackler
Headers show

Comments

Durham Goode - Aug. 15, 2015, 9:11 p.m.
# HG changeset patch
# User Durham Goode <durham@fb.com>
# Date 1439590967 25200
#      Fri Aug 14 15:22:47 2015 -0700
# Branch stable
# Node ID 70d9adcbe24113713282836e3d678cf89784015d
# Parent  7f5f25699b524f730664993416601a1df3ba9b9f
convert: fix convert dropping p2 contents during filemap merge

When converting a merge commit using a filemap convert (i.e. when moving
contents from the root of the repo into subdir1/), convert would silently drop
the entire contents of the target repo's p2. This was because when it built the
target commit, it did so by taking the target p1 and adding only the files that
changed in the source repo's merge commit.

This breaks in the case where the target repo has files that are unrelated to
the source repo (like in the case where you use convert to import a repo as a
subdirectory of another).

The fix is to use Mercurial's merge logic to detect which files in p2 we should
carry over to the merge. It follows three rules:

1) if the file belongs to the source, don't try to merge it. Rely on the list of
files provided to putcommit to be correct.

2) if the file requires merging or user input (change vs deleted), throw an
exception. We don't have enough info to do this.

3) if p2 has the newest, non-merge-requiring version of the file, take it

I've also added a test to cover this issue.
Augie Fackler - Aug. 17, 2015, 6:23 p.m.
On Sat, Aug 15, 2015 at 02:11:18PM -0700, Durham Goode wrote:
> # HG changeset patch
> # User Durham Goode <durham@fb.com>
> # Date 1439590967 25200
> #      Fri Aug 14 15:22:47 2015 -0700
> # Branch stable
> # Node ID 70d9adcbe24113713282836e3d678cf89784015d
> # Parent  7f5f25699b524f730664993416601a1df3ba9b9f
> convert: fix convert dropping p2 contents during filemap merge

Looks like a strict improvement. Might be nice to have a "known-broken" test for the case where there are include/exclude lines, but I don't feel that strongly about it since we're already doing strictly better than without this series. Thanks!

(Queued!)

>
> When converting a merge commit using a filemap convert (i.e. when moving
> contents from the root of the repo into subdir1/), convert would silently drop
> the entire contents of the target repo's p2. This was because when it built the
> target commit, it did so by taking the target p1 and adding only the files that
> changed in the source repo's merge commit.
>
> This breaks in the case where the target repo has files that are unrelated to
> the source repo (like in the case where you use convert to import a repo as a
> subdirectory of another).
>
> The fix is to use Mercurial's merge logic to detect which files in p2 we should
> carry over to the merge. It follows three rules:
>
> 1) if the file belongs to the source, don't try to merge it. Rely on the list of
> files provided to putcommit to be correct.
>
> 2) if the file requires merging or user input (change vs deleted), throw an
> exception. We don't have enough info to do this.
>
> 3) if p2 has the newest, non-merge-requiring version of the file, take it
>
> I've also added a test to cover this issue.
>
> diff --git a/hgext/convert/hg.py b/hgext/convert/hg.py
> --- a/hgext/convert/hg.py
> +++ b/hgext/convert/hg.py
> @@ -23,6 +23,7 @@ from mercurial.i18n import _
>  from mercurial.node import bin, hex, nullid
>  from mercurial import hg, util, context, bookmarks, error, scmutil, exchange
>  from mercurial import phases
> +from mercurial import merge as mergemod
>
>  from common import NoRepo, commit, converter_source, converter_sink, mapfile
>
> @@ -176,12 +177,51 @@ class mercurial_sink(converter_sink):
>
>          return fp.getvalue()
>
> +    def _calculatemergedfiles(self, source, p1ctx, p2ctx):
> +        """Calculates the files from p2 that we need to pull in when merging p1
> +        and p2, given that the merge is coming from the given source.
> +
> +        This prevents us from losing files that only exist in the target p2 and
> +        that don't come from the source repo (like if you're merging multiple
> +        repositories together).
> +        """
> +        anc = [p1ctx.ancestor(p2ctx)]
> +        # Calculate what files are coming from p2
> +        actions, diverge, rename = mergemod.calculateupdates(
> +            self.repo, p1ctx, p2ctx, anc,
> +            True,  # branchmerge
> +            True,  # force
> +            False, # partial
> +            False, # acceptremote
> +            False, # followcopies
> +        )
> +
> +        for file, (action, info, msg) in actions.iteritems():
> +            if source.targetfilebelongstosource(file):
> +                # If the file belongs to the source repo, ignore the p2
> +                # since it will be covered by the existing fileset.
> +                continue
> +
> +            # If the file requires actual merging, abort. We don't have enough
> +            # context to resolve merges correctly.
> +            if action in ['m', 'dm', 'cd', 'dc']:
> +                raise util.Abort(_("unable to convert merge commit "
> +                    "since target parents do not merge cleanly (file "
> +                    "%s, parents %s and %s)") % (file, p1ctx,
> +                                                 p2ctx))
> +            elif action == 'k':
> +                # 'keep' means nothing changed from p1
> +                continue
> +            else:
> +                # Any other change means we want to take the p2 version
> +                yield file
> +
>      def putcommit(self, files, copies, parents, commit, source, revmap, full,
>                    cleanp2):
>          files = dict(files)
>
>          def getfilectx(repo, memctx, f):
> -            if p2ctx and f in cleanp2 and f not in copies:
> +            if p2ctx and f in p2files and f not in copies:
>                  self.ui.debug('reusing %s from p2\n' % f)
>                  return p2ctx[f]
>              try:
> @@ -255,6 +295,7 @@ class mercurial_sink(converter_sink):
>          while parents:
>              p1 = p2
>              p2 = parents.pop(0)
> +            p1ctx = self.repo[p1]
>              p2ctx = None
>              if p2 != nullid:
>                  p2ctx = self.repo[p2]
> @@ -262,6 +303,13 @@ class mercurial_sink(converter_sink):
>              if full:
>                  fileset.update(self.repo[p1])
>                  fileset.update(self.repo[p2])
> +
> +            if p2ctx:
> +                p2files = set(cleanp2)
> +                for file in self._calculatemergedfiles(source, p1ctx, p2ctx):
> +                    p2files.add(file)
> +                    fileset.add(file)
> +
>              ctx = context.memctx(self.repo, (p1, p2), text, fileset,
>                                   getfilectx, commit.author, commit.date, extra)
>
> diff --git a/tests/test-convert-filemap.t b/tests/test-convert-filemap.t
> --- a/tests/test-convert-filemap.t
> +++ b/tests/test-convert-filemap.t
> @@ -671,3 +671,61 @@ test named branch pruning
>    |/
>    o  0:c334dc3be0da@default "add" files: a
>
> +  $ cd ..
> +
> +test converting merges into a repo that contains other files
> +
> +  $ hg init merge-test1
> +  $ cd merge-test1
> +  $ touch a && hg commit -Aqm a
> +  $ hg up -q null
> +  $ touch b && hg commit -Aqm b
> +  $ hg merge -q 0 && hg commit -qm merge
> +  $ cd ..
> +  $ hg init merge-test2
> +  $ cd merge-test2
> +  $ mkdir converted
> +  $ touch converted/a && hg commit -Aqm 'a'
> +  $ touch x && hg commit -Aqm 'x'
> +  $ cd ..
> +  $ hg log -G -T '{node}' -R merge-test1
> +  @    ea7c1a7ae9588677a715ce4f204cd89c28d5471f
> +  |\
> +  | o  d7486e00c6f1b633dcadc0582f78006d805c7a0f
> +  |
> +  o  3903775176ed42b1458a6281db4a0ccf4d9f287a
> +
> +  $ hg log -G -T '{node}' -R merge-test2
> +  @  34f1aa7da42559bae87920880b522d47b3ddbc0d
> +  |
> +  o  e01a12b07b4fdfd61ff90a2a1b4560a7a776f323
> +
> +- Build a shamap where the target converted/a is in on top of an unrelated
> +- change to 'x'. This simulates using convert to merge several repositories
> +- together.
> +  $ cat >> merge-test2/.hg/shamap <<EOF
> +  > 3903775176ed42b1458a6281db4a0ccf4d9f287a 34f1aa7da42559bae87920880b522d47b3ddbc0d
> +  > EOF
> +  $ cat >> merge-test-filemap <<EOF
> +  > rename . converted/
> +  > EOF
> +  $ hg convert --filemap merge-test-filemap merge-test1 merge-test2 --traceback
> +  scanning source...
> +  sorting...
> +  converting...
> +  1 b
> +  0 merge
> +  $ hg -R merge-test2 manifest -r tip
> +  converted/a
> +  converted/b
> +  x
> +  $ hg -R merge-test2 log -G -T '{node}\n{files % "{file}\n"}'
> +  o    4b5e2f0218d3442a0c14892b18685bf9c8059c4a
> +  |\
> +  | o  214325dd2e4cff981dcf00cb120cd39e1ea36dcc
> +  |    converted/b
> +  @  34f1aa7da42559bae87920880b522d47b3ddbc0d
> +  |  x
> +  o  e01a12b07b4fdfd61ff90a2a1b4560a7a776f323
> +     converted/a
> +
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> https://selenic.com/mailman/listinfo/mercurial-devel

Patch

diff --git a/hgext/convert/hg.py b/hgext/convert/hg.py
--- a/hgext/convert/hg.py
+++ b/hgext/convert/hg.py
@@ -23,6 +23,7 @@  from mercurial.i18n import _
 from mercurial.node import bin, hex, nullid
 from mercurial import hg, util, context, bookmarks, error, scmutil, exchange
 from mercurial import phases
+from mercurial import merge as mergemod
 
 from common import NoRepo, commit, converter_source, converter_sink, mapfile
 
@@ -176,12 +177,51 @@  class mercurial_sink(converter_sink):
 
         return fp.getvalue()
 
+    def _calculatemergedfiles(self, source, p1ctx, p2ctx):
+        """Calculates the files from p2 that we need to pull in when merging p1
+        and p2, given that the merge is coming from the given source.
+
+        This prevents us from losing files that only exist in the target p2 and
+        that don't come from the source repo (like if you're merging multiple
+        repositories together).
+        """
+        anc = [p1ctx.ancestor(p2ctx)]
+        # Calculate what files are coming from p2
+        actions, diverge, rename = mergemod.calculateupdates(
+            self.repo, p1ctx, p2ctx, anc,
+            True,  # branchmerge
+            True,  # force
+            False, # partial
+            False, # acceptremote
+            False, # followcopies
+        )
+
+        for file, (action, info, msg) in actions.iteritems():
+            if source.targetfilebelongstosource(file):
+                # If the file belongs to the source repo, ignore the p2
+                # since it will be covered by the existing fileset.
+                continue
+
+            # If the file requires actual merging, abort. We don't have enough
+            # context to resolve merges correctly.
+            if action in ['m', 'dm', 'cd', 'dc']:
+                raise util.Abort(_("unable to convert merge commit "
+                    "since target parents do not merge cleanly (file "
+                    "%s, parents %s and %s)") % (file, p1ctx,
+                                                 p2ctx))
+            elif action == 'k':
+                # 'keep' means nothing changed from p1
+                continue
+            else:
+                # Any other change means we want to take the p2 version
+                yield file
+
     def putcommit(self, files, copies, parents, commit, source, revmap, full,
                   cleanp2):
         files = dict(files)
 
         def getfilectx(repo, memctx, f):
-            if p2ctx and f in cleanp2 and f not in copies:
+            if p2ctx and f in p2files and f not in copies:
                 self.ui.debug('reusing %s from p2\n' % f)
                 return p2ctx[f]
             try:
@@ -255,6 +295,7 @@  class mercurial_sink(converter_sink):
         while parents:
             p1 = p2
             p2 = parents.pop(0)
+            p1ctx = self.repo[p1]
             p2ctx = None
             if p2 != nullid:
                 p2ctx = self.repo[p2]
@@ -262,6 +303,13 @@  class mercurial_sink(converter_sink):
             if full:
                 fileset.update(self.repo[p1])
                 fileset.update(self.repo[p2])
+
+            if p2ctx:
+                p2files = set(cleanp2)
+                for file in self._calculatemergedfiles(source, p1ctx, p2ctx):
+                    p2files.add(file)
+                    fileset.add(file)
+
             ctx = context.memctx(self.repo, (p1, p2), text, fileset,
                                  getfilectx, commit.author, commit.date, extra)
 
diff --git a/tests/test-convert-filemap.t b/tests/test-convert-filemap.t
--- a/tests/test-convert-filemap.t
+++ b/tests/test-convert-filemap.t
@@ -671,3 +671,61 @@  test named branch pruning
   |/
   o  0:c334dc3be0da@default "add" files: a
   
+  $ cd ..
+
+test converting merges into a repo that contains other files
+
+  $ hg init merge-test1
+  $ cd merge-test1
+  $ touch a && hg commit -Aqm a
+  $ hg up -q null
+  $ touch b && hg commit -Aqm b
+  $ hg merge -q 0 && hg commit -qm merge
+  $ cd ..
+  $ hg init merge-test2
+  $ cd merge-test2
+  $ mkdir converted
+  $ touch converted/a && hg commit -Aqm 'a'
+  $ touch x && hg commit -Aqm 'x'
+  $ cd ..
+  $ hg log -G -T '{node}' -R merge-test1
+  @    ea7c1a7ae9588677a715ce4f204cd89c28d5471f
+  |\
+  | o  d7486e00c6f1b633dcadc0582f78006d805c7a0f
+  |
+  o  3903775176ed42b1458a6281db4a0ccf4d9f287a
+  
+  $ hg log -G -T '{node}' -R merge-test2
+  @  34f1aa7da42559bae87920880b522d47b3ddbc0d
+  |
+  o  e01a12b07b4fdfd61ff90a2a1b4560a7a776f323
+  
+- Build a shamap where the target converted/a is in on top of an unrelated
+- change to 'x'. This simulates using convert to merge several repositories
+- together.
+  $ cat >> merge-test2/.hg/shamap <<EOF
+  > 3903775176ed42b1458a6281db4a0ccf4d9f287a 34f1aa7da42559bae87920880b522d47b3ddbc0d
+  > EOF
+  $ cat >> merge-test-filemap <<EOF
+  > rename . converted/
+  > EOF
+  $ hg convert --filemap merge-test-filemap merge-test1 merge-test2 --traceback
+  scanning source...
+  sorting...
+  converting...
+  1 b
+  0 merge
+  $ hg -R merge-test2 manifest -r tip
+  converted/a
+  converted/b
+  x
+  $ hg -R merge-test2 log -G -T '{node}\n{files % "{file}\n"}'
+  o    4b5e2f0218d3442a0c14892b18685bf9c8059c4a
+  |\
+  | o  214325dd2e4cff981dcf00cb120cd39e1ea36dcc
+  |    converted/b
+  @  34f1aa7da42559bae87920880b522d47b3ddbc0d
+  |  x
+  o  e01a12b07b4fdfd61ff90a2a1b4560a7a776f323
+     converted/a
+