Patchwork [V3,STABLE] amend: fix amending rename commit with diverged topologies (issue4405)

login
register
mail settings
Submitter Ryan McElroy
Date Oct. 22, 2014, 6:53 p.m.
Message ID <9cfb5bd8d8fa729cd180.1414004037@devbig105.prn2.facebook.com>
Download mbox | patch
Permalink /patch/6442/
State Accepted
Headers show

Comments

Ryan McElroy - Oct. 22, 2014, 6:53 p.m.
# HG changeset patch
# User Ryan McElroy <rmcelroy@fb.com>
# Date 1413466506 25200
#      Thu Oct 16 06:35:06 2014 -0700
# Node ID 9cfb5bd8d8fa729cd1806574d53b9e0e8bfc605a
# Parent  48c0b101a9de1fdbd638daa858da845cd05a6be7
amend: fix amending rename commit with diverged topologies (issue4405)

This addresses the bug described in issue4405: when obsolescence markers are
enabled, amending a commit with a file move can lead to the copy information
being lost.

However, the bug is more general and can be reproduced without obsmarkers as
well, as demonstracted by Pierre-Yves and put into the updated test.
Specifically, graph topology divergences between the filelogs and the changelog
can cause copy information to be lost during amends.
Matt Mackall - Oct. 22, 2014, 10:36 p.m.
On Wed, 2014-10-22 at 11:53 -0700, Ryan McElroy wrote:
> # HG changeset patch
> # User Ryan McElroy <rmcelroy@fb.com>
> # Date 1413466506 25200
> #      Thu Oct 16 06:35:06 2014 -0700
> # Node ID 9cfb5bd8d8fa729cd1806574d53b9e0e8bfc605a
> # Parent  48c0b101a9de1fdbd638daa858da845cd05a6be7
> amend: fix amending rename commit with diverged topologies (issue4405)

Queued for stable, thanks.

Patch

diff --git a/mercurial/copies.py b/mercurial/copies.py
--- a/mercurial/copies.py
+++ b/mercurial/copies.py
@@ -19,9 +19,15 @@ 
     return f[:s]
 
 def _findlimit(repo, a, b):
-    """Find the earliest revision that's an ancestor of a or b but not both,
+    """
+    Find the last revision that needs to be checked to ensure that a full
+    transitive closure for file copies can be properly calculated.
+    Generally, this means finding the earliest revision number that's an
+    ancestor of a or b but not both, except when a or b is a direct descendent
+    of the other, in which case we can return the minimum revnum of a and b.
     None if no such revision exists.
     """
+
     # basic idea:
     # - mark a and b with different sides
     # - if a parent's children are all on the same side, the parent is
@@ -73,7 +79,29 @@ 
 
     if not hascommonancestor:
         return None
-    return limit
+
+    # Consider the following flow (see test-commit-amend.t under issue4405):
+    # 1/ File 'a0' committed
+    # 2/ File renamed from 'a0' to 'a1' in a new commit (call it 'a1')
+    # 3/ Move back to first commit
+    # 4/ Create a new commit via revert to contents of 'a1' (call it 'a1-amend')
+    # 5/ Rename file from 'a1' to 'a2' and commit --amend 'a1-msg'
+    #
+    # During the amend in step five, we will be in this state:
+    #
+    # @  3 temporary amend commit for a1-amend
+    # |
+    # o  2 a1-amend
+    # |
+    # | o  1 a1
+    # |/
+    # o  0 a0
+    #
+    # When findlimit is called, a and b are revs 3 and 0, so limit will be 2,
+    # yet the filelog has the copy information in rev 1 and we will not look
+    # back far enough unless we also look at the a and b as candidates.
+    # This only occurs when a is a descendent of b or visa-versa.
+    return min(limit, a, b)
 
 def _chain(src, dst, a, b):
     '''chain two sets of copies a->b'''
diff --git a/tests/test-commit-amend.t b/tests/test-commit-amend.t
--- a/tests/test-commit-amend.t
+++ b/tests/test-commit-amend.t
@@ -856,3 +856,61 @@ 
   HG: changed obs.py
   $ hg parents --template "{desc}\n"
   editor should be invoked
+
+Check for issue4405
+-------------------
+
+Setup the repo with a file that gets moved in a second commit.
+  $ hg init repo
+  $ cd repo
+  $ touch a0
+  $ hg add a0
+  $ hg commit -m a0
+  $ hg mv a0 a1
+  $ hg commit -m a1
+  $ hg up -q 0
+  $ hg log -G --template '{rev} {desc}'
+  o  1 a1
+  |
+  @  0 a0
+  
+
+Now we branch the repro, but re-use the file contents, so we have a divergence
+in the file revlog topology and the changelog topology.
+  $ hg revert --rev 1 --all
+  removing a0
+  adding a1
+  $ hg ci -qm 'a1-amend'
+  $ hg log -G --template '{rev} {desc}'
+  @  2 a1-amend
+  |
+  | o  1 a1
+  |/
+  o  0 a0
+  
+
+The way mercurial does amends is to create a temporary commit (rev 3) and then
+fold the new and old commits together into another commit (rev 4). During this
+process, findlimit is called to check how far back to look for the transitive
+closure of file copy information, but due to the divergence of the filelog
+and changelog graph topologies, before findlimit was fixed, it returned a rev
+which was not far enough back in this case.
+  $ hg mv a1 a2
+  $ hg status --copies --rev 0
+  A a2
+    a0
+  R a0
+  $ hg ci --amend -q
+  $ hg log -G --template '{rev} {desc}'
+  @  4 a1-amend
+  |
+  | o  1 a1
+  |/
+  o  0 a0
+  
+
+Before the fix, the copy information was lost.
+  $ hg status --copies --rev 0
+  A a2
+    a0
+  R a0