Submitter | Siddharth Agarwal |
---|---|
Date | Dec. 21, 2012, 9:08 p.m. |
Message ID | <af618ad15aa529d53062.1356124103@sid0x220> |
Download | mbox | patch |
Permalink | /patch/264/ |
State | Accepted |
Commit | f23dea2b296e0ec2845fc27746dad6341fa2e338 |
Headers | show |
Comments
On Fri, 2012-12-21 at 13:08 -0800, Siddharth Agarwal wrote: > # HG changeset patch > # User Siddharth Agarwal <sid0 at fb.com> > # Date 1356123985 28800 > # Node ID af618ad15aa529d53062f7cf73d45f539f72ebfd > # Parent 554d6cba20701d6b44870a79389cc89812173aed > copies: do not track backward copies, only renames (issue3739) > > The inverse of a rename is a rename back, but the inverse of a copy is a > remove, not a "copy back". Strictly speaking, it's not this simple. Insofar as a rename is "simply" a copy+remove, there is a real sense that a copy does have an inverse. For instance, consider this scenario: changeset 1: create a changeset 2: copy a to b changeset 3: delete a Now we can clearly say what the following diffs should look like: 1->2: show a copy 2->3: show a delete 1->3 = 1->2 + 2->3: show a rename And we can even say what some of the reverse diffs should look like: 3->1: show a rename(!) 3->2: show a file add But what should 2->1 look like? If it doesn't mention some kind of relation between a and b, we've lost information relative to the the forward diffs, nor can we construct 3->2 + 2->1 = 3->1. So we've got some sort of relation here (let's call it a 'ypoc') such that 'add' + 'ypoc' = 'rename' in the same way that 'copy' + 'remove' = 'rename'. And as it happens, ypocs are actually a real thing in Mercurial, though we admittedly don't handle them very well, nor do we have any way to display them. A ypoc exists in the following not entirely uncommon situation: 1-2-4 \ / 3 1: create a 2: copy a to b 3: modify a 4: merge Here, merge should merge the changes of a into b and record both a and b as ancestors for the result[1], giving us an 2:1 relationship that's the inverse of a typical 1:n copy. If we then say "we don't really want a" and commit that removal with the merge, we've got a genuine ypoc. In other words, a ypoc is a cross-file merge where the 'other' file doesn't exist afterwards. And that actually suggests that copy itself not really the fundamental relationship, because you can have all of the following topologies: copy ypoc rename rename ??? ??? a-a a-a a a a-a a-a \ / \ / / \ b b b b b-b b-b (the ??? case comes up naturally in merges involving copies as we've seen above) So really, what we've got is a 'cross-filename inheritance relation' and depending on the topology, we can think of it as a copy, rename, ypoc, etc. In theory, pathcopies actually conceptually cares about this. Graft and rebase care about a topology that looks like this: o-o-o-o-x-o-o-o-o-a \ o-o-b where pathcopies(a,b) = pathcopies(a,x) + pathcopies(x,b). If there's a 'add' on the a->x path (aka a removal on x->a) and a 'ypoc' on the x->b path, then we may have the equivalent of a rename on the joined path. But again, we handle these things very badly at present, which is not surprising given how mind-warping it is: - we can't display ypocs or ???s usefully (though they matter for annotate!) - we probably don't/can't record some of these cases properly - pathcopies and mergecopies are not capable of representing them - we get confused by them when doing graft/rebase As you've seen, pathcopies uses a dict, so it can only represent 1:n relationships, and not n:1 ones, nor n:m, which can also exist! But as you've also seen, we actually want to get some of this sort of information out of at least mergecopies, hence the multiple dictionaries that are returned. For now, you're probably right that whatever the correct and complete semantics should be, what _backwardsrename is doing is wrong-ish as a ypoc is definitely not a copy, nor do we say we're returning 'cross-file inheritance relations'. You also may have discovered that mergecopies does not yet properly handle the idea that the given ancestor may not actually be a topological ancestor (as needed by rebase/graft). At some point, I want to rationalize things so that mergecopies is defined in terms of pathcopies so that it handles backwards legs, and maybe eventually have a well-defined theory for what should happen with the reverse-copy-like cases beyond 'punt'. All of this is a long way of saying that I don't think your commit comment is correct. [1] See http://www.selenic.com/hg/file/98687cdddcb1/mercurial/localrepo.py#l1026
On 12/21/2012 03:42 PM, Matt Mackall wrote: > All of this is a long way of saying that I don't think your commit > comment is correct. Thanks for the correction. I agree that "ypoc" is something we should eventually handle, perhaps with a richer data structure that supports the full range of possibilities. What do you think of keeping the code the same, but updating the commit message to: copies: do not track backward copies, only renames (issue3739) The inverse of a rename is a rename, but the inverse of a copy is not a copy. Presenting it as such -- in particular, stuffing it into the same dict as real copies -- causes bugs because other code starts believing the inverse copies are real. The only test whose output changes is test-mv-cp-st-diff.t. When a backwards status -C command is run where a copy is involved, the inverse copy (which was hitherto presented as a real copy) is no longer displayed. Keeping track of inverse copies is useful in some situations -- composability of diffs, for example, since a delete of "b" followed by an inverse copy "b" to "a" is equivalent to a rename "b" to "a". However, representing them would require a more complex data structure than the same dict in which real copies are also stored.
On 12/21/2012 07:50 PM, Siddharth Agarwal wrote: > Keeping track of inverse copies is useful in some situations -- > composability > of diffs, for example, since a delete of "b" followed by an inverse > copy "b" A delete of "a", not of "b".
On Fri, 2012-12-21 at 19:50 -0800, Siddharth Agarwal wrote: > On 12/21/2012 03:42 PM, Matt Mackall wrote: > > All of this is a long way of saying that I don't think your commit > > comment is correct. > > Thanks for the correction. I agree that "ypoc" is something we should > eventually handle, perhaps with a richer data structure that supports > the full range of possibilities. What do you think of keeping the code > the same, but updating the commit message to: > > copies: do not track backward copies, only renames (issue3739) > > The inverse of a rename is a rename, but the inverse of a copy is not a > copy. > Presenting it as such -- in particular, stuffing it into the same dict > as real > copies -- causes bugs because other code starts believing the inverse copies > are real. > > The only test whose output changes is test-mv-cp-st-diff.t. When a backwards > status -C command is run where a copy is involved, the inverse copy > (which was > hitherto presented as a real copy) is no longer displayed. > > Keeping track of inverse copies is useful in some situations -- > composability > of diffs, for example, since a delete of "b" followed by an inverse copy "b" > to "a" is equivalent to a rename "b" to "a". However, representing them > would > require a more complex data structure than the same dict in which real > copies > are also stored. Seems fine.
Patch
diff --git a/mercurial/copies.py b/mercurial/copies.py --- a/mercurial/copies.py +++ b/mercurial/copies.py @@ -145,12 +145,16 @@ def _forwardcopies(a, b): return cm -def _backwardcopies(a, b): - # because the forward mapping is 1:n, we can lose renames here - # in particular, we find renames better than copies +def _backwardrenames(a, b): + # Even though we're not taking copies into account, 1:n rename situations + # can still exist (e.g. hg cp a b; hg mv a c). In those cases we + # arbitrarily pick one of the renames. f = _forwardcopies(b, a) r = {} for k, v in f.iteritems(): + # remove copies + if v in a: + continue r[v] = k return r @@ -162,8 +166,8 @@ def pathcopies(x, y): if a == x: return _forwardcopies(x, y) if a == y: - return _backwardcopies(x, y) - return _chain(x, y, _backwardcopies(x, a), _forwardcopies(a, y)) + return _backwardrenames(x, y) + return _chain(x, y, _backwardrenames(x, a), _forwardcopies(a, y)) def mergecopies(repo, c1, c2, ca): """ diff --git a/tests/test-mv-cp-st-diff.t b/tests/test-mv-cp-st-diff.t --- a/tests/test-mv-cp-st-diff.t +++ b/tests/test-mv-cp-st-diff.t @@ -670,7 +670,6 @@ single copy % hg st -C --rev . --rev 0 M a - b R b % hg diff --git --rev . --rev 0 @@ -728,7 +727,6 @@ single copy % hg st -C --rev . --rev 2 M a - b A x/y R b @@ -1072,7 +1070,6 @@ copy chain % hg st -C --rev . --rev 0 M a - b R b R c @@ -1148,7 +1145,6 @@ copy chain % hg st -C --rev . --rev 2 M a - b A x/y R b R c diff --git a/tests/test-rebase-rename.t b/tests/test-rebase-rename.t --- a/tests/test-rebase-rename.t +++ b/tests/test-rebase-rename.t @@ -20,7 +20,10 @@ $ hg ci -Am B adding b - $ hg up -q -C 0 + $ hg mv b b-renamed + $ hg ci -m 'rename B' + + $ hg up -q -C 1 $ hg mv a a-renamed @@ -28,28 +31,32 @@ created new head $ hg tglog - @ 2: 'rename A' + @ 3: 'rename A' | - | o 1: 'B' + | o 2: 'rename B' |/ + o 1: 'B' + | o 0: 'A' Rename is tracked: $ hg tlog -p --git -r tip - 2: 'rename A' + 3: 'rename A' diff --git a/a b/a-renamed rename from a rename to a-renamed Rebase the revision containing the rename: - $ hg rebase -s 2 -d 1 + $ hg rebase -s 3 -d 2 saved backup bundle to $TESTTMP/a/.hg/strip-backup/*-backup.hg (glob) $ hg tglog - @ 2: 'rename A' + @ 3: 'rename A' + | + o 2: 'rename B' | o 1: 'B' | @@ -59,11 +66,32 @@ Rebase the revision containing the renam Rename is not lost: $ hg tlog -p --git -r tip - 2: 'rename A' + 3: 'rename A' diff --git a/a b/a-renamed rename from a rename to a-renamed + +Rebased revision does not contain information about b (issue3739) + + $ hg log -r 3 --debug + changeset: 3:3b905b1064f14ace3ad02353b79dd42d32981655 + tag: tip + phase: draft + parent: 2:920a371a5635af23a26a011ca346cecd1cfcb942 + parent: -1:0000000000000000000000000000000000000000 + manifest: 3:c4a62b2b64593c8fe0523d4c1ba2e243a8bd4dce + user: test + date: Thu Jan 01 00:00:00 1970 +0000 + files+: a-renamed + files-: a + extra: branch=default + extra: rebase_source=89af05cb38a281f891c6f5581dd027092da29166 + description: + rename A + + + $ cd .. @@ -78,47 +106,75 @@ Rename is not lost: $ hg ci -Am B adding b - $ hg up -q -C 0 + $ hg cp b b-copied + $ hg ci -Am 'copy B' + + $ hg up -q -C 1 $ hg cp a a-copied $ hg ci -m 'copy A' created new head $ hg tglog - @ 2: 'copy A' + @ 3: 'copy A' | - | o 1: 'B' + | o 2: 'copy B' |/ + o 1: 'B' + | o 0: 'A' Copy is tracked: $ hg tlog -p --git -r tip - 2: 'copy A' + 3: 'copy A' diff --git a/a b/a-copied copy from a copy to a-copied Rebase the revision containing the copy: - $ hg rebase -s 2 -d 1 + $ hg rebase -s 3 -d 2 saved backup bundle to $TESTTMP/b/.hg/strip-backup/*-backup.hg (glob) $ hg tglog - @ 2: 'copy A' + @ 3: 'copy A' + | + o 2: 'copy B' | o 1: 'B' | o 0: 'A' + Copy is not lost: $ hg tlog -p --git -r tip - 2: 'copy A' + 3: 'copy A' diff --git a/a b/a-copied copy from a copy to a-copied + +Rebased revision does not contain information about b (issue3739) + + $ hg log -r 3 --debug + changeset: 3:98f6e6dbf45ab54079c2237fbd11066a5c41a11d + tag: tip + phase: draft + parent: 2:39e588434882ff77d01229d169cdc77f29e8855e + parent: -1:0000000000000000000000000000000000000000 + manifest: 3:2232f329d66fffe3930d43479ae624f66322b04d + user: test + date: Thu Jan 01 00:00:00 1970 +0000 + files+: a-copied + extra: branch=default + extra: rebase_source=0a8162ff18a8900df8df8ef7ac0046955205613e + description: + copy A + + + $ cd ..