Patchwork [1,of,2,v8] graft: support grafting across move/copy (issue4028)

login
register
mail settings
Submitter Gábor Stefanik
Date Aug. 26, 2016, 4:16 p.m.
Message ID <f32aa28298164aa38830.1472228191@waste.org>
Download mbox | patch
Permalink /patch/16452/
State Accepted
Headers show

Comments

Gábor Stefanik - Aug. 26, 2016, 4:16 p.m.
# HG changeset patch
# User Gábor Stefanik <gabor.stefanik@nng.com>
# Date 1472225958 -7200
#      Fri Aug 26 17:39:18 2016 +0200
# Node ID f32aa28298164aa38830e83369c57c9553c6ff08
# Parent  318e2b600b80e4ed3c6f37df46ec7544f60d4c0b
graft: support grafting across move/copy (issue4028)

Graft performs a merge with a false common ancestor, which must be taken into
account when tracking copies. Explicitly pass the real common ancestor in this
case, and track copies between the real and false common ancestors in reverse.

With this change, when grafting a commit with a change to a file moved earlier
on the graft's source branch, the change is merged as expected into the original
(unmoved) file, rather than recreating it under its new name.
It should also make it possible to eventually enable cross-branch updates with
merge.

v2: handle the case when target branch also has a rename
v3: address review comments
v4: move ancestry check to mergecopies, split tests to separate commit
v5: split out parameter change, address review comments, re-include tests
v6: fix reversed logic
v7: partial rewrite to correctly handle divergent renames, address review
    comments, adjust more tests, and polish up some edge cases
v8: fix some delete-after-rename scenarios
Yuya Nishihara - Sept. 11, 2016, 4:28 p.m.
On Fri, 26 Aug 2016 11:16:31 -0500, Gábor Stefanik wrote:
> # HG changeset patch
> # User Gábor Stefanik <gabor.stefanik@nng.com>
> # Date 1472225958 -7200
> #      Fri Aug 26 17:39:18 2016 +0200
> # Node ID f32aa28298164aa38830e83369c57c9553c6ff08
> # Parent  318e2b600b80e4ed3c6f37df46ec7544f60d4c0b
> graft: support grafting across move/copy (issue4028)

Sorry for late reply, and I don't fully understand how this patch handles
reverse copy tracking correctly yet. But weekend is over, so I just wrote
random comments.

> +    # In certain scenarios (e.g. graft, update or rebase), ca can be overridden
> +    # We still need to know a real common ancestor in this case
> +    # We can't just compute _c1.ancestor(_c2) and compare it to ca, because
> +    # there can be multiple common ancestors, e.g. in case of bidmerge.
> +    cta = ca
> +    # ca.descendant(wc) and ca.descendant(ca) are False, work around that
> +    _c1 = c1.p1() if c1.rev() is None else c1
> +    _c2 = c2.p1() if c2.rev() is None else c2
> +    dirty_c1 = not (ca == _c1 or ca.descendant(_c1))
> +    dirty_c2 = not (ca == _c2 or ca.descendant(_c2))
> +    graft = dirty_c1 or dirty_c2
> +    if graft:
> +        cta = _c1.ancestor(_c2)

Can we know if we're doing graft-type merge beforehand? I think ca.descendant()
isn't fast so it should be avoided for normal merges.

>      # find interesting file sets from manifests
> +    if graft:
> +        repo.ui.debug("  computing unmatched files in rotated DAG\n")
>      addedinm1 = m1.filesnotin(ma)
>      addedinm2 = m2.filesnotin(ma)
> -    u1, u2 = _computenonoverlap(repo, c1, c2, addedinm1, addedinm2)
> +    _u1, _u2 = _computenonoverlap(repo, c1, c2, addedinm1, addedinm2)
> +    if not graft:
> +        u1, u2 = _u1, _u2
> +    else: # need to recompute this for directory move handling when grafting
> +        repo.ui.debug("  computing unmatched files in unrotated DAG\n")
> +        u1, u2 = _computenonoverlap(repo, c1, c2, m1.filesnotin(mta),
> +                                                  m2.filesnotin(mta))
> +
>      bothnew = sorted(addedinm1 & addedinm2)
>  
>      for f in u1:
> -        checkcopies(c1, f, m1, m2, ca, limit, diverge, copy1, fullcopy1)
> +        checkcopies(c1, f, m1, m2, ca, cta, dirty_c1, limit, diverge, copy1,
> +                    fullcopy1, incomplete1, incompletediverge)
>  
>      for f in u2:
> -        checkcopies(c2, f, m2, m1, ca, limit, diverge, copy2, fullcopy2)
> +        checkcopies(c2, f, m2, m1, ca, cta, dirty_c2, limit, diverge, copy2,
> +                    fullcopy2, incomplete2, incompletediverge)

[snip]

> +    # combine partial copy paths discovered in the previous step
> +    copyfrom, copyto = incomplete1, incomplete2
> +    if dirty_c1:
> +        copyfrom, copyto = incomplete2, incomplete1
> +    for f in copyfrom:
> +        if f in copyto:
> +            copy[copyto[f]] = copyfrom[f]
> +            del copyto[f]
> +    for f in incompletediverge:
> +        assert f not in diverge
> +        ic = incompletediverge[f]
> +        if ic[0] in copyto:
> +            diverge[f] = [copyto[ic[0]], ic[1]]

According to Matt's comment, we need two copy traces split at 'ca', but we use
ones split at 'cta' (and somewhat 'ca' is taken into account?), because it
wouldn't be easy to track copies backwards.

https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-August/086915.html

I guess that are addressed these "incomplete" dicts and "if"s in checkcopies(),
but I wonder if there are unhandled cases such as non-linear DAG to be rotated,
which might include "ypoc". I'm not sure, though.
Gábor Stefanik - Sept. 12, 2016, 10:13 a.m.
Hi,

>



--------------------------------------------------------------------------
This message, including its attachments, is confidential. For more information please read NNG's email policy here:
http://www.nng.com/emailpolicy/
By responding to this email you accept the email policy.


-----Original Message-----
> From: Yuya Nishihara [mailto:youjah@gmail.com] On Behalf Of Yuya

> Nishihara

> Sent: Sunday, September 11, 2016 6:29 PM

> To: Gábor STEFANIK <Gabor.STEFANIK@nng.com>

> Cc: mercurial-devel@mercurial-scm.org

> Subject: Re: [PATCH 1 of 2 v8] graft: support grafting across move/copy

> (issue4028)

>

> On Fri, 26 Aug 2016 11:16:31 -0500, Gábor Stefanik wrote:

> > # HG changeset patch

> > # User Gábor Stefanik <gabor.stefanik@nng.com> # Date 1472225958 -7200

> > #      Fri Aug 26 17:39:18 2016 +0200

> > # Node ID f32aa28298164aa38830e83369c57c9553c6ff08

> > # Parent  318e2b600b80e4ed3c6f37df46ec7544f60d4c0b

> > graft: support grafting across move/copy (issue4028)

>

> Sorry for late reply, and I don't fully understand how this patch handles

> reverse copy tracking correctly yet. But weekend is over, so I just wrote

> random comments.

>

> > +    # In certain scenarios (e.g. graft, update or rebase), ca can be

> overridden

> > +    # We still need to know a real common ancestor in this case

> > +    # We can't just compute _c1.ancestor(_c2) and compare it to ca,

> because

> > +    # there can be multiple common ancestors, e.g. in case of bidmerge.

> > +    cta = ca

> > +    # ca.descendant(wc) and ca.descendant(ca) are False, work around that

> > +    _c1 = c1.p1() if c1.rev() is None else c1

> > +    _c2 = c2.p1() if c2.rev() is None else c2

> > +    dirty_c1 = not (ca == _c1 or ca.descendant(_c1))

> > +    dirty_c2 = not (ca == _c2 or ca.descendant(_c2))

> > +    graft = dirty_c1 or dirty_c2

> > +    if graft:

> > +        cta = _c1.ancestor(_c2)

>

> Can we know if we're doing graft-type merge beforehand? I think

> ca.descendant() isn't fast so it should be avoided for normal merges.


This has been argued repeatedly. Basically the only way you can know in advance that your merge is going to be graftlike is by  doing a few descendant() calls yourself. So, with the exception of the "hg merge" command (which is guaranteed to yield ungraftlike merges), all commands wishing to do a merge will have to run through this whole descendant game.

In the first few versions of the patches, I actually did graftlikeness detection in mergemod.graft(), but it was a nightmare to get it to work properly, and as it turns out, graft() isn't the only thing doing a graftlike merge. I was informed that calling descendant() once in a command is fine, it's only too slow for calling in a loop.

I don't think complicating the code, potentially missing some edge cases of unusual commands doing graftlike merges, and breaking extension compatibility is warranted for a few ms speedup of "hg merge".

>

> >      # find interesting file sets from manifests

> > +    if graft:

> > +        repo.ui.debug("  computing unmatched files in rotated DAG\n")

> >      addedinm1 = m1.filesnotin(ma)

> >      addedinm2 = m2.filesnotin(ma)

> > -    u1, u2 = _computenonoverlap(repo, c1, c2, addedinm1, addedinm2)

> > +    _u1, _u2 = _computenonoverlap(repo, c1, c2, addedinm1, addedinm2)

> > +    if not graft:

> > +        u1, u2 = _u1, _u2

> > +    else: # need to recompute this for directory move handling when

> grafting

> > +        repo.ui.debug("  computing unmatched files in unrotated DAG\n")

> > +        u1, u2 = _computenonoverlap(repo, c1, c2, m1.filesnotin(mta),

> > +                                                  m2.filesnotin(mta))

> > +

> >      bothnew = sorted(addedinm1 & addedinm2)

> >

> >      for f in u1:

> > -        checkcopies(c1, f, m1, m2, ca, limit, diverge, copy1, fullcopy1)

> > +        checkcopies(c1, f, m1, m2, ca, cta, dirty_c1, limit, diverge, copy1,

> > +                    fullcopy1, incomplete1, incompletediverge)

> >

> >      for f in u2:

> > -        checkcopies(c2, f, m2, m1, ca, limit, diverge, copy2, fullcopy2)

> > +        checkcopies(c2, f, m2, m1, ca, cta, dirty_c2, limit, diverge, copy2,

> > +                    fullcopy2, incomplete2, incompletediverge)

>

> [snip]

>

> > +    # combine partial copy paths discovered in the previous step

> > +    copyfrom, copyto = incomplete1, incomplete2

> > +    if dirty_c1:

> > +        copyfrom, copyto = incomplete2, incomplete1

> > +    for f in copyfrom:

> > +        if f in copyto:

> > +            copy[copyto[f]] = copyfrom[f]

> > +            del copyto[f]

> > +    for f in incompletediverge:

> > +        assert f not in diverge

> > +        ic = incompletediverge[f]

> > +        if ic[0] in copyto:

> > +            diverge[f] = [copyto[ic[0]], ic[1]]

>

> According to Matt's comment, we need two copy traces split at 'ca', but we

> use ones split at 'cta' (and somewhat 'ca' is taken into account?), because it

> wouldn't be easy to track copies backwards.

>

> https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-

> August/086915.html

>

> I guess that are addressed these "incomplete" dicts and "if"s in

> checkcopies(), but I wonder if there are unhandled cases such as non-linear

> DAG to be rotated, which might include "ypoc". I'm not sure, though.


See the tests. Virtually every imaginable case is tested.

Matt actually wanted 3 copy traces, one between "cta" and "ca", one from one parent to "ca", and one from the other parent to "cta". The problem with this approach is that checkcopies can't just stop after going "behind" some cutoff revision, since it's operating in a low-level way in which "behind" doesn't really make sense. This is presumably for perf reasons. As a result, the checkcopies pass going from a parent to "ca" will actually go back to "cta", and find spurious copies. We could perhaps identify and remove those spurious copies by comparing the output to that of the ca->cta pass, but then we would need post-processing as complex as what this patch has to accomplish that, so we win nothing by going 3-pass.
Yuya Nishihara - Sept. 12, 2016, 3:49 p.m.
On Mon, 12 Sep 2016 10:13:20 +0000, Gábor STEFANIK wrote:
> > > +    # In certain scenarios (e.g. graft, update or rebase), ca can be
> > overridden
> > > +    # We still need to know a real common ancestor in this case
> > > +    # We can't just compute _c1.ancestor(_c2) and compare it to ca,
> > because
> > > +    # there can be multiple common ancestors, e.g. in case of bidmerge.
> > > +    cta = ca
> > > +    # ca.descendant(wc) and ca.descendant(ca) are False, work around that
> > > +    _c1 = c1.p1() if c1.rev() is None else c1
> > > +    _c2 = c2.p1() if c2.rev() is None else c2
> > > +    dirty_c1 = not (ca == _c1 or ca.descendant(_c1))
> > > +    dirty_c2 = not (ca == _c2 or ca.descendant(_c2))
> > > +    graft = dirty_c1 or dirty_c2
> > > +    if graft:
> > > +        cta = _c1.ancestor(_c2)
> >
> > Can we know if we're doing graft-type merge beforehand? I think
> > ca.descendant() isn't fast so it should be avoided for normal merges.
> 
> This has been argued repeatedly. Basically the only way you can know in advance that your merge is going to be graftlike is by  doing a few descendant() calls yourself. So, with the exception of the "hg merge" command (which is guaranteed to yield ungraftlike merges), all commands wishing to do a merge will have to run through this whole descendant game.
> 
> In the first few versions of the patches, I actually did graftlikeness detection in mergemod.graft(), but it was a nightmare to get it to work properly, and as it turns out, graft() isn't the only thing doing a graftlike merge. I was informed that calling descendant() once in a command is fine, it's only too slow for calling in a loop.

I thought callers could compute ca and cta by themselves since they explicitly
pass a pseudo ca, but maybe I'm wrong. If that's already been discussed, there's
no reason to complain. Sorry for the noise.

> > According to Matt's comment, we need two copy traces split at 'ca', but we
> > use ones split at 'cta' (and somewhat 'ca' is taken into account?), because it
> > wouldn't be easy to track copies backwards.
> >
> > https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-
> > August/086915.html
> >
> > I guess that are addressed these "incomplete" dicts and "if"s in
> > checkcopies(), but I wonder if there are unhandled cases such as non-linear
> > DAG to be rotated, which might include "ypoc". I'm not sure, though.
> 
> See the tests. Virtually every imaginable case is tested.
> 
> Matt actually wanted 3 copy traces, one between "cta" and "ca", one from one parent to "ca", and one from the other parent to "cta". The problem with this approach is that checkcopies can't just stop after going "behind" some cutoff revision, since it's operating in a low-level way in which "behind" doesn't really make sense. This is presumably for perf reasons. As a result, the checkcopies pass going from a parent to "ca" will actually go back to "cta", and find spurious copies. We could perhaps identify and remove those spurious copies by comparing the output to that of the ca->cta pass, but then we would need post-processing as complex as what this patch has to accomplish that, so we win nothing by going 3-pass.

My vague concern is that we always walk graph from top to bottom keeping linear
data for fixing up rotated copy trace.

I'm not sure if this is a relevant example, but given the following history,
I got different results by a) "graft 6 to 10" and b) "graft 7 to 10", which
seems suspicious.

@  10: a
|
| +  9: c
| |
| +  8: d->()
| |
| | +  7: d
| |/
| | +  6: c
| |/
| +    5: c,d
| |\
| | +  4: b->d
| | |
| + |  3: b->c
| |/
| +  2: a->b
|/
o  1: a
|
o  0: ()->a


hg init foo
cd foo

echo 0 >> a
hg add a
hg ci -m '()->a'

echo 1 >> a
hg ci -m 'a'

hg mv a b
echo 2 >> b
hg ci -m 'a->b'

hg mv b c
echo 3 >> c
hg ci -m 'b->c'

hg up 2
hg mv b d
echo 4 >> d
hg ci -m 'b->d'

hg up 3
hg merge 4
echo 5 >> c
echo 5 >> d
hg ci -m 'c,d'

echo 6 >> c
hg ci -m 'c'

hg up 5
echo 7 >> d
hg ci -m 'd'

hg up 5
hg rm d
hg ci -m 'd->()'

echo 9 >> c
hg ci -m 'c'

hg up 1
echo 10 >> a
hg ci -m 'a'

hg log -G -T '{rev}: {desc|firstline}\n'


# a) c->a (but there is also d->a)
hg graft 6
other [graft] changed c which local [local] deleted

# b) d->a (but there is also c->a)
hg graft 7
merging a and d to a

# c) c->a (d was deleted)
hg graft 9
merging a and c to a
Gábor Stefanik - Sept. 12, 2016, 5:25 p.m.
>



--------------------------------------------------------------------------
This message, including its attachments, is confidential. For more information please read NNG's email policy here:
http://www.nng.com/emailpolicy/
By responding to this email you accept the email policy.


-----Original Message-----
> From: Yuya Nishihara [mailto:youjah@gmail.com] On Behalf Of Yuya

> Nishihara

> Sent: Monday, September 12, 2016 5:49 PM

> To: Gábor STEFANIK <Gabor.STEFANIK@nng.com>

> Cc: mercurial-devel@mercurial-scm.org

> Subject: Re: [PATCH 1 of 2 v8] graft: support grafting across move/copy

> (issue4028)

>

> On Mon, 12 Sep 2016 10:13:20 +0000, Gábor STEFANIK wrote:

> > > > +    # In certain scenarios (e.g. graft, update or rebase), ca can

> > > > + be

> > > overridden

> > > > +    # We still need to know a real common ancestor in this case

> > > > +    # We can't just compute _c1.ancestor(_c2) and compare it to

> > > > + ca,

> > > because

> > > > +    # there can be multiple common ancestors, e.g. in case of bidmerge.

> > > > +    cta = ca

> > > > +    # ca.descendant(wc) and ca.descendant(ca) are False, work around

> that

> > > > +    _c1 = c1.p1() if c1.rev() is None else c1

> > > > +    _c2 = c2.p1() if c2.rev() is None else c2

> > > > +    dirty_c1 = not (ca == _c1 or ca.descendant(_c1))

> > > > +    dirty_c2 = not (ca == _c2 or ca.descendant(_c2))

> > > > +    graft = dirty_c1 or dirty_c2

> > > > +    if graft:

> > > > +        cta = _c1.ancestor(_c2)

> > >

> > > Can we know if we're doing graft-type merge beforehand? I think

> > > ca.descendant() isn't fast so it should be avoided for normal merges.

> >

> > This has been argued repeatedly. Basically the only way you can know in

> advance that your merge is going to be graftlike is by  doing a few

> descendant() calls yourself. So, with the exception of the "hg merge"

> command (which is guaranteed to yield ungraftlike merges), all commands

> wishing to do a merge will have to run through this whole descendant game.

> >

> > In the first few versions of the patches, I actually did graftlikeness

> detection in mergemod.graft(), but it was a nightmare to get it to work

> properly, and as it turns out, graft() isn't the only thing doing a graftlike

> merge. I was informed that calling descendant() once in a command is fine,

> it's only too slow for calling in a loop.

>

> I thought callers could compute ca and cta by themselves since they explicitly

> pass a pseudo ca, but maybe I'm wrong. If that's already been discussed,

> there's no reason to complain. Sorry for the noise.


In certain "hg update" scenarios, mergemod.update may be called with ancestor=None,
and we can still end up with a fake ca (in a completely sane way). Also, there are quite a few extensions
both in and out of tree, that call mergemod.update with or without an explicit ca, which may or may not
be a real common ancestor.

Then there is also the case when "ca" is a common ancestor of both c1 and c2, but ca != c1.ancestor(c2)
This case is an ungraftlike merge, and should not use the graft logic. It can happen e.g. due to bidmerge.

Finally, there are cases of "hg graft" where ca==cta, and thus a graft command can involve an ungraftlike
merge. There is no way for the caller to detect this, except by calling descendant().

>

> > > According to Matt's comment, we need two copy traces split at 'ca',

> > > but we use ones split at 'cta' (and somewhat 'ca' is taken into

> > > account?), because it wouldn't be easy to track copies backwards.

> > >

> > > https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-

> > > August/086915.html

> > >

> > > I guess that are addressed these "incomplete" dicts and "if"s in

> > > checkcopies(), but I wonder if there are unhandled cases such as

> > > non-linear DAG to be rotated, which might include "ypoc". I'm not sure,

> though.

> >

> > See the tests. Virtually every imaginable case is tested.

> >

> > Matt actually wanted 3 copy traces, one between "cta" and "ca", one from

> one parent to "ca", and one from the other parent to "cta". The problem

> with this approach is that checkcopies can't just stop after going "behind"

> some cutoff revision, since it's operating in a low-level way in which "behind"

> doesn't really make sense. This is presumably for perf reasons. As a result,

> the checkcopies pass going from a parent to "ca" will actually go back to "cta",

> and find spurious copies. We could perhaps identify and remove those

> spurious copies by comparing the output to that of the ca->cta pass, but then

> we would need post-processing as complex as what this patch has to

> accomplish that, so we win nothing by going 3-pass.

>

> My vague concern is that we always walk graph from top to bottom keeping

> linear data for fixing up rotated copy trace.

>

> I'm not sure if this is a relevant example, but given the following history, I got

> different results by a) "graft 6 to 10" and b) "graft 7 to 10", which seems

> suspicious.

>

> @  10: a

> |

> | +  9: c

> | |

> | +  8: d->()

> | |

> | | +  7: d

> | |/

> | | +  6: c

> | |/

> | +    5: c,d

> | |\

> | | +  4: b->d

> | | |

> | + |  3: b->c

> | |/

> | +  2: a->b

> |/

> o  1: a

> |

> o  0: ()->a

>

>

> hg init foo

> cd foo

>

> echo 0 >> a

> hg add a

> hg ci -m '()->a'

>

> echo 1 >> a

> hg ci -m 'a'

>

> hg mv a b

> echo 2 >> b

> hg ci -m 'a->b'

>

> hg mv b c

> echo 3 >> c

> hg ci -m 'b->c'

>

> hg up 2

> hg mv b d

> echo 4 >> d

> hg ci -m 'b->d'

>

> hg up 3

> hg merge 4

> echo 5 >> c

> echo 5 >> d

> hg ci -m 'c,d'

>

> echo 6 >> c

> hg ci -m 'c'

>

> hg up 5

> echo 7 >> d

> hg ci -m 'd'

>

> hg up 5

> hg rm d

> hg ci -m 'd->()'

>

> echo 9 >> c

> hg ci -m 'c'

>

> hg up 1

> echo 10 >> a

> hg ci -m 'a'

>

> hg log -G -T '{rev}: {desc|firstline}\n'

>

>

> # a) c->a (but there is also d->a)

> hg graft 6

> other [graft] changed c which local [local] deleted

>

> # b) d->a (but there is also c->a)

> hg graft 7

> merging a and d to a

>

> # c) c->a (d was deleted)

> hg graft 9

> merging a and c to a



This looks like a separate (though related) issue to me. Turning the DAG around causes "a" to be both "moved from c" and "moved from d", a ypoc-like scenario. The problem is, mergecopies' callers expect the "copy" return value to be a many-to-one mapping of targets to sources, when it's actually many-to-many in this scenario. Fixable, but I would rather handle this in a followup issue (that I intend to do still in the 4.0 timeframe), since it's not a regression, and the patch has already grown quite big.
In fact, this may be something we want to support explicitly in the tree as well, to handle refactors that cause two files to be combined into one (e.g. allowing "module.h" to be moved from both "module_pub.h" and "module_priv.h"), but that's yet another issue, probably beyond 4.0.
Pierre-Yves David - Sept. 13, 2016, 9:42 a.m.
On 09/12/2016 12:13 PM, Gábor STEFANIK wrote:
> -----Original Message-----
>> From: Yuya Nishihara [mailto:youjah@gmail.com] On Behalf Of Yuya
>> Nishihara
>> Sent: Sunday, September 11, 2016 6:29 PM
>> To: Gábor STEFANIK <Gabor.STEFANIK@nng.com>
>> Cc: mercurial-devel@mercurial-scm.org
>> Subject: Re: [PATCH 1 of 2 v8] graft: support grafting across move/copy
>> (issue4028)
>>
>> On Fri, 26 Aug 2016 11:16:31 -0500, Gábor Stefanik wrote:
>>> # HG changeset patch
>>> # User Gábor Stefanik <gabor.stefanik@nng.com> # Date 1472225958 -7200
>>> #      Fri Aug 26 17:39:18 2016 +0200
>>> # Node ID f32aa28298164aa38830e83369c57c9553c6ff08
>>> # Parent  318e2b600b80e4ed3c6f37df46ec7544f60d4c0b
>>> graft: support grafting across move/copy (issue4028)
>>
>> Sorry for late reply, and I don't fully understand how this patch handles
>> reverse copy tracking correctly yet. But weekend is over, so I just wrote
>> random comments.
>>
>>> +    # In certain scenarios (e.g. graft, update or rebase), ca can be
>> overridden
>>> +    # We still need to know a real common ancestor in this case
>>> +    # We can't just compute _c1.ancestor(_c2) and compare it to ca,
>> because
>>> +    # there can be multiple common ancestors, e.g. in case of bidmerge.
>>> +    cta = ca
>>> +    # ca.descendant(wc) and ca.descendant(ca) are False, work around that
>>> +    _c1 = c1.p1() if c1.rev() is None else c1
>>> +    _c2 = c2.p1() if c2.rev() is None else c2
>>> +    dirty_c1 = not (ca == _c1 or ca.descendant(_c1))
>>> +    dirty_c2 = not (ca == _c2 or ca.descendant(_c2))
>>> +    graft = dirty_c1 or dirty_c2
>>> +    if graft:
>>> +        cta = _c1.ancestor(_c2)
>>
>> Can we know if we're doing graft-type merge beforehand? I think
>> ca.descendant() isn't fast so it should be avoided for normal merges.
>
> This has been argued repeatedly. Basically the only way you can know in advance that your merge is going to be graftlike is by  doing a few descendant() calls yourself. So, with the exception of the "hg merge" command (which is guaranteed to yield ungraftlike merges), all commands wishing to do a merge will have to run through this whole descendant game.
>
> In the first few versions of the patches, I actually did graftlikeness detection in mergemod.graft(), but it was a nightmare to get it to work properly, and as it turns out, graft() isn't the only thing doing a graftlike merge. I was informed that calling descendant() once in a command is fine, it's only too slow for calling in a loop.
>
> I don't think complicating the code, potentially missing some edge cases of unusual commands doing graftlike merges, and breaking extension compatibility is warranted for a few ms speedup of "hg merge".

Given how often you had to explain this, it is probably worth adding 
something about this in the inline comment about it.

>>>      # find interesting file sets from manifests
>>> +    if graft:
>>> +        repo.ui.debug("  computing unmatched files in rotated DAG\n")
>>>      addedinm1 = m1.filesnotin(ma)
>>>      addedinm2 = m2.filesnotin(ma)
>>> -    u1, u2 = _computenonoverlap(repo, c1, c2, addedinm1, addedinm2)
>>> +    _u1, _u2 = _computenonoverlap(repo, c1, c2, addedinm1, addedinm2)
>>> +    if not graft:
>>> +        u1, u2 = _u1, _u2
>>> +    else: # need to recompute this for directory move handling when
>> grafting
>>> +        repo.ui.debug("  computing unmatched files in unrotated DAG\n")
>>> +        u1, u2 = _computenonoverlap(repo, c1, c2, m1.filesnotin(mta),
>>> +                                                  m2.filesnotin(mta))
>>> +
>>>      bothnew = sorted(addedinm1 & addedinm2)
>>>
>>>      for f in u1:
>>> -        checkcopies(c1, f, m1, m2, ca, limit, diverge, copy1, fullcopy1)
>>> +        checkcopies(c1, f, m1, m2, ca, cta, dirty_c1, limit, diverge, copy1,
>>> +                    fullcopy1, incomplete1, incompletediverge)
>>>
>>>      for f in u2:
>>> -        checkcopies(c2, f, m2, m1, ca, limit, diverge, copy2, fullcopy2)
>>> +        checkcopies(c2, f, m2, m1, ca, cta, dirty_c2, limit, diverge, copy2,
>>> +                    fullcopy2, incomplete2, incompletediverge)
>>
>> [snip]
>>
>>> +    # combine partial copy paths discovered in the previous step
>>> +    copyfrom, copyto = incomplete1, incomplete2
>>> +    if dirty_c1:
>>> +        copyfrom, copyto = incomplete2, incomplete1
>>> +    for f in copyfrom:
>>> +        if f in copyto:
>>> +            copy[copyto[f]] = copyfrom[f]
>>> +            del copyto[f]
>>> +    for f in incompletediverge:
>>> +        assert f not in diverge
>>> +        ic = incompletediverge[f]
>>> +        if ic[0] in copyto:
>>> +            diverge[f] = [copyto[ic[0]], ic[1]]
>>
>> According to Matt's comment, we need two copy traces split at 'ca', but we
>> use ones split at 'cta' (and somewhat 'ca' is taken into account?), because it
>> wouldn't be easy to track copies backwards.
>>
>> https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-
>> August/086915.html
>>
>> I guess that are addressed these "incomplete" dicts and "if"s in
>> checkcopies(), but I wonder if there are unhandled cases such as non-linear
>> DAG to be rotated, which might include "ypoc". I'm not sure, though.
>
> See the tests. Virtually every imaginable case is tested.
>
> Matt actually wanted 3 copy traces, one between "cta" and "ca", one from one parent to "ca", and one from the other parent to "cta". The problem with this approach is that checkcopies can't just stop after going "behind" some cutoff revision, since it's operating in a low-level way in which "behind" doesn't really make sense. This is presumably for perf reasons. As a result, the checkcopies pass going from a parent to "ca" will actually go back to "cta", and find spurious copies. We could perhaps identify and remove those spurious copies by comparing the output to that of the ca->cta pass, but then we would need post-processing as complex as what this patch has to accomplish that, so we win nothing by going 3-pass.

Okay this is the first time I read bout this 'behind' thing, I need to 
investigate that but that should probably extensibility documented as a 
comment.
Pierre-Yves David - Sept. 13, 2016, 10 a.m.
On 08/26/2016 06:16 PM, Gábor Stefanik wrote:
> # HG changeset patch
> # User Gábor Stefanik <gabor.stefanik@nng.com>
> # Date 1472225958 -7200
> #      Fri Aug 26 17:39:18 2016 +0200
> # Node ID f32aa28298164aa38830e83369c57c9553c6ff08
> # Parent  318e2b600b80e4ed3c6f37df46ec7544f60d4c0b
> graft: support grafting across move/copy (issue4028)
>
> Graft performs a merge with a false common ancestor, which must be taken into
> account when tracking copies. Explicitly pass the real common ancestor in this
> case, and track copies between the real and false common ancestors in reverse.
>
> With this change, when grafting a commit with a change to a file moved earlier
> on the graft's source branch, the change is merged as expected into the original
> (unmoved) file, rather than recreating it under its new name.
> It should also make it possible to eventually enable cross-branch updates with
> merge.

I've made a small review pass, forcusing on the test are they are the 
basis that will allow use to make sure the copy tracing is actually 
catching all cases. Unfortunately, the test are a bit too opaque for me, 
I've made some comment inline.

I think we should list possible variants and draw a table to make sure 
we define and test all of them.

Random example of changeset with table 
https://selenic.com/repo/hg//rev/6c81b8ebf66e


> diff --git a/mercurial/copies.py b/mercurial/copies.py
> --- a/mercurial/copies.py
> +++ b/mercurial/copies.py
> @@ -321,6 +321,20 @@
>      if repo.ui.configbool('experimental', 'disablecopytrace'):
>          return {}, {}, {}, {}
>
> +    # In certain scenarios (e.g. graft, update or rebase), ca can be overridden
> +    # We still need to know a real common ancestor in this case
> +    # We can't just compute _c1.ancestor(_c2) and compare it to ca, because
> +    # there can be multiple common ancestors, e.g. in case of bidmerge.
> +    cta = ca
> +    # ca.descendant(wc) and ca.descendant(ca) are False, work around that
> +    _c1 = c1.p1() if c1.rev() is None else c1
> +    _c2 = c2.p1() if c2.rev() is None else c2
> +    dirty_c1 = not (ca == _c1 or ca.descendant(_c1))
> +    dirty_c2 = not (ca == _c2 or ca.descendant(_c2))
> +    graft = dirty_c1 or dirty_c2
> +    if graft:
> +        cta = _c1.ancestor(_c2)
> +
>      limit = _findlimit(repo, c1.rev(), c2.rev())
>      if limit is None:
>          # no common ancestor, no copies
> @@ -330,28 +344,56 @@
>      m1 = c1.manifest()
>      m2 = c2.manifest()
>      ma = ca.manifest()
> +    mta = cta.manifest()
>
> -    copy1, copy2, = {}, {}
> +    # see checkcopies documentation below for these dicts
> +    copy1, copy2 = {}, {}
> +    incomplete1, incomplete2 = {}, {}
>      movewithdir1, movewithdir2 = {}, {}
>      fullcopy1, fullcopy2 = {}, {}
> -    diverge = {}
> +    diverge, incompletediverge = {}, {}
>
>      # find interesting file sets from manifests
> +    if graft:
> +        repo.ui.debug("  computing unmatched files in rotated DAG\n")
>      addedinm1 = m1.filesnotin(ma)
>      addedinm2 = m2.filesnotin(ma)
> -    u1, u2 = _computenonoverlap(repo, c1, c2, addedinm1, addedinm2)
> +    _u1, _u2 = _computenonoverlap(repo, c1, c2, addedinm1, addedinm2)
> +    if not graft:
> +        u1, u2 = _u1, _u2
> +    else: # need to recompute this for directory move handling when grafting
> +        repo.ui.debug("  computing unmatched files in unrotated DAG\n")
> +        u1, u2 = _computenonoverlap(repo, c1, c2, m1.filesnotin(mta),
> +                                                  m2.filesnotin(mta))
> +
>      bothnew = sorted(addedinm1 & addedinm2)
>
>      for f in u1:
> -        checkcopies(c1, f, m1, m2, ca, limit, diverge, copy1, fullcopy1)
> +        checkcopies(c1, f, m1, m2, ca, cta, dirty_c1, limit, diverge, copy1,
> +                    fullcopy1, incomplete1, incompletediverge)
>
>      for f in u2:
> -        checkcopies(c2, f, m2, m1, ca, limit, diverge, copy2, fullcopy2)
> +        checkcopies(c2, f, m2, m1, ca, cta, dirty_c2, limit, diverge, copy2,
> +                    fullcopy2, incomplete2, incompletediverge)
>
>      copy = dict(copy1.items() + copy2.items())
>      movewithdir = dict(movewithdir1.items() + movewithdir2.items())
>      fullcopy = dict(fullcopy1.items() + fullcopy2.items())
>
> +    # combine partial copy paths discovered in the previous step
> +    copyfrom, copyto = incomplete1, incomplete2
> +    if dirty_c1:
> +        copyfrom, copyto = incomplete2, incomplete1
> +    for f in copyfrom:
> +        if f in copyto:
> +            copy[copyto[f]] = copyfrom[f]
> +            del copyto[f]
> +    for f in incompletediverge:
> +        assert f not in diverge
> +        ic = incompletediverge[f]
> +        if ic[0] in copyto:
> +            diverge[f] = [copyto[ic[0]], ic[1]]
> +
>      renamedelete = {}
>      renamedeleteset = set()
>      divergeset = set()
> @@ -369,10 +411,29 @@
>      if bothnew:
>          repo.ui.debug("  unmatched files new in both:\n   %s\n"
>                        % "\n   ".join(bothnew))
> -    bothdiverge, _copy, _fullcopy = {}, {}, {}
> +    bothdiverge = {}
> +    _copy, _fullcopy = {}, {} # dummy dicts
> +    _incomplete1, _incomplete2, _diverge = {}, {}, {}
>      for f in bothnew:
> -        checkcopies(c1, f, m1, m2, ca, limit, bothdiverge, _copy, _fullcopy)
> -        checkcopies(c2, f, m2, m1, ca, limit, bothdiverge, _copy, _fullcopy)
> +        checkcopies(c1, f, m1, m2, ca, cta, dirty_c1, limit, bothdiverge,
> +                    _copy, _fullcopy, _incomplete1, _diverge)
> +        checkcopies(c2, f, m2, m1, ca, cta, dirty_c2, limit, bothdiverge,
> +                    _copy, _fullcopy, _incomplete2, _diverge)
> +    _incomplete = _incomplete2
> +    if dirty_c1:
> +        _incomplete = _incomplete1
> +        assert _incomplete2 == {}
> +    else:
> +        assert _incomplete1 == {}
> +    for f in _diverge:
> +        assert f not in bothdiverge
> +        ic = _diverge[f]
> +        if ic[0] in _incomplete:
> +            bothdiverge[f] = [_incomplete[ic[0]], ic[1]]
> +        elif ic[0] in (m1 if dirty_c1 else m2):
> +            # backed-out rename on one side, but watch out for deleted files
> +            bothdiverge[f] = ic
> +
>      for of, fl in bothdiverge.items():
>          if len(fl) == 2 and fl[0] == fl[1]:
>              copy[fl[0]] = of # not actually divergent, just matching renames
> @@ -438,7 +499,7 @@
>                        (d, dirmove[d]))
>
>      # check unaccounted nonoverlapping files against directory moves
> -    for f in u1 + u2:
> +    for f in _u1 + _u2:
>          if f not in fullcopy:
>              for d in dirmove:
>                  if f.startswith(d):
> @@ -452,7 +513,8 @@
>
>      return copy, movewithdir, diverge, renamedelete
>
> -def checkcopies(ctx, f, m1, m2, ca, limit, diverge, copy, fullcopy):
> +def checkcopies(ctx, f, m1, m2, ca, cta, remote_ca, limit, diverge, copy,
> +                fullcopy, incomplete, incompletediverge):
>      """
>      check possible copies of f from m1 to m2
>
> @@ -460,14 +522,20 @@
>      f = the filename to check
>      m1 = the source manifest
>      m2 = the destination manifest
> -    ca = the changectx of the common ancestor
> +    ca = the changectx of the common ancestor, overridden on graft
> +    cta = topological common ancestor for graft-like scenarios
> +    remote_ca = True if ca is outside cta::m1, False otherwise
>      limit = the rev number to not search beyond
>      diverge = record all diverges in this dict
>      copy = record all non-divergent copies in this dict
>      fullcopy = record all copies in this dict
> +    incomplete = record non-divergent partial copies here
> +    incompletediverge = record divergent partial copies here
>      """
>
>      ma = ca.manifest()
> +    mta = cta.manifest()
> +    backwards = ca != cta and not remote_ca and f in ma
>      getfctx = _makegetfctx(ctx)
>
>      def _related(f1, f2, limit):
> @@ -502,31 +570,57 @@
>              return False
>
>      of = None
> -    seen = set([f])
> -    for oc in getfctx(f, m1[f]).ancestors():
> +    seen = {f: [getfctx(f, m1[f])]}
> +    for oc in seen[f][0].ancestors():
>          ocr = oc.linkrev()
>          of = oc.path()
>          if of in seen:
> +            seen[of].append(oc)
>              # check limit late - grab last rename before
>              if ocr < limit:
>                  break
>              continue
> -        seen.add(of)
> +        seen[of] = [oc]
>
> -        fullcopy[f] = of # remember for dir rename detection
> +        # remember for dir rename detection
> +        if backwards:
> +            fullcopy[of] = f # grafting backwards through renames
> +        else:
> +            fullcopy[f] = of
>          if of not in m2:
>              continue # no match, keep looking
>          if m2[of] == ma.get(of):
> -            break # no merge needed, quit early
> +            return # no merge needed, quit early
>          c2 = getfctx(of, m2[of])
> -        cr = _related(oc, c2, ca.rev())
> +        cr = _related(oc, c2, cta.rev())
>          if cr and (of == f or of == c2.path()): # non-divergent
> -            copy[f] = of
> -            of = None
> -            break
> +            if backwards:
> +                copy[of] = f
> +            elif of in ma:
> +                copy[f] = of
> +            elif remote_ca: # special case: a <- b <- a -> b "ping-pong" rename
> +                copy[of] = f
> +                del fullcopy[f]
> +                fullcopy[of] = f
> +            else: # divergence w.r.t. graft CA on one side of topological CA
> +                for sf in seen:
> +                    if sf in ma and getfctx(sf, ma[sf]) in seen[sf]:
> +                        assert sf not in diverge
> +                        diverge[sf] = [f, of]
> +                        break
> +            return
>
> -    if of in ma:
> -        diverge.setdefault(of, []).append(f)
> +    if of in mta:
> +        if backwards or remote_ca:
> +            incomplete[of] = f
> +        else:
> +            for sf in seen:
> +                if sf in ma and getfctx(sf, ma[sf]) in seen[sf]:
> +                    if cta == ca:
> +                        diverge.setdefault(sf, []).append(f)
> +                    else:
> +                        incompletediverge[sf] = [of, f]
> +                    return
>
>  def duplicatecopies(repo, rev, fromrev, skiprev=None):
>      '''reproduce copies from fromrev to rev in the dirstate
> diff --git a/tests/test-graft.t b/tests/test-graft.t
> --- a/tests/test-graft.t
> +++ b/tests/test-graft.t
> @@ -179,6 +179,13 @@
>    committing changelog
>    grafting 5:97f8bfe72746 "5"
>      searching for copies back to rev 1
> +    computing unmatched files in rotated DAG
> +    computing unmatched files in unrotated DAG
> +    unmatched files in other:
> +     c
> +    all copies found (* = to merge, ! = divergent, % = renamed and deleted):
> +     src: 'c' -> dst: 'b' *
> +    checking for directory renames
>    resolving manifests
>     branchmerge: True, force: True, partial: False
>     ancestor: 4c60f11aa304, local: 6b9e5368ca4e+, remote: 97f8bfe72746
> @@ -193,6 +200,13 @@
>    scanning for duplicate grafts
>    grafting 4:9c233e8e184d "4"
>      searching for copies back to rev 1
> +    computing unmatched files in rotated DAG
> +    computing unmatched files in unrotated DAG
> +    unmatched files in other:
> +     c
> +    all copies found (* = to merge, ! = divergent, % = renamed and deleted):
> +     src: 'c' -> dst: 'b' *
> +    checking for directory renames
>    resolving manifests
>     branchmerge: True, force: True, partial: False
>     ancestor: 4c60f11aa304, local: 1905859650ec+, remote: 9c233e8e184d
> @@ -427,8 +441,8 @@
>    $ hg graft 3 --log -u foo
>    grafting 3:4c60f11aa304 "3"
>    warning: can't find ancestor for 'c' copied from 'b'!
> -  $ hg log --template '{rev} {parents} {desc}\n' -r tip
> -  14 1:5d205f8b35b6  3
> +  $ hg log --template '{rev}:{node|short} {parents} {desc}\n' -r tip
> +  14:0c921c65ef1e 1:5d205f8b35b6  3
>    (grafted from 4c60f11aa304a54ae1c199feb94e7fc771e51ed8)
>
>  Resolve conflicted graft
> @@ -620,7 +634,7 @@
>    date:        Thu Jan 01 00:00:00 1970 +0000
>    summary:     2
>
> -  changeset:   14:f64defefacee
> +  changeset:   14:0c921c65ef1e
>    parent:      1:5d205f8b35b6
>    user:        foo
>    date:        Thu Jan 01 00:00:00 1970 +0000
> @@ -842,3 +856,290 @@
>    |/
>    o  0
>
> +Graft from behind a move or rename

Use regularly use ReST like title to mark important section so make it

Graft from behind a move or rename
==================================


> +NOTE: This is affected by bug 5343, and will need updating when it's fixed

Please use 'issue5343' to help future archeology'

> +f50 tests a "ping-pong" rename case, where a file is renamed to the same name
> +on both branches, then the rename is backed out on one branch, and the backout
> +is grafted to the other branch. This creates a challenging b <- a <- b -> a
> +rename sequence in the graft target, topological CA, graft CA and graft source,
> +respectively. Since rename detection will run on the target side for such a
> +sequence (as for technical reasons, we split the source and target sides not at
> +the graft CA, but at the topological CA), it will pick up a false rename, and
> +cause a spurious merge conflict. This false rename is always exactly the
> +reverse of the true rename that would be detected on the source side, so deal
> +with it by detecting this condition and reversing as necessary.

I would probably have started with a less complexe test case (but I've 
to admit, that one looks fun)

> +
> +  $ hg init ../graftmove
> +  $ cd ../graftmove
> +  $ echo c10 > f10
> +  $ echo c20 > f20
> +  $ echo c30 > f30
> +  $ echo c40 > f40
> +  $ echo c50 > f50
> +  $ hg ci -qAm 0

Please use Letter for your commit message, otherwise it creates 
confusion between revision number and commit message, especially when 
commit starts to be amended/rebased/grafted.

You could also use descriptive commit message if that can help the 
readability of your changes.

> +  $ hg mv f10 f11
> +  $ hg mv f30 f31
> +  $ hg mv f50 f51
> +  $ hg ci -qAm 1
> +  $ echo c12 > f11
> +  $ hg mv f20 f22
> +  $ hg mv f51 f50
> +  $ hg ci -qAm 2
> +  $ hg mv f31 f33
> +  $ echo c43 > f40
> +  $ hg ci -qAm 3
> +  $ hg up -q 0

At that point, it is probably worthwhile to include a graphlog output to 
have a vision of the resulting stack (even if in this case it turn out 
to be quick linear and simple).

Shouldn't '2' alter the content of 'f50' so that we can check it get 
properly applied to 'f50' ? (if not, this is probably a sign that we 
need more details about what individual graft are testing .

> +  $ hg graft -r 2
> +  grafting 2:52f80b3a6c3e "2"
> +  merging f10 and f11 to f10
> +  warning: can't find ancestor for 'f50' copied from 'f51'!

What is this warning about ?

> +  $ hg status --change .
> +  M f10
> +  A f22
> +  R f20
> +  $ hg cat f10
> +  c12
> +  $ hg cat f11
> +  f11: no such file in rev 54f7d8995b01
> +  [1]
> +  $ hg graft -r 3
> +  grafting 3:0f6524134ac0 "3"
> +  note: possible conflict - f31 was renamed multiple times to:
> +   f33
> +   f30
> +  warning: can't find ancestor for 'f33' copied from 'f31'!

Again, I need some explanation of what we are testing here.

> +  $ hg up -q 0
> +  $ hg mv f10 f16
> +  $ echo c26 > f20
> +  $ hg mv f30 f36
> +  $ hg mv f40 f46
> +  $ hg mv f50 f51
> +  $ hg ci -qAm 6

Probably need some graph output here for tracking.

> +  $ hg graft -r 2
> +  grafting 2:52f80b3a6c3e "2"
> +  merging f16 and f11 to f16
> +  merging f20 and f22 to f22
> +  $ hg graft -r 3
> +  grafting 3:0f6524134ac0 "3"
> +  note: possible conflict - f31 was renamed multiple times to:
> +   f36
> +   f33
> +  merging f46 and f40 to f46
> +  warning: can't find ancestor for 'f33' copied from 'f31'!

Same here, that warning is confusing me and I'm not sure what we are 
testing.

> +  $ hg log -CGv --patch --git
> +  @  changeset:   8:db9925d9208e
> +  |  tag:         tip
> +  |  user:        test
> +  |  date:        Thu Jan 01 00:00:00 1970 +0000
> +  |  files:       f33 f46
> +  |  description:
> +  |  3
> +  |
> +  |
> +  |  diff --git a/f33 b/f33
> +  |  new file mode 100644
> +  |  --- /dev/null
> +  |  +++ b/f33
> +  |  @@ -0,0 +1,1 @@
> +  |  +c30
> +  |  diff --git a/f46 b/f46
> +  |  --- a/f46
> +  |  +++ b/f46
> +  |  @@ -1,1 +1,1 @@
> +  |  -c40
> +  |  +c43
> +  |
> +  o  changeset:   7:08113e8ff5c8
> +  |  user:        test
> +  |  date:        Thu Jan 01 00:00:00 1970 +0000
> +  |  files:       f16 f20 f22 f50 f51
> +  |  copies:      f22 (f20) f50 (f51)
> +  |  description:
> +  |  2
> +  |
> +  |
> +  |  diff --git a/f16 b/f16
> +  |  --- a/f16
> +  |  +++ b/f16
> +  |  @@ -1,1 +1,1 @@
> +  |  -c10
> +  |  +c12
> +  |  diff --git a/f20 b/f22
> +  |  rename from f20
> +  |  rename to f22
> +  |  diff --git a/f51 b/f50
> +  |  rename from f51
> +  |  rename to f50
> +  |
> +  o  changeset:   6:365d1fa57acb
> +  |  parent:      0:1060512e332e
> +  |  user:        test
> +  |  date:        Thu Jan 01 00:00:00 1970 +0000
> +  |  files:       f10 f16 f20 f30 f36 f40 f46 f50 f51
> +  |  copies:      f16 (f10) f36 (f30) f46 (f40) f51 (f50)
> +  |  description:
> +  |  6
> +  |
> +  |
> +  |  diff --git a/f10 b/f16
> +  |  rename from f10
> +  |  rename to f16
> +  |  diff --git a/f20 b/f20
> +  |  --- a/f20
> +  |  +++ b/f20
> +  |  @@ -1,1 +1,1 @@
> +  |  -c20
> +  |  +c26
> +  |  diff --git a/f30 b/f36
> +  |  rename from f30
> +  |  rename to f36
> +  |  diff --git a/f40 b/f46
> +  |  rename from f40
> +  |  rename to f46
> +  |  diff --git a/f50 b/f51
> +  |  rename from f50
> +  |  rename to f51
> +  |
> +  | o  changeset:   5:d5015e1a1de5
> +  | |  user:        test
> +  | |  date:        Thu Jan 01 00:00:00 1970 +0000
> +  | |  files:       f33 f40
> +  | |  description:
> +  | |  3
> +  | |
> +  | |
> +  | |  diff --git a/f33 b/f33
> +  | |  new file mode 100644
> +  | |  --- /dev/null
> +  | |  +++ b/f33
> +  | |  @@ -0,0 +1,1 @@
> +  | |  +c30
> +  | |  diff --git a/f40 b/f40
> +  | |  --- a/f40
> +  | |  +++ b/f40
> +  | |  @@ -1,1 +1,1 @@
> +  | |  -c40
> +  | |  +c43
> +  | |
> +  | o  changeset:   4:54f7d8995b01
> +  |/   parent:      0:1060512e332e
> +  |    user:        test
> +  |    date:        Thu Jan 01 00:00:00 1970 +0000
> +  |    files:       f10 f20 f22
> +  |    copies:      f22 (f20)
> +  |    description:
> +  |    2
> +  |
> +  |
> +  |    diff --git a/f10 b/f10
> +  |    --- a/f10
> +  |    +++ b/f10
> +  |    @@ -1,1 +1,1 @@
> +  |    -c10
> +  |    +c12
> +  |    diff --git a/f20 b/f22
> +  |    rename from f20
> +  |    rename to f22
> +  |
> +  | o  changeset:   3:0f6524134ac0
> +  | |  user:        test
> +  | |  date:        Thu Jan 01 00:00:00 1970 +0000
> +  | |  files:       f31 f33 f40
> +  | |  copies:      f33 (f31)
> +  | |  description:
> +  | |  3
> +  | |
> +  | |
> +  | |  diff --git a/f31 b/f33
> +  | |  rename from f31
> +  | |  rename to f33
> +  | |  diff --git a/f40 b/f40
> +  | |  --- a/f40
> +  | |  +++ b/f40
> +  | |  @@ -1,1 +1,1 @@
> +  | |  -c40
> +  | |  +c43
> +  | |
> +  | o  changeset:   2:52f80b3a6c3e
> +  | |  user:        test
> +  | |  date:        Thu Jan 01 00:00:00 1970 +0000
> +  | |  files:       f11 f20 f22 f50 f51
> +  | |  copies:      f22 (f20) f50 (f51)
> +  | |  description:
> +  | |  2
> +  | |
> +  | |
> +  | |  diff --git a/f11 b/f11
> +  | |  --- a/f11
> +  | |  +++ b/f11
> +  | |  @@ -1,1 +1,1 @@
> +  | |  -c10
> +  | |  +c12
> +  | |  diff --git a/f20 b/f22
> +  | |  rename from f20
> +  | |  rename to f22
> +  | |  diff --git a/f51 b/f50
> +  | |  rename from f51
> +  | |  rename to f50
> +  | |
> +  | o  changeset:   1:7607a972f96d
> +  |/   user:        test
> +  |    date:        Thu Jan 01 00:00:00 1970 +0000
> +  |    files:       f10 f11 f30 f31 f50 f51
> +  |    copies:      f11 (f10) f31 (f30) f51 (f50)
> +  |    description:
> +  |    1
> +  |
> +  |
> +  |    diff --git a/f10 b/f11
> +  |    rename from f10
> +  |    rename to f11
> +  |    diff --git a/f30 b/f31
> +  |    rename from f30
> +  |    rename to f31
> +  |    diff --git a/f50 b/f51
> +  |    rename from f50
> +  |    rename to f51
> +  |
> +  o  changeset:   0:1060512e332e
> +     user:        test
> +     date:        Thu Jan 01 00:00:00 1970 +0000
> +     files:       f10 f20 f30 f40 f50
> +     description:
> +     0
> +
> +
> +     diff --git a/f10 b/f10
> +     new file mode 100644
> +     --- /dev/null
> +     +++ b/f10
> +     @@ -0,0 +1,1 @@
> +     +c10
> +     diff --git a/f20 b/f20
> +     new file mode 100644
> +     --- /dev/null
> +     +++ b/f20
> +     @@ -0,0 +1,1 @@
> +     +c20
> +     diff --git a/f30 b/f30
> +     new file mode 100644
> +     --- /dev/null
> +     +++ b/f30
> +     @@ -0,0 +1,1 @@
> +     +c30
> +     diff --git a/f40 b/f40
> +     new file mode 100644
> +     --- /dev/null
> +     +++ b/f40
> +     @@ -0,0 +1,1 @@
> +     +c40
> +     diff --git a/f50 b/f50
> +     new file mode 100644
> +     --- /dev/null
> +     +++ b/f50
> +     @@ -0,0 +1,1 @@
> +     +c50
> +
> +  $ hg cat f22
> +  c26
> diff --git a/tests/test-rebase-conflicts.t b/tests/test-rebase-conflicts.t
> --- a/tests/test-rebase-conflicts.t
> +++ b/tests/test-rebase-conflicts.t
> @@ -238,6 +238,10 @@
>     merge against 9:e31216eec445
>       detach base 8:8e4e2c1a07ae
>      searching for copies back to rev 3
> +    computing unmatched files in rotated DAG
> +    computing unmatched files in unrotated DAG
> +    unmatched files in other:
> +     f2.txt
>    resolving manifests
>     branchmerge: True, force: True, partial: False
>     ancestor: 8e4e2c1a07ae, local: 4bc80088dc6b+, remote: e31216eec445
> @@ -255,6 +259,10 @@
>     merge against 10:2f2496ddf49d
>       detach base 9:e31216eec445
>      searching for copies back to rev 3
> +    computing unmatched files in rotated DAG
> +    computing unmatched files in unrotated DAG
> +    unmatched files in other:
> +     f2.txt
>    resolving manifests
>     branchmerge: True, force: True, partial: False
>     ancestor: e31216eec445, local: 19c888675e13+, remote: 2f2496ddf49d
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
Gábor Stefanik - Sept. 13, 2016, 1:37 p.m.
>



--------------------------------------------------------------------------
This message, including its attachments, is confidential. For more information please read NNG's email policy here:
http://www.nng.com/emailpolicy/
By responding to this email you accept the email policy.


-----Original Message-----
> From: Pierre-Yves David [mailto:pierre-yves.david@ens-lyon.org]

> Sent: Tuesday, September 13, 2016 11:42 AM

> To: Gábor STEFANIK <Gabor.STEFANIK@nng.com>; Yuya Nishihara

> <yuya@tcha.org>

> Cc: mercurial-devel@mercurial-scm.org

> Subject: Re: [PATCH 1 of 2 v8] graft: support grafting across move/copy

> (issue4028)

>

>

>

> On 09/12/2016 12:13 PM, Gábor STEFANIK wrote:

> > -----Original Message-----

> >> From: Yuya Nishihara [mailto:youjah@gmail.com] On Behalf Of Yuya

> >> Nishihara

> >> Sent: Sunday, September 11, 2016 6:29 PM

> >> To: Gábor STEFANIK <Gabor.STEFANIK@nng.com>

> >> Cc: mercurial-devel@mercurial-scm.org

> >> Subject: Re: [PATCH 1 of 2 v8] graft: support grafting across

> >> move/copy

> >> (issue4028)

> >>

> >> On Fri, 26 Aug 2016 11:16:31 -0500, Gábor Stefanik wrote:

> >>> # HG changeset patch

> >>> # User Gábor Stefanik <gabor.stefanik@nng.com> # Date 1472225958 -

> 7200

> >>> #      Fri Aug 26 17:39:18 2016 +0200

> >>> # Node ID f32aa28298164aa38830e83369c57c9553c6ff08

> >>> # Parent  318e2b600b80e4ed3c6f37df46ec7544f60d4c0b

> >>> graft: support grafting across move/copy (issue4028)

> >>

> >> Sorry for late reply, and I don't fully understand how this patch

> >> handles reverse copy tracking correctly yet. But weekend is over, so

> >> I just wrote random comments.

> >>

> >>> +    # In certain scenarios (e.g. graft, update or rebase), ca can

> >>> + be

> >> overridden

> >>> +    # We still need to know a real common ancestor in this case

> >>> +    # We can't just compute _c1.ancestor(_c2) and compare it to ca,

> >> because

> >>> +    # there can be multiple common ancestors, e.g. in case of bidmerge.

> >>> +    cta = ca

> >>> +    # ca.descendant(wc) and ca.descendant(ca) are False, work around

> that

> >>> +    _c1 = c1.p1() if c1.rev() is None else c1

> >>> +    _c2 = c2.p1() if c2.rev() is None else c2

> >>> +    dirty_c1 = not (ca == _c1 or ca.descendant(_c1))

> >>> +    dirty_c2 = not (ca == _c2 or ca.descendant(_c2))

> >>> +    graft = dirty_c1 or dirty_c2

> >>> +    if graft:

> >>> +        cta = _c1.ancestor(_c2)

> >>

> >> Can we know if we're doing graft-type merge beforehand? I think

> >> ca.descendant() isn't fast so it should be avoided for normal merges.

> >

> > This has been argued repeatedly. Basically the only way you can know in

> advance that your merge is going to be graftlike is by  doing a few

> descendant() calls yourself. So, with the exception of the "hg merge"

> command (which is guaranteed to yield ungraftlike merges), all commands

> wishing to do a merge will have to run through this whole descendant game.

> >

> > In the first few versions of the patches, I actually did graftlikeness

> detection in mergemod.graft(), but it was a nightmare to get it to work

> properly, and as it turns out, graft() isn't the only thing doing a graftlike

> merge. I was informed that calling descendant() once in a command is fine,

> it's only too slow for calling in a loop.

> >

> > I don't think complicating the code, potentially missing some edge cases of

> unusual commands doing graftlike merges, and breaking extension

> compatibility is warranted for a few ms speedup of "hg merge".

>

> Given how often you had to explain this, it is probably worth adding

> something about this in the inline comment about it.


Will do, for the sake of posterity.

The actual discussion I believe happened on either the mailing list or on IRC.
I was specifically assured that a small constant number of descendant() calls is OK.

>

> >>>      # find interesting file sets from manifests

> >>> +    if graft:

> >>> +        repo.ui.debug("  computing unmatched files in rotated

> >>> + DAG\n")

> >>>      addedinm1 = m1.filesnotin(ma)

> >>>      addedinm2 = m2.filesnotin(ma)

> >>> -    u1, u2 = _computenonoverlap(repo, c1, c2, addedinm1, addedinm2)

> >>> +    _u1, _u2 = _computenonoverlap(repo, c1, c2, addedinm1,

> addedinm2)

> >>> +    if not graft:

> >>> +        u1, u2 = _u1, _u2

> >>> +    else: # need to recompute this for directory move handling when

> >> grafting

> >>> +        repo.ui.debug("  computing unmatched files in unrotated DAG\n")

> >>> +        u1, u2 = _computenonoverlap(repo, c1, c2, m1.filesnotin(mta),

> >>> +

> >>> + m2.filesnotin(mta))

> >>> +

> >>>      bothnew = sorted(addedinm1 & addedinm2)

> >>>

> >>>      for f in u1:

> >>> -        checkcopies(c1, f, m1, m2, ca, limit, diverge, copy1, fullcopy1)

> >>> +        checkcopies(c1, f, m1, m2, ca, cta, dirty_c1, limit, diverge, copy1,

> >>> +                    fullcopy1, incomplete1, incompletediverge)

> >>>

> >>>      for f in u2:

> >>> -        checkcopies(c2, f, m2, m1, ca, limit, diverge, copy2, fullcopy2)

> >>> +        checkcopies(c2, f, m2, m1, ca, cta, dirty_c2, limit, diverge, copy2,

> >>> +                    fullcopy2, incomplete2, incompletediverge)

> >>

> >> [snip]

> >>

> >>> +    # combine partial copy paths discovered in the previous step

> >>> +    copyfrom, copyto = incomplete1, incomplete2

> >>> +    if dirty_c1:

> >>> +        copyfrom, copyto = incomplete2, incomplete1

> >>> +    for f in copyfrom:

> >>> +        if f in copyto:

> >>> +            copy[copyto[f]] = copyfrom[f]

> >>> +            del copyto[f]

> >>> +    for f in incompletediverge:

> >>> +        assert f not in diverge

> >>> +        ic = incompletediverge[f]

> >>> +        if ic[0] in copyto:

> >>> +            diverge[f] = [copyto[ic[0]], ic[1]]

> >>

> >> According to Matt's comment, we need two copy traces split at 'ca',

> >> but we use ones split at 'cta' (and somewhat 'ca' is taken into

> >> account?), because it wouldn't be easy to track copies backwards.

> >>

> >> https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-

> >> August/086915.html

> >>

> >> I guess that are addressed these "incomplete" dicts and "if"s in

> >> checkcopies(), but I wonder if there are unhandled cases such as

> >> non-linear DAG to be rotated, which might include "ypoc". I'm not sure,

> though.

> >

> > See the tests. Virtually every imaginable case is tested.

> >

> > Matt actually wanted 3 copy traces, one between "cta" and "ca", one from

> one parent to "ca", and one from the other parent to "cta". The problem

> with this approach is that checkcopies can't just stop after going "behind"

> some cutoff revision, since it's operating in a low-level way in which "behind"

> doesn't really make sense. This is presumably for perf reasons. As a result,

> the checkcopies pass going from a parent to "ca" will actually go back to "cta",

> and find spurious copies. We could perhaps identify and remove those

> spurious copies by comparing the output to that of the ca->cta pass, but then

> we would need post-processing as complex as what this patch has to

> accomplish that, so we win nothing by going 3-pass.

>

> Okay this is the first time I read bout this 'behind' thing, I need to investigate

> that but that should probably extensibility documented as a comment.


The existing comment (and the signature of the function) implies this heavily, as there
is actually no parameter provided for identifying a "bottom" revision.

The ca parameter doesn't serve as a stopping point, because the common ancestor's
manifest may actually contain a file from a much earlier revision. Same for the new cta.

There is a limit parameter, but it uses integer arithmetic on revision numbers, which
don't work well in a rotated DAG. Also, limit is apparently only a performance optimization,
and it provides no guarantees that it won't visit "uninteresting" revisions. This is because
in the 2-pass scenario, visiting an uninteresting revision just wastes time, but with 3 passes,
it's possible to accidentally visit a revision that would be uninteresting in the current pass,
but interesting in another, in which case copies in that revision will be recorded in the
wrong direction. Getting rid of that requires either rewriting the limit logic completely
to eliminate the usage of integer arithmetic (potentially severely slowing down rename detection),
or adding filter logic to get rid of these spurious reversed copies, which would be more complex
than the current filtering and combination logic for 2-pass.

Maybe document that limit is only an optimization, and doesn't provide guarantees?

>

> --

> Pierre-Yves David
Yuya Nishihara - Sept. 14, 2016, 4:09 p.m.
On Mon, 12 Sep 2016 17:25:45 +0000, Gábor STEFANIK wrote:
> > I thought callers could compute ca and cta by themselves since they explicitly
> > pass a pseudo ca, but maybe I'm wrong. If that's already been discussed,
> > there's no reason to complain. Sorry for the noise.
> 
> In certain "hg update" scenarios, mergemod.update may be called with ancestor=None,
> and we can still end up with a fake ca (in a completely sane way). Also, there are quite a few extensions
> both in and out of tree, that call mergemod.update with or without an explicit ca, which may or may not
> be a real common ancestor.
> 
> Then there is also the case when "ca" is a common ancestor of both c1 and c2, but ca != c1.ancestor(c2)
> This case is an ungraftlike merge, and should not use the graft logic. It can happen e.g. due to bidmerge.
> 
> Finally, there are cases of "hg graft" where ca==cta, and thus a graft command can involve an ungraftlike
> merge. There is no way for the caller to detect this, except by calling descendant().

Great explanation, many thanks. I didn't know the bidmerge case.

> > I'm not sure if this is a relevant example, but given the following history, I got
> > different results by a) "graft 6 to 10" and b) "graft 7 to 10", which seems
> > suspicious.
> >
> > @  10: a
> > |
> > | +  9: c
> > | |
> > | +  8: d->()
> > | |
> > | | +  7: d
> > | |/
> > | | +  6: c
> > | |/
> > | +    5: c,d
> > | |\
> > | | +  4: b->d
> > | | |
> > | + |  3: b->c
> > | |/
> > | +  2: a->b
> > |/
> > o  1: a
> > |
> > o  0: ()->a

[snip]

> This looks like a separate (though related) issue to me. Turning the DAG around causes "a" to be both "moved from c" and "moved from d", a ypoc-like scenario. The problem is, mergecopies' callers expect the "copy" return value to be a many-to-one mapping of targets to sources, when it's actually many-to-many in this scenario. Fixable, but I would rather handle this in a followup issue (that I intend to do still in the 4.0 timeframe), since it's not a regression, and the patch has already grown quite big.

Yes, this is ypoc-like, both "graft 6" and "graft 7" are ambiguous since
there are two sources 'c' and 'd' possibly to be merged to 'a'.

checkcopies() is getting complicated to support rotated DAG, and I'm afraid
it will be more complicated in future (or we'll have to decide a complete
rewrite.) That makes me feel extending checkcopies() isn't the best way
to work around simple graft scenarios for 4.0 timeframe.
Gábor Stefanik - Sept. 15, 2016, 10:06 a.m.
>



--------------------------------------------------------------------------
This message, including its attachments, is confidential. For more information please read NNG's email policy here:
http://www.nng.com/emailpolicy/
By responding to this email you accept the email policy.


-----Original Message-----
> From: Yuya Nishihara [mailto:youjah@gmail.com] On Behalf Of Yuya

> Nishihara

> Sent: Wednesday, September 14, 2016 6:10 PM

> To: Gábor STEFANIK <Gabor.STEFANIK@nng.com>

> Cc: mercurial-devel@mercurial-scm.org

> Subject: Re: [PATCH 1 of 2 v8] graft: support grafting across move/copy

> (issue4028)

>

> On Mon, 12 Sep 2016 17:25:45 +0000, Gábor STEFANIK wrote:

> > > I thought callers could compute ca and cta by themselves since they

> > > explicitly pass a pseudo ca, but maybe I'm wrong. If that's already

> > > been discussed, there's no reason to complain. Sorry for the noise.

> >

> > In certain "hg update" scenarios, mergemod.update may be called with

> > ancestor=None, and we can still end up with a fake ca (in a completely

> > sane way). Also, there are quite a few extensions both in and out of

> > tree, that call mergemod.update with or without an explicit ca, which may

> or may not be a real common ancestor.

> >

> > Then there is also the case when "ca" is a common ancestor of both c1

> > and c2, but ca != c1.ancestor(c2) This case is an ungraftlike merge, and

> should not use the graft logic. It can happen e.g. due to bidmerge.

> >

> > Finally, there are cases of "hg graft" where ca==cta, and thus a graft

> > command can involve an ungraftlike merge. There is no way for the caller

> to detect this, except by calling descendant().

>

> Great explanation, many thanks. I didn't know the bidmerge case.

>

> > > I'm not sure if this is a relevant example, but given the following

> > > history, I got different results by a) "graft 6 to 10" and b) "graft

> > > 7 to 10", which seems suspicious.

> > >

> > > @  10: a

> > > |

> > > | +  9: c

> > > | |

> > > | +  8: d->()

> > > | |

> > > | | +  7: d

> > > | |/

> > > | | +  6: c

> > > | |/

> > > | +    5: c,d

> > > | |\

> > > | | +  4: b->d

> > > | | |

> > > | + |  3: b->c

> > > | |/

> > > | +  2: a->b

> > > |/

> > > o  1: a

> > > |

> > > o  0: ()->a

>

> [snip]

>

> > This looks like a separate (though related) issue to me. Turning the DAG

> around causes "a" to be both "moved from c" and "moved from d", a ypoc-

> like scenario. The problem is, mergecopies' callers expect the "copy" return

> value to be a many-to-one mapping of targets to sources, when it's actually

> many-to-many in this scenario. Fixable, but I would rather handle this in a

> followup issue (that I intend to do still in the 4.0 timeframe), since it's not a

> regression, and the patch has already grown quite big.

>

> Yes, this is ypoc-like, both "graft 6" and "graft 7" are ambiguous since there

> are two sources 'c' and 'd' possibly to be merged to 'a'.

>

> checkcopies() is getting complicated to support rotated DAG, and I'm afraid it

> will be more complicated in future (or we'll have to decide a complete

> rewrite.) That makes me feel extending checkcopies() isn't the best way to

> work around simple graft scenarios for 4.0 timeframe.


The issue you described isn't specific to backwards grafting through renames.
In fact, it can be reproduced also with a regular, ungraftlike merge command:

hg init test
cd test
for i in 1 2 3 4 5 6 7 8; do echo $i >> a; done
for i in a b c d e f g h; do echo $i >> b; done
hg add a b
hg ci -m initial
hg mv a c
hg ci -m 'a->c'
hg up 0
hg mv b c
hg ci -m 'b->c'
hg merge
<resolve the conflict so that both files' contents are preserved, numbers first, then letters>
hg commit -m merge
hg up 0
sed -e s/4/44/ a > a
hg ci -m 44
hg up 0
sed -e s/e/ee/ b > b
hg ci -m ee
hg up 3
hg merge 4 # works
hg ci -m goodmerge
hg up 3
hg merge 5 # fails, "other changed b which local deleted"

I have a plan for fixing this, and it doesn't involve further extending checkcopies.
Gábor Stefanik - Sept. 15, 2016, 10:42 a.m.
>



--------------------------------------------------------------------------
This message, including its attachments, is confidential. For more information please read NNG's email policy here:
http://www.nng.com/emailpolicy/
By responding to this email you accept the email policy.


-----Original Message-----
> From: Mercurial-devel [mailto:mercurial-devel-bounces@mercurial-scm.org]

> On Behalf Of Gábor STEFANIK

> Sent: Thursday, September 15, 2016 12:07 PM

> To: Yuya Nishihara <yuya@tcha.org>

> Cc: mercurial-devel@mercurial-scm.org

> Subject: RE: [PATCH 1 of 2 v8] graft: support grafting across move/copy

> (issue4028)

>

> >

>

>

> --------------------------------------------------------------------------

> This message, including its attachments, is confidential. For more information

> please read NNG's email policy here:

> http://www.nng.com/emailpolicy/

> By responding to this email you accept the email policy.

>

>

> -----Original Message-----

> > From: Yuya Nishihara [mailto:youjah@gmail.com] On Behalf Of Yuya

> > Nishihara

> > Sent: Wednesday, September 14, 2016 6:10 PM

> > To: Gábor STEFANIK <Gabor.STEFANIK@nng.com>

> > Cc: mercurial-devel@mercurial-scm.org

> > Subject: Re: [PATCH 1 of 2 v8] graft: support grafting across

> > move/copy

> > (issue4028)

> >

> > On Mon, 12 Sep 2016 17:25:45 +0000, Gábor STEFANIK wrote:

> > > > I thought callers could compute ca and cta by themselves since

> > > > they explicitly pass a pseudo ca, but maybe I'm wrong. If that's

> > > > already been discussed, there's no reason to complain. Sorry for the

> noise.

> > >

> > > In certain "hg update" scenarios, mergemod.update may be called with

> > > ancestor=None, and we can still end up with a fake ca (in a

> > > completely sane way). Also, there are quite a few extensions both in

> > > and out of tree, that call mergemod.update with or without an

> > > explicit ca, which may

> > or may not be a real common ancestor.

> > >

> > > Then there is also the case when "ca" is a common ancestor of both

> > > c1 and c2, but ca != c1.ancestor(c2) This case is an ungraftlike

> > > merge, and

> > should not use the graft logic. It can happen e.g. due to bidmerge.

> > >

> > > Finally, there are cases of "hg graft" where ca==cta, and thus a

> > > graft command can involve an ungraftlike merge. There is no way for

> > > the caller

> > to detect this, except by calling descendant().

> >

> > Great explanation, many thanks. I didn't know the bidmerge case.

> >

> > > > I'm not sure if this is a relevant example, but given the

> > > > following history, I got different results by a) "graft 6 to 10"

> > > > and b) "graft

> > > > 7 to 10", which seems suspicious.

> > > >

> > > > @  10: a

> > > > |

> > > > | +  9: c

> > > > | |

> > > > | +  8: d->()

> > > > | |

> > > > | | +  7: d

> > > > | |/

> > > > | | +  6: c

> > > > | |/

> > > > | +    5: c,d

> > > > | |\

> > > > | | +  4: b->d

> > > > | | |

> > > > | + |  3: b->c

> > > > | |/

> > > > | +  2: a->b

> > > > |/

> > > > o  1: a

> > > > |

> > > > o  0: ()->a

> >

> > [snip]

> >

> > > This looks like a separate (though related) issue to me. Turning the

> > > DAG

> > around causes "a" to be both "moved from c" and "moved from d", a

> > ypoc- like scenario. The problem is, mergecopies' callers expect the

> > "copy" return value to be a many-to-one mapping of targets to sources,

> > when it's actually many-to-many in this scenario. Fixable, but I would

> > rather handle this in a followup issue (that I intend to do still in

> > the 4.0 timeframe), since it's not a regression, and the patch has already

> grown quite big.

> >

> > Yes, this is ypoc-like, both "graft 6" and "graft 7" are ambiguous

> > since there are two sources 'c' and 'd' possibly to be merged to 'a'.

> >

> > checkcopies() is getting complicated to support rotated DAG, and I'm

> > afraid it will be more complicated in future (or we'll have to decide

> > a complete

> > rewrite.) That makes me feel extending checkcopies() isn't the best

> > way to work around simple graft scenarios for 4.0 timeframe.

>

> The issue you described isn't specific to backwards grafting through renames.

> In fact, it can be reproduced also with a regular, ungraftlike merge command:

>

> hg init test

> cd test

> for i in 1 2 3 4 5 6 7 8; do echo $i >> a; done for i in a b c d e f g h; do echo $i >>

> b; done hg add a b hg ci -m initial hg mv a c hg ci -m 'a->c'

> hg up 0

> hg mv b c

> hg ci -m 'b->c'

> hg merge

> <resolve the conflict so that both files' contents are preserved, numbers

> first, then letters> hg commit -m merge hg up 0 sed -e s/4/44/ a > a hg ci -m

> 44 hg up 0 sed -e s/e/ee/ b > b hg ci -m ee hg up 3 hg merge 4 # works hg ci -

> m goodmerge hg up 3 hg merge 5 # fails, "other changed b which local

> deleted"


Ouch. That got mangled.
See a readable version in https://bz.mercurial-scm.org/show_bug.cgi?id=5369

>

> I have a plan for fixing this, and it doesn't involve further extending

> checkcopies.

> _______________________________________________

> Mercurial-devel mailing list

> Mercurial-devel@mercurial-scm.org

> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Gábor Stefanik - Sept. 16, 2016, 8:44 p.m.
>



--------------------------------------------------------------------------
This message, including its attachments, is confidential. For more information please read NNG's email policy here:
http://www.nng.com/emailpolicy/
By responding to this email you accept the email policy.


-----Original Message-----
> From: Pierre-Yves David [mailto:pierre-yves.david@ens-lyon.org]

> Sent: Tuesday, September 13, 2016 12:00 PM

> To: Gábor STEFANIK <Gabor.STEFANIK@nng.com>; mercurial-

> devel@mercurial-scm.org

> Subject: Re: [PATCH 1 of 2 v8] graft: support grafting across move/copy

> (issue4028)

>

>

>

> On 08/26/2016 06:16 PM, Gábor Stefanik wrote:

> > # HG changeset patch

> > # User Gábor Stefanik <gabor.stefanik@nng.com> # Date 1472225958 -7200

> > #      Fri Aug 26 17:39:18 2016 +0200

> > # Node ID f32aa28298164aa38830e83369c57c9553c6ff08

> > # Parent  318e2b600b80e4ed3c6f37df46ec7544f60d4c0b

> > graft: support grafting across move/copy (issue4028)

> >

> > Graft performs a merge with a false common ancestor, which must be

> > taken into account when tracking copies. Explicitly pass the real

> > common ancestor in this case, and track copies between the real and false

> common ancestors in reverse.

> >

> > With this change, when grafting a commit with a change to a file moved

> > earlier on the graft's source branch, the change is merged as expected

> > into the original

> > (unmoved) file, rather than recreating it under its new name.

> > It should also make it possible to eventually enable cross-branch

> > updates with merge.

>

> I've made a small review pass, forcusing on the test are they are the basis

> that will allow use to make sure the copy tracing is actually catching all cases.

> Unfortunately, the test are a bit too opaque for me, I've made some

> comment inline.

>

> I think we should list possible variants and draw a table to make sure we

> define and test all of them.

>

> Random example of changeset with table

> https://selenic.com/repo/hg//rev/6c81b8ebf66e


Done.

>

>

> > diff --git a/mercurial/copies.py b/mercurial/copies.py

> > --- a/mercurial/copies.py

> > +++ b/mercurial/copies.py

> > @@ -321,6 +321,20 @@

> >      if repo.ui.configbool('experimental', 'disablecopytrace'):

> >          return {}, {}, {}, {}

> >

> > +    # In certain scenarios (e.g. graft, update or rebase), ca can be

> overridden

> > +    # We still need to know a real common ancestor in this case

> > +    # We can't just compute _c1.ancestor(_c2) and compare it to ca,

> because

> > +    # there can be multiple common ancestors, e.g. in case of bidmerge.

> > +    cta = ca

> > +    # ca.descendant(wc) and ca.descendant(ca) are False, work around that

> > +    _c1 = c1.p1() if c1.rev() is None else c1

> > +    _c2 = c2.p1() if c2.rev() is None else c2

> > +    dirty_c1 = not (ca == _c1 or ca.descendant(_c1))

> > +    dirty_c2 = not (ca == _c2 or ca.descendant(_c2))

> > +    graft = dirty_c1 or dirty_c2

> > +    if graft:

> > +        cta = _c1.ancestor(_c2)

> > +

> >      limit = _findlimit(repo, c1.rev(), c2.rev())

> >      if limit is None:

> >          # no common ancestor, no copies

> > @@ -330,28 +344,56 @@

> >      m1 = c1.manifest()

> >      m2 = c2.manifest()

> >      ma = ca.manifest()

> > +    mta = cta.manifest()

> >

> > -    copy1, copy2, = {}, {}

> > +    # see checkcopies documentation below for these dicts

> > +    copy1, copy2 = {}, {}

> > +    incomplete1, incomplete2 = {}, {}

> >      movewithdir1, movewithdir2 = {}, {}

> >      fullcopy1, fullcopy2 = {}, {}

> > -    diverge = {}

> > +    diverge, incompletediverge = {}, {}

> >

> >      # find interesting file sets from manifests

> > +    if graft:

> > +        repo.ui.debug("  computing unmatched files in rotated DAG\n")

> >      addedinm1 = m1.filesnotin(ma)

> >      addedinm2 = m2.filesnotin(ma)

> > -    u1, u2 = _computenonoverlap(repo, c1, c2, addedinm1, addedinm2)

> > +    _u1, _u2 = _computenonoverlap(repo, c1, c2, addedinm1, addedinm2)

> > +    if not graft:

> > +        u1, u2 = _u1, _u2

> > +    else: # need to recompute this for directory move handling when

> grafting

> > +        repo.ui.debug("  computing unmatched files in unrotated DAG\n")

> > +        u1, u2 = _computenonoverlap(repo, c1, c2, m1.filesnotin(mta),

> > +                                                  m2.filesnotin(mta))

> > +

> >      bothnew = sorted(addedinm1 & addedinm2)

> >

> >      for f in u1:

> > -        checkcopies(c1, f, m1, m2, ca, limit, diverge, copy1, fullcopy1)

> > +        checkcopies(c1, f, m1, m2, ca, cta, dirty_c1, limit, diverge, copy1,

> > +                    fullcopy1, incomplete1, incompletediverge)

> >

> >      for f in u2:

> > -        checkcopies(c2, f, m2, m1, ca, limit, diverge, copy2, fullcopy2)

> > +        checkcopies(c2, f, m2, m1, ca, cta, dirty_c2, limit, diverge, copy2,

> > +                    fullcopy2, incomplete2, incompletediverge)

> >

> >      copy = dict(copy1.items() + copy2.items())

> >      movewithdir = dict(movewithdir1.items() + movewithdir2.items())

> >      fullcopy = dict(fullcopy1.items() + fullcopy2.items())

> >

> > +    # combine partial copy paths discovered in the previous step

> > +    copyfrom, copyto = incomplete1, incomplete2

> > +    if dirty_c1:

> > +        copyfrom, copyto = incomplete2, incomplete1

> > +    for f in copyfrom:

> > +        if f in copyto:

> > +            copy[copyto[f]] = copyfrom[f]

> > +            del copyto[f]

> > +    for f in incompletediverge:

> > +        assert f not in diverge

> > +        ic = incompletediverge[f]

> > +        if ic[0] in copyto:

> > +            diverge[f] = [copyto[ic[0]], ic[1]]

> > +

> >      renamedelete = {}

> >      renamedeleteset = set()

> >      divergeset = set()

> > @@ -369,10 +411,29 @@

> >      if bothnew:

> >          repo.ui.debug("  unmatched files new in both:\n   %s\n"

> >                        % "\n   ".join(bothnew))

> > -    bothdiverge, _copy, _fullcopy = {}, {}, {}

> > +    bothdiverge = {}

> > +    _copy, _fullcopy = {}, {} # dummy dicts

> > +    _incomplete1, _incomplete2, _diverge = {}, {}, {}

> >      for f in bothnew:

> > -        checkcopies(c1, f, m1, m2, ca, limit, bothdiverge, _copy, _fullcopy)

> > -        checkcopies(c2, f, m2, m1, ca, limit, bothdiverge, _copy, _fullcopy)

> > +        checkcopies(c1, f, m1, m2, ca, cta, dirty_c1, limit, bothdiverge,

> > +                    _copy, _fullcopy, _incomplete1, _diverge)

> > +        checkcopies(c2, f, m2, m1, ca, cta, dirty_c2, limit, bothdiverge,

> > +                    _copy, _fullcopy, _incomplete2, _diverge)

> > +    _incomplete = _incomplete2

> > +    if dirty_c1:

> > +        _incomplete = _incomplete1

> > +        assert _incomplete2 == {}

> > +    else:

> > +        assert _incomplete1 == {}

> > +    for f in _diverge:

> > +        assert f not in bothdiverge

> > +        ic = _diverge[f]

> > +        if ic[0] in _incomplete:

> > +            bothdiverge[f] = [_incomplete[ic[0]], ic[1]]

> > +        elif ic[0] in (m1 if dirty_c1 else m2):

> > +            # backed-out rename on one side, but watch out for deleted files

> > +            bothdiverge[f] = ic

> > +

> >      for of, fl in bothdiverge.items():

> >          if len(fl) == 2 and fl[0] == fl[1]:

> >              copy[fl[0]] = of # not actually divergent, just matching renames

> > @@ -438,7 +499,7 @@

> >                        (d, dirmove[d]))

> >

> >      # check unaccounted nonoverlapping files against directory moves

> > -    for f in u1 + u2:

> > +    for f in _u1 + _u2:

> >          if f not in fullcopy:

> >              for d in dirmove:

> >                  if f.startswith(d):

> > @@ -452,7 +513,8 @@

> >

> >      return copy, movewithdir, diverge, renamedelete

> >

> > -def checkcopies(ctx, f, m1, m2, ca, limit, diverge, copy, fullcopy):

> > +def checkcopies(ctx, f, m1, m2, ca, cta, remote_ca, limit, diverge, copy,

> > +                fullcopy, incomplete, incompletediverge):

> >      """

> >      check possible copies of f from m1 to m2

> >

> > @@ -460,14 +522,20 @@

> >      f = the filename to check

> >      m1 = the source manifest

> >      m2 = the destination manifest

> > -    ca = the changectx of the common ancestor

> > +    ca = the changectx of the common ancestor, overridden on graft

> > +    cta = topological common ancestor for graft-like scenarios

> > +    remote_ca = True if ca is outside cta::m1, False otherwise

> >      limit = the rev number to not search beyond

> >      diverge = record all diverges in this dict

> >      copy = record all non-divergent copies in this dict

> >      fullcopy = record all copies in this dict

> > +    incomplete = record non-divergent partial copies here

> > +    incompletediverge = record divergent partial copies here

> >      """

> >

> >      ma = ca.manifest()

> > +    mta = cta.manifest()

> > +    backwards = ca != cta and not remote_ca and f in ma

> >      getfctx = _makegetfctx(ctx)

> >

> >      def _related(f1, f2, limit):

> > @@ -502,31 +570,57 @@

> >              return False

> >

> >      of = None

> > -    seen = set([f])

> > -    for oc in getfctx(f, m1[f]).ancestors():

> > +    seen = {f: [getfctx(f, m1[f])]}

> > +    for oc in seen[f][0].ancestors():

> >          ocr = oc.linkrev()

> >          of = oc.path()

> >          if of in seen:

> > +            seen[of].append(oc)

> >              # check limit late - grab last rename before

> >              if ocr < limit:

> >                  break

> >              continue

> > -        seen.add(of)

> > +        seen[of] = [oc]

> >

> > -        fullcopy[f] = of # remember for dir rename detection

> > +        # remember for dir rename detection

> > +        if backwards:

> > +            fullcopy[of] = f # grafting backwards through renames

> > +        else:

> > +            fullcopy[f] = of

> >          if of not in m2:

> >              continue # no match, keep looking

> >          if m2[of] == ma.get(of):

> > -            break # no merge needed, quit early

> > +            return # no merge needed, quit early

> >          c2 = getfctx(of, m2[of])

> > -        cr = _related(oc, c2, ca.rev())

> > +        cr = _related(oc, c2, cta.rev())

> >          if cr and (of == f or of == c2.path()): # non-divergent

> > -            copy[f] = of

> > -            of = None

> > -            break

> > +            if backwards:

> > +                copy[of] = f

> > +            elif of in ma:

> > +                copy[f] = of

> > +            elif remote_ca: # special case: a <- b <- a -> b "ping-pong" rename

> > +                copy[of] = f

> > +                del fullcopy[f]

> > +                fullcopy[of] = f

> > +            else: # divergence w.r.t. graft CA on one side of topological CA

> > +                for sf in seen:

> > +                    if sf in ma and getfctx(sf, ma[sf]) in seen[sf]:

> > +                        assert sf not in diverge

> > +                        diverge[sf] = [f, of]

> > +                        break

> > +            return

> >

> > -    if of in ma:

> > -        diverge.setdefault(of, []).append(f)

> > +    if of in mta:

> > +        if backwards or remote_ca:

> > +            incomplete[of] = f

> > +        else:

> > +            for sf in seen:

> > +                if sf in ma and getfctx(sf, ma[sf]) in seen[sf]:

> > +                    if cta == ca:

> > +                        diverge.setdefault(sf, []).append(f)

> > +                    else:

> > +                        incompletediverge[sf] = [of, f]

> > +                    return

> >

> >  def duplicatecopies(repo, rev, fromrev, skiprev=None):

> >      '''reproduce copies from fromrev to rev in the dirstate

> > diff --git a/tests/test-graft.t b/tests/test-graft.t

> > --- a/tests/test-graft.t

> > +++ b/tests/test-graft.t

> > @@ -179,6 +179,13 @@

> >    committing changelog

> >    grafting 5:97f8bfe72746 "5"

> >      searching for copies back to rev 1

> > +    computing unmatched files in rotated DAG

> > +    computing unmatched files in unrotated DAG

> > +    unmatched files in other:

> > +     c

> > +    all copies found (* = to merge, ! = divergent, % = renamed and

> deleted):

> > +     src: 'c' -> dst: 'b' *

> > +    checking for directory renames

> >    resolving manifests

> >     branchmerge: True, force: True, partial: False

> >     ancestor: 4c60f11aa304, local: 6b9e5368ca4e+, remote: 97f8bfe72746

> > @@ -193,6 +200,13 @@

> >    scanning for duplicate grafts

> >    grafting 4:9c233e8e184d "4"

> >      searching for copies back to rev 1

> > +    computing unmatched files in rotated DAG

> > +    computing unmatched files in unrotated DAG

> > +    unmatched files in other:

> > +     c

> > +    all copies found (* = to merge, ! = divergent, % = renamed and

> deleted):

> > +     src: 'c' -> dst: 'b' *

> > +    checking for directory renames

> >    resolving manifests

> >     branchmerge: True, force: True, partial: False

> >     ancestor: 4c60f11aa304, local: 1905859650ec+, remote: 9c233e8e184d

> > @@ -427,8 +441,8 @@

> >    $ hg graft 3 --log -u foo

> >    grafting 3:4c60f11aa304 "3"

> >    warning: can't find ancestor for 'c' copied from 'b'!

> > -  $ hg log --template '{rev} {parents} {desc}\n' -r tip

> > -  14 1:5d205f8b35b6  3

> > +  $ hg log --template '{rev}:{node|short} {parents} {desc}\n' -r tip

> > +  14:0c921c65ef1e 1:5d205f8b35b6  3

> >    (grafted from 4c60f11aa304a54ae1c199feb94e7fc771e51ed8)

> >

> >  Resolve conflicted graft

> > @@ -620,7 +634,7 @@

> >    date:        Thu Jan 01 00:00:00 1970 +0000

> >    summary:     2

> >

> > -  changeset:   14:f64defefacee

> > +  changeset:   14:0c921c65ef1e

> >    parent:      1:5d205f8b35b6

> >    user:        foo

> >    date:        Thu Jan 01 00:00:00 1970 +0000

> > @@ -842,3 +856,290 @@

> >    |/

> >    o  0

> >

> > +Graft from behind a move or rename

>

> Use regularly use ReST like title to mark important section so make it

>

> Graft from behind a move or rename

> ==================================


Done.

>

>

> > +NOTE: This is affected by bug 5343, and will need updating when it's fixed

>

> Please use 'issue5343' to help future archeology'


Done.

>

> > +f50 tests a "ping-pong" rename case, where a file is renamed to the same

> name

> > +on both branches, then the rename is backed out on one branch, and the

> backout

> > +is grafted to the other branch. This creates a challenging b <- a <- b -> a

> > +rename sequence in the graft target, topological CA, graft CA and graft

> source,

> > +respectively. Since rename detection will run on the target side for such a

> > +sequence (as for technical reasons, we split the source and target sides

> not at

> > +the graft CA, but at the topological CA), it will pick up a false rename, and

> > +cause a spurious merge conflict. This false rename is always exactly the

> > +reverse of the true rename that would be detected on the source side, so

> deal

> > +with it by detecting this condition and reversing as necessary.

>

> I would probably have started with a less complexe test case (but I've

> to admit, that one looks fun)


This is not actually the first testcase, rather, it's the only one that requires a detailed explanation.

>

> > +

> > +  $ hg init ../graftmove

> > +  $ cd ../graftmove

> > +  $ echo c10 > f10

> > +  $ echo c20 > f20

> > +  $ echo c30 > f30

> > +  $ echo c40 > f40

> > +  $ echo c50 > f50

> > +  $ hg ci -qAm 0

>

> Please use Letter for your commit message, otherwise it creates

> confusion between revision number and commit message, especially when

> commit starts to be amended/rebased/grafted.

>

> You could also use descriptive commit message if that can help the

> readability of your changes.


Done.

The file names/contents and the commit messages are intended to be aligned,
for easy identification of where a change comes from.

>

> > +  $ hg mv f10 f11

> > +  $ hg mv f30 f31

> > +  $ hg mv f50 f51

> > +  $ hg ci -qAm 1

> > +  $ echo c12 > f11

> > +  $ hg mv f20 f22

> > +  $ hg mv f51 f50

> > +  $ hg ci -qAm 2

> > +  $ hg mv f31 f33

> > +  $ echo c43 > f40

> > +  $ hg ci -qAm 3

> > +  $ hg up -q 0

>

> At that point, it is probably worthwhile to include a graphlog output to

> have a vision of the resulting stack (even if in this case it turn out

> to be quick linear and simple).


Done.

>

> Shouldn't '2' alter the content of 'f50' so that we can check it get

> properly applied to 'f50' ? (if not, this is probably a sign that we

> need more details about what individual graft are testing .


Good idea, done. Originally, I was just testing that the rename itself gets
grafted correctly.

>

> > +  $ hg graft -r 2

> > +  grafting 2:52f80b3a6c3e "2"

> > +  merging f10 and f11 to f10

> > +  warning: can't find ancestor for 'f50' copied from 'f51'!

>

> What is this warning about ?


This comes from the "record" phase of the graft. It tries to record
"f50" as copied from "f51" in the graft commit, but the graft's
parent doesn't contain a file with that name.

This is related to bug 5343. We shouldn't be trying to record
any copy information for convergent renames like this.

>

> > +  $ hg status --change .

> > +  M f10

> > +  A f22

> > +  R f20

> > +  $ hg cat f10

> > +  c12

> > +  $ hg cat f11

> > +  f11: no such file in rev 54f7d8995b01

> > +  [1]

> > +  $ hg graft -r 3

> > +  grafting 3:0f6524134ac0 "3"

> > +  note: possible conflict - f31 was renamed multiple times to:

> > +   f33

> > +   f30

> > +  warning: can't find ancestor for 'f33' copied from 'f31'!

>

> Again, I need some explanation of what we are testing here.


This is a test of divergent rename handling (the "note" right before the warning).
The warning is the same as before, only this time it's not as obvious what the correct
thing would be to do.
It's debatable if recording f33 as  renamed from f31 is correct in this case - this is something
that needs to be discussed in bug 5343.

>

> > +  $ hg up -q 0

> > +  $ hg mv f10 f16

> > +  $ echo c26 > f20

> > +  $ hg mv f30 f36

> > +  $ hg mv f40 f46

> > +  $ hg mv f50 f51

> > +  $ hg ci -qAm 6

>

> Probably need some graph output here for tracking.


Done.

>

> > +  $ hg graft -r 2

> > +  grafting 2:52f80b3a6c3e "2"

> > +  merging f16 and f11 to f16

> > +  merging f20 and f22 to f22

> > +  $ hg graft -r 3

> > +  grafting 3:0f6524134ac0 "3"

> > +  note: possible conflict - f31 was renamed multiple times to:

> > +   f36

> > +   f33

> > +  merging f46 and f40 to f46

> > +  warning: can't find ancestor for 'f33' copied from 'f31'!

>

> Same here, that warning is confusing me and I'm not sure what we are

> testing.


Again, divergent rename test (this time a more complicated case involving
the incomplete divergence concept), triggering a variant of bug 5343.

>

> > +  $ hg log -CGv --patch --git

> > +  @  changeset:   8:db9925d9208e

> > +  |  tag:         tip

> > +  |  user:        test

> > +  |  date:        Thu Jan 01 00:00:00 1970 +0000

> > +  |  files:       f33 f46

> > +  |  description:

> > +  |  3

> > +  |

> > +  |

> > +  |  diff --git a/f33 b/f33

> > +  |  new file mode 100644

> > +  |  --- /dev/null

> > +  |  +++ b/f33

> > +  |  @@ -0,0 +1,1 @@

> > +  |  +c30

> > +  |  diff --git a/f46 b/f46

> > +  |  --- a/f46

> > +  |  +++ b/f46

> > +  |  @@ -1,1 +1,1 @@

> > +  |  -c40

> > +  |  +c43

> > +  |

> > +  o  changeset:   7:08113e8ff5c8

> > +  |  user:        test

> > +  |  date:        Thu Jan 01 00:00:00 1970 +0000

> > +  |  files:       f16 f20 f22 f50 f51

> > +  |  copies:      f22 (f20) f50 (f51)

> > +  |  description:

> > +  |  2

> > +  |

> > +  |

> > +  |  diff --git a/f16 b/f16

> > +  |  --- a/f16

> > +  |  +++ b/f16

> > +  |  @@ -1,1 +1,1 @@

> > +  |  -c10

> > +  |  +c12

> > +  |  diff --git a/f20 b/f22

> > +  |  rename from f20

> > +  |  rename to f22

> > +  |  diff --git a/f51 b/f50

> > +  |  rename from f51

> > +  |  rename to f50

> > +  |

> > +  o  changeset:   6:365d1fa57acb

> > +  |  parent:      0:1060512e332e

> > +  |  user:        test

> > +  |  date:        Thu Jan 01 00:00:00 1970 +0000

> > +  |  files:       f10 f16 f20 f30 f36 f40 f46 f50 f51

> > +  |  copies:      f16 (f10) f36 (f30) f46 (f40) f51 (f50)

> > +  |  description:

> > +  |  6

> > +  |

> > +  |

> > +  |  diff --git a/f10 b/f16

> > +  |  rename from f10

> > +  |  rename to f16

> > +  |  diff --git a/f20 b/f20

> > +  |  --- a/f20

> > +  |  +++ b/f20

> > +  |  @@ -1,1 +1,1 @@

> > +  |  -c20

> > +  |  +c26

> > +  |  diff --git a/f30 b/f36

> > +  |  rename from f30

> > +  |  rename to f36

> > +  |  diff --git a/f40 b/f46

> > +  |  rename from f40

> > +  |  rename to f46

> > +  |  diff --git a/f50 b/f51

> > +  |  rename from f50

> > +  |  rename to f51

> > +  |

> > +  | o  changeset:   5:d5015e1a1de5

> > +  | |  user:        test

> > +  | |  date:        Thu Jan 01 00:00:00 1970 +0000

> > +  | |  files:       f33 f40

> > +  | |  description:

> > +  | |  3

> > +  | |

> > +  | |

> > +  | |  diff --git a/f33 b/f33

> > +  | |  new file mode 100644

> > +  | |  --- /dev/null

> > +  | |  +++ b/f33

> > +  | |  @@ -0,0 +1,1 @@

> > +  | |  +c30

> > +  | |  diff --git a/f40 b/f40

> > +  | |  --- a/f40

> > +  | |  +++ b/f40

> > +  | |  @@ -1,1 +1,1 @@

> > +  | |  -c40

> > +  | |  +c43

> > +  | |

> > +  | o  changeset:   4:54f7d8995b01

> > +  |/   parent:      0:1060512e332e

> > +  |    user:        test

> > +  |    date:        Thu Jan 01 00:00:00 1970 +0000

> > +  |    files:       f10 f20 f22

> > +  |    copies:      f22 (f20)

> > +  |    description:

> > +  |    2

> > +  |

> > +  |

> > +  |    diff --git a/f10 b/f10

> > +  |    --- a/f10

> > +  |    +++ b/f10

> > +  |    @@ -1,1 +1,1 @@

> > +  |    -c10

> > +  |    +c12

> > +  |    diff --git a/f20 b/f22

> > +  |    rename from f20

> > +  |    rename to f22

> > +  |

> > +  | o  changeset:   3:0f6524134ac0

> > +  | |  user:        test

> > +  | |  date:        Thu Jan 01 00:00:00 1970 +0000

> > +  | |  files:       f31 f33 f40

> > +  | |  copies:      f33 (f31)

> > +  | |  description:

> > +  | |  3

> > +  | |

> > +  | |

> > +  | |  diff --git a/f31 b/f33

> > +  | |  rename from f31

> > +  | |  rename to f33

> > +  | |  diff --git a/f40 b/f40

> > +  | |  --- a/f40

> > +  | |  +++ b/f40

> > +  | |  @@ -1,1 +1,1 @@

> > +  | |  -c40

> > +  | |  +c43

> > +  | |

> > +  | o  changeset:   2:52f80b3a6c3e

> > +  | |  user:        test

> > +  | |  date:        Thu Jan 01 00:00:00 1970 +0000

> > +  | |  files:       f11 f20 f22 f50 f51

> > +  | |  copies:      f22 (f20) f50 (f51)

> > +  | |  description:

> > +  | |  2

> > +  | |

> > +  | |

> > +  | |  diff --git a/f11 b/f11

> > +  | |  --- a/f11

> > +  | |  +++ b/f11

> > +  | |  @@ -1,1 +1,1 @@

> > +  | |  -c10

> > +  | |  +c12

> > +  | |  diff --git a/f20 b/f22

> > +  | |  rename from f20

> > +  | |  rename to f22

> > +  | |  diff --git a/f51 b/f50

> > +  | |  rename from f51

> > +  | |  rename to f50

> > +  | |

> > +  | o  changeset:   1:7607a972f96d

> > +  |/   user:        test

> > +  |    date:        Thu Jan 01 00:00:00 1970 +0000

> > +  |    files:       f10 f11 f30 f31 f50 f51

> > +  |    copies:      f11 (f10) f31 (f30) f51 (f50)

> > +  |    description:

> > +  |    1

> > +  |

> > +  |

> > +  |    diff --git a/f10 b/f11

> > +  |    rename from f10

> > +  |    rename to f11

> > +  |    diff --git a/f30 b/f31

> > +  |    rename from f30

> > +  |    rename to f31

> > +  |    diff --git a/f50 b/f51

> > +  |    rename from f50

> > +  |    rename to f51

> > +  |

> > +  o  changeset:   0:1060512e332e

> > +     user:        test

> > +     date:        Thu Jan 01 00:00:00 1970 +0000

> > +     files:       f10 f20 f30 f40 f50

> > +     description:

> > +     0

> > +

> > +

> > +     diff --git a/f10 b/f10

> > +     new file mode 100644

> > +     --- /dev/null

> > +     +++ b/f10

> > +     @@ -0,0 +1,1 @@

> > +     +c10

> > +     diff --git a/f20 b/f20

> > +     new file mode 100644

> > +     --- /dev/null

> > +     +++ b/f20

> > +     @@ -0,0 +1,1 @@

> > +     +c20

> > +     diff --git a/f30 b/f30

> > +     new file mode 100644

> > +     --- /dev/null

> > +     +++ b/f30

> > +     @@ -0,0 +1,1 @@

> > +     +c30

> > +     diff --git a/f40 b/f40

> > +     new file mode 100644

> > +     --- /dev/null

> > +     +++ b/f40

> > +     @@ -0,0 +1,1 @@

> > +     +c40

> > +     diff --git a/f50 b/f50

> > +     new file mode 100644

> > +     --- /dev/null

> > +     +++ b/f50

> > +     @@ -0,0 +1,1 @@

> > +     +c50

> > +

> > +  $ hg cat f22

> > +  c26

> > diff --git a/tests/test-rebase-conflicts.t b/tests/test-rebase-conflicts.t

> > --- a/tests/test-rebase-conflicts.t

> > +++ b/tests/test-rebase-conflicts.t

> > @@ -238,6 +238,10 @@

> >     merge against 9:e31216eec445

> >       detach base 8:8e4e2c1a07ae

> >      searching for copies back to rev 3

> > +    computing unmatched files in rotated DAG

> > +    computing unmatched files in unrotated DAG

> > +    unmatched files in other:

> > +     f2.txt

> >    resolving manifests

> >     branchmerge: True, force: True, partial: False

> >     ancestor: 8e4e2c1a07ae, local: 4bc80088dc6b+, remote: e31216eec445

> > @@ -255,6 +259,10 @@

> >     merge against 10:2f2496ddf49d

> >       detach base 9:e31216eec445

> >      searching for copies back to rev 3

> > +    computing unmatched files in rotated DAG

> > +    computing unmatched files in unrotated DAG

> > +    unmatched files in other:

> > +     f2.txt

> >    resolving manifests

> >     branchmerge: True, force: True, partial: False

> >     ancestor: e31216eec445, local: 19c888675e13+, remote: 2f2496ddf49d

> > _______________________________________________

> > Mercurial-devel mailing list

> > Mercurial-devel@mercurial-scm.org

> > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

> >

>

> --

> Pierre-Yves David

Patch

diff --git a/mercurial/copies.py b/mercurial/copies.py
--- a/mercurial/copies.py
+++ b/mercurial/copies.py
@@ -321,6 +321,20 @@ 
     if repo.ui.configbool('experimental', 'disablecopytrace'):
         return {}, {}, {}, {}
 
+    # In certain scenarios (e.g. graft, update or rebase), ca can be overridden
+    # We still need to know a real common ancestor in this case
+    # We can't just compute _c1.ancestor(_c2) and compare it to ca, because
+    # there can be multiple common ancestors, e.g. in case of bidmerge.
+    cta = ca
+    # ca.descendant(wc) and ca.descendant(ca) are False, work around that
+    _c1 = c1.p1() if c1.rev() is None else c1
+    _c2 = c2.p1() if c2.rev() is None else c2
+    dirty_c1 = not (ca == _c1 or ca.descendant(_c1))
+    dirty_c2 = not (ca == _c2 or ca.descendant(_c2))
+    graft = dirty_c1 or dirty_c2
+    if graft:
+        cta = _c1.ancestor(_c2)
+
     limit = _findlimit(repo, c1.rev(), c2.rev())
     if limit is None:
         # no common ancestor, no copies
@@ -330,28 +344,56 @@ 
     m1 = c1.manifest()
     m2 = c2.manifest()
     ma = ca.manifest()
+    mta = cta.manifest()
 
-    copy1, copy2, = {}, {}
+    # see checkcopies documentation below for these dicts
+    copy1, copy2 = {}, {}
+    incomplete1, incomplete2 = {}, {}
     movewithdir1, movewithdir2 = {}, {}
     fullcopy1, fullcopy2 = {}, {}
-    diverge = {}
+    diverge, incompletediverge = {}, {}
 
     # find interesting file sets from manifests
+    if graft:
+        repo.ui.debug("  computing unmatched files in rotated DAG\n")
     addedinm1 = m1.filesnotin(ma)
     addedinm2 = m2.filesnotin(ma)
-    u1, u2 = _computenonoverlap(repo, c1, c2, addedinm1, addedinm2)
+    _u1, _u2 = _computenonoverlap(repo, c1, c2, addedinm1, addedinm2)
+    if not graft:
+        u1, u2 = _u1, _u2
+    else: # need to recompute this for directory move handling when grafting
+        repo.ui.debug("  computing unmatched files in unrotated DAG\n")
+        u1, u2 = _computenonoverlap(repo, c1, c2, m1.filesnotin(mta),
+                                                  m2.filesnotin(mta))
+
     bothnew = sorted(addedinm1 & addedinm2)
 
     for f in u1:
-        checkcopies(c1, f, m1, m2, ca, limit, diverge, copy1, fullcopy1)
+        checkcopies(c1, f, m1, m2, ca, cta, dirty_c1, limit, diverge, copy1,
+                    fullcopy1, incomplete1, incompletediverge)
 
     for f in u2:
-        checkcopies(c2, f, m2, m1, ca, limit, diverge, copy2, fullcopy2)
+        checkcopies(c2, f, m2, m1, ca, cta, dirty_c2, limit, diverge, copy2,
+                    fullcopy2, incomplete2, incompletediverge)
 
     copy = dict(copy1.items() + copy2.items())
     movewithdir = dict(movewithdir1.items() + movewithdir2.items())
     fullcopy = dict(fullcopy1.items() + fullcopy2.items())
 
+    # combine partial copy paths discovered in the previous step
+    copyfrom, copyto = incomplete1, incomplete2
+    if dirty_c1:
+        copyfrom, copyto = incomplete2, incomplete1
+    for f in copyfrom:
+        if f in copyto:
+            copy[copyto[f]] = copyfrom[f]
+            del copyto[f]
+    for f in incompletediverge:
+        assert f not in diverge
+        ic = incompletediverge[f]
+        if ic[0] in copyto:
+            diverge[f] = [copyto[ic[0]], ic[1]]
+
     renamedelete = {}
     renamedeleteset = set()
     divergeset = set()
@@ -369,10 +411,29 @@ 
     if bothnew:
         repo.ui.debug("  unmatched files new in both:\n   %s\n"
                       % "\n   ".join(bothnew))
-    bothdiverge, _copy, _fullcopy = {}, {}, {}
+    bothdiverge = {}
+    _copy, _fullcopy = {}, {} # dummy dicts
+    _incomplete1, _incomplete2, _diverge = {}, {}, {}
     for f in bothnew:
-        checkcopies(c1, f, m1, m2, ca, limit, bothdiverge, _copy, _fullcopy)
-        checkcopies(c2, f, m2, m1, ca, limit, bothdiverge, _copy, _fullcopy)
+        checkcopies(c1, f, m1, m2, ca, cta, dirty_c1, limit, bothdiverge,
+                    _copy, _fullcopy, _incomplete1, _diverge)
+        checkcopies(c2, f, m2, m1, ca, cta, dirty_c2, limit, bothdiverge,
+                    _copy, _fullcopy, _incomplete2, _diverge)
+    _incomplete = _incomplete2
+    if dirty_c1:
+        _incomplete = _incomplete1
+        assert _incomplete2 == {}
+    else:
+        assert _incomplete1 == {}
+    for f in _diverge:
+        assert f not in bothdiverge
+        ic = _diverge[f]
+        if ic[0] in _incomplete:
+            bothdiverge[f] = [_incomplete[ic[0]], ic[1]]
+        elif ic[0] in (m1 if dirty_c1 else m2):
+            # backed-out rename on one side, but watch out for deleted files
+            bothdiverge[f] = ic
+
     for of, fl in bothdiverge.items():
         if len(fl) == 2 and fl[0] == fl[1]:
             copy[fl[0]] = of # not actually divergent, just matching renames
@@ -438,7 +499,7 @@ 
                       (d, dirmove[d]))
 
     # check unaccounted nonoverlapping files against directory moves
-    for f in u1 + u2:
+    for f in _u1 + _u2:
         if f not in fullcopy:
             for d in dirmove:
                 if f.startswith(d):
@@ -452,7 +513,8 @@ 
 
     return copy, movewithdir, diverge, renamedelete
 
-def checkcopies(ctx, f, m1, m2, ca, limit, diverge, copy, fullcopy):
+def checkcopies(ctx, f, m1, m2, ca, cta, remote_ca, limit, diverge, copy,
+                fullcopy, incomplete, incompletediverge):
     """
     check possible copies of f from m1 to m2
 
@@ -460,14 +522,20 @@ 
     f = the filename to check
     m1 = the source manifest
     m2 = the destination manifest
-    ca = the changectx of the common ancestor
+    ca = the changectx of the common ancestor, overridden on graft
+    cta = topological common ancestor for graft-like scenarios
+    remote_ca = True if ca is outside cta::m1, False otherwise
     limit = the rev number to not search beyond
     diverge = record all diverges in this dict
     copy = record all non-divergent copies in this dict
     fullcopy = record all copies in this dict
+    incomplete = record non-divergent partial copies here
+    incompletediverge = record divergent partial copies here
     """
 
     ma = ca.manifest()
+    mta = cta.manifest()
+    backwards = ca != cta and not remote_ca and f in ma
     getfctx = _makegetfctx(ctx)
 
     def _related(f1, f2, limit):
@@ -502,31 +570,57 @@ 
             return False
 
     of = None
-    seen = set([f])
-    for oc in getfctx(f, m1[f]).ancestors():
+    seen = {f: [getfctx(f, m1[f])]}
+    for oc in seen[f][0].ancestors():
         ocr = oc.linkrev()
         of = oc.path()
         if of in seen:
+            seen[of].append(oc)
             # check limit late - grab last rename before
             if ocr < limit:
                 break
             continue
-        seen.add(of)
+        seen[of] = [oc]
 
-        fullcopy[f] = of # remember for dir rename detection
+        # remember for dir rename detection
+        if backwards:
+            fullcopy[of] = f # grafting backwards through renames
+        else:
+            fullcopy[f] = of
         if of not in m2:
             continue # no match, keep looking
         if m2[of] == ma.get(of):
-            break # no merge needed, quit early
+            return # no merge needed, quit early
         c2 = getfctx(of, m2[of])
-        cr = _related(oc, c2, ca.rev())
+        cr = _related(oc, c2, cta.rev())
         if cr and (of == f or of == c2.path()): # non-divergent
-            copy[f] = of
-            of = None
-            break
+            if backwards:
+                copy[of] = f
+            elif of in ma:
+                copy[f] = of
+            elif remote_ca: # special case: a <- b <- a -> b "ping-pong" rename
+                copy[of] = f
+                del fullcopy[f]
+                fullcopy[of] = f
+            else: # divergence w.r.t. graft CA on one side of topological CA
+                for sf in seen:
+                    if sf in ma and getfctx(sf, ma[sf]) in seen[sf]:
+                        assert sf not in diverge
+                        diverge[sf] = [f, of]
+                        break
+            return
 
-    if of in ma:
-        diverge.setdefault(of, []).append(f)
+    if of in mta:
+        if backwards or remote_ca:
+            incomplete[of] = f
+        else:
+            for sf in seen:
+                if sf in ma and getfctx(sf, ma[sf]) in seen[sf]:
+                    if cta == ca:
+                        diverge.setdefault(sf, []).append(f)
+                    else:
+                        incompletediverge[sf] = [of, f]
+                    return
 
 def duplicatecopies(repo, rev, fromrev, skiprev=None):
     '''reproduce copies from fromrev to rev in the dirstate
diff --git a/tests/test-graft.t b/tests/test-graft.t
--- a/tests/test-graft.t
+++ b/tests/test-graft.t
@@ -179,6 +179,13 @@ 
   committing changelog
   grafting 5:97f8bfe72746 "5"
     searching for copies back to rev 1
+    computing unmatched files in rotated DAG
+    computing unmatched files in unrotated DAG
+    unmatched files in other:
+     c
+    all copies found (* = to merge, ! = divergent, % = renamed and deleted):
+     src: 'c' -> dst: 'b' *
+    checking for directory renames
   resolving manifests
    branchmerge: True, force: True, partial: False
    ancestor: 4c60f11aa304, local: 6b9e5368ca4e+, remote: 97f8bfe72746
@@ -193,6 +200,13 @@ 
   scanning for duplicate grafts
   grafting 4:9c233e8e184d "4"
     searching for copies back to rev 1
+    computing unmatched files in rotated DAG
+    computing unmatched files in unrotated DAG
+    unmatched files in other:
+     c
+    all copies found (* = to merge, ! = divergent, % = renamed and deleted):
+     src: 'c' -> dst: 'b' *
+    checking for directory renames
   resolving manifests
    branchmerge: True, force: True, partial: False
    ancestor: 4c60f11aa304, local: 1905859650ec+, remote: 9c233e8e184d
@@ -427,8 +441,8 @@ 
   $ hg graft 3 --log -u foo
   grafting 3:4c60f11aa304 "3"
   warning: can't find ancestor for 'c' copied from 'b'!
-  $ hg log --template '{rev} {parents} {desc}\n' -r tip
-  14 1:5d205f8b35b6  3
+  $ hg log --template '{rev}:{node|short} {parents} {desc}\n' -r tip
+  14:0c921c65ef1e 1:5d205f8b35b6  3
   (grafted from 4c60f11aa304a54ae1c199feb94e7fc771e51ed8)
 
 Resolve conflicted graft
@@ -620,7 +634,7 @@ 
   date:        Thu Jan 01 00:00:00 1970 +0000
   summary:     2
   
-  changeset:   14:f64defefacee
+  changeset:   14:0c921c65ef1e
   parent:      1:5d205f8b35b6
   user:        foo
   date:        Thu Jan 01 00:00:00 1970 +0000
@@ -842,3 +856,290 @@ 
   |/
   o  0
   
+Graft from behind a move or rename
+NOTE: This is affected by bug 5343, and will need updating when it's fixed
+
+f50 tests a "ping-pong" rename case, where a file is renamed to the same name
+on both branches, then the rename is backed out on one branch, and the backout
+is grafted to the other branch. This creates a challenging b <- a <- b -> a
+rename sequence in the graft target, topological CA, graft CA and graft source,
+respectively. Since rename detection will run on the target side for such a
+sequence (as for technical reasons, we split the source and target sides not at
+the graft CA, but at the topological CA), it will pick up a false rename, and
+cause a spurious merge conflict. This false rename is always exactly the
+reverse of the true rename that would be detected on the source side, so deal
+with it by detecting this condition and reversing as necessary.
+
+  $ hg init ../graftmove
+  $ cd ../graftmove
+  $ echo c10 > f10
+  $ echo c20 > f20
+  $ echo c30 > f30
+  $ echo c40 > f40
+  $ echo c50 > f50
+  $ hg ci -qAm 0
+  $ hg mv f10 f11
+  $ hg mv f30 f31
+  $ hg mv f50 f51
+  $ hg ci -qAm 1
+  $ echo c12 > f11
+  $ hg mv f20 f22
+  $ hg mv f51 f50
+  $ hg ci -qAm 2
+  $ hg mv f31 f33
+  $ echo c43 > f40
+  $ hg ci -qAm 3
+  $ hg up -q 0
+  $ hg graft -r 2
+  grafting 2:52f80b3a6c3e "2"
+  merging f10 and f11 to f10
+  warning: can't find ancestor for 'f50' copied from 'f51'!
+  $ hg status --change .
+  M f10
+  A f22
+  R f20
+  $ hg cat f10
+  c12
+  $ hg cat f11
+  f11: no such file in rev 54f7d8995b01
+  [1]
+  $ hg graft -r 3
+  grafting 3:0f6524134ac0 "3"
+  note: possible conflict - f31 was renamed multiple times to:
+   f33
+   f30
+  warning: can't find ancestor for 'f33' copied from 'f31'!
+  $ hg up -q 0
+  $ hg mv f10 f16
+  $ echo c26 > f20
+  $ hg mv f30 f36
+  $ hg mv f40 f46
+  $ hg mv f50 f51
+  $ hg ci -qAm 6
+  $ hg graft -r 2
+  grafting 2:52f80b3a6c3e "2"
+  merging f16 and f11 to f16
+  merging f20 and f22 to f22
+  $ hg graft -r 3
+  grafting 3:0f6524134ac0 "3"
+  note: possible conflict - f31 was renamed multiple times to:
+   f36
+   f33
+  merging f46 and f40 to f46
+  warning: can't find ancestor for 'f33' copied from 'f31'!
+  $ hg log -CGv --patch --git
+  @  changeset:   8:db9925d9208e
+  |  tag:         tip
+  |  user:        test
+  |  date:        Thu Jan 01 00:00:00 1970 +0000
+  |  files:       f33 f46
+  |  description:
+  |  3
+  |
+  |
+  |  diff --git a/f33 b/f33
+  |  new file mode 100644
+  |  --- /dev/null
+  |  +++ b/f33
+  |  @@ -0,0 +1,1 @@
+  |  +c30
+  |  diff --git a/f46 b/f46
+  |  --- a/f46
+  |  +++ b/f46
+  |  @@ -1,1 +1,1 @@
+  |  -c40
+  |  +c43
+  |
+  o  changeset:   7:08113e8ff5c8
+  |  user:        test
+  |  date:        Thu Jan 01 00:00:00 1970 +0000
+  |  files:       f16 f20 f22 f50 f51
+  |  copies:      f22 (f20) f50 (f51)
+  |  description:
+  |  2
+  |
+  |
+  |  diff --git a/f16 b/f16
+  |  --- a/f16
+  |  +++ b/f16
+  |  @@ -1,1 +1,1 @@
+  |  -c10
+  |  +c12
+  |  diff --git a/f20 b/f22
+  |  rename from f20
+  |  rename to f22
+  |  diff --git a/f51 b/f50
+  |  rename from f51
+  |  rename to f50
+  |
+  o  changeset:   6:365d1fa57acb
+  |  parent:      0:1060512e332e
+  |  user:        test
+  |  date:        Thu Jan 01 00:00:00 1970 +0000
+  |  files:       f10 f16 f20 f30 f36 f40 f46 f50 f51
+  |  copies:      f16 (f10) f36 (f30) f46 (f40) f51 (f50)
+  |  description:
+  |  6
+  |
+  |
+  |  diff --git a/f10 b/f16
+  |  rename from f10
+  |  rename to f16
+  |  diff --git a/f20 b/f20
+  |  --- a/f20
+  |  +++ b/f20
+  |  @@ -1,1 +1,1 @@
+  |  -c20
+  |  +c26
+  |  diff --git a/f30 b/f36
+  |  rename from f30
+  |  rename to f36
+  |  diff --git a/f40 b/f46
+  |  rename from f40
+  |  rename to f46
+  |  diff --git a/f50 b/f51
+  |  rename from f50
+  |  rename to f51
+  |
+  | o  changeset:   5:d5015e1a1de5
+  | |  user:        test
+  | |  date:        Thu Jan 01 00:00:00 1970 +0000
+  | |  files:       f33 f40
+  | |  description:
+  | |  3
+  | |
+  | |
+  | |  diff --git a/f33 b/f33
+  | |  new file mode 100644
+  | |  --- /dev/null
+  | |  +++ b/f33
+  | |  @@ -0,0 +1,1 @@
+  | |  +c30
+  | |  diff --git a/f40 b/f40
+  | |  --- a/f40
+  | |  +++ b/f40
+  | |  @@ -1,1 +1,1 @@
+  | |  -c40
+  | |  +c43
+  | |
+  | o  changeset:   4:54f7d8995b01
+  |/   parent:      0:1060512e332e
+  |    user:        test
+  |    date:        Thu Jan 01 00:00:00 1970 +0000
+  |    files:       f10 f20 f22
+  |    copies:      f22 (f20)
+  |    description:
+  |    2
+  |
+  |
+  |    diff --git a/f10 b/f10
+  |    --- a/f10
+  |    +++ b/f10
+  |    @@ -1,1 +1,1 @@
+  |    -c10
+  |    +c12
+  |    diff --git a/f20 b/f22
+  |    rename from f20
+  |    rename to f22
+  |
+  | o  changeset:   3:0f6524134ac0
+  | |  user:        test
+  | |  date:        Thu Jan 01 00:00:00 1970 +0000
+  | |  files:       f31 f33 f40
+  | |  copies:      f33 (f31)
+  | |  description:
+  | |  3
+  | |
+  | |
+  | |  diff --git a/f31 b/f33
+  | |  rename from f31
+  | |  rename to f33
+  | |  diff --git a/f40 b/f40
+  | |  --- a/f40
+  | |  +++ b/f40
+  | |  @@ -1,1 +1,1 @@
+  | |  -c40
+  | |  +c43
+  | |
+  | o  changeset:   2:52f80b3a6c3e
+  | |  user:        test
+  | |  date:        Thu Jan 01 00:00:00 1970 +0000
+  | |  files:       f11 f20 f22 f50 f51
+  | |  copies:      f22 (f20) f50 (f51)
+  | |  description:
+  | |  2
+  | |
+  | |
+  | |  diff --git a/f11 b/f11
+  | |  --- a/f11
+  | |  +++ b/f11
+  | |  @@ -1,1 +1,1 @@
+  | |  -c10
+  | |  +c12
+  | |  diff --git a/f20 b/f22
+  | |  rename from f20
+  | |  rename to f22
+  | |  diff --git a/f51 b/f50
+  | |  rename from f51
+  | |  rename to f50
+  | |
+  | o  changeset:   1:7607a972f96d
+  |/   user:        test
+  |    date:        Thu Jan 01 00:00:00 1970 +0000
+  |    files:       f10 f11 f30 f31 f50 f51
+  |    copies:      f11 (f10) f31 (f30) f51 (f50)
+  |    description:
+  |    1
+  |
+  |
+  |    diff --git a/f10 b/f11
+  |    rename from f10
+  |    rename to f11
+  |    diff --git a/f30 b/f31
+  |    rename from f30
+  |    rename to f31
+  |    diff --git a/f50 b/f51
+  |    rename from f50
+  |    rename to f51
+  |
+  o  changeset:   0:1060512e332e
+     user:        test
+     date:        Thu Jan 01 00:00:00 1970 +0000
+     files:       f10 f20 f30 f40 f50
+     description:
+     0
+  
+  
+     diff --git a/f10 b/f10
+     new file mode 100644
+     --- /dev/null
+     +++ b/f10
+     @@ -0,0 +1,1 @@
+     +c10
+     diff --git a/f20 b/f20
+     new file mode 100644
+     --- /dev/null
+     +++ b/f20
+     @@ -0,0 +1,1 @@
+     +c20
+     diff --git a/f30 b/f30
+     new file mode 100644
+     --- /dev/null
+     +++ b/f30
+     @@ -0,0 +1,1 @@
+     +c30
+     diff --git a/f40 b/f40
+     new file mode 100644
+     --- /dev/null
+     +++ b/f40
+     @@ -0,0 +1,1 @@
+     +c40
+     diff --git a/f50 b/f50
+     new file mode 100644
+     --- /dev/null
+     +++ b/f50
+     @@ -0,0 +1,1 @@
+     +c50
+  
+  $ hg cat f22
+  c26
diff --git a/tests/test-rebase-conflicts.t b/tests/test-rebase-conflicts.t
--- a/tests/test-rebase-conflicts.t
+++ b/tests/test-rebase-conflicts.t
@@ -238,6 +238,10 @@ 
    merge against 9:e31216eec445
      detach base 8:8e4e2c1a07ae
     searching for copies back to rev 3
+    computing unmatched files in rotated DAG
+    computing unmatched files in unrotated DAG
+    unmatched files in other:
+     f2.txt
   resolving manifests
    branchmerge: True, force: True, partial: False
    ancestor: 8e4e2c1a07ae, local: 4bc80088dc6b+, remote: e31216eec445
@@ -255,6 +259,10 @@ 
    merge against 10:2f2496ddf49d
      detach base 9:e31216eec445
     searching for copies back to rev 3
+    computing unmatched files in rotated DAG
+    computing unmatched files in unrotated DAG
+    unmatched files in other:
+     f2.txt
   resolving manifests
    branchmerge: True, force: True, partial: False
    ancestor: e31216eec445, local: 19c888675e13+, remote: 2f2496ddf49d