Patchwork merge: extract _resolvetrivial() function

login
register
mail settings
Submitter Martin von Zweigbergk
Date Dec. 10, 2014, 9:18 p.m.
Message ID <9b8dc07b08a2721d1db6.1418246307@martinvonz.mtv.corp.google.com>
Download mbox | patch
Permalink /patch/7050/
State Accepted
Commit 416c133145ee78c8e83865b7370e185eed69c1be
Headers show

Comments

Martin von Zweigbergk - Dec. 10, 2014, 9:18 p.m.
# HG changeset patch
# User Martin von Zweigbergk <martinvonz@google.com>
# Date 1417643428 28800
#      Wed Dec 03 13:50:28 2014 -0800
# Node ID 9b8dc07b08a2721d1db6f218bee06880a40d168d
# Parent  07fcb956212223676b35055d39dddac88ca990b6
merge: extract _resolvetrivial() function

We would eventually like to move the resolution of modify/delete and
delete/modify conflicts to the resolve phase. However, we don't want
to move the checks for identical content that were added in
902554884335 (merge: before cd/dc prompt, check that changed side
really changed, 2014-12-01). Let's instead move these out to a new
_resolvetrivial() function that processes the actions from
manifestmerge() and replaces any false cd/dc conflicts. The function
will also provide a natural place for us to later add code for
resolving false 'm' conflicts.
Matt Mackall - Dec. 10, 2014, 11:18 p.m.
On Wed, 2014-12-10 at 13:18 -0800, Martin von Zweigbergk wrote:
> # HG changeset patch
> # User Martin von Zweigbergk <martinvonz@google.com>
> # Date 1417643428 28800
> #      Wed Dec 03 13:50:28 2014 -0800
> # Node ID 9b8dc07b08a2721d1db6f218bee06880a40d168d
> # Parent  07fcb956212223676b35055d39dddac88ca990b6
> merge: extract _resolvetrivial() function

Queued for default, thanks.
Mads Kiilerich - Dec. 10, 2014, 11:43 p.m.
On 12/10/2014 10:18 PM, Martin von Zweigbergk wrote:
> # HG changeset patch
> # User Martin von Zweigbergk <martinvonz@google.com>
> # Date 1417643428 28800
> #      Wed Dec 03 13:50:28 2014 -0800
> # Node ID 9b8dc07b08a2721d1db6f218bee06880a40d168d
> # Parent  07fcb956212223676b35055d39dddac88ca990b6
> merge: extract _resolvetrivial() function
>
> We would eventually like to move the resolution of modify/delete and
> delete/modify conflicts to the resolve phase. However, we don't want
> to move the checks for identical content that were added in
> 902554884335 (merge: before cd/dc prompt, check that changed side
> really changed, 2014-12-01). Let's instead move these out to a new
> _resolvetrivial() function that processes the actions from
> manifestmerge() and replaces any false cd/dc conflicts. The function
> will also provide a natural place for us to later add code for
> resolving false 'm' conflicts.

Hmm ... it doesn't sound wrong ... but so far we don't have any use of 
it and it doesn't add much value by itself. It could perhaps wait until 
we have some use of it and know it is going in the right direction?

Resolving 'm' conflicts in this function seems to imply that we 
duplicate a lot of manifestmerge in this function. Meh.

More important, I would like to somehow move this "changed but same" 
check into manifestmerge, before scheduling any wrong actions. Bid merge 
needs that so it avoids picking unnecessarily hard actions - especially 
merge actions, but also cd/dc. It would be an unfortunate layering 
violation to consider actual file content in manifestmerge and it could 
be costly to retrieve file content an extra time and at the wrong time 
... but that seems to be what it takes. This patch will not help us much 
going in that direction ... if we should end up going in that direction.

/Mads
Martin von Zweigbergk - Dec. 10, 2014, 11:53 p.m.
On Wed Dec 10 2014 at 3:44:00 PM Mads Kiilerich <mads@kiilerich.com> wrote:

> On 12/10/2014 10:18 PM, Martin von Zweigbergk wrote:
> > # HG changeset patch
> > # User Martin von Zweigbergk <martinvonz@google.com>
> > # Date 1417643428 28800
> > #      Wed Dec 03 13:50:28 2014 -0800
> > # Node ID 9b8dc07b08a2721d1db6f218bee06880a40d168d
> > # Parent  07fcb956212223676b35055d39dddac88ca990b6
> > merge: extract _resolvetrivial() function
> >
> > We would eventually like to move the resolution of modify/delete and
> > delete/modify conflicts to the resolve phase. However, we don't want
> > to move the checks for identical content that were added in
> > 902554884335 (merge: before cd/dc prompt, check that changed side
> > really changed, 2014-12-01). Let's instead move these out to a new
> > _resolvetrivial() function that processes the actions from
> > manifestmerge() and replaces any false cd/dc conflicts. The function
> > will also provide a natural place for us to later add code for
> > resolving false 'm' conflicts.
>
> Hmm ... it doesn't sound wrong ... but so far we don't have any use of
> it and it doesn't add much value by itself. It could perhaps wait until
> we have some use of it and know it is going in the right direction?
>
> Resolving 'm' conflicts in this function seems to imply that we
> duplicate a lot of manifestmerge in this function. Meh.
>

Hopefully it would be moving it rather than duplicating it.

More important, I would like to somehow move this "changed but same"
> check into manifestmerge, before scheduling any wrong actions.


I have also considered that. I think the function has a good and clear
scope. Nothing is preventing the call site from moving. I.e., even if
manifestmerge() later ends up calling the function, I think it clarifies
the code by having it extracted into its own function.


> Bid merge
> needs that so it avoids picking unnecessarily hard actions - especially
> merge actions, but also cd/dc. It would be an unfortunate layering
> violation to consider actual file content in manifestmerge and it could
> be costly to retrieve file content an extra time and at the wrong time
> ... but that seems to be what it takes. This patch will not help us much
> going in that direction ... if we should end up going in that direction.
>

Yes, we'll see if we end up going in that direction. I played with it a
bit, but didn't find a clean way of doing without extra performance cost. I
didn't mean to imply that I think it has to happen, just that it's natural
place if it does happen.
Matt Mackall - Dec. 11, 2014, 6:58 p.m.
On Thu, 2014-12-11 at 00:43 +0100, Mads Kiilerich wrote:
> On 12/10/2014 10:18 PM, Martin von Zweigbergk wrote:
> > # HG changeset patch
> > # User Martin von Zweigbergk <martinvonz@google.com>
> > # Date 1417643428 28800
> > #      Wed Dec 03 13:50:28 2014 -0800
> > # Node ID 9b8dc07b08a2721d1db6f218bee06880a40d168d
> > # Parent  07fcb956212223676b35055d39dddac88ca990b6
> > merge: extract _resolvetrivial() function
> >
> > We would eventually like to move the resolution of modify/delete and
> > delete/modify conflicts to the resolve phase. However, we don't want
> > to move the checks for identical content that were added in
> > 902554884335 (merge: before cd/dc prompt, check that changed side
> > really changed, 2014-12-01). Let's instead move these out to a new
> > _resolvetrivial() function that processes the actions from
> > manifestmerge() and replaces any false cd/dc conflicts. The function
> > will also provide a natural place for us to later add code for
> > resolving false 'm' conflicts.
> 
> Hmm ... it doesn't sound wrong ... but so far we don't have any use of 
> it and it doesn't add much value by itself. It could perhaps wait until 
> we have some use of it and know it is going in the right direction?
> 
> Resolving 'm' conflicts in this function seems to imply that we 
> duplicate a lot of manifestmerge in this function. Meh.

No, it just means moving this line out of filemerge:

http://www.selenic.com/hg/file/416c133145ee/mercurial/filemerge.py#l381

> More important, I would like to somehow move this "changed but same" 
> check into manifestmerge, before scheduling any wrong actions. Bid merge 
> needs that so it avoids picking unnecessarily hard actions - especially 
> merge actions, but also cd/dc. It would be an unfortunate layering 
> violation to consider actual file content in manifestmerge and it could 
> be costly to retrieve file content an extra time and at the wrong time 
> ... but that seems to be what it takes. This patch will not help us much 
> going in that direction ... if we should end up going in that direction.

Seems this will actually give us an opportunity to defer those expensive
comparisons. If our bids on a file are:

keep
merge

..we can just jump to keep without discovering whether the merge is
actually trivial, right? 

However, when we're trying to pick between two merges, we obviously want
to go with the trivial one, so we might want a two-pass approach.
Mads Kiilerich - Dec. 11, 2014, 7:56 p.m.
On 12/11/2014 07:58 PM, Matt Mackall wrote:
> On Thu, 2014-12-11 at 00:43 +0100, Mads Kiilerich wrote:
>> On 12/10/2014 10:18 PM, Martin von Zweigbergk wrote:
>>> # HG changeset patch
>>> # User Martin von Zweigbergk <martinvonz@google.com>
>>> # Date 1417643428 28800
>>> #      Wed Dec 03 13:50:28 2014 -0800
>>> # Node ID 9b8dc07b08a2721d1db6f218bee06880a40d168d
>>> # Parent  07fcb956212223676b35055d39dddac88ca990b6
>>> merge: extract _resolvetrivial() function
>>>
>>> We would eventually like to move the resolution of modify/delete and
>>> delete/modify conflicts to the resolve phase. However, we don't want
>>> to move the checks for identical content that were added in
>>> 902554884335 (merge: before cd/dc prompt, check that changed side
>>> really changed, 2014-12-01). Let's instead move these out to a new
>>> _resolvetrivial() function that processes the actions from
>>> manifestmerge() and replaces any false cd/dc conflicts. The function
>>> will also provide a natural place for us to later add code for
>>> resolving false 'm' conflicts.
>> Hmm ... it doesn't sound wrong ... but so far we don't have any use of
>> it and it doesn't add much value by itself. It could perhaps wait until
>> we have some use of it and know it is going in the right direction?
>>
>> Resolving 'm' conflicts in this function seems to imply that we
>> duplicate a lot of manifestmerge in this function. Meh.
> No, it just means moving this line out of filemerge:
>
> http://www.selenic.com/hg/file/416c133145ee/mercurial/filemerge.py#l381

Agreed, assuming we want it to show up as something that is a resolved 
merge, instead of showing the simpler actions actions usually computed 
in manifestmerge 
http://www.selenic.com/hg/file/416c133145ee/mercurial/merge.py#l425 . I 
assume there is a reason we don't postpone all of these to filemerge in 
the first place?

>
>> More important, I would like to somehow move this "changed but same"
>> check into manifestmerge, before scheduling any wrong actions. Bid merge
>> needs that so it avoids picking unnecessarily hard actions - especially
>> merge actions, but also cd/dc. It would be an unfortunate layering
>> violation to consider actual file content in manifestmerge and it could
>> be costly to retrieve file content an extra time and at the wrong time
>> ... but that seems to be what it takes. This patch will not help us much
>> going in that direction ... if we should end up going in that direction.
> Seems this will actually give us an opportunity to defer those expensive
> comparisons. If our bids on a file are:
>
> keep
> merge
>
> ..we can just jump to keep without discovering whether the merge is
> actually trivial, right?

Yes, but especially when recovering from a criss cross merge, we will 
very rarely have a "same filelog node" situation. There will have been 
merges, and assuming things have been resolved "correctly", we will have 
a "different filelog node but same content" situation on one side. We 
want to pick that side. Currently we cannot.

Having the "changed but same" check in manifestmerge will _give_ us the 
"keep/merge" situation that bid merge can handle optimally. We will need 
that before the bid auction.

> However, when we're trying to pick between two merges, we obviously want
> to go with the trivial one, so we might want a two-pass approach.

Yes, we can implement the "changed but same" check for 'm', 'cd' and 
'cd' in/before bid merge too. That would be pretty much the same as we 
already have to do after bid merge (for cd/dc) or in filemerge (for m). 
I think it would be simpler and cleaner to make the first pass in 
manifestmerge handle "changed but same" too. I assume that is how we 
would do it if we had a "clean content" hash in the index. The question 
is how much we will let the lack of it influence the architecture and 
how value it adds to be able to defer it.

/Mads

Patch

diff --git a/mercurial/merge.py b/mercurial/merge.py
--- a/mercurial/merge.py
+++ b/mercurial/merge.py
@@ -537,6 +537,30 @@ 
 
     return actions, diverge, renamedelete
 
+def _resolvetrivial(repo, wctx, mctx, ancestor, actions):
+    """Resolves false conflicts where the nodeid changed but the content
+       remained the same."""
+
+    cdactions = []
+    for action in actions['cd']:
+        f = action[0]
+        if f in ancestor and not wctx[f].cmp(ancestor[f]):
+            # local did change but ended up with same content
+            actions['r'].append((f, None, "prompt same"))
+        else:
+            cdactions.append(action)
+    actions['cd'] = cdactions
+
+    dcactions = []
+    for action in actions['dc']:
+        f = action[0]
+        if f in ancestor and not mctx[f].cmp(ancestor[f]):
+            # remote did change but ended up with same content
+            pass # don't get = keep local deleted
+        else:
+            dcactions.append(action)
+    actions['dc'] = dcactions
+
 def calculateupdates(repo, wctx, mctx, ancestors, branchmerge, force, partial,
                      acceptremote, followcopies):
     "Calculate the actions needed to merge mctx into wctx using ancestors"
@@ -614,12 +638,11 @@ 
             continue
         repo.ui.note(_('end of auction\n\n'))
 
+    _resolvetrivial(repo, wctx, mctx, ancestors[0], actions)
+
     # Prompt and create actions. TODO: Move this towards resolve phase.
     for f, args, msg in sorted(actions['cd']):
-        if f in ancestors[0] and not wctx[f].cmp(ancestors[0][f]):
-            # local did change but ended up with same content
-            actions['r'].append((f, None, "prompt same"))
-        elif repo.ui.promptchoice(
+        if repo.ui.promptchoice(
             _("local changed %s which remote deleted\n"
               "use (c)hanged version or (d)elete?"
               "$$ &Changed $$ &Delete") % f, 0):
@@ -630,10 +653,7 @@ 
 
     for f, args, msg in sorted(actions['dc']):
         flags, = args
-        if f in ancestors[0] and not mctx[f].cmp(ancestors[0][f]):
-            # remote did change but ended up with same content
-            pass # don't get = keep local deleted
-        elif repo.ui.promptchoice(
+        if repo.ui.promptchoice(
             _("remote changed %s which local deleted\n"
               "use (c)hanged version or leave (d)eleted?"
               "$$ &Changed $$ &Deleted") % f, 0) == 0: