Patchwork [2,of,2,V2] revset: new predicates to find key merge revisions

login
register
mail settings
Submitter Simon Farnsworth
Date March 7, 2016, 6:27 p.m.
Message ID <faf1fd7f707980136831.1457375248@simonfar-macbookpro.local>
Download mbox | patch
Permalink /patch/13651/
State Changes Requested
Delegated to: Pierre-Yves David
Headers show

Comments

Simon Farnsworth - March 7, 2016, 6:27 p.m.
# HG changeset patch
# User Simon Farnsworth <simonfar@fb.com>
# Date 1457374657 0
#      Mon Mar 07 18:17:37 2016 +0000
# Node ID faf1fd7f7079801368317a681f55dbc46e878de0
# Parent  d92485c84727d67bcbf462b26dcfc1b7c746d07d
revset: new predicates to find key merge revisions

When you have tricky merge conflicts, it's useful to look at the history of
the conflict. In a fast moving repository, you're drowned in revisions that
are in the merge but that can't impact on a conflict.

Provide a new revset predicate to help out; conflict(type, pat) allows you
to find the "base", "other" or "local" revision types for the merge
conflicts identified by pat. This, in turn, enables you to write revsets to
find all revisions that are of interest in a conflict situation.
Pierre-Yves David - March 10, 2016, 12:17 a.m.
On 03/07/2016 06:27 PM, Simon Farnsworth wrote:
> # HG changeset patch
> # User Simon Farnsworth <simonfar@fb.com>
> # Date 1457374657 0
> #      Mon Mar 07 18:17:37 2016 +0000
> # Node ID faf1fd7f7079801368317a681f55dbc46e878de0
> # Parent  d92485c84727d67bcbf462b26dcfc1b7c746d07d
> revset: new predicates to find key merge revisions
>
> When you have tricky merge conflicts, it's useful to look at the history of
> the conflict. In a fast moving repository, you're drowned in revisions that
> are in the merge but that can't impact on a conflict.
>
> Provide a new revset predicate to help out; conflict(type, pat) allows you
> to find the "base", "other" or "local" revision types for the merge
> conflicts identified by pat. This, in turn, enables you to write revsets to
> find all revisions that are of interest in a conflict situation.

I'm a bit confused about this.

Is there situation where "local" is not p1() and "other" is not p2() 
(except for "update" merge were "local" does not exist)? If not, I'm not 
sure why we need this predicate for them.

What does "base" return exactly, I could not really infer it from the 
documentation.

[more comment below]

> diff --git a/mercurial/mergestate.py b/mercurial/mergestate.py
> --- a/mercurial/mergestate.py
> +++ b/mercurial/mergestate.py
> @@ -73,6 +73,7 @@
>           Do not use this directly! Instead call read() or clean()."""
>           self._repo = repo
>           self._dirty = False
> +        self._otherinferred = False

Can we get a small comment about what this attribut means?

>
>       statepathv1 = 'merge/state'
>       statepathv2 = 'merge/state2'
> @@ -153,6 +154,7 @@
>               # we cannot do better than that with v1 of the format
>               mctx = self._repo[None].parents()[-1]
>               v1records.append(('O', mctx.hex()))
> +            self._otherinferred = True
>               # add place holder "other" file node information
>               # nobody is using it yet so we do no need to fetch the data
>               # if mctx was wrong `mctx[bits[-2]]` may fails.
> @@ -304,6 +306,26 @@
>               if entry[0] == 'u':
>                   yield f
>
> +    def ancestorchangectx(self, dfile):
> +        extras = self.extras(dfile)
> +        anccommitnode = extras.get('ancestorlinknode')
> +        if anccommitnode:
> +            return self._repo[anccommitnode]
> +        else:
> +            return None
> +
> +    def otherfilectx(self, dfile):
> +        if self._otherinferred:
> +            return None
> +        entry = self._state[dfile]
> +        return self.otherctx.filectx(entry[5], fileid=entry[6])
> +
> +    def localfilectx(self, dfile):
> +        if self._local is None:
> +            return None
> +        entry = self._state[dfile]
> +        return self.localctx.filectx(entry[2])
> +
>       def driverresolved(self):
>           """Obtain the paths of driver-resolved files."""
>
> diff --git a/mercurial/revset.py b/mercurial/revset.py
> --- a/mercurial/revset.py
> +++ b/mercurial/revset.py
> @@ -17,6 +17,7 @@
>       error,
>       hbisect,
>       match as matchmod,
> +    mergestate as mergestatemod,
>       node,
>       obsolete as obsmod,
>       parser,
> @@ -816,6 +817,68 @@
>       getargs(x, 0, 0, _("closed takes no arguments"))
>       return subset.filter(lambda r: repo[r].closesbranch())
>
> +@predicate('conflict(type,[pattern])')
> +def conflict(repo, subset, x):
> +    """The type revision for any merge conflict matching pattern.
> +    See :hg:`help patterns` for information about file patterns.
> +
> +    type should be one of "base", "other" or "local", for the three-way
> +    merge base, the "other" tree, or the "local" tree.
> +
> +    The pattern without explicit kind like ``glob:`` is expected to be
> +    relative to the current directory and match against a file exactly
> +    for efficiency. If pattern is omitted, all conflicts are included.
> +    """
> +    # i18n: "conflict" is a keyword
> +    l = getargs(x, 1, 2, _('conflict takes one or two arguments'))
> +    t = getstring(l[0], _("conflict requires a type"))
> +    if t not in ["base", "other", "local"]:
> +        msg = _('conflict file types are "base", "other" or "local"')
> +        raise error.Abort(msg)
> +    if len(l) == 2:
> +        pat = getstring(l[1], _("conflict takes a string pattern"))
> +    else:
> +        pat = None
> +
> +    ms = mergestatemod.mergestatereadonly.read(repo)
> +
> +    if pat is None:
> +        files = ms.files()
> +    elif not matchmod.patkind(pat):
> +        f = pathutil.canonpath(repo.root, repo.getcwd(), pat)
> +        if f in ms:
> +            files = [f]
> +        else:
> +            files = []
> +    else:
> +        m = matchmod.match(repo.root, repo.getcwd(), [pat], ctx=repo[None])
> +        files = (f for f in ms if m(f))
> +
> +    s = set()
> +    warning = _("merge {t} commit of file {f} not known - ignoring\n")
> +    for f in files:
> +        warn = False
> +        if t == "base":
> +            base = ms.ancestorchangectx(f)
> +            if base is None:
> +                repo.ui.warn(warning.format(t=t, f=f))

We use % in mercurial. especially because Python3.5 supports it for bytes.

Can you elaborate about the sitation where we end up issuing this warning?

> +            else:
> +                s.add(base.rev())
> +        elif t == "local":
> +            local = ms.localfilectx(f)
> +            if local is None:
> +                repo.ui.warn(warning.format(t=t, f=f))
> +            else:
> +                s.add(local.introrev())
> +        elif t == "other":
> +            other = ms.otherfilectx(f)
> +            if other is None:
> +                repo.ui.warn(warning.format(t=t, f=f))
> +            else:
> +                s.add(other.introrev())
> +
> +    return subset & s
> +
>   @predicate('contains(pattern)')
>   def contains(repo, subset, x):
>       """The revision's manifest contains a file matching pattern (but might not
> diff --git a/tests/test-revset.t b/tests/test-revset.t
> --- a/tests/test-revset.t
> +++ b/tests/test-revset.t
> @@ -2230,3 +2230,105 @@
>     2
>
>     $ cd ..
> +
> +Test merge conflict predicates
> +
> +  $ hg init conflictrepo
> +  $ cd conflictrepo
> +  $ echo file1 > file1
> +  $ echo file2 > file2
> +  $ hg commit -qAm first
> +  $ echo line2 >> file1
> +  $ hg commit -qAm second
> +  $ hg bookmark base
> +  $ hg bookmark tree1
> +  $ echo line1 > file1
> +  $ hg commit -qAm tree1-file1
> +  $ echo tree1-file2 > file2
> +  $ hg commit -qAm tree1-file2
> +  $ echo file3 > file3
> +  $ hg commit -qAm tree1-file3
> +  $ hg update base
> +  2 files updated, 0 files merged, 1 files removed, 0 files unresolved
> +  (activating bookmark base)
> +  $ hg bookmark tree2
> +  $ echo lineA > file1
> +  $ echo line2 >> file1
> +  $ hg commit -qAm tree2
> +  $ hg bookmark tree2
> +  $ echo tree2-file2 > file2
> +  $ hg commit -qAm tree2-file2
> +  $ echo file4 > file4
> +  $ hg commit -qAm tree2-file4
> +
> +There are no markers before a merge conflict exists
> +
> +  $ hg debugrevspec 'conflict("base","glob:*")'
> +  $ hg debugrevspec 'conflict("local","glob:*")'
> +  $ hg debugrevspec 'conflict("other","glob:*")'
> +
> +Merge and test that the expected set of markers exist and work with patterns
> +
> +  $ hg merge --rev tree1 --tool :fail
> +  1 files updated, 0 files merged, 0 files removed, 2 files unresolved
> +  use 'hg resolve' to retry unresolved file merges or 'hg update -C .' to abandon
> +  [1]
> +  $ hg resolve --list
> +  U file1
> +  U file2
> +  $ hg debugrevspec 'conflict("base","glob:*")'
> +  1
> +  $ hg debugrevspec 'conflict("local","glob:*")'
> +  5
> +  6
> +  $ hg debugrevspec 'conflict("other","glob:*")'
> +  2
> +  3
> +  $ hg debugrevspec 'conflict("base", "file1")'
> +  1
> +  $ hg debugrevspec 'conflict("local", "file1")'
> +  5
> +  $ hg debugrevspec 'conflict("other", "file1")'
> +  2
> +  $ hg debugrevspec 'conflict("base", "file2")'
> +  1
> +  $ hg debugrevspec 'conflict("local", "file2")'
> +  6
> +  $ hg debugrevspec 'conflict("other", "file2")'
> +  3
> +
> +There are no markers on files not in conflict
> +  $ hg debugrevspec 'conflict("base", "file4")'
> +  $ hg debugrevspec 'conflict("local", "file4")'
> +  $ hg debugrevspec 'conflict("other","file4")'
> +
> +It's possible to get interesting sets from markers
> +
> +  $ hg debugrevspec 'conflict("base", "glob:*"):conflict("local", "file1")'
> +  1
> +  2
> +  3
> +  4
> +  5
> +  $ hg debugrevspec 'conflict("base", "file2"):conflict("other", "re:file[1234]")'
> +  1
> +  2
> +  3
> +
> +If we only have a V1 state file, "base" and "other" markers can't be found,
> +but local can.
> +
> +  $ rm .hg/merge/state2
> +  $ hg debugrevspec 'conflict("base", "file1")'
> +  merge base commit of file file1 not known - ignoring
> +  $ hg debugrevspec 'conflict("local", "file1")'
> +  5
> +  $ hg debugrevspec 'conflict("other", "file1")'
> +  merge other commit of file file1 not known - ignoring
> +
> +We get a nice error if we ask for a non-existent marker
> +
> +  $ hg debugrevspec 'conflict("mercurial-versus-sccs", "glob:*")'
> +  abort: conflict file types are "base", "other" or "local"
> +  [255]
> +  $ cd ..
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
Simon Farnsworth - March 10, 2016, 10:12 a.m.
On 10/03/2016, 00:17, "Pierre-Yves David" <pierre-yves.david@ens-lyon.org> wrote:


>On 03/07/2016 06:27 PM, Simon Farnsworth wrote:

>> # HG changeset patch

>> # User Simon Farnsworth <simonfar@fb.com>

>> # Date 1457374657 0

>> #      Mon Mar 07 18:17:37 2016 +0000

>> # Node ID faf1fd7f7079801368317a681f55dbc46e878de0

>> # Parent  d92485c84727d67bcbf462b26dcfc1b7c746d07d

>> revset: new predicates to find key merge revisions

>>

>> When you have tricky merge conflicts, it's useful to look at the history of

>> the conflict. In a fast moving repository, you're drowned in revisions that

>> are in the merge but that can't impact on a conflict.

>>

>> Provide a new revset predicate to help out; conflict(type, pat) allows you

>> to find the "base", "other" or "local" revision types for the merge

>> conflicts identified by pat. This, in turn, enables you to write revsets to

>> find all revisions that are of interest in a conflict situation.

>

>I'm a bit confused about this.

>

>Is there situation where "local" is not p1() and "other" is not p2() 

>(except for "update" merge were "local" does not exist)? If not, I'm not 

>sure why we need this predicate for them.


These two predicates point you to the introrev of the conflict, not the merge points. I can clarify the commit message in a V3 submission. The goal is that between base, local, other, p1() and p2(), you have all the points you need to write revsets that find you interesting commits from the point of view of a human resolving the commit; for example, 'conflict("base", "file"):conflict("other", "file")' is all commits from the three way merge base to the latest change on the other side that could have caused the conflict.

>

>What does "base" return exactly, I could not really infer it from the 

>documentation.


It's the common ancestor of the two commits being merged - we call it "base" elsewhere in merge documentation, so I've kept that name rather than ancestor.

>

>[more comment below]

>

>> diff --git a/mercurial/mergestate.py b/mercurial/mergestate.py

>> --- a/mercurial/mergestate.py

>> +++ b/mercurial/mergestate.py

>> @@ -73,6 +73,7 @@

>>           Do not use this directly! Instead call read() or clean()."""

>>           self._repo = repo

>>           self._dirty = False

>> +        self._otherinferred = False

>

>Can we get a small comment about what this attribut means?


Will do. It tells you that we've guessed at self._other when we read a v1 state file, rather than reading it in from a v2 state file.

>

>>

>>       statepathv1 = 'merge/state'

>>       statepathv2 = 'merge/state2'

>> @@ -153,6 +154,7 @@

>>               # we cannot do better than that with v1 of the format

>>               mctx = self._repo[None].parents()[-1]

>>               v1records.append(('O', mctx.hex()))

>> +            self._otherinferred = True

>>               # add place holder "other" file node information

>>               # nobody is using it yet so we do no need to fetch the data

>>               # if mctx was wrong `mctx[bits[-2]]` may fails.

>> @@ -304,6 +306,26 @@

>>               if entry[0] == 'u':

>>                   yield f

>>

>> +    def ancestorchangectx(self, dfile):

>> +        extras = self.extras(dfile)

>> +        anccommitnode = extras.get('ancestorlinknode')

>> +        if anccommitnode:

>> +            return self._repo[anccommitnode]

>> +        else:

>> +            return None

>> +

>> +    def otherfilectx(self, dfile):

>> +        if self._otherinferred:

>> +            return None

>> +        entry = self._state[dfile]

>> +        return self.otherctx.filectx(entry[5], fileid=entry[6])

>> +

>> +    def localfilectx(self, dfile):

>> +        if self._local is None:

>> +            return None

>> +        entry = self._state[dfile]

>> +        return self.localctx.filectx(entry[2])

>> +

>>       def driverresolved(self):

>>           """Obtain the paths of driver-resolved files."""

>>

>> diff --git a/mercurial/revset.py b/mercurial/revset.py

>> --- a/mercurial/revset.py

>> +++ b/mercurial/revset.py

>> @@ -17,6 +17,7 @@

>>       error,

>>       hbisect,

>>       match as matchmod,

>> +    mergestate as mergestatemod,

>>       node,

>>       obsolete as obsmod,

>>       parser,

>> @@ -816,6 +817,68 @@

>>       getargs(x, 0, 0, _("closed takes no arguments"))

>>       return subset.filter(lambda r: repo[r].closesbranch())

>>

>> +@predicate('conflict(type,[pattern])')

>> +def conflict(repo, subset, x):

>> +    """The type revision for any merge conflict matching pattern.

>> +    See :hg:`help patterns` for information about file patterns.

>> +

>> +    type should be one of "base", "other" or "local", for the three-way

>> +    merge base, the "other" tree, or the "local" tree.

>> +


I will also enhance this bit of documentation to make it clearer what I'm expecting when you feed in "base", "other", or "local".

>> +    The pattern without explicit kind like ``glob:`` is expected to be

>> +    relative to the current directory and match against a file exactly

>> +    for efficiency. If pattern is omitted, all conflicts are included.

>> +    """

>> +    # i18n: "conflict" is a keyword

>> +    l = getargs(x, 1, 2, _('conflict takes one or two arguments'))

>> +    t = getstring(l[0], _("conflict requires a type"))

>> +    if t not in ["base", "other", "local"]:

>> +        msg = _('conflict file types are "base", "other" or "local"')

>> +        raise error.Abort(msg)

>> +    if len(l) == 2:

>> +        pat = getstring(l[1], _("conflict takes a string pattern"))

>> +    else:

>> +        pat = None

>> +

>> +    ms = mergestatemod.mergestatereadonly.read(repo)

>> +

>> +    if pat is None:

>> +        files = ms.files()

>> +    elif not matchmod.patkind(pat):

>> +        f = pathutil.canonpath(repo.root, repo.getcwd(), pat)

>> +        if f in ms:

>> +            files = [f]

>> +        else:

>> +            files = []

>> +    else:

>> +        m = matchmod.match(repo.root, repo.getcwd(), [pat], ctx=repo[None])

>> +        files = (f for f in ms if m(f))

>> +

>> +    s = set()

>> +    warning = _("merge {t} commit of file {f} not known - ignoring\n")

>> +    for f in files:

>> +        warn = False

>> +        if t == "base":

>> +            base = ms.ancestorchangectx(f)

>> +            if base is None:

>> +                repo.ui.warn(warning.format(t=t, f=f))

>

>We use % in mercurial. especially because Python3.5 supports it for bytes.


Will fix.

>

>Can you elaborate about the sitation where we end up issuing this warning?


You end up with this warning if we've only got a v1 merge state - if you can think of a better way to word it to indicate that you need to undo and redo the merge to get a v2 merge state, I'll happily change it.

>

>> +            else:

>> +                s.add(base.rev())

>> +        elif t == "local":

>> +            local = ms.localfilectx(f)

>> +            if local is None:

>> +                repo.ui.warn(warning.format(t=t, f=f))

>> +            else:

>> +                s.add(local.introrev())

>> +        elif t == "other":

>> +            other = ms.otherfilectx(f)

>> +            if other is None:

>> +                repo.ui.warn(warning.format(t=t, f=f))

>> +            else:

>> +                s.add(other.introrev())

>> +

>> +    return subset & s

>> +

>>   @predicate('contains(pattern)')

>>   def contains(repo, subset, x):

>>       """The revision's manifest contains a file matching pattern (but might not

>> diff --git a/tests/test-revset.t b/tests/test-revset.t

<snip>

-- 
Simon Farnsworth
Pierre-Yves David - March 10, 2016, 1:51 p.m.
On 03/10/2016 10:12 AM, Simon Farnsworth wrote:
> On 10/03/2016, 00:17, "Pierre-Yves David" <pierre-yves.david@ens-lyon.org> wrote:
>
>
>> On 03/07/2016 06:27 PM, Simon Farnsworth wrote:
>>> # HG changeset patch
>>> # User Simon Farnsworth <simonfar@fb.com>
>>> # Date 1457374657 0
>>> #      Mon Mar 07 18:17:37 2016 +0000
>>> # Node ID faf1fd7f7079801368317a681f55dbc46e878de0
>>> # Parent  d92485c84727d67bcbf462b26dcfc1b7c746d07d
>>> revset: new predicates to find key merge revisions
>>>
>>> When you have tricky merge conflicts, it's useful to look at the history of
>>> the conflict. In a fast moving repository, you're drowned in revisions that
>>> are in the merge but that can't impact on a conflict.
>>>
>>> Provide a new revset predicate to help out; conflict(type, pat) allows you
>>> to find the "base", "other" or "local" revision types for the merge
>>> conflicts identified by pat. This, in turn, enables you to write revsets to
>>> find all revisions that are of interest in a conflict situation.
>>
>> I'm a bit confused about this.
>>
>> Is there situation where "local" is not p1() and "other" is not p2()
>> (except for "update" merge were "local" does not exist)? If not, I'm not
>> sure why we need this predicate for them.
>
> These two predicates point you to the introrev of the conflict, not the merge points. I can clarify the commit message in a V3 submission. The goal is that between base, local, other, p1() and p2(), you have all the points you need to write revsets that find you interesting commits from the point of view of a human resolving the commit; for example, 'conflict("base", "file"):conflict("other", "file")' is all commits from the three way merge base to the latest change on the other side that could have caused the conflict.

As said in another thread, I'm not super eager to promote "introrev" as 
the one relevant for the "conflict" too much. especially because it is 
already accessible using last(::p1() and file(FILENAME)).

If having access introrev is really useful (for speed?) we could 
probably add a keyword argument to the 'file()' revset to restrict the 
search to a specific file revision.

>> What does "base" return exactly, I could not really infer it from the
>> documentation.
>
> It's the common ancestor of the two commits being merged - we call it "base" elsewhere in merge documentation, so I've kept that name rather than ancestor.

I know what's the general meaning of "base" in a merge context, but what 
are you returning here? the changeset common ancestor of the introrev of 
that version? Also, there might be more than one common ancestors so It 
is unclear to me how you behave in this case.
Simon Farnsworth - March 10, 2016, 2:26 p.m.
On 10/03/2016, 13:51, "Pierre-Yves David" <pierre-yves.david@ens-lyon.org> wrote:


>On 03/10/2016 10:12 AM, Simon Farnsworth wrote:

>> On 10/03/2016, 00:17, "Pierre-Yves David" <pierre-yves.david@ens-lyon.org> wrote:

>>

>>

>>> On 03/07/2016 06:27 PM, Simon Farnsworth wrote:

>>>> # HG changeset patch

>>>> # User Simon Farnsworth <simonfar@fb.com>

>>>> # Date 1457374657 0

>>>> #      Mon Mar 07 18:17:37 2016 +0000

>>>> # Node ID faf1fd7f7079801368317a681f55dbc46e878de0

>>>> # Parent  d92485c84727d67bcbf462b26dcfc1b7c746d07d

>>>> revset: new predicates to find key merge revisions

>>>>

>>>> When you have tricky merge conflicts, it's useful to look at the history of

>>>> the conflict. In a fast moving repository, you're drowned in revisions that

>>>> are in the merge but that can't impact on a conflict.

>>>>

>>>> Provide a new revset predicate to help out; conflict(type, pat) allows you

>>>> to find the "base", "other" or "local" revision types for the merge

>>>> conflicts identified by pat. This, in turn, enables you to write revsets to

>>>> find all revisions that are of interest in a conflict situation.

>>>

>>> I'm a bit confused about this.

>>>

>>> Is there situation where "local" is not p1() and "other" is not p2()

>>> (except for "update" merge were "local" does not exist)? If not, I'm not

>>> sure why we need this predicate for them.

>>

>> These two predicates point you to the introrev of the conflict, not the merge points. I can clarify the commit message in a V3 submission. The goal is that between base, local, other, p1() and p2(), you have all the points you need to write revsets that find you interesting commits from the point of view of a human resolving the commit; for example, 'conflict("base", "file"):conflict("other", "file")' is all commits from the three way merge base to the latest change on the other side that could have caused the conflict.

>

>As said in another thread, I'm not super eager to promote "introrev" as 

>the one relevant for the "conflict" too much. especially because it is 

>already accessible using last(::p1() and file(FILENAME)).


The main reason for looking at the introrev is that it's the last revision that's likely to be relevant, especially in a monolithic repository - commits after this one that did not touch this file are probably to other areas of the project, and we've already got p1() and p2() for the merge parents (albeit under horrible names).

>

>If having access introrev is really useful (for speed?) we could 

>probably add a keyword argument to the 'file()' revset to restrict the 

>search to a specific file revision.


I'm not sure how the keyword argument is meant to help - could you explain this a bit further?

The introrev is cheaply available at this point, because I have a filectx in hand, hence going for it.

>

>>> What does "base" return exactly, I could not really infer it from the

>>> documentation.

>>

>> It's the common ancestor of the two commits being merged - we call it "base" elsewhere in merge documentation, so I've kept that name rather than ancestor.

>

>I know what's the general meaning of "base" in a merge context, but what 

>are you returning here? the changeset common ancestor of the introrev of 

>that version? Also, there might be more than one common ancestors so It 

>is unclear to me how you behave in this case.


In current code, there is only one common ancestor used as the base of the three-way merge for a file; that's the commit I'm returning. In this case, going back to the introrev seems unnecessary - we should have chosen an older node if that was a better merge base, so that's the node I'm returning.

-- 
Simon
Pierre-Yves David - March 10, 2016, 2:32 p.m.
On 03/10/2016 02:26 PM, Simon Farnsworth wrote:
> On 10/03/2016, 13:51, "Pierre-Yves David" <pierre-yves.david@ens-lyon.org> wrote:
>
>
>> On 03/10/2016 10:12 AM, Simon Farnsworth wrote:
>>> On 10/03/2016, 00:17, "Pierre-Yves David" <pierre-yves.david@ens-lyon.org> wrote:
>>>
>>>
>>>> On 03/07/2016 06:27 PM, Simon Farnsworth wrote:
>>>>> # HG changeset patch
>>>>> # User Simon Farnsworth <simonfar@fb.com>
>>>>> # Date 1457374657 0
>>>>> #      Mon Mar 07 18:17:37 2016 +0000
>>>>> # Node ID faf1fd7f7079801368317a681f55dbc46e878de0
>>>>> # Parent  d92485c84727d67bcbf462b26dcfc1b7c746d07d
>>>>> revset: new predicates to find key merge revisions
>>>>>
>>>>> When you have tricky merge conflicts, it's useful to look at the history of
>>>>> the conflict. In a fast moving repository, you're drowned in revisions that
>>>>> are in the merge but that can't impact on a conflict.
>>>>>
>>>>> Provide a new revset predicate to help out; conflict(type, pat) allows you
>>>>> to find the "base", "other" or "local" revision types for the merge
>>>>> conflicts identified by pat. This, in turn, enables you to write revsets to
>>>>> find all revisions that are of interest in a conflict situation.
>>>>
>>>> I'm a bit confused about this.
>>>>
>>>> Is there situation where "local" is not p1() and "other" is not p2()
>>>> (except for "update" merge were "local" does not exist)? If not, I'm not
>>>> sure why we need this predicate for them.
>>>
>>> These two predicates point you to the introrev of the conflict, not the merge points. I can clarify the commit message in a V3 submission. The goal is that between base, local, other, p1() and p2(), you have all the points you need to write revsets that find you interesting commits from the point of view of a human resolving the commit; for example, 'conflict("base", "file"):conflict("other", "file")' is all commits from the three way merge base to the latest change on the other side that could have caused the conflict.
>>
>> As said in another thread, I'm not super eager to promote "introrev" as
>> the one relevant for the "conflict" too much. especially because it is
>> already accessible using last(::p1() and file(FILENAME)).
>
> The main reason for looking at the introrev is that it's the last revision that's likely to be relevant, especially in a monolithic repository - commits after this one that did not touch this file are probably to other areas of the project, and we've already got p1() and p2() for the merge parents (albeit under horrible names).
>
>>
>> If having access introrev is really useful (for speed?) we could
>> probably add a keyword argument to the 'file()' revset to restrict the
>> search to a specific file revision.
>
> I'm not sure how the keyword argument is meant to help - could you explain this a bit further?

Something like file(README, from=7916edd30816) would give you the 
introrev of README as in 7916edd30816

> The introrev is cheaply available at this point, because I have a filectx in hand, hence going for it.
>
>>
>>>> What does "base" return exactly, I could not really infer it from the
>>>> documentation.
>>>
>>> It's the common ancestor of the two commits being merged - we call it "base" elsewhere in merge documentation, so I've kept that name rather than ancestor.
>>
>> I know what's the general meaning of "base" in a merge context, but what
>> are you returning here? the changeset common ancestor of the introrev of
>> that version? Also, there might be more than one common ancestors so It
>> is unclear to me how you behave in this case.
>
> In current code, there is only one common ancestor used as the base of the three-way merge for a file; that's the commit I'm returning. In this case, going back to the introrev seems unnecessary - we should have chosen an older node if that was a better merge base, so that's the node I'm returning.

So we return the one initially picked by the merge code. What if there 
is multiple file match by pattern.
Simon Farnsworth - March 10, 2016, 2:35 p.m.
On 10/03/2016, 14:32, "Pierre-Yves David" <pierre-yves.david@ens-lyon.org> wrote:


>On 03/10/2016 02:26 PM, Simon Farnsworth wrote:

>> On 10/03/2016, 13:51, "Pierre-Yves David" <pierre-yves.david@ens-lyon.org> wrote:

>>

>>

>>> On 03/10/2016 10:12 AM, Simon Farnsworth wrote:

>>>> On 10/03/2016, 00:17, "Pierre-Yves David" <pierre-yves.david@ens-lyon.org> wrote:

>>>>

>>>>

>>>>> On 03/07/2016 06:27 PM, Simon Farnsworth wrote:

>>>>>> # HG changeset patch

>>>>>> # User Simon Farnsworth <simonfar@fb.com>

>>>>>> # Date 1457374657 0

>>>>>> #      Mon Mar 07 18:17:37 2016 +0000

>>>>>> # Node ID faf1fd7f7079801368317a681f55dbc46e878de0

>>>>>> # Parent  d92485c84727d67bcbf462b26dcfc1b7c746d07d

>>>>>> revset: new predicates to find key merge revisions

>>>>>>

>>>>>> When you have tricky merge conflicts, it's useful to look at the history of

>>>>>> the conflict. In a fast moving repository, you're drowned in revisions that

>>>>>> are in the merge but that can't impact on a conflict.

>>>>>>

>>>>>> Provide a new revset predicate to help out; conflict(type, pat) allows you

>>>>>> to find the "base", "other" or "local" revision types for the merge

>>>>>> conflicts identified by pat. This, in turn, enables you to write revsets to

>>>>>> find all revisions that are of interest in a conflict situation.

>>>>>

>>>>> I'm a bit confused about this.

>>>>>

>>>>> Is there situation where "local" is not p1() and "other" is not p2()

>>>>> (except for "update" merge were "local" does not exist)? If not, I'm not

>>>>> sure why we need this predicate for them.

>>>>

>>>> These two predicates point you to the introrev of the conflict, not the merge points. I can clarify the commit message in a V3 submission. The goal is that between base, local, other, p1() and p2(), you have all the points you need to write revsets that find you interesting commits from the point of view of a human resolving the commit; for example, 'conflict("base", "file"):conflict("other", "file")' is all commits from the three way merge base to the latest change on the other side that could have caused the conflict.

>>>

>>> As said in another thread, I'm not super eager to promote "introrev" as

>>> the one relevant for the "conflict" too much. especially because it is

>>> already accessible using last(::p1() and file(FILENAME)).

>>

>> The main reason for looking at the introrev is that it's the last revision that's likely to be relevant, especially in a monolithic repository - commits after this one that did not touch this file are probably to other areas of the project, and we've already got p1() and p2() for the merge parents (albeit under horrible names).

>>

>>>

>>> If having access introrev is really useful (for speed?) we could

>>> probably add a keyword argument to the 'file()' revset to restrict the

>>> search to a specific file revision.

>>

>> I'm not sure how the keyword argument is meant to help - could you explain this a bit further?

>

>Something like file(README, from=7916edd30816) would give you the 

>introrev of README as in 7916edd30816


So I'd get conflict("local", "README") with file("README", from=p1())?

>

>> The introrev is cheaply available at this point, because I have a filectx in hand, hence going for it.

>>

>>>

>>>>> What does "base" return exactly, I could not really infer it from the

>>>>> documentation.

>>>>

>>>> It's the common ancestor of the two commits being merged - we call it "base" elsewhere in merge documentation, so I've kept that name rather than ancestor.

>>>

>>> I know what's the general meaning of "base" in a merge context, but what

>>> are you returning here? the changeset common ancestor of the introrev of

>>> that version? Also, there might be more than one common ancestors so It

>>> is unclear to me how you behave in this case.

>>

>> In current code, there is only one common ancestor used as the base of the three-way merge for a file; that's the commit I'm returning. In this case, going back to the introrev seems unnecessary - we should have chosen an older node if that was a better merge base, so that's the node I'm returning.

>

>So we return the one initially picked by the merge code. What if there 

>is multiple file match by pattern.


I return the set of base revisions - if there are multiple files, I can return multiple revisions.

-- 
Simon Farnsworth
Pierre-Yves David - March 10, 2016, 2:36 p.m.
On 03/10/2016 02:35 PM, Simon Farnsworth wrote:
> On 10/03/2016, 14:32, "Pierre-Yves David" <pierre-yves.david@ens-lyon.org> wrote:
>
>
>> On 03/10/2016 02:26 PM, Simon Farnsworth wrote:
>>> On 10/03/2016, 13:51, "Pierre-Yves David" <pierre-yves.david@ens-lyon.org> wrote:
>>>
>>>
>>>> On 03/10/2016 10:12 AM, Simon Farnsworth wrote:
>>>>> On 10/03/2016, 00:17, "Pierre-Yves David" <pierre-yves.david@ens-lyon.org> wrote:
>>>>>
>>>>>
>>>>>> On 03/07/2016 06:27 PM, Simon Farnsworth wrote:
>>>>>>> # HG changeset patch
>>>>>>> # User Simon Farnsworth <simonfar@fb.com>
>>>>>>> # Date 1457374657 0
>>>>>>> #      Mon Mar 07 18:17:37 2016 +0000
>>>>>>> # Node ID faf1fd7f7079801368317a681f55dbc46e878de0
>>>>>>> # Parent  d92485c84727d67bcbf462b26dcfc1b7c746d07d
>>>>>>> revset: new predicates to find key merge revisions
>>>>>>>
>>>>>>> When you have tricky merge conflicts, it's useful to look at the history of
>>>>>>> the conflict. In a fast moving repository, you're drowned in revisions that
>>>>>>> are in the merge but that can't impact on a conflict.
>>>>>>>
>>>>>>> Provide a new revset predicate to help out; conflict(type, pat) allows you
>>>>>>> to find the "base", "other" or "local" revision types for the merge
>>>>>>> conflicts identified by pat. This, in turn, enables you to write revsets to
>>>>>>> find all revisions that are of interest in a conflict situation.
>>>>>>
>>>>>> I'm a bit confused about this.
>>>>>>
>>>>>> Is there situation where "local" is not p1() and "other" is not p2()
>>>>>> (except for "update" merge were "local" does not exist)? If not, I'm not
>>>>>> sure why we need this predicate for them.
>>>>>
>>>>> These two predicates point you to the introrev of the conflict, not the merge points. I can clarify the commit message in a V3 submission. The goal is that between base, local, other, p1() and p2(), you have all the points you need to write revsets that find you interesting commits from the point of view of a human resolving the commit; for example, 'conflict("base", "file"):conflict("other", "file")' is all commits from the three way merge base to the latest change on the other side that could have caused the conflict.
>>>>
>>>> As said in another thread, I'm not super eager to promote "introrev" as
>>>> the one relevant for the "conflict" too much. especially because it is
>>>> already accessible using last(::p1() and file(FILENAME)).
>>>
>>> The main reason for looking at the introrev is that it's the last revision that's likely to be relevant, especially in a monolithic repository - commits after this one that did not touch this file are probably to other areas of the project, and we've already got p1() and p2() for the merge parents (albeit under horrible names).
>>>
>>>>
>>>> If having access introrev is really useful (for speed?) we could
>>>> probably add a keyword argument to the 'file()' revset to restrict the
>>>> search to a specific file revision.
>>>
>>> I'm not sure how the keyword argument is meant to help - could you explain this a bit further?
>>
>> Something like file(README, from=7916edd30816) would give you the
>> introrev of README as in 7916edd30816
>
> So I'd get conflict("local", "README") with file("README", from=p1())?

Yep

>
>>
>>> The introrev is cheaply available at this point, because I have a filectx in hand, hence going for it.
>>>
>>>>
>>>>>> What does "base" return exactly, I could not really infer it from the
>>>>>> documentation.
>>>>>
>>>>> It's the common ancestor of the two commits being merged - we call it "base" elsewhere in merge documentation, so I've kept that name rather than ancestor.
>>>>
>>>> I know what's the general meaning of "base" in a merge context, but what
>>>> are you returning here? the changeset common ancestor of the introrev of
>>>> that version? Also, there might be more than one common ancestors so It
>>>> is unclear to me how you behave in this case.
>>>
>>> In current code, there is only one common ancestor used as the base of the three-way merge for a file; that's the commit I'm returning. In this case, going back to the introrev seems unnecessary - we should have chosen an older node if that was a better merge base, so that's the node I'm returning.
>>
>> So we return the one initially picked by the merge code. What if there
>> is multiple file match by pattern.
>
> I return the set of base revisions - if there are multiple files, I can return multiple revisions.

Okay, If you think of a way to improve the documentation to make this 
clear that would probably help.

Patch

diff --git a/mercurial/mergestate.py b/mercurial/mergestate.py
--- a/mercurial/mergestate.py
+++ b/mercurial/mergestate.py
@@ -73,6 +73,7 @@ 
         Do not use this directly! Instead call read() or clean()."""
         self._repo = repo
         self._dirty = False
+        self._otherinferred = False
 
     statepathv1 = 'merge/state'
     statepathv2 = 'merge/state2'
@@ -153,6 +154,7 @@ 
             # we cannot do better than that with v1 of the format
             mctx = self._repo[None].parents()[-1]
             v1records.append(('O', mctx.hex()))
+            self._otherinferred = True
             # add place holder "other" file node information
             # nobody is using it yet so we do no need to fetch the data
             # if mctx was wrong `mctx[bits[-2]]` may fails.
@@ -304,6 +306,26 @@ 
             if entry[0] == 'u':
                 yield f
 
+    def ancestorchangectx(self, dfile):
+        extras = self.extras(dfile)
+        anccommitnode = extras.get('ancestorlinknode')
+        if anccommitnode:
+            return self._repo[anccommitnode]
+        else:
+            return None
+
+    def otherfilectx(self, dfile):
+        if self._otherinferred:
+            return None
+        entry = self._state[dfile]
+        return self.otherctx.filectx(entry[5], fileid=entry[6])
+
+    def localfilectx(self, dfile):
+        if self._local is None:
+            return None
+        entry = self._state[dfile]
+        return self.localctx.filectx(entry[2])
+
     def driverresolved(self):
         """Obtain the paths of driver-resolved files."""
 
diff --git a/mercurial/revset.py b/mercurial/revset.py
--- a/mercurial/revset.py
+++ b/mercurial/revset.py
@@ -17,6 +17,7 @@ 
     error,
     hbisect,
     match as matchmod,
+    mergestate as mergestatemod,
     node,
     obsolete as obsmod,
     parser,
@@ -816,6 +817,68 @@ 
     getargs(x, 0, 0, _("closed takes no arguments"))
     return subset.filter(lambda r: repo[r].closesbranch())
 
+@predicate('conflict(type,[pattern])')
+def conflict(repo, subset, x):
+    """The type revision for any merge conflict matching pattern.
+    See :hg:`help patterns` for information about file patterns.
+
+    type should be one of "base", "other" or "local", for the three-way
+    merge base, the "other" tree, or the "local" tree.
+
+    The pattern without explicit kind like ``glob:`` is expected to be
+    relative to the current directory and match against a file exactly
+    for efficiency. If pattern is omitted, all conflicts are included.
+    """
+    # i18n: "conflict" is a keyword
+    l = getargs(x, 1, 2, _('conflict takes one or two arguments'))
+    t = getstring(l[0], _("conflict requires a type"))
+    if t not in ["base", "other", "local"]:
+        msg = _('conflict file types are "base", "other" or "local"')
+        raise error.Abort(msg)
+    if len(l) == 2:
+        pat = getstring(l[1], _("conflict takes a string pattern"))
+    else:
+        pat = None
+
+    ms = mergestatemod.mergestatereadonly.read(repo)
+
+    if pat is None:
+        files = ms.files()
+    elif not matchmod.patkind(pat):
+        f = pathutil.canonpath(repo.root, repo.getcwd(), pat)
+        if f in ms:
+            files = [f]
+        else:
+            files = []
+    else:
+        m = matchmod.match(repo.root, repo.getcwd(), [pat], ctx=repo[None])
+        files = (f for f in ms if m(f))
+
+    s = set()
+    warning = _("merge {t} commit of file {f} not known - ignoring\n")
+    for f in files:
+        warn = False
+        if t == "base":
+            base = ms.ancestorchangectx(f)
+            if base is None:
+                repo.ui.warn(warning.format(t=t, f=f))
+            else:
+                s.add(base.rev())
+        elif t == "local":
+            local = ms.localfilectx(f)
+            if local is None:
+                repo.ui.warn(warning.format(t=t, f=f))
+            else:
+                s.add(local.introrev())
+        elif t == "other":
+            other = ms.otherfilectx(f)
+            if other is None:
+                repo.ui.warn(warning.format(t=t, f=f))
+            else:
+                s.add(other.introrev())
+
+    return subset & s
+
 @predicate('contains(pattern)')
 def contains(repo, subset, x):
     """The revision's manifest contains a file matching pattern (but might not
diff --git a/tests/test-revset.t b/tests/test-revset.t
--- a/tests/test-revset.t
+++ b/tests/test-revset.t
@@ -2230,3 +2230,105 @@ 
   2
 
   $ cd ..
+
+Test merge conflict predicates
+
+  $ hg init conflictrepo
+  $ cd conflictrepo
+  $ echo file1 > file1
+  $ echo file2 > file2
+  $ hg commit -qAm first
+  $ echo line2 >> file1
+  $ hg commit -qAm second
+  $ hg bookmark base
+  $ hg bookmark tree1
+  $ echo line1 > file1
+  $ hg commit -qAm tree1-file1
+  $ echo tree1-file2 > file2
+  $ hg commit -qAm tree1-file2
+  $ echo file3 > file3
+  $ hg commit -qAm tree1-file3
+  $ hg update base
+  2 files updated, 0 files merged, 1 files removed, 0 files unresolved
+  (activating bookmark base)
+  $ hg bookmark tree2
+  $ echo lineA > file1
+  $ echo line2 >> file1
+  $ hg commit -qAm tree2
+  $ hg bookmark tree2
+  $ echo tree2-file2 > file2
+  $ hg commit -qAm tree2-file2
+  $ echo file4 > file4
+  $ hg commit -qAm tree2-file4
+
+There are no markers before a merge conflict exists
+
+  $ hg debugrevspec 'conflict("base","glob:*")'
+  $ hg debugrevspec 'conflict("local","glob:*")'
+  $ hg debugrevspec 'conflict("other","glob:*")'
+
+Merge and test that the expected set of markers exist and work with patterns
+
+  $ hg merge --rev tree1 --tool :fail
+  1 files updated, 0 files merged, 0 files removed, 2 files unresolved
+  use 'hg resolve' to retry unresolved file merges or 'hg update -C .' to abandon
+  [1]
+  $ hg resolve --list
+  U file1
+  U file2
+  $ hg debugrevspec 'conflict("base","glob:*")'
+  1
+  $ hg debugrevspec 'conflict("local","glob:*")'
+  5
+  6
+  $ hg debugrevspec 'conflict("other","glob:*")'
+  2
+  3
+  $ hg debugrevspec 'conflict("base", "file1")'
+  1
+  $ hg debugrevspec 'conflict("local", "file1")'
+  5
+  $ hg debugrevspec 'conflict("other", "file1")'
+  2
+  $ hg debugrevspec 'conflict("base", "file2")'
+  1
+  $ hg debugrevspec 'conflict("local", "file2")'
+  6
+  $ hg debugrevspec 'conflict("other", "file2")'
+  3
+
+There are no markers on files not in conflict
+  $ hg debugrevspec 'conflict("base", "file4")'
+  $ hg debugrevspec 'conflict("local", "file4")'
+  $ hg debugrevspec 'conflict("other","file4")'
+
+It's possible to get interesting sets from markers
+
+  $ hg debugrevspec 'conflict("base", "glob:*"):conflict("local", "file1")'
+  1
+  2
+  3
+  4
+  5
+  $ hg debugrevspec 'conflict("base", "file2"):conflict("other", "re:file[1234]")'
+  1
+  2
+  3
+
+If we only have a V1 state file, "base" and "other" markers can't be found,
+but local can.
+
+  $ rm .hg/merge/state2
+  $ hg debugrevspec 'conflict("base", "file1")'
+  merge base commit of file file1 not known - ignoring
+  $ hg debugrevspec 'conflict("local", "file1")'
+  5
+  $ hg debugrevspec 'conflict("other", "file1")'
+  merge other commit of file file1 not known - ignoring
+
+We get a nice error if we ask for a non-existent marker
+
+  $ hg debugrevspec 'conflict("mercurial-versus-sccs", "glob:*")'
+  abort: conflict file types are "base", "other" or "local"
+  [255]
+  $ cd ..