Patchwork D5495: revset: add "branch" positional arguments to the merge revset

login
register
mail settings
Submitter phabricator
Date Jan. 6, 2019, 7:03 p.m.
Message ID <differential-rev-PHID-DREV-6lublzvib3hwvmxuwnsy-req@phab.mercurial-scm.org>
Download mbox | patch
Permalink /patch/37495/
State New
Headers show

Comments

phabricator - Jan. 6, 2019, 7:03 p.m.
angel.ezquerra created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Make it possible to only include those merge revisions that are merges with one
  or more specific branches (passed as positional arguments to the merge revset
  function). All merge revisions are shown by default (i.e. if no "merge with"
  branches are specified).

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5495

AFFECTED FILES
  mercurial/revset.py

CHANGE DETAILS




To: angel.ezquerra, #hg-reviewers
Cc: mercurial-devel
Yuya Nishihara - Jan. 7, 2019, 2:15 p.m.
> +@predicate('merge(*withbranch)', safe=True)
>  def merge(repo, subset, x):
> -    """Changeset is a merge changeset.
> +    """Changeset is a merge changeset
> +
> +    All merge revisions are returned by default. If one or more "withbranch"
> +    names are provided only merges with those branches (i.e. whose
> +    second parent belongs to one of those branches) will be returned.

I understand this will be useful in a certain branch strategy, but the
proposed syntax is hardly extensible. Maybe it can be a non-wildcard argument
of `stringmatcher` type so we can at least add another option later.

Any thoughts? Do anyone love this feature?

If we had a syntax to filter revisions by sub expression, this and the
samebranch option could be expressed as follows:

```
merge() & filter($a, p2($a) & branch("..."))
merge() & filter($a, samebranch(parents($a)))

where filter(argname, boolean-expr)
```

This is much more expressive (or verbose) and can support other types of
branches, but is hard to implement.
phabricator - Jan. 7, 2019, 2:17 p.m.
yuja added a comment.


  > +@predicate('merge(*withbranch)', safe=True)
  > 
  >   def merge(repo, subset, x):
  > 
  > - """Changeset is a merge changeset. +    """Changeset is a merge changeset + +    All merge revisions are returned by default. If one or more "withbranch" +    names are provided only merges with those branches (i.e. whose +    second parent belongs to one of those branches) will be returned.
  
  I understand this will be useful in a certain branch strategy, but the
  proposed syntax is hardly extensible. Maybe it can be a non-wildcard argument
  of `stringmatcher` type so we can at least add another option later.
  
  Any thoughts? Do anyone love this feature?
  
  If we had a syntax to filter revisions by sub expression, this and the
  samebranch option could be expressed as follows:
  
    merge() & filter($a, p2($a) & branch("..."))
    merge() & filter($a, samebranch(parents($a)))
    
    where filter(argname, boolean-expr)
  
  This is much more expressive (or verbose) and can support other types of
  branches, but is hard to implement.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5495

To: angel.ezquerra, #hg-reviewers
Cc: yuja, mercurial-devel
phabricator - Jan. 8, 2019, 12:05 p.m.
pulkit added a comment.


  In https://phab.mercurial-scm.org/D5495#81562, @yuja wrote:
  
  > > +@predicate('merge(*withbranch)', safe=True)
  > > 
  > >   def merge(repo, subset, x):
  > > 
  > > - """Changeset is a merge changeset. +    """Changeset is a merge changeset + +    All merge revisions are returned by default. If one or more "withbranch" +    names are provided only merges with those branches (i.e. whose +    second parent belongs to one of those branches) will be returned.
  >
  > I understand this will be useful in a certain branch strategy, but the
  >  proposed syntax is hardly extensible. Maybe it can be a non-wildcard argument
  >  of `stringmatcher` type so we can at least add another option later.
  >
  > Any thoughts? Do anyone love this feature?
  
  
  I like this feature as it will be quite useful for us. Also I like your filter() suggestion more than the syntax this series introduces.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5495

To: angel.ezquerra, #hg-reviewers
Cc: pulkit, yuja, mercurial-devel
phabricator - Jan. 9, 2019, 11:21 p.m.
angel.ezquerra added a comment.


  In https://phab.mercurial-scm.org/D5495#81562, @yuja wrote:
  
  > > +@predicate('merge(*withbranch)', safe=True)
  > > 
  > >   def merge(repo, subset, x):
  > > 
  > > - """Changeset is a merge changeset. +    """Changeset is a merge changeset + +    All merge revisions are returned by default. If one or more "withbranch" +    names are provided only merges with those branches (i.e. whose +    second parent belongs to one of those branches) will be returned.
  >
  > I understand this will be useful in a certain branch strategy, but the
  >  proposed syntax is hardly extensible. Maybe it can be a non-wildcard argument
  >  of `stringmatcher` type so we can at least add another option later.
  >
  > Any thoughts? Do anyone love this feature?
  >
  > If we had a syntax to filter revisions by sub expression, this and the
  >  samebranch option could be expressed as follows:
  >
  >   merge() & filter($a, p2($a) & branch("..."))
  >   merge() & filter($a, samebranch(parents($a)))
  >  
  >   where filter(argname, boolean-expr)
  >
  >
  > This is much more expressive (or verbose) and can support other types of
  >  branches, but is hard to implement.
  
  
  Thank you for the review, @yuja
  
  I think it would be a good idea to make the "branch" arguments more flexible. One option could be to use a stringmatcher to add support for regular expressions as you suggest. I can look into that. However there may be some other options worth exploring. The one you suggest is very interesting although I find the syntax a bit complicated for the common use cases that I want to enable which are:
  
  1. Ignore merges from the same branch, which in a named-branch based branching strategy are usually irrelevant
  2. Look into merges with a specific branch (e.g. which branches have been merged with the default branch)?
  
  In my experience those two are the ones that are the most common and I think we should try to make the easy to use. That is, I think that even if mercurial had a filter function like the one you propose I would still want to be able to express those 2 common merge properties in a simple way.
  
  That being said, I really like your idea since I often find myself being unable to express what I want with a revset (as powerful as those are) because of the lack of a filtering mechanism. Adding a generic filter function would be very useful indeed. I'm not sure if the syntax you propose would work as is though. It seems that it would need a new "&" operator? In any case I believe that it is out of the scope of this particular set of patches. Do you agree? If so I can focus on improving this patch by adding the stringmatcher as you suggest (as it seems I'm not the only one who thinks this would be useful). Is that ok?

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5495

To: angel.ezquerra, #hg-reviewers
Cc: pulkit, yuja, mercurial-devel
Yuya Nishihara - Jan. 10, 2019, 1:14 p.m.
>   I think it would be a good idea to make the "branch" arguments more flexible. One option could be to use a stringmatcher to add support for regular expressions as you suggest. I can look into that. However there may be some other options worth exploring. The one you suggest is very interesting although I find the syntax a bit complicated for the common use cases that I want to enable which are:
>   
>   1. Ignore merges from the same branch, which in a named-branch based branching strategy are usually irrelevant
>   2. Look into merges with a specific branch (e.g. which branches have been merged with the default branch)?
>   
>   In my experience those two are the ones that are the most common and I think we should try to make the easy to use. That is, I think that even if mercurial had a filter function like the one you propose I would still want to be able to express those 2 common merge properties in a simple way.

Yep, I agree with that.

>   That being said, I really like your idea since I often find myself being unable to express what I want with a revset (as powerful as those are) because of the lack of a filtering mechanism. Adding a generic filter function would be very useful indeed. I'm not sure if the syntax you propose would work as is though. It seems that it would need a new "&" operator? In any case I believe that it is out of the scope of this particular set of patches. Do you agree?

Yes. Actually I have a PoC-level implementation of generic filtering predicate,
which can be reviewed separately.

> If so I can focus on improving this patch by adding the stringmatcher as you suggest (as it seems I'm not the only one who thinks this would be useful). Is that ok?

Sounds good to me. To be clear, I want `'withbranch'` instead of
`'*withbranch'`, because the withbranch option doesn't look like a first-class
parameter of the `merge()` predicate.
phabricator - Jan. 10, 2019, 1:16 p.m.
yuja added a comment.


  >   I think it would be a good idea to make the "branch" arguments more flexible. One option could be to use a stringmatcher to add support for regular expressions as you suggest. I can look into that. However there may be some other options worth exploring. The one you suggest is very interesting although I find the syntax a bit complicated for the common use cases that I want to enable which are:
  >   
  >   1. Ignore merges from the same branch, which in a named-branch based branching strategy are usually irrelevant
  >   2. Look into merges with a specific branch (e.g. which branches have been merged with the default branch)?
  >   
  >   In my experience those two are the ones that are the most common and I think we should try to make the easy to use. That is, I think that even if mercurial had a filter function like the one you propose I would still want to be able to express those 2 common merge properties in a simple way.
  
  Yep, I agree with that.
  
  >   That being said, I really like your idea since I often find myself being unable to express what I want with a revset (as powerful as those are) because of the lack of a filtering mechanism. Adding a generic filter function would be very useful indeed. I'm not sure if the syntax you propose would work as is though. It seems that it would need a new "&" operator? In any case I believe that it is out of the scope of this particular set of patches. Do you agree?
  
  Yes. Actually I have a PoC-level implementation of generic filtering predicate,
  which can be reviewed separately.
  
  > If so I can focus on improving this patch by adding the stringmatcher as you suggest (as it seems I'm not the only one who thinks this would be useful). Is that ok?
  
  Sounds good to me. To be clear, I want `'withbranch'` instead of
  `'*withbranch'`, because the withbranch option doesn't look like a first-class
  parameter of the `merge()` predicate.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5495

To: angel.ezquerra, #hg-reviewers
Cc: pulkit, yuja, mercurial-devel
phabricator - Jan. 11, 2019, 11:10 p.m.
angel.ezquerra added a comment.


  In https://phab.mercurial-scm.org/D5495#82036, @yuja wrote:
  
  > >   I think it would be a good idea to make the "branch" arguments more flexible. One option could be to use a stringmatcher to add support for regular expressions as you suggest. I can look into that. However there may be some other options worth exploring. The one you suggest is very interesting although I find the syntax a bit complicated for the common use cases that I want to enable which are:
  > >   
  > >   1. Ignore merges from the same branch, which in a named-branch based branching strategy are usually irrelevant
  > >   2. Look into merges with a specific branch (e.g. which branches have been merged with the default branch)?
  > >   
  > >   In my experience those two are the ones that are the most common and I think we should try to make the easy to use. That is, I think that even if mercurial had a filter function like the one you propose I would still want to be able to express those 2 common merge properties in a simple way.
  >
  > Yep, I agree with that.
  >
  > >   That being said, I really like your idea since I often find myself being unable to express what I want with a revset (as powerful as those are) because of the lack of a filtering mechanism. Adding a generic filter function would be very useful indeed. I'm not sure if the syntax you propose would work as is though. It seems that it would need a new "&" operator? In any case I believe that it is out of the scope of this particular set of patches. Do you agree?
  >
  > Yes. Actually I have a PoC-level implementation of generic filtering predicate,
  >  which can be reviewed separately.
  >
  > > If so I can focus on improving this patch by adding the stringmatcher as you suggest (as it seems I'm not the only one who thinks this would be useful). Is that ok?
  >
  > Sounds good to me. To be clear, I want `'withbranch'` instead of
  >  `'*withbranch'`, because the withbranch option doesn't look like a first-class
  >  parameter of the `merge()` predicate.
  
  
  This would not make it possible to select multiple "merged with" branches by doing: hg log -r "merge(feature1, feature2)"
  Instead I guess you are proposing that for that use case we force the user to do: hg log -r "merge('re:(feature1|feature2)')
  
  Did I understand you correctly?

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5495

To: angel.ezquerra, #hg-reviewers
Cc: pulkit, yuja, mercurial-devel
Yuya Nishihara - Jan. 12, 2019, 2:22 a.m.
>   This would not make it possible to select multiple "merged with" branches by doing: hg log -r "merge(feature1, feature2)"
>   Instead I guess you are proposing that for that use case we force the user to do: hg log -r "merge('re:(feature1|feature2)')
>   
>   Did I understand you correctly?

Yes. And we can introduce other prefixes of string matcher if `re:` seemed
unnecessarily complex.
phabricator - Jan. 12, 2019, 2:23 a.m.
yuja added a comment.


  >   This would not make it possible to select multiple "merged with" branches by doing: hg log -r "merge(feature1, feature2)"
  >   Instead I guess you are proposing that for that use case we force the user to do: hg log -r "merge('re:(feature1|feature2)')
  >   
  >   Did I understand you correctly?
  
  Yes. And we can introduce other prefixes of string matcher if `re:` seemed
  unnecessarily complex.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5495

To: angel.ezquerra, #hg-reviewers
Cc: pulkit, yuja, mercurial-devel
phabricator - Jan. 14, 2019, 12:21 p.m.
lothiraldan added a comment.


  What about making the argument a revset instead of a branch name. You can get the same result `merge(branch("foo")` but have a more expressive result `merge(only(4.8, 4.7))` ?
  
  (Sorry to be a bit late to the party)

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5495

To: angel.ezquerra, #hg-reviewers
Cc: lothiraldan, pulkit, yuja, mercurial-devel
Yuya Nishihara - Jan. 14, 2019, 1:28 p.m.
>   What about making the argument a revset instead of a branch name. You can get the same result `merge(branch("foo")` but have a more expressive result `merge(only(4.8, 4.7))` ?

That's basically a simpler form of my `filter()` proposal.

The problem of `merge(branch("foo"))` is that it's ambiguous which revision
the expression will be tested against. It could be expressed as
`merge(p2=branch("foo"))` to disambiguate, but this syntax isn't generic
enough to express the `samebranch=True` constraint. So if we want a truly
expressive syntax, we'll need something like a lambda function.

```
filtereach(merge(), p2(_) & branch("foo"))
filtereach(merge(), samebranch(parents(_))
```

I agreed with Angel in that `merge(withbranch=name)` would be useful enough
to have a dedicated syntax, but I'm open to other ideas like
`merge(p1=expr, p2=expr)`.
phabricator - Jan. 14, 2019, 1:30 p.m.
yuja added a comment.


  >   What about making the argument a revset instead of a branch name. You can get the same result `merge(branch("foo")` but have a more expressive result `merge(only(4.8, 4.7))` ?
  
  That's basically a simpler form of my `filter()` proposal.
  
  The problem of `merge(branch("foo"))` is that it's ambiguous which revision
  the expression will be tested against. It could be expressed as
  `merge(p2=branch("foo"))` to disambiguate, but this syntax isn't generic
  enough to express the `samebranch=True` constraint. So if we want a truly
  expressive syntax, we'll need something like a lambda function.
  
    filtereach(merge(), p2(_) & branch("foo"))
    filtereach(merge(), samebranch(parents(_))
  
  I agreed with Angel in that `merge(withbranch=name)` would be useful enough
  to have a dedicated syntax, but I'm open to other ideas like
  `merge(p1=expr, p2=expr)`.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5495

To: angel.ezquerra, #hg-reviewers
Cc: lothiraldan, pulkit, yuja, mercurial-devel
phabricator - Jan. 14, 2019, 11:36 p.m.
angel.ezquerra added a comment.


  In https://phab.mercurial-scm.org/D5495#82397, @yuja wrote:
  
  > Generally looks good.
  >
  > Can you fix a couple of nits? And if possible, fold the tests from https://phab.mercurial-scm.org/D5577
  >  into this and the next patch. We prefer including relevant test in each
  >  commit.
  >
  > > -@predicate('merge()', safe=True)
  > >  +@predicate('merge(withbranch)', safe=True)
  >
  > `merge([withbranch])` as it is an optional parameter.
  >
  > >   def merge(repo, subset, x):
  > > 
  > > - """Changeset is a merge changeset. +    """Changeset is a merge changeset + +    All merge revisions are returned by default. If a "withbranch" +    pattern is provided only merges with (i.e. whose second parent +    belongs to) those branches that match the pattern will be returned. +    The simplest pattern is the name of a single branch. It is also +    possible to specify a regular expression by starting the pattern +    with "re:". This can be used to match more than one branch +    (e.g. "re:branch1|branch2"). """
  > >   1. i18n: "merge" is a keyword
  > > - getargs(x, 0, 0, _("merge takes no arguments")) +    args = getargsdict(x, 'merge', 'withbranch') +    withbranch = '' +    if 'withbranch' in args: +        withbranch = getstring(args['withbranch'], +                               _('withbranch argument must be a string')) +        kind, branchname, branchmatcher = stringutil.stringmatcher(withbranch)
  >
  > Can you merge this with the next `if withbranch:` block to reduce the number
  >  of conditionally defined variables referenced later?
  >
  > >   cl = repo.changelog
  > > 
  > > - return subset.filter(lambda r: cl.parentrevs(r)[1] != -1,
  > > - condrepr='<merge>') +    # create the function that will be used to filter the subset +    if withbranch:
  >
  >
  >
  > > +        # matchfn is a function that returns true when a revision
  > >  +        # is a merge and the second parent belongs to a branch that
  > >  +        # matches the withbranch pattern (which can be a literal or a regex)
  >
  > Nit: these comments seem a bit verbose. It's documented in stringmatcher().
  >
  > > +        if kind == 'literal':
  > >  +            matchfn = lambda r: (cl.parentrevs(r)[1] != -1
  > >  +                                 and repo[r].p2().branch() == withbranch)
  > >  +        else:
  > >  +            matchfn = lambda r: (cl.parentrevs(r)[1] != -1
  > >  +                                 and branchmatcher(repo[r].p2().branch()))
  >
  > If we don't have anything special about the `literal` kind, we can always use
  >  the `branchmatcher()`. `if kind == 'literal'` isn't needed.
  
  
  
  
  In https://phab.mercurial-scm.org/D5495#82407, @yuja wrote:
  
  > >   What about making the argument a revset instead of a branch name. You can get the same result `merge(branch("foo")` but have a more expressive result `merge(only(4.8, 4.7))` ?
  >
  > That's basically a simpler form of my `filter()` proposal.
  >
  > The problem of `merge(branch("foo"))` is that it's ambiguous which revision
  >  the expression will be tested against. It could be expressed as
  >  `merge(p2=branch("foo"))` to disambiguate, but this syntax isn't generic
  >  enough to express the `samebranch=True` constraint. So if we want a truly
  >  expressive syntax, we'll need something like a lambda function.
  >
  >   filtereach(merge(), p2(_) & branch("foo"))
  >   filtereach(merge(), samebranch(parents(_))
  >
  >
  > I agreed with Angel in that `merge(withbranch=name)` would be useful enough
  >  to have a dedicated syntax, but I'm open to other ideas like
  >  `merge(p1=expr, p2=expr)`.
  
  
  @lothiraldan , your proposal is interesting but as @yuja said it is less generic that introducing a full blown filter function. Also, to be really flexible it would require some way to refer to the p2 revision (i.e. you'd need to introduce an implicit or explicit reference to each of the p1 or p2 revisions). I think this would make the syntax for the most common use cases that I want to cover too complex. That being said, the proposed syntax leaves open the door for introducing such revset based filtering in the future if needed (although I think that a generic filter function would be more useful).
  
  @yuja  , I've sent an updated set of patches, following your recommendations. There are 2 patches now, since each includes its own tests. This means that the 3rd patch on the original patch set is no longer needed. However I don't know what is the best way to tell that to phabricator...

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5495

To: angel.ezquerra, #hg-reviewers
Cc: mjpieters, lothiraldan, pulkit, yuja, mercurial-devel
phabricator - Jan. 16, 2019, 6 a.m.
mharbison72 added a comment.


  In https://phab.mercurial-scm.org/D5495#82416, @angel.ezquerra wrote:
  
  > I've sent an updated set of patches, following your recommendations. There are 2 patches now, since each includes its own tests. This means that the 3rd patch on the original patch set is no longer needed. However I don't know what is the best way to tell that to phabricator...
  
  
  I think you should be able to go to that 3rd patch, and set it to abandoned in the actions at the bottom.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5495

To: angel.ezquerra, #hg-reviewers
Cc: mharbison72, mjpieters, lothiraldan, pulkit, yuja, mercurial-devel
Angel Ezquerra - Jan. 16, 2019, 7:58 a.m.
Thank you Matt. I just did that.

Cheers,

Angel

On Wed, Jan 16, 2019 at 7:01 AM mharbison72 (Matt Harbison)
<phabricator@mercurial-scm.org> wrote:
>
> mharbison72 added a comment.
>
>
>   In https://phab.mercurial-scm.org/D5495#82416, @angel.ezquerra wrote:
>
>   > I've sent an updated set of patches, following your recommendations. There are 2 patches now, since each includes its own tests. This means that the 3rd patch on the original patch set is no longer needed. However I don't know what is the best way to tell that to phabricator...
>
>
>   I think you should be able to go to that 3rd patch, and set it to abandoned in the actions at the bottom.
>
> REPOSITORY
>   rHG Mercurial
>
> REVISION DETAIL
>   https://phab.mercurial-scm.org/D5495
>
> To: angel.ezquerra, #hg-reviewers
> Cc: mharbison72, mjpieters, lothiraldan, pulkit, yuja, mercurial-devel
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/revset.py b/mercurial/revset.py
--- a/mercurial/revset.py
+++ b/mercurial/revset.py
@@ -1249,15 +1249,33 @@ 
         pass
     return baseset(datarepr=('<max %r, %r>', subset, os))
 
-@predicate('merge()', safe=True)
+@predicate('merge(*withbranch)', safe=True)
 def merge(repo, subset, x):
-    """Changeset is a merge changeset.
+    """Changeset is a merge changeset
+
+    All merge revisions are returned by default. If one or more "withbranch"
+    names are provided only merges with those branches (i.e. whose
+    second parent belongs to one of those branches) will be returned.
     """
     # i18n: "merge" is a keyword
-    getargs(x, 0, 0, _("merge takes no arguments"))
+    args = getargsdict(x, 'merge', '*withbranch')
+    withbranches = []
+    if 'withbranch' in args:
+        for el in args['withbranch']:
+            # i18n: "withbranch" is a keyword
+            withbranches.append(getstring(el,
+                _('withbranch argument must be a string')))
     cl = repo.changelog
-    return subset.filter(lambda r: cl.parentrevs(r)[1] != -1,
-                         condrepr='<merge>')
+    if withbranches:
+        # basematch is a function that returns true when a revision
+        # is a merge and the second parent belongs to one of the
+        # selected "merge with branches"
+        matches = lambda r: (cl.parentrevs(r)[1] != -1
+                             and repo[r].p2().branch() in withbranches)
+    else:
+        # basematch is a function that returns true when a revision is a merge
+        matches = lambda r: cl.parentrevs(r)[1] != -1
+    return subset.filter(matches, condrepr='<merge>')
 
 @predicate('branchpoint()', safe=True)
 def branchpoint(repo, subset, x):