Patchwork [STABLE] log: fix order of revisions filtered by multiple OR options (issue5100)

login
register
mail settings
Submitter Yuya Nishihara
Date Feb. 15, 2016, 2:34 p.m.
Message ID <ae36fdcb20ae3b7b0d8b.1455546859@mimosa>
Download mbox | patch
Permalink /patch/13212/
State Superseded
Delegated to: Pierre-Yves David
Headers show

Comments

Yuya Nishihara - Feb. 15, 2016, 2:34 p.m.
# HG changeset patch
# User Yuya Nishihara <yuya@tcha.org>
# Date 1455543967 -32400
#      Mon Feb 15 22:46:07 2016 +0900
# Branch stable
# Node ID ae36fdcb20ae3b7b0d8b2e7f868a1c3eb104e34e
# Parent  8da94662afe51a836eda500652097772c34002e8
log: fix order of revisions filtered by multiple OR options (issue5100)

This is the simplest workaround for the issue of the ordering of revset, which
is that the expression "x or y" takes over the ordering specified by the input
set (or the left-hand-side expression.) For example, the following expression

  A & (x | y)

will be evaluated as if

  (A & x) | (A & y)

It is wrong because revset has ordering. Ideally, we should fix the revset
computation, but that would require a long patch series. So, for now, this
patch just works around the common log cases.

Since this change might have some impact on performance, it is enabled only
if the expression seems to have ' or ' expression.
Pierre-Yves David - Feb. 23, 2016, 2:05 p.m.
On 02/15/2016 03:34 PM, Yuya Nishihara wrote:
> # HG changeset patch
> # User Yuya Nishihara <yuya@tcha.org>
> # Date 1455543967 -32400
> #      Mon Feb 15 22:46:07 2016 +0900
> # Branch stable
> # Node ID ae36fdcb20ae3b7b0d8b2e7f868a1c3eb104e34e
> # Parent  8da94662afe51a836eda500652097772c34002e8
> log: fix order of revisions filtered by multiple OR options (issue5100)
>
> This is the simplest workaround for the issue of the ordering of revset, which
> is that the expression "x or y" takes over the ordering specified by the input
> set (or the left-hand-side expression.) For example, the following expression
>
>    A & (x | y)
>
> will be evaluated as if
>
>    (A & x) | (A & y)
>
> It is wrong because revset has ordering. Ideally, we should fix the revset
> computation, but that would require a long patch series. So, for now, this
> patch just works around the common log cases.
>
> Since this change might have some impact on performance, it is enabled only
> if the expression seems to have ' or ' expression.
>
> diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py
> --- a/mercurial/cmdutil.py
> +++ b/mercurial/cmdutil.py
> @@ -2139,9 +2139,14 @@ def getlogrevs(repo, pats, opts):
>           # Revset matches can reorder revisions. "A or B" typically returns
>           # returns the revision matching A then the revision matching B. Sort
>           # again to fix that.
> +        oldrevs = revs
>           revs = matcher(repo, revs)
>           if not opts.get('rev'):
>               revs.sort(reverse=True)
> +        elif ' or ' in expr:
> +            # XXX "A or B" is known to change the order; fix it by filtering
> +            # matched set again (issue5100)
> +            revs = oldrevs & revs

uurg, this is hacky even if I understand why we need it. Should 'addset' 
have some logic to handle this case?

You are checking for ' or ' but what about '+' ?

Also, what about user input containing " or " string (eg: in 'desc(…)'), 
over sorting will not introduce behavior issue but still seems a bit 
unfortunabte,

finally. this seems to be fixing the issue for log only. Is log the only 
affected command (because it build revset in a strange way?) or the bug 
more widespread?
Yuya Nishihara - Feb. 23, 2016, 3:38 p.m.
On Tue, 23 Feb 2016 15:05:58 +0100, Pierre-Yves David wrote:
> On 02/15/2016 03:34 PM, Yuya Nishihara wrote:
> > # HG changeset patch
> > # User Yuya Nishihara <yuya@tcha.org>
> > # Date 1455543967 -32400
> > #      Mon Feb 15 22:46:07 2016 +0900
> > # Branch stable
> > # Node ID ae36fdcb20ae3b7b0d8b2e7f868a1c3eb104e34e
> > # Parent  8da94662afe51a836eda500652097772c34002e8
> > log: fix order of revisions filtered by multiple OR options (issue5100)
> >
> > This is the simplest workaround for the issue of the ordering of revset, which
> > is that the expression "x or y" takes over the ordering specified by the input
> > set (or the left-hand-side expression.) For example, the following expression
> >
> >    A & (x | y)
> >
> > will be evaluated as if
> >
> >    (A & x) | (A & y)
> >
> > It is wrong because revset has ordering. Ideally, we should fix the revset
> > computation, but that would require a long patch series. So, for now, this
> > patch just works around the common log cases.
> >
> > Since this change might have some impact on performance, it is enabled only
> > if the expression seems to have ' or ' expression.
> >
> > diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py
> > --- a/mercurial/cmdutil.py
> > +++ b/mercurial/cmdutil.py
> > @@ -2139,9 +2139,14 @@ def getlogrevs(repo, pats, opts):
> >           # Revset matches can reorder revisions. "A or B" typically returns
> >           # returns the revision matching A then the revision matching B. Sort
> >           # again to fix that.
> > +        oldrevs = revs
> >           revs = matcher(repo, revs)
> >           if not opts.get('rev'):
> >               revs.sort(reverse=True)
> > +        elif ' or ' in expr:
> > +            # XXX "A or B" is known to change the order; fix it by filtering
> > +            # matched set again (issue5100)
> > +            revs = oldrevs & revs
> 
> uurg, this is hacky even if I understand why we need it. Should 'addset'
> have some logic to handle this case?

I don't think 'addset' can have something, but optimize() can do. Basic idea
is to tell optimize() if the current expression should define its ordering or
follow the existing ordering.

  A & (x | y)  # A defines ordering, (x | y) follows A
  =   -------

  x | y | z    # x, y, z, and (x | y | z) define ordering
  =========

  A & x(B)     # A defines ordering, x(B) follows A, but B may define ordering
  =   --=      # if B is a set expression (e.g. first(x:y))

If an expression is known to have its ordering (e.g. x:y, sort(), orset(), ..),
but if it should follow the existing ordering, wrap it.

  A & (x | y) -> A & _reorder((x | y))
                     --------
                     A.filter(addset(x, y))

> You are checking for ' or ' but what about '+' ?

_makelogrevsets() doesn't use '+'. I agree it's a hack, but I wanted to keep
a stable patch simple. But as the issue is old regression, I'm okay to drop
this patch and do a complete fix in default branch.

> Also, what about user input containing " or " string (eg: in 'desc(…)'), 
> over sorting will not introduce behavior issue but still seems a bit 
> unfortunabte,

Yep. I can change it to investigate the parsed tree.

> finally. this seems to be fixing the issue for log only. Is log the only 
> affected command (because it build revset in a strange way?) or the bug 
> more widespread?

graphlog isn't affected because it sorts the result set, but there are many ways
to get strange ordering as revset is expressive.

% hg debugrevspec '0: & 2:1' -v --optimize
(and
  (rangepost
    ('symbol', '0'))
  (range
    ('symbol', '2')
    ('symbol', '1')))
* optimized:
(and
  (range
    ('symbol', '0')
    ('string', 'tip'))
  (range
    ('symbol', '2')
    ('symbol', '1')))
* set:
<filteredset
  <spanset- 1:2>>
2
1

Patch

diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py
--- a/mercurial/cmdutil.py
+++ b/mercurial/cmdutil.py
@@ -2139,9 +2139,14 @@  def getlogrevs(repo, pats, opts):
         # Revset matches can reorder revisions. "A or B" typically returns
         # returns the revision matching A then the revision matching B. Sort
         # again to fix that.
+        oldrevs = revs
         revs = matcher(repo, revs)
         if not opts.get('rev'):
             revs.sort(reverse=True)
+        elif ' or ' in expr:
+            # XXX "A or B" is known to change the order; fix it by filtering
+            # matched set again (issue5100)
+            revs = oldrevs & revs
     if limit is not None:
         limitedrevs = []
         for idx, r in enumerate(revs):
diff --git a/tests/test-log.t b/tests/test-log.t
--- a/tests/test-log.t
+++ b/tests/test-log.t
@@ -918,6 +918,44 @@  log -r tip --stat
    1 files changed, 1 insertions(+), 0 deletions(-)
   
 
+Test that log should respect the order of -rREV even if multiple OR conditions
+are specified (issue5100):
+
+ summary of revisions:
+
+  $ hg log -r0:6 -T '{rev} {pad(files, 6)} {desc|firstline}\n'
+  0 base   base
+  1 base   r1
+  2 base   r2
+  3 b1     b1
+  4 b2     b2
+  5        m12
+  6 b1     b1.1
+
+ log FILE in ascending order:
+
+  $ hg log -r0:6 -T '{rev} {files}\n' b1 b2
+  3 b1
+  4 b2
+  6 b1
+  $ hg log -r0:6 -T '{rev} {files}\n' b2 b1
+  3 b1
+  4 b2
+  6 b1
+
+ log -k TEXT in descending order:
+
+  $ hg log -r6:0 -T '{rev} {desc|firstline}\n' -k r1 -k r2 -k b1
+  6 b1.1
+  3 b1
+  2 r2
+  1 r1
+  $ hg log -r6:0 -T '{rev} {desc|firstline}\n' -k r2 -k b1 -k r1
+  6 b1.1
+  3 b1
+  2 r2
+  1 r1
+
   $ cd ..