Patchwork [2,of,2,V2] revset: add only() fastpath for minus operations

login
register
mail settings
Submitter Durham Goode
Date Feb. 18, 2016, 7:38 p.m.
Message ID <a2a62c7df76324d49046.1455824288@dev8486.prn1.facebook.com>
Download mbox | patch
Permalink /patch/13256/
State Changes Requested
Delegated to: Martin von Zweigbergk
Headers show

Comments

Durham Goode - Feb. 18, 2016, 7:38 p.m.
# HG changeset patch
# User Durham Goode <durham@fb.com>
# Date 1455823717 28800
#      Thu Feb 18 11:28:37 2016 -0800
# Node ID a2a62c7df76324d49046a2893c971195507ee64b
# Parent  dc8d72c3df7dfcd7b67fa71d679cc7c3ad90f25c
revset: add only() fastpath for minus operations

Previously, '::X - ::Y' was translated to '::X and not ::Y' and the 'and'
optimizer turned this into 'only(X, Y)'. Since we no longer transform minus into
and-not we lost this optimization.

This patch adds it back. This results in no performance change actually (as
determined by revsetbenchmark.py), but it does undo the test changes caused by
the use of the minus operator.
Martin von Zweigbergk - Feb. 20, 2016, 7:46 a.m.
On Thu, Feb 18, 2016 at 11:38 AM, Durham Goode <durham@fb.com> wrote:
> # HG changeset patch
> # User Durham Goode <durham@fb.com>
> # Date 1455823717 28800
> #      Thu Feb 18 11:28:37 2016 -0800
> # Node ID a2a62c7df76324d49046a2893c971195507ee64b
> # Parent  dc8d72c3df7dfcd7b67fa71d679cc7c3ad90f25c
> revset: add only() fastpath for minus operations
>
> Previously, '::X - ::Y' was translated to '::X and not ::Y' and the 'and'
> optimizer turned this into 'only(X, Y)'. Since we no longer transform minus into
> and-not we lost this optimization.
>
> This patch adds it back. This results in no performance change actually (as
> determined by revsetbenchmark.py)

Any idea why? IIUC, that means that smartset's minus operator is as
fast as the specialized only() predicate on the tests we have. Does
that mean that we are just missing some relevant tests? Or that only()
is not yet optimized as well as it could be? Or, if the minus operator
really is as fast, that we should optimize and simplify the only()
predicate by calling to the minus operator?
Durham Goode - Feb. 23, 2016, 8:32 p.m.
On 2/19/16, 11:46 PM, "Martin von Zweigbergk" <martinvonz@google.com> wrote:

>On Thu, Feb 18, 2016 at 11:38 AM, Durham Goode <durham@fb.com> wrote:

>> # HG changeset patch

>> # User Durham Goode <durham@fb.com>

>> # Date 1455823717 28800

>> #      Thu Feb 18 11:28:37 2016 -0800

>> # Node ID a2a62c7df76324d49046a2893c971195507ee64b

>> # Parent  dc8d72c3df7dfcd7b67fa71d679cc7c3ad90f25c

>> revset: add only() fastpath for minus operations

>>

>> Previously, '::X - ::Y' was translated to '::X and not ::Y' and the 'and'

>> optimizer turned this into 'only(X, Y)'. Since we no longer transform minus into

>> and-not we lost this optimization.

>>

>> This patch adds it back. This results in no performance change actually (as

>> determined by revsetbenchmark.py)

>

>Any idea why? IIUC, that means that smartset's minus operator is as

>fast as the specialized only() predicate on the tests we have. Does

>that mean that we are just missing some relevant tests? Or that only()

>is not yet optimized as well as it could be? Or, if the minus operator

>really is as fast, that we should optimize and simplify the only()

>predicate by calling to the minus operator?


I think the minus operator is just generally fast enough in these cases.  Further investigation might show we don't need to special case only(), but only() is a pretty unique variant that we probably want to keep so we can add further optimizations later.

Patch

diff --git a/mercurial/revset.py b/mercurial/revset.py
--- a/mercurial/revset.py
+++ b/mercurial/revset.py
@@ -2170,6 +2170,19 @@  def optimize(x, small):
         wa, ta = optimize(x[1], True)
         wb, tb = optimize(x[2], True)
 
+        # (::x - ::y) has a fast path
+        def isonly(revs, bases):
+            return (
+                revs is not None
+                and revs[0] == 'func'
+                and getstring(revs[1], _('not a symbol')) == 'ancestors'
+                and bases is not None
+                and bases[0] == 'func'
+                and getstring(bases[1], _('not a symbol')) == 'ancestors')
+
+        if isonly(ta, tb):
+            return wa, ('func', ('symbol', 'only'), ('list', ta[2], tb[2]))
+
         return wa, (op, ta, tb)
     elif op == 'only':
         return optimize(('func', ('symbol', 'only'),
diff --git a/tests/test-revset.t b/tests/test-revset.t
--- a/tests/test-revset.t
+++ b/tests/test-revset.t
@@ -1248,16 +1248,13 @@  check that conversion to only works
     (dagrangepre
       ('symbol', '1')))
   * optimized:
-  (minus
-    (func
-      ('symbol', 'ancestors')
-      ('symbol', '3'))
-    (func
-      ('symbol', 'ancestors')
+  (func
+    ('symbol', 'only')
+    (list
+      ('symbol', '3')
       ('symbol', '1')))
   * set:
-  <filteredset
-    <generatorset+>>
+  <baseset+ [3]>
   3
   $ try --optimize 'ancestors(1) - ancestors(3)'
   (minus
@@ -1268,16 +1265,13 @@  check that conversion to only works
       ('symbol', 'ancestors')
       ('symbol', '3')))
   * optimized:
-  (minus
-    (func
-      ('symbol', 'ancestors')
-      ('symbol', '1'))
-    (func
-      ('symbol', 'ancestors')
+  (func
+    ('symbol', 'only')
+    (list
+      ('symbol', '1')
       ('symbol', '3')))
   * set:
-  <filteredset
-    <generatorset+>>
+  <baseset+ []>
   $ try --optimize 'not ::2 and ::6'
   (and
     (not