Patchwork [6,of,7] fileset: reorder 'and' expression to evaluate basic patterns first

login
register
mail settings
Submitter Yuya Nishihara
Date Aug. 3, 2018, 3:01 p.m.
Message ID <6a0bf2f620f877d60989.1533308503@mimosa>
Download mbox | patch
Permalink /patch/33156/
State Accepted
Headers show

Comments

Yuya Nishihara - Aug. 3, 2018, 3:01 p.m.
# HG changeset patch
# User Yuya Nishihara <yuya@tcha.org>
# Date 1532158905 -32400
#      Sat Jul 21 16:41:45 2018 +0900
# Node ID 6a0bf2f620f877d609896f649c0feae09c3f5d2d
# Parent  1ec115b2c5525f0507ff7c4dc2019b57ae391911
fileset: reorder 'and' expression to evaluate basic patterns first

Timing of a crafted example (when disk cache is warm):

  $ hg files set:'binary() and path:contrib'
  (orig) time: real 0.140 secs (user 0.120+0.000 sys 0.020+0.000)
  (new)  time: real 0.040 secs (user 0.030+0.000 sys 0.010+0.000)
Pulkit Goyal - Aug. 3, 2018, 5:56 p.m.
On Fri 3 Aug, 2018, 6:02 PM Yuya Nishihara, <yuya@tcha.org> wrote:

> # HG changeset patch
> # User Yuya Nishihara <yuya@tcha.org>
> # Date 1532158905 -32400
> #      Sat Jul 21 16:41:45 2018 +0900
> # Node ID 6a0bf2f620f877d609896f649c0feae09c3f5d2d
> # Parent  1ec115b2c5525f0507ff7c4dc2019b57ae391911
> fileset: reorder 'and' expression to evaluate basic patterns first
>
> Timing of a crafted example (when disk cache is warm):
>
>   $ hg files set:'binary() and path:contrib'
>   (orig) time: real 0.140 secs (user 0.120+0.000 sys 0.020+0.000)
>   (new)  time: real 0.040 secs (user 0.030+0.000 sys 0.010+0.000)
>

This looks like performance improvement. How about adding this to 4.8
releasenotes page on wiki?

Patch

diff --git a/mercurial/filesetlang.py b/mercurial/filesetlang.py
--- a/mercurial/filesetlang.py
+++ b/mercurial/filesetlang.py
@@ -184,7 +184,14 @@  def _optimize(x):
     if op == 'not':
         w, t = _optimize(x[1])
         return w, (op, t)
-    if op in {'and', 'minus'}:
+    if op == 'and':
+        wa, ta = _optimize(x[1])
+        wb, tb = _optimize(x[2])
+        if wa <= wb:
+            return wa, (op, ta, tb)
+        else:
+            return wb, (op, tb, ta)
+    if op == 'minus':
         wa, ta = _optimize(x[1])
         wb, tb = _optimize(x[2])
         return max(wa, wb), (op, ta, tb)
diff --git a/tests/test-fileset.t b/tests/test-fileset.t
--- a/tests/test-fileset.t
+++ b/tests/test-fileset.t
@@ -186,18 +186,18 @@  Show parsed tree at stages:
     (symbol 'a2')
     (and
       (func
-        (symbol 'grep')
-        (string 'b'))
+        (symbol 'clean')
+        None)
       (func
-        (symbol 'clean')
-        None)))
+        (symbol 'grep')
+        (string 'b'))))
   * matcher:
   <unionmatcher matchers=[
     <patternmatcher patterns='(?:a1$)'>,
     <patternmatcher patterns='(?:a2$)'>,
     <intersectionmatcher
-      m1=<predicatenmatcher pred=grep('b')>,
-      m2=<predicatenmatcher pred=clean>>]>
+      m1=<predicatenmatcher pred=clean>,
+      m2=<predicatenmatcher pred=grep('b')>>]>
   a1
   a2
   b1
@@ -283,6 +283,19 @@  Test files properties
   $ fileset 'binary()'
   bin
 
+  $ fileset -p optimized -s 'binary() and b*'
+  * optimized:
+  (and
+    (symbol 'b*')
+    (func
+      (symbol 'binary')
+      None))
+  * matcher:
+  <intersectionmatcher
+    m1=<patternmatcher patterns='(?:b[^/]*$)'>,
+    m2=<predicatenmatcher pred=binary>>
+  bin
+
   $ fileset 'grep("b{1}")'
   .hgignore
   b1