Patchwork [1,of,2,v2] revset: introduce optional 'while' predicate for ancestors()

login
register
mail settings
Submitter Mads Kiilerich
Date Oct. 16, 2014, 9:53 p.m.
Message ID <1cd08d67e748ca66a270.1413496439@ssl.google-analytics.com>
Download mbox | patch
Permalink /patch/6349/
State Deferred
Headers show

Comments

Mads Kiilerich - Oct. 16, 2014, 9:53 p.m.
# HG changeset patch
# User Mads Kiilerich <madski@unity3d.com>
# Date 1413496239 -7200
#      Thu Oct 16 23:50:39 2014 +0200
# Node ID 1cd08d67e748ca66a270ea5f67346ef262f39c58
# Parent  a42d67c16fb9f17a50eb3068f16c35204ec240f2
revset: introduce optional 'while' predicate for ancestors()

When specifying a 'while' set, ancestors() will now only visit parents that are
in that set. This makes it possible to prune while doing an ancestor traversal
and reduce the number of membership tests.  Such a pruning is very convenient
when expensive checks are involved or there just are a lot of them.

The primary initial use case for this feature is that filtering on branch name
was so expensive. Often it is just as relevant to prune everything not on the
branch.

This makes some new queries possible and there will be some queries that pretty
much will go from N s to 0 s. But for entertainment, here are some numbers on a
big repo on a recently created and contiguous release branch:

revset #0: branch(.)
  wall 9.744125 comb 9.730000 user 9.570000 sys 0.160000 (best of 3)

revset #1: ancestors(.)&branch(.)
  wall 10.476355 comb 10.460000 user 10.390000 sys 0.070000 (best of 3)

revset #2: ancestors(., branch(.))
  wall 0.170288 comb 0.170000 user 0.170000 sys 0.000000 (best of 57)

This touches the hot parts of ancestors but not have any measurable impact on
ancestors when not using while:
Before: ancestors(head())
  wall 0.801175 comb 0.800000 user 0.800000 sys 0.000000 (best of 13)
After: ancestors(head())
  wall 0.794651 comb 0.800000 user 0.790000 sys 0.010000 (best of 13)

Using this feature, only(X,Y) could be implemented as ancestors(X,!::Y). It do
however seem like this generic code, despite using pruning, has a significantly
higher overhead than the handcoded one. There might be further room for
improvement.

ancestors("3.1",!::"3.0")
  wall 0.027154 comb 0.030000 user 0.030000 sys 0.000000 (best of 107)

only("3.1","3.0")
  wall 0.001673 comb 0.010000 user 0.010000 sys 0.000000 (best of 1738)

Similarly, X::Y could be implemented as ancestors(Y, X::). That is however also
no win.

ancestors("2.0","1.0"::)
  wall 0.075563 comb 0.070000 user 0.070000 sys 0.000000 (best of 100)

"1.0"::"2.0"
  wall 0.011683 comb 0.010000 user 0.010000 sys 0.000000 (best of 248)

Patch

diff --git a/mercurial/revset.py b/mercurial/revset.py
--- a/mercurial/revset.py
+++ b/mercurial/revset.py
@@ -17,7 +17,7 @@  import obsolete as obsmod
 import pathutil
 import repoview
 
-def _revancestors(repo, revs, followfirst):
+def _revancestors(repo, revs, followfirst, while_=None):
     """Like revlog.ancestors(), but supports followfirst."""
     cut = followfirst and 1 or None
     cl = repo.changelog
@@ -41,10 +41,11 @@  def _revancestors(repo, revs, followfirs
                         revsnode = revqueue.popleft()
                         heapq.heappush(h, -revsnode)
                 seen.add(current)
-                yield current
-                for parent in cl.parentrevs(current)[:cut]:
-                    if parent != node.nullrev:
-                        heapq.heappush(h, -parent)
+                if while_ is None or current in while_:
+                    yield current
+                    for parent in cl.parentrevs(current)[:cut]:
+                        if parent != node.nullrev:
+                            heapq.heappush(h, -parent)
 
     return generatorset(iterate(), iterasc=False)
 
@@ -343,15 +344,22 @@  def ancestor(repo, subset, x):
     return baseset()
 
 def _ancestors(repo, subset, x, followfirst=False):
-    heads = getset(repo, spanset(repo), x)
+    args = getargs(x, 0, 2, _('ancestors takes no, one or two arguments'))
+    if not args:
+        return baseset()
+    heads = getset(repo, spanset(repo), args[0])
     if not heads:
         return baseset()
-    s = _revancestors(repo, heads, followfirst)
+    while_ = None
+    if len(args) > 1:
+        while_ = getset(repo, spanset(repo), args[1])
+    s = _revancestors(repo, heads, followfirst, while_=while_)
     return subset.filter(s.__contains__)
 
 def ancestors(repo, subset, x):
-    """``ancestors(set)``
+    """``ancestors(set, [while])``
     Changesets that are ancestors of a changeset in set.
+    If a while set is specified, only follow ancestors in that set.
     """
     return _ancestors(repo, subset, x)
 
diff --git a/tests/test-revset.t b/tests/test-revset.t
--- a/tests/test-revset.t
+++ b/tests/test-revset.t
@@ -246,6 +246,10 @@  ancestor can accept 0 or more arguments
   5
   $ log 'ancestor(ancestors(5))'
   0
+  $ log 'ancestors(5+9, 1:7)'
+  1
+  3
+  5
   $ log 'author(bob)'
   2
   $ log 'author("re:bob|test")'