From patchwork Thu Oct 16 21:53:59 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: [1, of, 2, v2] revset: introduce optional 'while' predicate for ancestors() From: Mads Kiilerich X-Patchwork-Id: 6349 Message-Id: <1cd08d67e748ca66a270.1413496439@ssl.google-analytics.com> To: mercurial-devel@selenic.com Date: Thu, 16 Oct 2014 23:53:59 +0200 # HG changeset patch # User Mads Kiilerich # Date 1413496239 -7200 # Thu Oct 16 23:50:39 2014 +0200 # Node ID 1cd08d67e748ca66a270ea5f67346ef262f39c58 # Parent a42d67c16fb9f17a50eb3068f16c35204ec240f2 revset: introduce optional 'while' predicate for ancestors() When specifying a 'while' set, ancestors() will now only visit parents that are in that set. This makes it possible to prune while doing an ancestor traversal and reduce the number of membership tests. Such a pruning is very convenient when expensive checks are involved or there just are a lot of them. The primary initial use case for this feature is that filtering on branch name was so expensive. Often it is just as relevant to prune everything not on the branch. This makes some new queries possible and there will be some queries that pretty much will go from N s to 0 s. But for entertainment, here are some numbers on a big repo on a recently created and contiguous release branch: revset #0: branch(.) wall 9.744125 comb 9.730000 user 9.570000 sys 0.160000 (best of 3) revset #1: ancestors(.)&branch(.) wall 10.476355 comb 10.460000 user 10.390000 sys 0.070000 (best of 3) revset #2: ancestors(., branch(.)) wall 0.170288 comb 0.170000 user 0.170000 sys 0.000000 (best of 57) This touches the hot parts of ancestors but not have any measurable impact on ancestors when not using while: Before: ancestors(head()) wall 0.801175 comb 0.800000 user 0.800000 sys 0.000000 (best of 13) After: ancestors(head()) wall 0.794651 comb 0.800000 user 0.790000 sys 0.010000 (best of 13) Using this feature, only(X,Y) could be implemented as ancestors(X,!::Y). It do however seem like this generic code, despite using pruning, has a significantly higher overhead than the handcoded one. There might be further room for improvement. ancestors("3.1",!::"3.0") wall 0.027154 comb 0.030000 user 0.030000 sys 0.000000 (best of 107) only("3.1","3.0") wall 0.001673 comb 0.010000 user 0.010000 sys 0.000000 (best of 1738) Similarly, X::Y could be implemented as ancestors(Y, X::). That is however also no win. ancestors("2.0","1.0"::) wall 0.075563 comb 0.070000 user 0.070000 sys 0.000000 (best of 100) "1.0"::"2.0" wall 0.011683 comb 0.010000 user 0.010000 sys 0.000000 (best of 248) diff --git a/mercurial/revset.py b/mercurial/revset.py --- a/mercurial/revset.py +++ b/mercurial/revset.py @@ -17,7 +17,7 @@ import obsolete as obsmod import pathutil import repoview -def _revancestors(repo, revs, followfirst): +def _revancestors(repo, revs, followfirst, while_=None): """Like revlog.ancestors(), but supports followfirst.""" cut = followfirst and 1 or None cl = repo.changelog @@ -41,10 +41,11 @@ def _revancestors(repo, revs, followfirs revsnode = revqueue.popleft() heapq.heappush(h, -revsnode) seen.add(current) - yield current - for parent in cl.parentrevs(current)[:cut]: - if parent != node.nullrev: - heapq.heappush(h, -parent) + if while_ is None or current in while_: + yield current + for parent in cl.parentrevs(current)[:cut]: + if parent != node.nullrev: + heapq.heappush(h, -parent) return generatorset(iterate(), iterasc=False) @@ -343,15 +344,22 @@ def ancestor(repo, subset, x): return baseset() def _ancestors(repo, subset, x, followfirst=False): - heads = getset(repo, spanset(repo), x) + args = getargs(x, 0, 2, _('ancestors takes no, one or two arguments')) + if not args: + return baseset() + heads = getset(repo, spanset(repo), args[0]) if not heads: return baseset() - s = _revancestors(repo, heads, followfirst) + while_ = None + if len(args) > 1: + while_ = getset(repo, spanset(repo), args[1]) + s = _revancestors(repo, heads, followfirst, while_=while_) return subset.filter(s.__contains__) def ancestors(repo, subset, x): - """``ancestors(set)`` + """``ancestors(set, [while])`` Changesets that are ancestors of a changeset in set. + If a while set is specified, only follow ancestors in that set. """ return _ancestors(repo, subset, x) diff --git a/tests/test-revset.t b/tests/test-revset.t --- a/tests/test-revset.t +++ b/tests/test-revset.t @@ -246,6 +246,10 @@ ancestor can accept 0 or more arguments 5 $ log 'ancestor(ancestors(5))' 0 + $ log 'ancestors(5+9, 1:7)' + 1 + 3 + 5 $ log 'author(bob)' 2 $ log 'author("re:bob|test")'