From patchwork Tue Apr 1 02:49:52 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: [3,of,3] revset: improve roots revset performance From: Durham Goode X-Patchwork-Id: 4177 Message-Id: <157d96db8eaf11d34301.1396320592@dev2000.prn2.facebook.com> To: mercurial-devel@selenic.com Date: Mon, 31 Mar 2014 19:49:52 -0700 # HG changeset patch # User Durham Goode # Date 1396307014 25200 # Mon Mar 31 16:03:34 2014 -0700 # Node ID 157d96db8eaf11d343017477847ba8719d52eeca # Parent aa93084956fd536092f56eb2347dcb723e153d94 revset: improve roots revset performance Previously we would iterate over every item in the subset, checking if it was in the provided args. This often meant iterating over every rev in the repo. Now we iterate over the args provided, checking if they exist in the subset. On a large repo this brings setting phase boundaries (which use this revset roots(X:: - X::Y)) down from 0.8 seconds to 0.4 seconds. The "roots((tip~100::) - (tip~100::tip))" revset in revsetbenchmarks shows it going from 0.12s to 0.10s, so we should be able to catch regressions here in the future. This actually introduces a regression in 'roots(all())' (0.2s to 0.26s) since we're now using spansets, which are slightly slower to do containment checks on. I believe this trade off is worth it, since it makes the revset O(number of args) instead of O(size of repo). diff --git a/mercurial/revset.py b/mercurial/revset.py --- a/mercurial/revset.py +++ b/mercurial/revset.py @@ -1480,8 +1480,8 @@ """``roots(set)`` Changesets in set with no parent changeset in set. """ - s = getset(repo, baseset(repo.changelog), x).set() - subset = baseset([r for r in subset if r in s]) + s = getset(repo, spanset(repo), x).set() + subset = baseset([r for r in s if r in subset.set()]) cs = _children(repo, subset, s) return subset - cs