Patchwork [2,of,2] revset: optimize not public revset

login
register
mail settings
Submitter Laurent Charignon
Date May 4, 2015, 8:06 p.m.
Message ID <c332e7ca0fc826d5626c.1430769966@lcharignon-mbp.local>
Download mbox | patch
Permalink /patch/8874/
State Superseded
Commit 08d1ef09ed371bfa6982ea67f6c92b495d42c520
Delegated to: Augie Fackler
Headers show

Comments

Laurent Charignon - May 4, 2015, 8:06 p.m.
# HG changeset patch
# User Laurent Charignon <lcharignon@fb.com>
# Date 1429911030 25200
#      Fri Apr 24 14:30:30 2015 -0700
# Node ID c332e7ca0fc826d5626cd743a3f9e9ccf8b8c865
# Parent  73bd80fd6a2f41805e81f981a6fd48c04a496300
revset: optimize not public revset

This patvh speeds up the computation of the not public() changeset
and incidentally speed up the computation of divergents() changeset on our big
repo by 100x from 50% to 0.5% of the time spent in smartlog with evolve.

In this patch we optimize not public() to _notpublic() (new revset) and use
the work on phaseset (from the previous commit) to be able to compute
_notpublic() quickly.

We use a non-lazy approach making the assumption the number of notpublic
change will not be in the order of magnitude of the repo size. Adopting a
lazy approach gives a speedup of 5x (vs 100x) only due to the overhead of the
code for lazy generation.

Patch

diff --git a/mercurial/revset.py b/mercurial/revset.py
--- a/mercurial/revset.py
+++ b/mercurial/revset.py
@@ -1483,6 +1483,22 @@ 
     except error.RepoLookupError:
         return baseset()
 
+def _notpublic(repo, subset, x):
+    """``_notpublic()``
+    Changeset not in public phase."""
+    # i18n: "public" is a keyword
+    getargs(x, 0, 0, _("_notpublic takes no arguments"))
+    if repo._phasecache._phasesets:
+        s = set()
+        for u in repo._phasecache._phasesets[1:]:
+            s.update(u)
+        return subset & s
+    else:
+        phase = repo._phasecache.phase
+        target = phases.public
+        condition = lambda r: phase(repo, r) != target
+        return subset.filter(condition, cache=False)
+
 def public(repo, subset, x):
     """``public()``
     Changeset in public phase."""
@@ -1989,6 +2005,7 @@ 
     "parents": parents,
     "present": present,
     "public": public,
+    "_notpublic": _notpublic,
     "remote": remote,
     "removes": removes,
     "rev": rev,
@@ -2063,6 +2080,7 @@ 
     "parents",
     "present",
     "public",
+    "_notpublic",
     "remote",
     "removes",
     "rev",
@@ -2154,8 +2172,14 @@ 
             wb, wa = wa, wb
         return max(wa, wb), (op, ta, tb)
     elif op == 'not':
-        o = optimize(x[1], not small)
-        return o[0], (op, o[1])
+        # Optimize not public() to _notpublic() because we have a fast version
+        if x[1] == ('func', ('symbol', 'public'), None):
+            newsym =  ('func', ('symbol', '_notpublic'), None)
+            o = optimize(newsym, not small)
+            return o[0], o[1]
+        else:
+            o = optimize(x[1], not small)
+            return o[0], (op, o[1])
     elif op == 'parentpost':
         o = optimize(x[1], small)
         return o[0], (op, o[1])