Patchwork [3,of,6] obsolete: populate successors(), precursors(), children() lazily

login
register
mail settings
Submitter Martin von Zweigbergk
Date Feb. 4, 2015, 4:01 a.m.
Message ID <cd65758885ab3cc66490.1423022470@martinvonz.mtv.corp.google.com>
Download mbox | patch
Permalink /patch/7651/
State Superseded
Headers show

Comments

Martin von Zweigbergk - Feb. 4, 2015, 4:01 a.m.
# HG changeset patch
# User Martin von Zweigbergk <martinvonz@google.com>
# Date 1421822097 28800
#      Tue Jan 20 22:34:57 2015 -0800
# Node ID cd65758885ab3cc6649001fa9aa68105d818323a
# Parent  518d3caf2307a7699ec078afe9b1e6991640941c
obsolete: populate successors(), precursors(), children() lazily

The precursors and children dictionaries are not used by many
commands. By making them lazily populated, plain 'hg status' is sped
up from 0.561s to 0.438s on my hg.hg repo with 73k markers. Also make
sucessors lazily populated, mostly for consistency with the others.
Martin von Zweigbergk - Feb. 4, 2015, 11:02 p.m.
On Tue Feb 03 2015 at 8:01:14 PM Martin von Zweigbergk <
martinvonz@google.com> wrote:

> # HG changeset patch
> # User Martin von Zweigbergk <martinvonz@google.com>
> # Date 1421822097 28800
> #      Tue Jan 20 22:34:57 2015 -0800
> # Node ID cd65758885ab3cc6649001fa9aa68105d818323a
> # Parent  518d3caf2307a7699ec078afe9b1e6991640941c
> obsolete: populate successors(), precursors(), children() lazily
>
> The precursors and children dictionaries are not used by many
> commands. By making them lazily populated, plain 'hg status' is sped
> up from 0.561s to 0.438s on my hg.hg repo with 73k markers. Also make
> sucessors lazily populated, mostly for consistency with the others.
>

An alternative to the "plain 'hg status'..." that isn't vulnerable to the
argument that 'hg status' shouldn't even read markers is this:

"By making them lazily populated, 'hg log --limit 10' is sped up from
0.538s to 0.413s on my hg.hg repo with 73k markers."

Feel free to use that sentence instead if applying this patch.
Matt Mackall - Feb. 5, 2015, 12:06 a.m.
On Tue, 2015-02-03 at 20:01 -0800, Martin von Zweigbergk wrote:
> # HG changeset patch
> # User Martin von Zweigbergk <martinvonz@google.com>
> # Date 1421822097 28800
> #      Tue Jan 20 22:34:57 2015 -0800
> # Node ID cd65758885ab3cc6649001fa9aa68105d818323a
> # Parent  518d3caf2307a7699ec078afe9b1e6991640941c
> obsolete: populate successors(), precursors(), children() lazily

>      def successors(self):
> +        if self._successors is None:
> +            self._successors = {}
> +            self._addsuccessors(self._all)
>          return self._successors

You probably want to look at how util.propertycache is used to make a
faster version of this pattern.
Martin von Zweigbergk - Feb. 5, 2015, 6:50 a.m.
On Wed Feb 04 2015 at 4:06:36 PM Matt Mackall <mpm@selenic.com> wrote:

> On Tue, 2015-02-03 at 20:01 -0800, Martin von Zweigbergk wrote:
> > # HG changeset patch
> > # User Martin von Zweigbergk <martinvonz@google.com>
> > # Date 1421822097 28800
> > #      Tue Jan 20 22:34:57 2015 -0800
> > # Node ID cd65758885ab3cc6649001fa9aa68105d818323a
> > # Parent  518d3caf2307a7699ec078afe9b1e6991640941c
> > obsolete: populate successors(), precursors(), children() lazily
>
> >      def successors(self):
> > +        if self._successors is None:
> > +            self._successors = {}
> > +            self._addsuccessors(self._all)
> >          return self._successors
>
> You probably want to look at how util.propertycache is used to make a
> faster version of this pattern.


Thanks. I'm not sure "faster" is important here, but it does seem to
simplify as well. I'll send V2 of patches 1/6-3/6 soon, with patch 2
replaced because it's no longer need with @propertycache.

Patch

diff -r 518d3caf2307 -r cd65758885ab mercurial/obsolete.py
--- a/mercurial/obsolete.py	Tue Jan 20 22:33:04 2015 -0800
+++ b/mercurial/obsolete.py	Tue Jan 20 22:34:57 2015 -0800
@@ -470,15 +470,16 @@ 
         """The flags field of the marker"""
         return self._data[2]
 
-def _checkinvalidmarkers(obsstore):
+def _checkinvalidmarkers(obsstore, markers):
     """search for marker with invalid data and raise error if needed
 
     Exist as a separated function to allow the evolve extension for a more
     subtle handling.
     """
-    if node.nullid in obsstore._precursors:
-        raise util.Abort(_('bad obsolescence marker detected: '
-                           'invalid successors nullid'))
+    for mark in markers:
+        if node.nullid in mark[1]:
+            raise util.Abort(_('bad obsolescence marker detected: '
+                               'invalid successors nullid'))
 
 class obsstore(object):
     """Store obsolete markers
@@ -502,16 +503,16 @@ 
         # caches for various obsolescence related cache
         self.caches = {}
         self._all = []
-        self._precursors = {}
-        self._successors = {}
-        self._children = {}
+        self._precursors = None
+        self._successors = None
+        self._children = None
         self.sopener = sopener
         data = sopener.tryread('obsstore')
         self._version = defaultformat
         self._readonly = readonly
         if data:
             self._version, markers = _readmarkers(data)
-            self._load(markers)
+            self._setmarkers(markers)
 
     def __iter__(self):
         return iter(self._all)
@@ -589,7 +590,7 @@ 
                 # XXX: f.close() == filecache invalidation == obsstore rebuilt.
                 # call 'filecacheentry.refresh()'  here
                 f.close()
-            self._load(new)
+            self._addmarkers(new)
             # new marker *may* have changed several set. invalidate the cache.
             self.caches.clear()
         # records the number of new markers for the transaction hooks
@@ -624,21 +625,39 @@ 
                     self._children.setdefault(p, set()).add(mark)
 
     def successors(self):
+        if self._successors is None:
+            self._successors = {}
+            self._addsuccessors(self._all)
         return self._successors
 
     def precursors(self):
+        if self._precursors is None:
+            self._precursors = {}
+            self._addprecursors(self._all)
         return self._precursors
 
     def children(self):
+        if self._children is None:
+            self._children = {}
+            self._addchildren(self._all)
         return self._children
 
-    def _load(self, markers):
+    def _setmarkers(self, markers):
+        self._all = list(markers)
+        self._successors = None
+        self._precursors = None
+        self._children = None
+
+    def _addmarkers(self, markers):
         markers = list(markers) # to allow repeated iteration
         self._all.extend(markers)
-        self._addsuccessors(markers)
-        self._addprecursors(markers)
-        self._addchildren(markers)
-        _checkinvalidmarkers(self)
+        if self._successors is not None:
+            self._addsuccessors(markers)
+        if self._precursors is not None:
+            self._addprecursors(markers)
+        if self._children is not None:
+            self._addchildren(markers)
+        _checkinvalidmarkers(self, markers)
 
     def relevantmarkers(self, nodes):
         """return a set of all obsolescence markers relevant to a set of nodes.