Patchwork [4,of,6,V2] obsstore: disable garbage collection during initialisation (issue4456)

login
register
mail settings
Submitter Pierre-Yves David
Date Dec. 4, 2014, 3:16 p.m.
Message ID <dcf878af15d0eb3a7997.1417706177@marginatus.alto.octopoid.net>
Download mbox | patch
Permalink /patch/7000/
State Accepted
Headers show

Comments

Pierre-Yves David - Dec. 4, 2014, 3:16 p.m.
# HG changeset patch
# User Pierre-Yves David <pierre-yves.david@fb.com>
# Date 1417049911 28800
#      Wed Nov 26 16:58:31 2014 -0800
# Node ID dcf878af15d0eb3a7997331faac73a0fa5ca736b
# Parent  d046747f367d29efd5fc1802d7f37be8bc416125
obsstore: disable garbage collection during initialisation (issue4456)

Python garbage collection is triggered by container creation. So code that
creates a lot of tuple tends to trigger GC a lot. We disable the gc during
obsolescence marker parsing and associated initialization. This provides an
interesting speedup (25%).

Load marker function on my 58758 markers repo:
before: 0.468247 seconds
after:  0.344362 seconds

The benefit is a bit less visible overall. With python2.6 on my system I see:
after:  0.60
before: 0.53

The difference is probably explained by the delaying of a costly GC. (but there
is still a win). Marking involved tuple, list and dict as ignorable by the
garbage collector should give us more benefit. But this is another adventure.

Thanks goes to Siddharth Agarwal for the lead.

Patch

diff --git a/mercurial/obsolete.py b/mercurial/obsolete.py
--- a/mercurial/obsolete.py
+++ b/mercurial/obsolete.py
@@ -375,10 +375,11 @@  def _fm1encodeonemarker(marker):
 # mapping to read/write various marker formats
 # <version> -> (decoder, encoder)
 formats = {_fm0version: (_fm0readmarkers, _fm0encodeonemarker),
            _fm1version: (_fm1readmarkers, _fm1encodeonemarker)}
 
+@util.nogc
 def _readmarkers(data):
     """Read and enumerate markers from raw data"""
     off = 0
     diskversion = _unpack('>B', data[off:off + 1])[0]
     off += 1
@@ -560,10 +561,11 @@  class obsstore(object):
 
         Returns the number of new markers added."""
         version, markers = _readmarkers(data)
         return self.add(transaction, markers)
 
+    @util.nogc
     def _load(self, markers):
         for mark in markers:
             self._all.append(mark)
             pre, sucs = mark[:2]
             self.successors.setdefault(pre, set()).add(mark)