From patchwork Sun Nov 30 01:57:42 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: [2, of, 4] obsstore: disable garbage collection during initialisation (issue4456) From: Pierre-Yves David X-Patchwork-Id: 6890 Message-Id: To: mercurial-devel@selenic.com Cc: Pierre-Yves David Date: Sat, 29 Nov 2014 17:57:42 -0800 # HG changeset patch # User Pierre-Yves David # Date 1417049911 28800 # Wed Nov 26 16:58:31 2014 -0800 # Node ID d0f3dac4ea2b4aff51946c7db0834aa4e5c3e82a # Parent 04eb7e49d2b6f90f71aa85de9ad0b4d70670d688 obsstore: disable garbage collection during initialisation (issue4456) Python garbage collection is triggered by contained creation. So code that creates a lot of tuple tends to trigger GC a lot. We disable the gc during obsolescence marker parsing and associated initialization. The provide and interesting speedup (25%). On my 58758 markers repo: before: 0.468247 seconds after: 0.344362 seconds Thanks goes to Siddharth Agarwal for the lead. diff --git a/mercurial/obsolete.py b/mercurial/obsolete.py --- a/mercurial/obsolete.py +++ b/mercurial/obsolete.py @@ -66,10 +66,11 @@ The file starts with a version header: The header is followed by the markers. Marker format depend of the version. See comment associated with each format for details. """ import struct +import gc import util, base85, node import phases from i18n import _ _pack = struct.pack @@ -466,12 +467,28 @@ class obsstore(object): self.sopener = sopener data = sopener.tryread('obsstore') self._version = defaultformat self._readonly = readonly if data: - self._version, markers = _readmarkers(data) - self._load(markers) + # Python's garbage collector triggers a GC each time a certain + # number of container objects (the number being defined by + # gc.get_threshold()) are allocated. Markers parsing creates + # multiple tuples while parsing each markers so the gc is triggered + # a lot while parsing an high number of markers. As a workaround, + # disable GC during initialisation. + # + # This would probably marker parsing during exchange but I do not + # expect the order of magnitude to matter outside of initialisation + # case. + gcenabled = gc.isenabled() + gc.disable() + try: + self._version, markers = _readmarkers(data) + self._load(markers) + finally: + if gcenabled: + gc.enable() def __iter__(self): return iter(self._all) def __len__(self):