Patchwork [1,of,6] revlog: add a fast method for getting a list of chunks

login
register
mail settings
Submitter Siddharth Agarwal
Date Sept. 7, 2013, 8:40 p.m.
Message ID <338cb41f98bf07596ba4.1378586435@dev1091.prn1.facebook.com>
Download mbox | patch
Permalink /patch/2405/
State Accepted
Commit c2e27e57d250703aa386c3d638087da1f17169d2
Headers show

Comments

Siddharth Agarwal - Sept. 7, 2013, 8:40 p.m.
# HG changeset patch
# User Siddharth Agarwal <sid0@fb.com>
# Date 1378510295 25200
#      Fri Sep 06 16:31:35 2013 -0700
# Node ID 338cb41f98bf07596ba4b3b29d624ed77b4f18a1
# Parent  b36dabbc3a3aab387df0a56e432e067bc1c14861
revlog: add a fast method for getting a list of chunks

This moves _chunkraw into the loop. Doing that improves revlog decompression --
in particular, manifest decompression -- significantly. For a 20 MB manifest
which is the result of a > 40k delta chain, hg perfmanifest improves from 0.55
seconds to 0.49 seconds.

Patch

diff --git a/mercurial/revlog.py b/mercurial/revlog.py
--- a/mercurial/revlog.py
+++ b/mercurial/revlog.py
@@ -853,6 +853,28 @@ 
     def _chunk(self, rev):
         return decompress(self._chunkraw(rev, rev))
 
+    def _chunks(self, revs):
+        '''faster version of [self._chunk(rev) for rev in revs]
+
+        Assumes that revs is in ascending order.'''
+        start = self.start
+        length = self.length
+        inline = self._inline
+        iosize = self._io.size
+        getchunk = self._getchunk
+
+        l = []
+        ladd = l.append
+
+        for rev in revs:
+            chunkstart = start(rev)
+            if inline:
+                chunkstart += (rev + 1) * iosize
+            chunklength = length(rev)
+            ladd(decompress(getchunk(chunkstart, chunklength)))
+
+        return l
+
     def _chunkbase(self, rev):
         return self._chunk(rev)
 
@@ -933,7 +955,7 @@ 
         if text is None:
             text = str(self._chunkbase(base))
 
-        bins = [self._chunk(r) for r in chain]
+        bins = self._chunks(chain)
         text = mdiff.patches(text, bins)
 
         text = self._checkhash(text, node, rev)