Patchwork [2,of,3] revlog: inline start() and env() for perf reasons

login
register
mail settings
Submitter Gregory Szorc
Date Nov. 2, 2016, 1:25 a.m.
Message ID <0c41c0cb9b1ef7df40a3.1478049946@ubuntu-vm-main>
Download mbox | patch
Permalink /patch/17260/
State Accepted
Headers show

Comments

Gregory Szorc - Nov. 2, 2016, 1:25 a.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1477176083 25200
#      Sat Oct 22 15:41:23 2016 -0700
# Node ID 0c41c0cb9b1ef7df40a30672927229ac195b1c92
# Parent  c0fa82e1ba8031de5e2c8fe03e56bea66924c9fa
revlog: inline start() and env() for perf reasons

When I implemented `hg perfrevlogchunks`, one of the things that
stood out was N * _chunk() calls was ~38x slower than 1
_chunks() call. Specifically, on the mozilla-unified repo:

N*_chunk:  0.528997s
1*_chunks: 0.013735s

This repo has 352,097 changesets. So the average time per changeset
comes out to:

N*_chunk:  1.502us
1*_chunks: 0.039us

If you extrapolate these numbers to a repository with 1M changesets,
that comes out to 1.502s versus 0.039s, which is significant.

At these latencies, Python attribute lookups and function calls
matter. So, this patch inlines some code to cut down on that overhead.

The impact of this patch on N*_chunk() calls is clear:

! wall 0.528997 comb 0.520000 user 0.500000 sys 0.020000 (best of 19)
! wall 0.367723 comb 0.370000 user 0.350000 sys 0.020000 (best of 27)

So, we go from ~38x slower to ~27x. A nice improvement. But there's
still a long way to go.

It's worth noting that functionality like revsets perform changelog
lookups one revision at a time. So this code path is worth optimizing.

Patch

diff --git a/mercurial/revlog.py b/mercurial/revlog.py
--- a/mercurial/revlog.py
+++ b/mercurial/revlog.py
@@ -1109,8 +1109,14 @@  class revlog(object):
         Callers will need to call ``self.start(rev)`` and ``self.length(rev)``
         to determine where each revision's data begins and ends.
         """
-        start = self.start(startrev)
-        end = self.end(endrev)
+        # Inlined self.start(startrev) & self.end(endrev) for perf reasons
+        # (functions are expensive).
+        index = self.index
+        istart = index[startrev]
+        iend = index[endrev]
+        start = int(istart[0] >> 16)
+        end = int(iend[0] >> 16) + iend[1]
+
         if self._inline:
             start += (startrev + 1) * self._io.size
             end += (endrev + 1) * self._io.size