Patchwork [3,of,3] revlog: optimize _chunkraw when startrev==endrev

login
register
mail settings
Submitter Gregory Szorc
Date Nov. 2, 2016, 1:25 a.m.
Message ID <4c96a177e4ff6a63156e.1478049947@ubuntu-vm-main>
Download mbox | patch
Permalink /patch/17261/
State Accepted
Headers show

Comments

Gregory Szorc - Nov. 2, 2016, 1:25 a.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1477244433 25200
#      Sun Oct 23 10:40:33 2016 -0700
# Node ID 4c96a177e4ff6a63156ec8cf49483a78e4437756
# Parent  0c41c0cb9b1ef7df40a30672927229ac195b1c92
revlog: optimize _chunkraw when startrev==endrev

In many cases, _chunkraw() is called with startrev==endrev. When
this is true, we can avoid an extra index lookup and some other
minor operations.

On the mozilla-unified repo, `hg perfrevlogchunks -c` says this
has the following impact:

! read w/ reused fd
! wall 0.371846 comb 0.370000 user 0.350000 sys 0.020000 (best of 27)
! wall 0.337930 comb 0.330000 user 0.300000 sys 0.030000 (best of 30)

! read batch w/ reused fd
! wall 0.014952 comb 0.020000 user 0.000000 sys 0.020000 (best of 197)
! wall 0.014866 comb 0.010000 user 0.000000 sys 0.010000 (best of 196)

So, we've gone from ~25x slower than batch to ~22.5% slower.

At this point, there's probably not much else we can do except
implement an optimized function in the index itself, including in C.
Yuya Nishihara - Nov. 4, 2016, 8:27 a.m.
On Tue, 01 Nov 2016 18:25:47 -0700, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1477244433 25200
> #      Sun Oct 23 10:40:33 2016 -0700
> # Node ID 4c96a177e4ff6a63156ec8cf49483a78e4437756
> # Parent  0c41c0cb9b1ef7df40a30672927229ac195b1c92
> revlog: optimize _chunkraw when startrev==endrev

LGTM, queued the series, thanks.

> In many cases, _chunkraw() is called with startrev==endrev. When
> this is true, we can avoid an extra index lookup and some other
> minor operations.
> 
> On the mozilla-unified repo, `hg perfrevlogchunks -c` says this
> has the following impact:
> 
> ! read w/ reused fd
> ! wall 0.371846 comb 0.370000 user 0.350000 sys 0.020000 (best of 27)
> ! wall 0.337930 comb 0.330000 user 0.300000 sys 0.030000 (best of 30)
> 
> ! read batch w/ reused fd
> ! wall 0.014952 comb 0.020000 user 0.000000 sys 0.020000 (best of 197)
> ! wall 0.014866 comb 0.010000 user 0.000000 sys 0.010000 (best of 196)
> 
> So, we've gone from ~25x slower than batch to ~22.5% slower.
> 
> At this point, there's probably not much else we can do except
> implement an optimized function in the index itself, including in C.

or in Rust :-)
Pierre-Yves David - Nov. 6, 2016, 3:16 p.m.
On 11/04/2016 09:27 AM, Yuya Nishihara wrote:
> On Tue, 01 Nov 2016 18:25:47 -0700, Gregory Szorc wrote:
>> # HG changeset patch
>> # User Gregory Szorc <gregory.szorc@gmail.com>
>> # Date 1477244433 25200
>> #      Sun Oct 23 10:40:33 2016 -0700
>> # Node ID 4c96a177e4ff6a63156ec8cf49483a78e4437756
>> # Parent  0c41c0cb9b1ef7df40a30672927229ac195b1c92
>> revlog: optimize _chunkraw when startrev==endrev
>
> LGTM, queued the series, thanks.
>
>> In many cases, _chunkraw() is called with startrev==endrev. When
>> this is true, we can avoid an extra index lookup and some other
>> minor operations.
>>
>> On the mozilla-unified repo, `hg perfrevlogchunks -c` says this
>> has the following impact:
>>
>> ! read w/ reused fd
>> ! wall 0.371846 comb 0.370000 user 0.350000 sys 0.020000 (best of 27)
>> ! wall 0.337930 comb 0.330000 user 0.300000 sys 0.030000 (best of 30)
>>
>> ! read batch w/ reused fd
>> ! wall 0.014952 comb 0.020000 user 0.000000 sys 0.020000 (best of 197)
>> ! wall 0.014866 comb 0.010000 user 0.000000 sys 0.010000 (best of 196)
>>
>> So, we've gone from ~25x slower than batch to ~22.5% slower.

I'm double reviewing this and I do not understand these number. I see 
where you 25x number come from, but I fail to get what this ~22.5% 
slower one is about. Can you clarify this ?

>> At this point, there's probably not much else we can do except
>> implement an optimized function in the index itself, including in C.

+1, if we get a significant high level performance boost it make sense 
to move more of the revlog code into C (or other compiled language).

Cheers,
Gregory Szorc - Nov. 6, 2016, 5:31 p.m.
On Sun, Nov 6, 2016 at 7:16 AM, Pierre-Yves David <
pierre-yves.david@ens-lyon.org> wrote:

>
>
> On 11/04/2016 09:27 AM, Yuya Nishihara wrote:
>
>> On Tue, 01 Nov 2016 18:25:47 -0700, Gregory Szorc wrote:
>>
>>> # HG changeset patch
>>> # User Gregory Szorc <gregory.szorc@gmail.com>
>>> # Date 1477244433 25200
>>> #      Sun Oct 23 10:40:33 2016 -0700
>>> # Node ID 4c96a177e4ff6a63156ec8cf49483a78e4437756
>>> # Parent  0c41c0cb9b1ef7df40a30672927229ac195b1c92
>>> revlog: optimize _chunkraw when startrev==endrev
>>>
>>
>> LGTM, queued the series, thanks.
>>
>> In many cases, _chunkraw() is called with startrev==endrev. When
>>> this is true, we can avoid an extra index lookup and some other
>>> minor operations.
>>>
>>> On the mozilla-unified repo, `hg perfrevlogchunks -c` says this
>>> has the following impact:
>>>
>>> ! read w/ reused fd
>>> ! wall 0.371846 comb 0.370000 user 0.350000 sys 0.020000 (best of 27)
>>> ! wall 0.337930 comb 0.330000 user 0.300000 sys 0.030000 (best of 30)
>>>
>>> ! read batch w/ reused fd
>>> ! wall 0.014952 comb 0.020000 user 0.000000 sys 0.020000 (best of 197)
>>> ! wall 0.014866 comb 0.010000 user 0.000000 sys 0.010000 (best of 196)
>>>
>>> So, we've gone from ~25x slower than batch to ~22.5% slower.
>>>
>>
> I'm double reviewing this and I do not understand these number. I see
> where you 25x number come from, but I fail to get what this ~22.5% slower
> one is about. Can you clarify this ?
>

That "22.5%" is supposed to be "22.5x."


>
> At this point, there's probably not much else we can do except
>>> implement an optimized function in the index itself, including in C.
>>>
>>
> +1, if we get a significant high level performance boost it make sense to
> move more of the revlog code into C (or other compiled language).
>
> Cheers,
>
> --
> Pierre-Yves David
>
Pierre-Yves David - Nov. 6, 2016, 6:52 p.m.
okay, I've fixed this in flight.

On 11/06/2016 06:31 PM, Gregory Szorc wrote:
> On Sun, Nov 6, 2016 at 7:16 AM, Pierre-Yves David
> <pierre-yves.david@ens-lyon.org <mailto:pierre-yves.david@ens-lyon.org>>
> wrote:
>
>
>
>     On 11/04/2016 09:27 AM, Yuya Nishihara wrote:
>
>         On Tue, 01 Nov 2016 18:25:47 -0700, Gregory Szorc wrote:
>
>             # HG changeset patch
>             # User Gregory Szorc <gregory.szorc@gmail.com
>             <mailto:gregory.szorc@gmail.com>>
>             # Date 1477244433 25200
>             #      Sun Oct 23 10:40:33 2016 -0700
>             # Node ID 4c96a177e4ff6a63156ec8cf49483a78e4437756
>             # Parent  0c41c0cb9b1ef7df40a30672927229ac195b1c92
>             revlog: optimize _chunkraw when startrev==endrev
>
>
>         LGTM, queued the series, thanks.
>
>             In many cases, _chunkraw() is called with startrev==endrev. When
>             this is true, we can avoid an extra index lookup and some other
>             minor operations.
>
>             On the mozilla-unified repo, `hg perfrevlogchunks -c` says this
>             has the following impact:
>
>             ! read w/ reused fd
>             ! wall 0.371846 comb 0.370000 user 0.350000 sys 0.020000
>             (best of 27)
>             ! wall 0.337930 comb 0.330000 user 0.300000 sys 0.030000
>             (best of 30)
>
>             ! read batch w/ reused fd
>             ! wall 0.014952 comb 0.020000 user 0.000000 sys 0.020000
>             (best of 197)
>             ! wall 0.014866 comb 0.010000 user 0.000000 sys 0.010000
>             (best of 196)
>
>             So, we've gone from ~25x slower than batch to ~22.5% slower.
>
>
>     I'm double reviewing this and I do not understand these number. I
>     see where you 25x number come from, but I fail to get what this
>     ~22.5% slower one is about. Can you clarify this ?
>
>
> That "22.5%" is supposed to be "22.5x."
>
>
>
>             At this point, there's probably not much else we can do except
>             implement an optimized function in the index itself,
>             including in C.
>
>
>     +1, if we get a significant high level performance boost it make
>     sense to move more of the revlog code into C (or other compiled
>     language).
>
>     Cheers,
>
>     --
>     Pierre-Yves David
>
>

Patch

diff --git a/mercurial/revlog.py b/mercurial/revlog.py
--- a/mercurial/revlog.py
+++ b/mercurial/revlog.py
@@ -1113,9 +1113,12 @@  class revlog(object):
         # (functions are expensive).
         index = self.index
         istart = index[startrev]
-        iend = index[endrev]
         start = int(istart[0] >> 16)
-        end = int(iend[0] >> 16) + iend[1]
+        if startrev == endrev:
+            end = start + istart[1]
+        else:
+            iend = index[endrev]
+            end = int(iend[0] >> 16) + iend[1]
 
         if self._inline:
             start += (startrev + 1) * self._io.size