Submitter | Gregory Szorc |
---|---|
Date | Nov. 2, 2016, 1:25 a.m. |
Message ID | <4c96a177e4ff6a63156e.1478049947@ubuntu-vm-main> |
Download | mbox | patch |
Permalink | /patch/17261/ |
State | Accepted |
Headers | show |
Comments
On Tue, 01 Nov 2016 18:25:47 -0700, Gregory Szorc wrote: > # HG changeset patch > # User Gregory Szorc <gregory.szorc@gmail.com> > # Date 1477244433 25200 > # Sun Oct 23 10:40:33 2016 -0700 > # Node ID 4c96a177e4ff6a63156ec8cf49483a78e4437756 > # Parent 0c41c0cb9b1ef7df40a30672927229ac195b1c92 > revlog: optimize _chunkraw when startrev==endrev LGTM, queued the series, thanks. > In many cases, _chunkraw() is called with startrev==endrev. When > this is true, we can avoid an extra index lookup and some other > minor operations. > > On the mozilla-unified repo, `hg perfrevlogchunks -c` says this > has the following impact: > > ! read w/ reused fd > ! wall 0.371846 comb 0.370000 user 0.350000 sys 0.020000 (best of 27) > ! wall 0.337930 comb 0.330000 user 0.300000 sys 0.030000 (best of 30) > > ! read batch w/ reused fd > ! wall 0.014952 comb 0.020000 user 0.000000 sys 0.020000 (best of 197) > ! wall 0.014866 comb 0.010000 user 0.000000 sys 0.010000 (best of 196) > > So, we've gone from ~25x slower than batch to ~22.5% slower. > > At this point, there's probably not much else we can do except > implement an optimized function in the index itself, including in C. or in Rust :-)
On 11/04/2016 09:27 AM, Yuya Nishihara wrote: > On Tue, 01 Nov 2016 18:25:47 -0700, Gregory Szorc wrote: >> # HG changeset patch >> # User Gregory Szorc <gregory.szorc@gmail.com> >> # Date 1477244433 25200 >> # Sun Oct 23 10:40:33 2016 -0700 >> # Node ID 4c96a177e4ff6a63156ec8cf49483a78e4437756 >> # Parent 0c41c0cb9b1ef7df40a30672927229ac195b1c92 >> revlog: optimize _chunkraw when startrev==endrev > > LGTM, queued the series, thanks. > >> In many cases, _chunkraw() is called with startrev==endrev. When >> this is true, we can avoid an extra index lookup and some other >> minor operations. >> >> On the mozilla-unified repo, `hg perfrevlogchunks -c` says this >> has the following impact: >> >> ! read w/ reused fd >> ! wall 0.371846 comb 0.370000 user 0.350000 sys 0.020000 (best of 27) >> ! wall 0.337930 comb 0.330000 user 0.300000 sys 0.030000 (best of 30) >> >> ! read batch w/ reused fd >> ! wall 0.014952 comb 0.020000 user 0.000000 sys 0.020000 (best of 197) >> ! wall 0.014866 comb 0.010000 user 0.000000 sys 0.010000 (best of 196) >> >> So, we've gone from ~25x slower than batch to ~22.5% slower. I'm double reviewing this and I do not understand these number. I see where you 25x number come from, but I fail to get what this ~22.5% slower one is about. Can you clarify this ? >> At this point, there's probably not much else we can do except >> implement an optimized function in the index itself, including in C. +1, if we get a significant high level performance boost it make sense to move more of the revlog code into C (or other compiled language). Cheers,
On Sun, Nov 6, 2016 at 7:16 AM, Pierre-Yves David < pierre-yves.david@ens-lyon.org> wrote: > > > On 11/04/2016 09:27 AM, Yuya Nishihara wrote: > >> On Tue, 01 Nov 2016 18:25:47 -0700, Gregory Szorc wrote: >> >>> # HG changeset patch >>> # User Gregory Szorc <gregory.szorc@gmail.com> >>> # Date 1477244433 25200 >>> # Sun Oct 23 10:40:33 2016 -0700 >>> # Node ID 4c96a177e4ff6a63156ec8cf49483a78e4437756 >>> # Parent 0c41c0cb9b1ef7df40a30672927229ac195b1c92 >>> revlog: optimize _chunkraw when startrev==endrev >>> >> >> LGTM, queued the series, thanks. >> >> In many cases, _chunkraw() is called with startrev==endrev. When >>> this is true, we can avoid an extra index lookup and some other >>> minor operations. >>> >>> On the mozilla-unified repo, `hg perfrevlogchunks -c` says this >>> has the following impact: >>> >>> ! read w/ reused fd >>> ! wall 0.371846 comb 0.370000 user 0.350000 sys 0.020000 (best of 27) >>> ! wall 0.337930 comb 0.330000 user 0.300000 sys 0.030000 (best of 30) >>> >>> ! read batch w/ reused fd >>> ! wall 0.014952 comb 0.020000 user 0.000000 sys 0.020000 (best of 197) >>> ! wall 0.014866 comb 0.010000 user 0.000000 sys 0.010000 (best of 196) >>> >>> So, we've gone from ~25x slower than batch to ~22.5% slower. >>> >> > I'm double reviewing this and I do not understand these number. I see > where you 25x number come from, but I fail to get what this ~22.5% slower > one is about. Can you clarify this ? > That "22.5%" is supposed to be "22.5x." > > At this point, there's probably not much else we can do except >>> implement an optimized function in the index itself, including in C. >>> >> > +1, if we get a significant high level performance boost it make sense to > move more of the revlog code into C (or other compiled language). > > Cheers, > > -- > Pierre-Yves David >
okay, I've fixed this in flight. On 11/06/2016 06:31 PM, Gregory Szorc wrote: > On Sun, Nov 6, 2016 at 7:16 AM, Pierre-Yves David > <pierre-yves.david@ens-lyon.org <mailto:pierre-yves.david@ens-lyon.org>> > wrote: > > > > On 11/04/2016 09:27 AM, Yuya Nishihara wrote: > > On Tue, 01 Nov 2016 18:25:47 -0700, Gregory Szorc wrote: > > # HG changeset patch > # User Gregory Szorc <gregory.szorc@gmail.com > <mailto:gregory.szorc@gmail.com>> > # Date 1477244433 25200 > # Sun Oct 23 10:40:33 2016 -0700 > # Node ID 4c96a177e4ff6a63156ec8cf49483a78e4437756 > # Parent 0c41c0cb9b1ef7df40a30672927229ac195b1c92 > revlog: optimize _chunkraw when startrev==endrev > > > LGTM, queued the series, thanks. > > In many cases, _chunkraw() is called with startrev==endrev. When > this is true, we can avoid an extra index lookup and some other > minor operations. > > On the mozilla-unified repo, `hg perfrevlogchunks -c` says this > has the following impact: > > ! read w/ reused fd > ! wall 0.371846 comb 0.370000 user 0.350000 sys 0.020000 > (best of 27) > ! wall 0.337930 comb 0.330000 user 0.300000 sys 0.030000 > (best of 30) > > ! read batch w/ reused fd > ! wall 0.014952 comb 0.020000 user 0.000000 sys 0.020000 > (best of 197) > ! wall 0.014866 comb 0.010000 user 0.000000 sys 0.010000 > (best of 196) > > So, we've gone from ~25x slower than batch to ~22.5% slower. > > > I'm double reviewing this and I do not understand these number. I > see where you 25x number come from, but I fail to get what this > ~22.5% slower one is about. Can you clarify this ? > > > That "22.5%" is supposed to be "22.5x." > > > > At this point, there's probably not much else we can do except > implement an optimized function in the index itself, > including in C. > > > +1, if we get a significant high level performance boost it make > sense to move more of the revlog code into C (or other compiled > language). > > Cheers, > > -- > Pierre-Yves David > >
Patch
diff --git a/mercurial/revlog.py b/mercurial/revlog.py --- a/mercurial/revlog.py +++ b/mercurial/revlog.py @@ -1113,9 +1113,12 @@ class revlog(object): # (functions are expensive). index = self.index istart = index[startrev] - iend = index[endrev] start = int(istart[0] >> 16) - end = int(iend[0] >> 16) + iend[1] + if startrev == endrev: + end = start + istart[1] + else: + iend = index[endrev] + end = int(iend[0] >> 16) + iend[1] if self._inline: start += (startrev + 1) * self._io.size