Patchwork localrepo: avoid unnecessary conversion from node to rev

login
register
mail settings
Submitter Stanislau Hlebik
Date Feb. 2, 2017, 10:57 a.m.
Message ID <13a528b72173b1228f6e.1486033044@devvm1840.lla2.facebook.com>
Download mbox | patch
Permalink /patch/18292/
State Accepted
Headers show

Comments

Stanislau Hlebik - Feb. 2, 2017, 10:57 a.m.
# HG changeset patch
# User Stanislau Hlebik <stash@fb.com>
# Date 1486032998 28800
#      Thu Feb 02 02:56:38 2017 -0800
# Node ID 13a528b72173b1228f6eeb0ffc2346e7b78d1d78
# Parent  abf029200e198878a4576a87e095bd8d77d9cea9
localrepo: avoid unnecessary conversion from node to rev

changelog.heads() first calls headrevs then converts them to nodes.
localrepo.heads() then sorts them using self.changelog.rev function and makes
useless conversion back to revs. Instead let's call changelog.headrevs() from
localrepo.heads(), sort the output and then convert to nodes. Because headrevs
does not support start parameter this optimization only works if start is None.
Jun Wu - Feb. 2, 2017, 3:06 p.m.
This patch looks good to me. See inline comment about how to make it faster.
That could probably be fixed in flight.

Excerpts from Stanislau Hlebik's message of 2017-02-02 02:57:24 -0800:
> # HG changeset patch
> # User Stanislau Hlebik <stash@fb.com>
> # Date 1486032998 28800
> #      Thu Feb 02 02:56:38 2017 -0800
> # Node ID 13a528b72173b1228f6eeb0ffc2346e7b78d1d78
> # Parent  abf029200e198878a4576a87e095bd8d77d9cea9
> localrepo: avoid unnecessary conversion from node to rev
> 
> changelog.heads() first calls headrevs then converts them to nodes.
> localrepo.heads() then sorts them using self.changelog.rev function and makes
> useless conversion back to revs. Instead let's call changelog.headrevs() from
> localrepo.heads(), sort the output and then convert to nodes. Because headrevs
> does not support start parameter this optimization only works if start is None.
> 
> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> --- a/mercurial/localrepo.py
> +++ b/mercurial/localrepo.py
> @@ -1852,6 +1852,10 @@
>                                    listsubrepos)
>  
>      def heads(self, start=None):
> +        if start is None:
> +            headrevs = sorted(self.changelog.headrevs(), reverse=True)

headrevs() is already sorted in both C and Python implementations, so
"sorted(..., reverse=True)" could be replaced by "reversed(...)".

> +            return [self.changelog.node(rev) for rev in headrevs]
> +
>          heads = self.changelog.heads(start)
>          # sort the output in rev descending order
>          return sorted(heads, key=self.changelog.rev, reverse=True)
Yuya Nishihara - Feb. 3, 2017, 1:16 p.m.
On Thu, 2 Feb 2017 15:06:46 +0000, Jun Wu wrote:
> This patch looks good to me. See inline comment about how to make it faster.
> That could probably be fixed in flight.
> 
> Excerpts from Stanislau Hlebik's message of 2017-02-02 02:57:24 -0800:
> > # HG changeset patch
> > # User Stanislau Hlebik <stash@fb.com>
> > # Date 1486032998 28800
> > #      Thu Feb 02 02:56:38 2017 -0800
> > # Node ID 13a528b72173b1228f6eeb0ffc2346e7b78d1d78
> > # Parent  abf029200e198878a4576a87e095bd8d77d9cea9
> > localrepo: avoid unnecessary conversion from node to rev
> > 
> > changelog.heads() first calls headrevs then converts them to nodes.
> > localrepo.heads() then sorts them using self.changelog.rev function and makes
> > useless conversion back to revs. Instead let's call changelog.headrevs() from
> > localrepo.heads(), sort the output and then convert to nodes. Because headrevs
> > does not support start parameter this optimization only works if start is None.
> > 
> > diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> > --- a/mercurial/localrepo.py
> > +++ b/mercurial/localrepo.py
> > @@ -1852,6 +1852,10 @@
> >                                    listsubrepos)
> >  
> >      def heads(self, start=None):
> > +        if start is None:
> > +            headrevs = sorted(self.changelog.headrevs(), reverse=True)
> 
> headrevs() is already sorted in both C and Python implementations, so
> "sorted(..., reverse=True)" could be replaced by "reversed(...)".

Yeah.

> > +            return [self.changelog.node(rev) for rev in headrevs]

The function call "self.changelog" wasn't instant because of repoview, but I
don't remember if that's still true. Pierre-Yves?
Gregory Szorc - Feb. 11, 2017, 6:44 p.m.
> On Feb 3, 2017, at 05:16, Yuya Nishihara <yuya@tcha.org> wrote:
> 
>> On Thu, 2 Feb 2017 15:06:46 +0000, Jun Wu wrote:
>> This patch looks good to me. See inline comment about how to make it faster.
>> That could probably be fixed in flight.
>> 
>> Excerpts from Stanislau Hlebik's message of 2017-02-02 02:57:24 -0800:
>>> # HG changeset patch
>>> # User Stanislau Hlebik <stash@fb.com>
>>> # Date 1486032998 28800
>>> #      Thu Feb 02 02:56:38 2017 -0800
>>> # Node ID 13a528b72173b1228f6eeb0ffc2346e7b78d1d78
>>> # Parent  abf029200e198878a4576a87e095bd8d77d9cea9
>>> localrepo: avoid unnecessary conversion from node to rev
>>> 
>>> changelog.heads() first calls headrevs then converts them to nodes.
>>> localrepo.heads() then sorts them using self.changelog.rev function and makes
>>> useless conversion back to revs. Instead let's call changelog.headrevs() from
>>> localrepo.heads(), sort the output and then convert to nodes. Because headrevs
>>> does not support start parameter this optimization only works if start is None.
>>> 
>>> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
>>> --- a/mercurial/localrepo.py
>>> +++ b/mercurial/localrepo.py
>>> @@ -1852,6 +1852,10 @@
>>>                                   listsubrepos)
>>> 
>>>     def heads(self, start=None):
>>> +        if start is None:
>>> +            headrevs = sorted(self.changelog.headrevs(), reverse=True)
>> 
>> headrevs() is already sorted in both C and Python implementations, so
>> "sorted(..., reverse=True)" could be replaced by "reversed(...)".
> 
> Yeah.
> 
>>> +            return [self.changelog.node(rev) for rev in headrevs]
> 
> The function call "self.changelog" wasn't instant because of repoview, but I
> don't remember if that's still true. Pierre-Yves?

Pierre-Yves made it substantially faster a few releases ago. But there is still enough overhead for it to be problematic.

FWIW, just doing a plain attribute lookup in a simple loop like this can introduce measurable overhead. We go so far as to alias list.append in a local to avoid this in some cases!
Pierre-Yves David - March 7, 2017, 11:30 a.m.
On 02/11/2017 07:44 PM, Gregory Szorc wrote:
>
>
>> On Feb 3, 2017, at 05:16, Yuya Nishihara <yuya@tcha.org> wrote:
>>
>>> On Thu, 2 Feb 2017 15:06:46 +0000, Jun Wu wrote:
>>> This patch looks good to me. See inline comment about how to make it faster.
>>> That could probably be fixed in flight.
>>>
>>> Excerpts from Stanislau Hlebik's message of 2017-02-02 02:57:24 -0800:
>>>> # HG changeset patch
>>>> # User Stanislau Hlebik <stash@fb.com>
>>>> # Date 1486032998 28800
>>>> #      Thu Feb 02 02:56:38 2017 -0800
>>>> # Node ID 13a528b72173b1228f6eeb0ffc2346e7b78d1d78
>>>> # Parent  abf029200e198878a4576a87e095bd8d77d9cea9
>>>> localrepo: avoid unnecessary conversion from node to rev
>>>>
>>>> changelog.heads() first calls headrevs then converts them to nodes.
>>>> localrepo.heads() then sorts them using self.changelog.rev function and makes
>>>> useless conversion back to revs. Instead let's call changelog.headrevs() from
>>>> localrepo.heads(), sort the output and then convert to nodes. Because headrevs
>>>> does not support start parameter this optimization only works if start is None.
>>>>
>>>> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
>>>> --- a/mercurial/localrepo.py
>>>> +++ b/mercurial/localrepo.py
>>>> @@ -1852,6 +1852,10 @@
>>>>                                   listsubrepos)
>>>>
>>>>     def heads(self, start=None):
>>>> +        if start is None:
>>>> +            headrevs = sorted(self.changelog.headrevs(), reverse=True)
>>>
>>> headrevs() is already sorted in both C and Python implementations, so
>>> "sorted(..., reverse=True)" could be replaced by "reversed(...)".
>>
>> Yeah.
>>
>>>> +            return [self.changelog.node(rev) for rev in headrevs]
>>
>> The function call "self.changelog" wasn't instant because of repoview, but I
>> don't remember if that's still true. Pierre-Yves?
>
> Pierre-Yves made it substantially faster a few releases ago. But there is still enough overhead for it to be problematic.

I'm a bit late to the party, but I confirm that the overhead is 
significant enough that you do not want it to be in loops.

> FWIW, just doing a plain attribute lookup in a simple loop like this can introduce measurable overhead. We go so far as to alias list.append in a local to avoid this in some cases!

Patch

diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -1852,6 +1852,10 @@ 
                                   listsubrepos)
 
     def heads(self, start=None):
+        if start is None:
+            headrevs = sorted(self.changelog.headrevs(), reverse=True)
+            return [self.changelog.node(rev) for rev in headrevs]
+
         heads = self.changelog.heads(start)
         # sort the output in rev descending order
         return sorted(heads, key=self.changelog.rev, reverse=True)