Patchwork hgweb: build the "entries" list directly in filelog command

login
register
mail settings
Submitter Denis Laxalde
Date Jan. 13, 2017, 3:31 p.m.
Message ID <ad41cc2eb799156b0de2.1484321478@sh77.tls.logilab.fr>
Download mbox | patch
Permalink /patch/18208/
State Accepted
Headers show

Comments

Denis Laxalde - Jan. 13, 2017, 3:31 p.m.
# HG changeset patch
# User Denis Laxalde <denis.laxalde@logilab.fr>
# Date 1484299345 -3600
#      Fri Jan 13 10:22:25 2017 +0100
# Node ID ad41cc2eb799156b0de2dd71adeed88ec42a98d9
# Parent  e882c7bb5a0ba2589a44108c9a87b300a13e08df
hgweb: build the "entries" list directly in filelog command

There's no apparent reason to have this "entries" generator function that
builds a list and then yields its elements in reverse order and which is only
called to build the "entries" list. So just build the list directly, in
reverse order.

Adjust "parity" generator's offset to keep rendering the same.
Pierre-Yves David - Jan. 13, 2017, 4:08 p.m.
On 01/13/2017 04:31 PM, Denis Laxalde wrote:
> # HG changeset patch
> # User Denis Laxalde <denis.laxalde@logilab.fr>
> # Date 1484299345 -3600
> #      Fri Jan 13 10:22:25 2017 +0100
> # Node ID ad41cc2eb799156b0de2dd71adeed88ec42a98d9
> # Parent  e882c7bb5a0ba2589a44108c9a87b300a13e08df
> hgweb: build the "entries" list directly in filelog command
>
> There's no apparent reason to have this "entries" generator function that
> builds a list and then yields its elements in reverse order and which is only
> called to build the "entries" list. So just build the list directly, in
> reverse order.

revlog are usually read from 0 to max for performance reason (data in 
N+1 might rely on data on N and last access might be cached).
This might explain why the list was build in one direction and then 
reversed.
Can you double check if such mechanism is in play in this case?

(I fully support the removal of the closure)

> Adjust "parity" generator's offset to keep rendering the same.
>
> diff --git a/mercurial/hgweb/webcommands.py b/mercurial/hgweb/webcommands.py
> --- a/mercurial/hgweb/webcommands.py
> +++ b/mercurial/hgweb/webcommands.py
> @@ -974,26 +974,20 @@ def filelog(web, req, tmpl):
>      count = fctx.filerev() + 1
>      start = max(0, fctx.filerev() - revcount + 1) # first rev on this page
>      end = min(count, start + revcount) # last rev on this page
> -    parity = paritygen(web.stripecount, offset=start - end)
> -
> -    def entries():
> -        l = []
> -
> -        repo = web.repo
> -        revs = fctx.filelog().revs(start, end - 1)
> -        for i in revs:
> -            iterfctx = fctx.filectx(i)
> +    parity = paritygen(web.stripecount, offset=start - end + 1)
>
> -            l.append(dict(
> -                parity=next(parity),
> -                filerev=i,
> -                file=f,
> -                rename=webutil.renamelink(iterfctx),
> -                **webutil.commonentry(repo, iterfctx)))
> -        for e in reversed(l):
> -            yield e
> +    repo = web.repo
> +    revs = fctx.filelog().revs(start, end - 1)
> +    entries = []
> +    for i in reversed(revs):
> +        iterfctx = fctx.filectx(i)
> +        entries.append(dict(
> +            parity=next(parity),
> +            filerev=i,
> +            file=f,
> +            rename=webutil.renamelink(iterfctx),
> +            **webutil.commonentry(repo, iterfctx)))
>
> -    entries = list(entries())
>      latestentry = entries[:1]
>
>      revnav = webutil.filerevnav(web.repo, fctx.path())
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
Gregory Szorc - Jan. 14, 2017, 5:14 a.m.
On Fri, Jan 13, 2017 at 8:08 AM, Pierre-Yves David <
pierre-yves.david@ens-lyon.org> wrote:

>
>
> On 01/13/2017 04:31 PM, Denis Laxalde wrote:
>
>> # HG changeset patch
>> # User Denis Laxalde <denis.laxalde@logilab.fr>
>> # Date 1484299345 -3600
>> #      Fri Jan 13 10:22:25 2017 +0100
>> # Node ID ad41cc2eb799156b0de2dd71adeed88ec42a98d9
>> # Parent  e882c7bb5a0ba2589a44108c9a87b300a13e08df
>> hgweb: build the "entries" list directly in filelog command
>>
>> There's no apparent reason to have this "entries" generator function that
>> builds a list and then yields its elements in reverse order and which is
>> only
>> called to build the "entries" list. So just build the list directly, in
>> reverse order.
>>
>
> revlog are usually read from 0 to max for performance reason (data in N+1
> might rely on data on N and last access might be cached).
> This might explain why the list was build in one direction and then
> reversed.
> Can you double check if such mechanism is in play in this case?
>
>
Your observation about desiring to iterate revlogs from 0 to max is spot on
and this patch would normally be wrong. However, in this code `start > end`
(as can be seen in the context just above the first changed line), so we're
already iterating in the suboptimal direction. This patch actually fixes
that by throwing in a reversed() of a descending fctx.filelog().revs()!

(This confused me on first read too.)



> (I fully support the removal of the closure)
>
>
> Adjust "parity" generator's offset to keep rendering the same.
>>
>> diff --git a/mercurial/hgweb/webcommands.py
>> b/mercurial/hgweb/webcommands.py
>> --- a/mercurial/hgweb/webcommands.py
>> +++ b/mercurial/hgweb/webcommands.py
>> @@ -974,26 +974,20 @@ def filelog(web, req, tmpl):
>>      count = fctx.filerev() + 1
>>      start = max(0, fctx.filerev() - revcount + 1) # first rev on this
>> page
>>      end = min(count, start + revcount) # last rev on this page
>> -    parity = paritygen(web.stripecount, offset=start - end)
>> -
>> -    def entries():
>> -        l = []
>> -
>> -        repo = web.repo
>> -        revs = fctx.filelog().revs(start, end - 1)
>> -        for i in revs:
>> -            iterfctx = fctx.filectx(i)
>> +    parity = paritygen(web.stripecount, offset=start - end + 1)
>>
>> -            l.append(dict(
>> -                parity=next(parity),
>> -                filerev=i,
>> -                file=f,
>> -                rename=webutil.renamelink(iterfctx),
>> -                **webutil.commonentry(repo, iterfctx)))
>> -        for e in reversed(l):
>> -            yield e
>> +    repo = web.repo
>> +    revs = fctx.filelog().revs(start, end - 1)
>> +    entries = []
>> +    for i in reversed(revs):
>> +        iterfctx = fctx.filectx(i)
>> +        entries.append(dict(
>> +            parity=next(parity),
>> +            filerev=i,
>> +            file=f,
>> +            rename=webutil.renamelink(iterfctx),
>> +            **webutil.commonentry(repo, iterfctx)))
>>
>> -    entries = list(entries())
>>      latestentry = entries[:1]
>>
>>      revnav = webutil.filerevnav(web.repo, fctx.path())
>> _______________________________________________
>> Mercurial-devel mailing list
>> Mercurial-devel@mercurial-scm.org
>> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>>
>>
> --
> Pierre-Yves David
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
Pierre-Yves David - Jan. 15, 2017, 7:52 a.m.
On 01/14/2017 06:14 AM, Gregory Szorc wrote:
> On Fri, Jan 13, 2017 at 8:08 AM, Pierre-Yves David
> <pierre-yves.david@ens-lyon.org <mailto:pierre-yves.david@ens-lyon.org>>
> wrote:
>
>
>
>     On 01/13/2017 04:31 PM, Denis Laxalde wrote:
>
>         # HG changeset patch
>         # User Denis Laxalde <denis.laxalde@logilab.fr
>         <mailto:denis.laxalde@logilab.fr>>
>         # Date 1484299345 -3600
>         #      Fri Jan 13 10:22:25 2017 +0100
>         # Node ID ad41cc2eb799156b0de2dd71adeed88ec42a98d9
>         # Parent  e882c7bb5a0ba2589a44108c9a87b300a13e08df
>         hgweb: build the "entries" list directly in filelog command
>
>         There's no apparent reason to have this "entries" generator
>         function that
>         builds a list and then yields its elements in reverse order and
>         which is only
>         called to build the "entries" list. So just build the list
>         directly, in
>         reverse order.
>
>
>     revlog are usually read from 0 to max for performance reason (data
>     in N+1 might rely on data on N and last access might be cached).
>     This might explain why the list was build in one direction and then
>     reversed.
>     Can you double check if such mechanism is in play in this case?
>
>
> Your observation about desiring to iterate revlogs from 0 to max is spot
> on and this patch would normally be wrong. However, in this code `start
>> end` (as can be seen in the context just above the first changed
> line), so we're already iterating in the suboptimal direction. This
> patch actually fixes that by throwing in a reversed() of a descending
> fctx.filelog().revs()!
>
> (This confused me on first read too.)

oh, right. Thanks for pointing that out.

I've pushed that patch Thanks!

Cheers,
via Mercurial-devel - Jan. 15, 2017, 6:09 p.m.
On Sat, Jan 14, 2017, 23:52 Pierre-Yves David <
pierre-yves.david@ens-lyon.org> wrote:

>
>
> On 01/14/2017 06:14 AM, Gregory Szorc wrote:
> > On Fri, Jan 13, 2017 at 8:08 AM, Pierre-Yves David
> > <pierre-yves.david@ens-lyon.org <mailto:pierre-yves.david@ens-lyon.org>>
> > wrote:
> >
> >
> >
> >     On 01/13/2017 04:31 PM, Denis Laxalde wrote:
> >
> >         # HG changeset patch
> >         # User Denis Laxalde <denis.laxalde@logilab.fr
> >         <mailto:denis.laxalde@logilab.fr>>
> >         # Date 1484299345 -3600
> >         #      Fri Jan 13 10:22:25 2017 +0100
> >         # Node ID ad41cc2eb799156b0de2dd71adeed88ec42a98d9
> >         # Parent  e882c7bb5a0ba2589a44108c9a87b300a13e08df
> >         hgweb: build the "entries" list directly in filelog command
> >
> >         There's no apparent reason to have this "entries" generator
> >         function that
> >         builds a list and then yields its elements in reverse order and
> >         which is only
> >         called to build the "entries" list. So just build the list
> >         directly, in
> >         reverse order.
> >
> >
> >     revlog are usually read from 0 to max for performance reason (data
> >     in N+1 might rely on data on N and last access might be cached).
> >     This might explain why the list was build in one direction and then
> >     reversed.
> >     Can you double check if such mechanism is in play in this case?
> >
> >
> > Your observation about desiring to iterate revlogs from 0 to max is spot
> > on and this patch would normally be wrong. However, in this code `start
> >> end` (as can be seen in the context just above the first changed
> > line), so we're already iterating in the suboptimal direction. This
> > patch actually fixes that by throwing in a reversed() of a descending
> > fctx.filelog().revs()!
>

These are the lines from the context:

count = fctx.filerev() + 1
start = max(0, fctx.filerev() - revcount + 1) # first rev on this page
end = min(count, start + revcount) # last rev on this page

Let's say count=100, revcount=10. Then start=90 and end=100. Since you two
(Greg and Pierre-Yves) agree that start > end, I suspect I'm just missing
something, but I couldn't figure out what.

(Btw, "start = max(0, count - revcount)" would be slightly simpler.)


>
> > (This confused me on first read too.)
>
> oh, right. Thanks for pointing that out.
>
> I've pushed that patch Thanks!
>
> Cheers,
>
> --
> Pierre-Yves David
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
Denis Laxalde - Jan. 16, 2017, 10:59 a.m.
Martin von Zweigbergk via Mercurial-devel a écrit :
>
>
> On Sat, Jan 14, 2017, 23:52 Pierre-Yves David
> <pierre-yves.david@ens-lyon.org <mailto:pierre-yves.david@ens-lyon.org>>
> wrote:
>
>
>
>     On 01/14/2017 06:14 AM, Gregory Szorc wrote:
>     > On Fri, Jan 13, 2017 at 8:08 AM, Pierre-Yves David
>     > <pierre-yves.david@ens-lyon.org
>     <mailto:pierre-yves.david@ens-lyon.org>
>     <mailto:pierre-yves.david@ens-lyon.org
>     <mailto:pierre-yves.david@ens-lyon.org>>>
>     > wrote:
>     >
>     >
>     >
>     >     On 01/13/2017 04:31 PM, Denis Laxalde wrote:
>     >
>     >         # HG changeset patch
>     >         # User Denis Laxalde <denis.laxalde@logilab.fr
>     <mailto:denis.laxalde@logilab.fr>
>     >         <mailto:denis.laxalde@logilab.fr
>     <mailto:denis.laxalde@logilab.fr>>>
>     >         # Date 1484299345 -3600
>     >         #      Fri Jan 13 10:22:25 2017 +0100
>     >         # Node ID ad41cc2eb799156b0de2dd71adeed88ec42a98d9
>     >         # Parent  e882c7bb5a0ba2589a44108c9a87b300a13e08df
>     >         hgweb: build the "entries" list directly in filelog command
>     >
>     >         There's no apparent reason to have this "entries" generator
>     >         function that
>     >         builds a list and then yields its elements in reverse
>     order and
>     >         which is only
>     >         called to build the "entries" list. So just build the list
>     >         directly, in
>     >         reverse order.
>     >
>     >
>     >     revlog are usually read from 0 to max for performance reason (data
>     >     in N+1 might rely on data on N and last access might be cached).
>     >     This might explain why the list was build in one direction and
>     then
>     >     reversed.
>     >     Can you double check if such mechanism is in play in this case?
>     >
>     >
>     > Your observation about desiring to iterate revlogs from 0 to max
>     is spot
>     > on and this patch would normally be wrong. However, in this code
>     `start
>     >> end` (as can be seen in the context just above the first changed
>     > line), so we're already iterating in the suboptimal direction. This
>     > patch actually fixes that by throwing in a reversed() of a descending
>     > fctx.filelog().revs()!
>
>
> These are the lines from the context:
>
> count = fctx.filerev() + 1
> start = max(0, fctx.filerev() - revcount + 1) # first rev on this page
> end = min(count, start + revcount) # last rev on this page
>
> Let's say count=100, revcount=10. Then start=90 and end=100. Since you
> two (Greg and Pierre-Yves) agree that start > end, I suspect I'm just
> missing something, but I couldn't figure out what.


That's also my impression. Previously revisions were iterated over in
ascending order, now they are in descending order.

If this causes performance regressions (I could not observe any), I can
send a patch restoring iteration in ascending order.
Pierre-Yves David - Jan. 16, 2017, 5:22 p.m.
On 01/16/2017 11:59 AM, Denis Laxalde wrote:
> Martin von Zweigbergk via Mercurial-devel a écrit :
>>
>>
>> On Sat, Jan 14, 2017, 23:52 Pierre-Yves David
>> <pierre-yves.david@ens-lyon.org <mailto:pierre-yves.david@ens-lyon.org>>
>> wrote:
>>
>>
>>
>>     On 01/14/2017 06:14 AM, Gregory Szorc wrote:
>>     > On Fri, Jan 13, 2017 at 8:08 AM, Pierre-Yves David
>>     > <pierre-yves.david@ens-lyon.org
>>     <mailto:pierre-yves.david@ens-lyon.org>
>>     <mailto:pierre-yves.david@ens-lyon.org
>>     <mailto:pierre-yves.david@ens-lyon.org>>>
>>     > wrote:
>>     >
>>     >
>>     >
>>     >     On 01/13/2017 04:31 PM, Denis Laxalde wrote:
>>     >
>>     >         # HG changeset patch
>>     >         # User Denis Laxalde <denis.laxalde@logilab.fr
>>     <mailto:denis.laxalde@logilab.fr>
>>     >         <mailto:denis.laxalde@logilab.fr
>>     <mailto:denis.laxalde@logilab.fr>>>
>>     >         # Date 1484299345 -3600
>>     >         #      Fri Jan 13 10:22:25 2017 +0100
>>     >         # Node ID ad41cc2eb799156b0de2dd71adeed88ec42a98d9
>>     >         # Parent  e882c7bb5a0ba2589a44108c9a87b300a13e08df
>>     >         hgweb: build the "entries" list directly in filelog command
>>     >
>>     >         There's no apparent reason to have this "entries" generator
>>     >         function that
>>     >         builds a list and then yields its elements in reverse
>>     order and
>>     >         which is only
>>     >         called to build the "entries" list. So just build the list
>>     >         directly, in
>>     >         reverse order.
>>     >
>>     >
>>     >     revlog are usually read from 0 to max for performance reason
>> (data
>>     >     in N+1 might rely on data on N and last access might be
>> cached).
>>     >     This might explain why the list was build in one direction and
>>     then
>>     >     reversed.
>>     >     Can you double check if such mechanism is in play in this case?
>>     >
>>     >
>>     > Your observation about desiring to iterate revlogs from 0 to max
>>     is spot
>>     > on and this patch would normally be wrong. However, in this code
>>     `start
>>     >> end` (as can be seen in the context just above the first changed
>>     > line), so we're already iterating in the suboptimal direction. This
>>     > patch actually fixes that by throwing in a reversed() of a
>> descending
>>     > fctx.filelog().revs()!
>>
>>
>> These are the lines from the context:
>>
>> count = fctx.filerev() + 1
>> start = max(0, fctx.filerev() - revcount + 1) # first rev on this page
>> end = min(count, start + revcount) # last rev on this page
>>
>> Let's say count=100, revcount=10. Then start=90 and end=100. Since you
>> two (Greg and Pierre-Yves) agree that start > end, I suspect I'm just
>> missing something, but I couldn't figure out what.
>
>
> That's also my impression. Previously revisions were iterated over in
> ascending order, now they are in descending order.

I think Martin and Denis are right. I just got fooled by the start = 
max(…); stop = min(…) (that is embarrassing). Thanks for spotting that.

> If this causes performance regressions (I could not observe any), I can
> send a patch restoring iteration in ascending order.

Can you provide use with some detail about the performance testing you 
did? If possible with associated number?

In all cases it is probably worthwhile to keep ascending order here as I 
don't see any reason it would hurt.

Cheers,

Patch

diff --git a/mercurial/hgweb/webcommands.py b/mercurial/hgweb/webcommands.py
--- a/mercurial/hgweb/webcommands.py
+++ b/mercurial/hgweb/webcommands.py
@@ -974,26 +974,20 @@  def filelog(web, req, tmpl):
     count = fctx.filerev() + 1
     start = max(0, fctx.filerev() - revcount + 1) # first rev on this page
     end = min(count, start + revcount) # last rev on this page
-    parity = paritygen(web.stripecount, offset=start - end)
-
-    def entries():
-        l = []
-
-        repo = web.repo
-        revs = fctx.filelog().revs(start, end - 1)
-        for i in revs:
-            iterfctx = fctx.filectx(i)
+    parity = paritygen(web.stripecount, offset=start - end + 1)
 
-            l.append(dict(
-                parity=next(parity),
-                filerev=i,
-                file=f,
-                rename=webutil.renamelink(iterfctx),
-                **webutil.commonentry(repo, iterfctx)))
-        for e in reversed(l):
-            yield e
+    repo = web.repo
+    revs = fctx.filelog().revs(start, end - 1)
+    entries = []
+    for i in reversed(revs):
+        iterfctx = fctx.filectx(i)
+        entries.append(dict(
+            parity=next(parity),
+            filerev=i,
+            file=f,
+            rename=webutil.renamelink(iterfctx),
+            **webutil.commonentry(repo, iterfctx)))
 
-    entries = list(entries())
     latestentry = entries[:1]
 
     revnav = webutil.filerevnav(web.repo, fctx.path())