Patchwork [STABLE,V2] hgweb: cache fctx.parents() in annotate command (issue5414)

login
register
mail settings
Submitter Gregory Szorc
Date Nov. 5, 2016, 4:41 p.m.
Message ID <1258a74819e5c2b4904d.1478364074@ubuntu-vm-main>
Download mbox | patch
Permalink /patch/17350/
State Accepted
Headers show

Comments

Gregory Szorc - Nov. 5, 2016, 4:41 p.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1478363887 25200
#      Sat Nov 05 09:38:07 2016 -0700
# Branch stable
# Node ID 1258a74819e5c2b4904d11c71231e3b3f33a0aa1
# Parent  e5cc44ea12de681d971fcbebb65a7fb71fd1c3c7
hgweb: cache fctx.parents() in annotate command (issue5414)

9c37df347485 introduced a call to fctx.parents() for each line in
annotate output. This function call isn't cheap, as it requires
linkrev adjustment.

Since multiple lines in annotate output tend to belong to the same
file revision, a cache of fctx.parents() lookups for each input
should be effective in the common case. So we implement one.

Since the cache has to precompute parents so an aborted generator
doesn't leave an incomplete cache, we could just return a list.
However, we preserve the generator for backwards compatibility.

The effect of this change when requesting /annotate/96ca0ecdcfa/
browser/locales/en-US/chrome/browser/downloads/downloads.dtd on
the mozilla-aurora repo is significant:

p1(9c37df347485)  5.5s
9c37df347485:    66.3s
this patch:      10.8s

We're still slower than before. But only by ~2x instead of ~12x.

On the tip revisions of layout/base/nsCSSFrameConstructor.cpp file in
the mozilla-unified repo, time went from 12.5s to 14.5s and back to
12.5s. I'm not sure why the mozilla-aurora repo is so slow.

Looking at the code of basefilectx.parents(), there is room for
further improvements. Notably, we still perform redundant calls to
filelog.renamed() and basefilectx._parentfilectx(). And
basefilectx.annotate() also makes similar calls, so there is potential
for object reuse. However, introducing caches here are not appropriate
for the stable branch.
Yuya Nishihara - Nov. 6, 2016, 2:31 a.m.
On Sat, 05 Nov 2016 09:41:14 -0700, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1478363887 25200
> #      Sat Nov 05 09:38:07 2016 -0700
> # Branch stable
> # Node ID 1258a74819e5c2b4904d11c71231e3b3f33a0aa1
> # Parent  e5cc44ea12de681d971fcbebb65a7fb71fd1c3c7
> hgweb: cache fctx.parents() in annotate command (issue5414)

Queued this, thanks.

> Looking at the code of basefilectx.parents(), there is room for
> further improvements. Notably, we still perform redundant calls to
> filelog.renamed() and basefilectx._parentfilectx(). And
> basefilectx.annotate() also makes similar calls, so there is potential
> for object reuse. However, introducing caches here are not appropriate
> for the stable branch.

Perhaps fctx.parents() can be property-cached, but we'll need to drop
uninteresting chains of parents in fctx.annotate().
Jun Wu - Nov. 6, 2016, 5:01 p.m.
Excerpts from Yuya Nishihara's message of 2016-11-06 11:31:04 +0900:
> Perhaps fctx.parents() can be property-cached, but we'll need to drop
> uninteresting chains of parents in fctx.annotate().

If we go the property-cache approach, I think it's better to cache
"_adjustedlinkrev". It's at a lower level and covers both "parents"
and "introrev". Caching "parents" may increase memory usage unintentionally.

I don't fully get what "uninteresting chains of parents" means here.
In the annotate case, let's say f1, f2 = f0.parents().
Both f1 and f2 have _descendantrev set to f0's adjusted linkrev.

Suppose there is a global cache dict: {(path, filenode, srcrev): linkrev}, I
think if srcrev=_descendantrev (it's true for f1, f2) and _descendantrev is
adjusted from the direct child (f0), then it is "interesting" and can be
cached. This is similar to what marmoute said during the sprint - for the
log -f or annotate case, once the first fctx's introrev is known, the cache
can be used to calculate the ancestors' adjusted linkrevs.
Yuya Nishihara - Nov. 7, 2016, 12:42 p.m.
On Sun, 6 Nov 2016 17:01:05 +0000, Jun Wu wrote:
> Excerpts from Yuya Nishihara's message of 2016-11-06 11:31:04 +0900:
> > Perhaps fctx.parents() can be property-cached, but we'll need to drop
> > uninteresting chains of parents in fctx.annotate().
> 
> If we go the property-cache approach, I think it's better to cache
> "_adjustedlinkrev". It's at a lower level and covers both "parents"
> and "introrev". Caching "parents" may increase memory usage unintentionally.
> 
> I don't fully get what "uninteresting chains of parents" means here.
> In the annotate case, let's say f1, f2 = f0.parents().
> Both f1 and f2 have _descendantrev set to f0's adjusted linkrev.

As you said, what's in my mind was the memory usage. Caching fctx.parents()
would mean annotate() builds a full link from self to root nodes. Some of
these intermediate nodes aren't useful for hgweb.

> Suppose there is a global cache dict: {(path, filenode, srcrev): linkrev}, I
> think if srcrev=_descendantrev (it's true for f1, f2) and _descendantrev is
> adjusted from the direct child (f0), then it is "interesting" and can be
> cached. This is similar to what marmoute said during the sprint - for the
> log -f or annotate case, once the first fctx's introrev is known, the cache
> can be used to calculate the ancestors' adjusted linkrevs.

Given we have ugly hacks to pass ancestry data around fctx objects, a global
cache might be useful.
Gregory Szorc - Nov. 7, 2016, 5:29 p.m.
On Mon, Nov 7, 2016 at 4:42 AM, Yuya Nishihara <yuya@tcha.org> wrote:

> On Sun, 6 Nov 2016 17:01:05 +0000, Jun Wu wrote:
> > Excerpts from Yuya Nishihara's message of 2016-11-06 11:31:04 +0900:
> > > Perhaps fctx.parents() can be property-cached, but we'll need to drop
> > > uninteresting chains of parents in fctx.annotate().
> >
> > If we go the property-cache approach, I think it's better to cache
> > "_adjustedlinkrev". It's at a lower level and covers both "parents"
> > and "introrev". Caching "parents" may increase memory usage
> unintentionally.
> >
> > I don't fully get what "uninteresting chains of parents" means here.
> > In the annotate case, let's say f1, f2 = f0.parents().
> > Both f1 and f2 have _descendantrev set to f0's adjusted linkrev.
>
> As you said, what's in my mind was the memory usage. Caching fctx.parents()
> would mean annotate() builds a full link from self to root nodes. Some of
> these intermediate nodes aren't useful for hgweb.
>
> > Suppose there is a global cache dict: {(path, filenode, srcrev):
> linkrev}, I
> > think if srcrev=_descendantrev (it's true for f1, f2) and _descendantrev
> is
> > adjusted from the direct child (f0), then it is "interesting" and can be
> > cached. This is similar to what marmoute said during the sprint - for the
> > log -f or annotate case, once the first fctx's introrev is known, the
> cache
> > can be used to calculate the ancestors' adjusted linkrevs.
>
> Given we have ugly hacks to pass ancestry data around fctx objects, a
> global
> cache might be useful.
>

Could we change basefilectx.annotate() to return a rich data structure
instead of a list of tuples? That data structure could have the cached
parents and other reusable cache data (which could be passed into
subsequent calls if needed).
Jun Wu - Nov. 7, 2016, 5:32 p.m.
Excerpts from Gregory Szorc's message of 2016-11-07 09:29:40 -0800:
> Could we change basefilectx.annotate() to return a rich data structure
> instead of a list of tuples? That data structure could have the cached
> parents and other reusable cache data (which could be passed into
> subsequent calls if needed).

It's already returning "fctx", which is rich...
Yuya Nishihara - Nov. 8, 2016, 2:21 p.m.
On Mon, 7 Nov 2016 17:32:40 +0000, Jun Wu wrote:
> Excerpts from Gregory Szorc's message of 2016-11-07 09:29:40 -0800:
> > Could we change basefilectx.annotate() to return a rich data structure
> > instead of a list of tuples? That data structure could have the cached
> > parents and other reusable cache data (which could be passed into
> > subsequent calls if needed).
> 
> It's already returning "fctx", which is rich...

True. To be clear, I don't like the idea to make the API more complicated
just because our rich data structure "fctx" has a poor performance.

Patch

diff --git a/mercurial/hgweb/webcommands.py b/mercurial/hgweb/webcommands.py
--- a/mercurial/hgweb/webcommands.py
+++ b/mercurial/hgweb/webcommands.py
@@ -861,12 +861,24 @@  def annotate(web, req, tmpl):
     f = fctx.path()
     parity = paritygen(web.stripecount)
 
+    # parents() is called once per line and several lines likely belong to
+    # same revision. So it is worth caching.
+    # TODO there are still redundant operations within basefilectx.parents()
+    # and from the fctx.annotate() call itself that could be cached.
+    parentscache = {}
     def parents(f):
-        for p in f.parents():
-            yield {
-                "node": p.hex(),
-                "rev": p.rev(),
-            }
+        rev = f.rev()
+        if rev not in parentscache:
+            parentscache[rev] = []
+            for p in f.parents():
+                entry = {
+                    'node': p.hex(),
+                    'rev': p.rev(),
+                }
+                parentscache[rev].append(entry)
+
+        for p in parentscache[rev]:
+            yield p
 
     def annotate(**map):
         if util.binary(fctx.data()):