Patchwork annotate: pre-calculate the "needed" dictionary

login
register
mail settings
Submitter Jun Wu
Date Sept. 2, 2016, 3:10 p.m.
Message ID <c9a9040c95add098eb8e.1472829030@x1c>
Download mbox | patch
Permalink /patch/16533/
State Changes Requested
Headers show

Comments

Jun Wu - Sept. 2, 2016, 3:10 p.m.
# HG changeset patch
# User Jun Wu <quark@fb.com>
# Date 1472826059 -3600
#      Fri Sep 02 15:20:59 2016 +0100
# Node ID c9a9040c95add098eb8e280fa60edaa766154141
# Parent  d130a38ef41f3c9e2d2f26df3535d89a93f87301
# Available At https://bitbucket.org/quark-zju/hg-draft
#              hg pull https://bitbucket.org/quark-zju/hg-draft -r c9a9040c95ad
annotate: pre-calculate the "needed" dictionary

The "needed" dict is used as a reference counter to free items in the giant
"hist" dict. However, currently it is not very accurate and can lead to
dropping "hist" items unnecessarily, for example, with the following DAG,

      -3-
     /   \
  --1--2--4--

The current algorithm will visit and calculate rev 1 twice, undesired.

However, simply removing "needed" is not okay, because it will consume 10x
memory:

  # without any change
  % HGRCPATH= lrun ./hg annotate mercurial/commands.py -r d130a38 3>&2 [1]
  MEMORY   49074176
  CPUTIME  9.213
  REALTIME 9.270

  # with "needed" removed
  MEMORY   637673472
  CPUTIME  8.164
  REALTIME 8.249

This patch moves "needed" (and "pcache") calculation to a separate DFS to
address the issue. It improves perf (reuses hist) while maintaining low
memory usage:

  # with this patch applied
  MEMORY   48164864
  CPUTIME  7.871
  REALTIME 7.956

[1]: lrun is a lightweight sandbox built on Linux cgroup and namespace. It's
     used to measure CPU and memory usage here. Source code is available at
     github.com/quark-zju/lrun.
Yuya Nishihara - Sept. 4, 2016, 3:21 p.m.
On Fri, 2 Sep 2016 16:10:30 +0100, Jun Wu wrote:
> # HG changeset patch
> # User Jun Wu <quark@fb.com>
> # Date 1472826059 -3600
> #      Fri Sep 02 15:20:59 2016 +0100
> # Node ID c9a9040c95add098eb8e280fa60edaa766154141
> # Parent  d130a38ef41f3c9e2d2f26df3535d89a93f87301
> # Available At https://bitbucket.org/quark-zju/hg-draft
> #              hg pull https://bitbucket.org/quark-zju/hg-draft -r c9a9040c95ad
> annotate: pre-calculate the "needed" dictionary
> 
> The "needed" dict is used as a reference counter to free items in the giant
> "hist" dict. However, currently it is not very accurate and can lead to
> dropping "hist" items unnecessarily, for example, with the following DAG,
> 
>       -3-
>      /   \
>   --1--2--4--
> 
> The current algorithm will visit and calculate rev 1 twice, undesired.

Also that would generate wrong result probably because hist[f] was deleted too
early and pcache[f] was emptied. Compare "hg annotate mercurial/commands.py"
for example.

Can you add a test case for that? This patch appears to fix old bug, so I think
it's good for stable.

> +        # 1st DFS pre-calculates pcache and needed
>          visit = [base]
> -        hist = {}
>          pcache = {}
>          needed = {base: 1}
>          while visit:
>              f = visit[-1]
> -            pcached = f in pcache
> -            if not pcached:
> -                pcache[f] = parents(f)
> +            visit.pop()

Nit: could be f = visit.pop()

> +            pl = parents(f)
> +            pcache[f] = pl
> +            for p in pl:
> +                if p not in needed:
> +                    needed[p] = 1
> +                else:
> +                    needed[p] += 1
> +                if p not in pcache:
> +                    visit.append(p)

There can be more than one route to 'p' from 'visit[]', so 'p not in pcache'
isn't enough to avoid unneeded calculation of 'parents(f)', and needed[p] can
be over-counted.
Jun Wu - Sept. 4, 2016, 8:50 p.m.
Good catch about both issues! I will prepare a V2.

Excerpts from Yuya Nishihara's message of 2016-09-05 00:21:45 +0900:
> On Fri, 2 Sep 2016 16:10:30 +0100, Jun Wu wrote:
> > # HG changeset patch
> > # User Jun Wu <quark@fb.com>
> > # Date 1472826059 -3600
> > #      Fri Sep 02 15:20:59 2016 +0100
> > # Node ID c9a9040c95add098eb8e280fa60edaa766154141
> > # Parent  d130a38ef41f3c9e2d2f26df3535d89a93f87301
> > # Available At https://bitbucket.org/quark-zju/hg-draft 
> > #              hg pull https://bitbucket.org/quark-zju/hg-draft  -r c9a9040c95ad
> > annotate: pre-calculate the "needed" dictionary
> > 
> > The "needed" dict is used as a reference counter to free items in the giant
> > "hist" dict. However, currently it is not very accurate and can lead to
> > dropping "hist" items unnecessarily, for example, with the following DAG,
> > 
> >       -3-
> >      /   \
> >   --1--2--4--
> > 
> > The current algorithm will visit and calculate rev 1 twice, undesired.
> 
> Also that would generate wrong result probably because hist[f] was deleted too
> early and pcache[f] was emptied. Compare "hg annotate mercurial/commands.py"
> for example.
> 
> Can you add a test case for that? This patch appears to fix old bug, so I think
> it's good for stable.
> 
> > +        # 1st DFS pre-calculates pcache and needed
> >          visit = [base]
> > -        hist = {}
> >          pcache = {}
> >          needed = {base: 1}
> >          while visit:
> >              f = visit[-1]
> > -            pcached = f in pcache
> > -            if not pcached:
> > -                pcache[f] = parents(f)
> > +            visit.pop()
> 
> Nit: could be f = visit.pop()
> 
> > +            pl = parents(f)
> > +            pcache[f] = pl
> > +            for p in pl:
> > +                if p not in needed:
> > +                    needed[p] = 1
> > +                else:
> > +                    needed[p] += 1
> > +                if p not in pcache:
> > +                    visit.append(p)
> 
> There can be more than one route to 'p' from 'visit[]', so 'p not in pcache'
> isn't enough to avoid unneeded calculation of 'parents(f)', and needed[p] can
> be over-counted.

Patch

diff --git a/mercurial/context.py b/mercurial/context.py
--- a/mercurial/context.py
+++ b/mercurial/context.py
@@ -991,13 +991,26 @@  class basefilectx(object):
         # depth-first search.
 
+        # 1st DFS pre-calculates pcache and needed
         visit = [base]
-        hist = {}
         pcache = {}
         needed = {base: 1}
         while visit:
             f = visit[-1]
-            pcached = f in pcache
-            if not pcached:
-                pcache[f] = parents(f)
+            visit.pop()
+            pl = parents(f)
+            pcache[f] = pl
+            for p in pl:
+                if p not in needed:
+                    needed[p] = 1
+                else:
+                    needed[p] += 1
+                if p not in pcache:
+                    visit.append(p)
+
+        # 2nd DFS does the actual annotate
+        visit[:] = [base]
+        hist = {}
+        while visit:
+            f = visit[-1]
 
             ready = True
@@ -1007,6 +1020,4 @@  class basefilectx(object):
                     ready = False
                     visit.append(p)
-                if not pcached:
-                    needed[p] = needed.get(p, 0) + 1
             if ready:
                 visit.pop()