Patchwork [1,of,5] revlog: return lazy set from findcommonmissing

login
register
mail settings
Submitter Durham Goode
Date Nov. 12, 2013, 4:14 a.m.
Message ID <41801f3f1968e702e23d.1384229663@dev350.prn1.facebook.com>
Download mbox | patch
Permalink /patch/2918/
State Superseded
Commit eeba4eaf0716842f48beec32148b60ed17885500
Headers show

Comments

Durham Goode - Nov. 12, 2013, 4:14 a.m.
# HG changeset patch
# User Durham Goode <durham@fb.com>
# Date 1384216802 28800
#      Mon Nov 11 16:40:02 2013 -0800
# Node ID 41801f3f1968e702e23d1f91bbe83e1f2b9ca133
# Parent  aa80446aacc3b1574211649cd8f190250b6b04b3
revlog: return lazy set from findcommonmissing

When computing the commonmissing, it greedily computes the entire set
immediately. On a large repo where the majority of history is irrelevant, this
causes a significant slow down.

Replacing it with a lazy set makes amend go from 11 seconds to 8.7 seconds.
Augie Fackler - Nov. 12, 2013, 7:26 p.m.
On Mon, Nov 11, 2013 at 08:14:23PM -0800, Durham Goode wrote:
> # HG changeset patch
> # User Durham Goode <durham@fb.com>
> # Date 1384216802 28800
> #      Mon Nov 11 16:40:02 2013 -0800
> # Node ID 41801f3f1968e702e23d1f91bbe83e1f2b9ca133
> # Parent  aa80446aacc3b1574211649cd8f190250b6b04b3
> revlog: return lazy set from findcommonmissing
>
> When computing the commonmissing, it greedily computes the entire set
> immediately. On a large repo where the majority of history is irrelevant, this
> causes a significant slow down.
>
> Replacing it with a lazy set makes amend go from 11 seconds to 8.7 seconds.
>
> diff --git a/mercurial/revlog.py b/mercurial/revlog.py
> --- a/mercurial/revlog.py
> +++ b/mercurial/revlog.py
> @@ -401,7 +401,29 @@
>          heads = [self.rev(n) for n in heads]
>
>          # we want the ancestors, but inclusive
> -        has = set(self.ancestors(common))
> +        class lazyset(object):
> +            def __init__(self, lazyset):
> +                self.set = set()
> +                self.lazyset = lazyset

these names are at best a sin against scoping and readability. Can you
name this parameter and attribute more descriptively?

> +
> +            def __contains__(self, value):
> +                return value in self.set or value in self.lazyset
> +
> +            def __iter__(self):
> +                set = self.set
> +                for r in set:
> +                    yield r
> +                for r in self.lazyset:
> +                    if not r in set:
> +                        yield r
> +
> +            def add(self, value):
> +                self.set.add(value)
> +
> +            def update(self, values):
> +                self.set.update(values)
> +
> +        has = lazyset(self.ancestors(common))
>          has.add(nullrev)
>          has.update(common)
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/revlog.py b/mercurial/revlog.py
--- a/mercurial/revlog.py
+++ b/mercurial/revlog.py
@@ -401,7 +401,29 @@ 
         heads = [self.rev(n) for n in heads]
 
         # we want the ancestors, but inclusive
-        has = set(self.ancestors(common))
+        class lazyset(object):
+            def __init__(self, lazyset):
+                self.set = set()
+                self.lazyset = lazyset
+
+            def __contains__(self, value):
+                return value in self.set or value in self.lazyset
+
+            def __iter__(self):
+                set = self.set
+                for r in set:
+                    yield r
+                for r in self.lazyset:
+                    if not r in set:
+                        yield r
+
+            def add(self, value):
+                self.set.add(value)
+
+            def update(self, values):
+                self.set.update(values)
+
+        has = lazyset(self.ancestors(common))
         has.add(nullrev)
         has.update(common)