Patchwork [1,of,3,V3] revlog: return lazy set from findcommonmissing

login
register
mail settings
Submitter Durham Goode
Date Nov. 16, 2013, 8:29 p.m.
Message ID <e243df7f91498e4423c5.1384633784@dev350.prn1.facebook.com>
Download mbox | patch
Permalink /patch/2992/
State Accepted
Commit eeba4eaf0716842f48beec32148b60ed17885500
Headers show

Comments

Durham Goode - Nov. 16, 2013, 8:29 p.m.
# HG changeset patch
# User Durham Goode <durham@fb.com>
# Date 1384216802 28800
#      Mon Nov 11 16:40:02 2013 -0800
# Node ID e243df7f91498e4423c5854c3f38322854304760
# Parent  aa80446aacc3b1574211649cd8f190250b6b04b3
revlog: return lazy set from findcommonmissing

When computing the commonmissing, it greedily computes the entire set
immediately. On a large repo where the majority of history is irrelevant, this
causes a significant slow down.

Replacing it with a lazy set makes amend go from 11 seconds to 8.7 seconds.

Patch

diff --git a/mercurial/revlog.py b/mercurial/revlog.py
--- a/mercurial/revlog.py
+++ b/mercurial/revlog.py
@@ -401,7 +401,29 @@ 
         heads = [self.rev(n) for n in heads]
 
         # we want the ancestors, but inclusive
-        has = set(self.ancestors(common))
+        class lazyset(object):
+            def __init__(self, lazyvalues):
+                self.addedvalues = set()
+                self.lazyvalues = lazyvalues
+
+            def __contains__(self, value):
+                return value in self.addedvalues or value in self.lazyvalues
+
+            def __iter__(self):
+                added = self.addedvalues
+                for r in added:
+                    yield r
+                for r in self.lazyvalues:
+                    if not r in added:
+                        yield r
+
+            def add(self, value):
+                self.addedvalues.add(value)
+
+            def update(self, values):
+                self.addedvalues.update(values)
+
+        has = lazyset(self.ancestors(common))
         has.add(nullrev)
         has.update(common)