Patchwork [4,of,6] revlog: add functions that use ancestor.lazyancestorset

login
register
mail settings
Submitter Siddharth Agarwal
Date Dec. 14, 2012, 7:35 p.m.
Message ID <530ebaedf4047fbcddef.1355513754@sid0x220>
Download mbox | patch
Permalink /patch/109/
State Accepted
Headers show

Comments

Siddharth Agarwal - Dec. 14, 2012, 7:35 p.m.
# HG changeset patch
# User Siddharth Agarwal <sid0 at fb.com>
# Date 1355510978 28800
# Node ID 530ebaedf4047fbcddef10d98dbb0fb0e68dc7a5
# Parent  74b0e1f732cf868681ace3e2bc44d5eb54e1729a
revlog: add functions that use ancestor.lazyancestorset

In a linear repository with over 400,000 commits, hg perfancestorset takes
0.80 seconds no matter how far behind we're looking. With this patch, hg
perfancestorset -- X takes:

    Rev X       Time
       -1      0.00s
    -4000      0.01s
   -20000      0.04s
   -80000      0.17s
  -200000      0.43s
  -300000      0.69s
        0      0.88s

Thus, for revisions close to tip, we're up to several orders of magnitude
faster. At 0 we're around 10% slower.
Kevin Bullock - Dec. 14, 2012, 8:58 p.m.
On Dec 14, 2012, at 1:35 PM, Siddharth Agarwal wrote:

> # HG changeset patch
> # User Siddharth Agarwal <sid0 at fb.com>
> # Date 1355510978 28800
> # Node ID 530ebaedf4047fbcddef10d98dbb0fb0e68dc7a5
> # Parent  74b0e1f732cf868681ace3e2bc44d5eb54e1729a
> revlog: add functions that use ancestor.lazyancestorset
> 
> [...]
> diff -r 74b0e1f732cf -r 530ebaedf404 mercurial/revlog.py
> --- a/mercurial/revlog.py	Fri Dec 14 11:31:23 2012 -0800
> +++ b/mercurial/revlog.py	Fri Dec 14 10:49:38 2012 -0800
> @@ -369,6 +369,21 @@
>         for rev in self.ancestors(revs, stoprev):
>             yield rev
> 
> +    def ancestorset(self, revs, stoprev=0):
> +        """Return an object that resembles the set of ancestors of revs.
> +
> +        The only operation currently supported is membership (in). The set is
> +        computed lazily starting from tip and working in descending rev
> +        order."""
> +        return ancestor.lazyancestorset(revs, self.parentrevs, False,
> +                                        stoprev=stoprev)
> +
> +    def incancestorset(self, revs, stoprev=0):
> +        """Identical to ancestorset() except it also generates the revisions,
> +        'revs'"""
> +        return ancestor.lazyancestorset(revs, self.parentrevs, True,
> +                                        stoprev=stoprev)
> +

Are these actually going to be useful on _all_ revlogs, or should they go on changelog? So far the only uses I see are on changelog.

pacem in terris / ??? / ?????? / ????????? / ??
Kevin R. Bullock
Kevin Bullock - Dec. 14, 2012, 9:05 p.m.
On Dec 14, 2012, at 2:58 PM, Kevin Bullock wrote:

> On Dec 14, 2012, at 1:35 PM, Siddharth Agarwal wrote:
> 
>> # HG changeset patch
>> # User Siddharth Agarwal <sid0 at fb.com>
>> # Date 1355510978 28800
>> # Node ID 530ebaedf4047fbcddef10d98dbb0fb0e68dc7a5
>> # Parent  74b0e1f732cf868681ace3e2bc44d5eb54e1729a
>> revlog: add functions that use ancestor.lazyancestorset
>> 
>> [...]
>> diff -r 74b0e1f732cf -r 530ebaedf404 mercurial/revlog.py
>> --- a/mercurial/revlog.py	Fri Dec 14 11:31:23 2012 -0800
>> +++ b/mercurial/revlog.py	Fri Dec 14 10:49:38 2012 -0800
>> @@ -369,6 +369,21 @@
>>        for rev in self.ancestors(revs, stoprev):
>>            yield rev
>> 
>> +    def ancestorset(self, revs, stoprev=0):
>> +        """Return an object that resembles the set of ancestors of revs.
>> +
>> +        The only operation currently supported is membership (in). The set is
>> +        computed lazily starting from tip and working in descending rev
>> +        order."""
>> +        return ancestor.lazyancestorset(revs, self.parentrevs, False,
>> +                                        stoprev=stoprev)
>> +
>> +    def incancestorset(self, revs, stoprev=0):
>> +        """Identical to ancestorset() except it also generates the revisions,
>> +        'revs'"""
>> +        return ancestor.lazyancestorset(revs, self.parentrevs, True,
>> +                                        stoprev=stoprev)
>> +
> 
> Are these actually going to be useful on _all_ revlogs, or should they go on changelog? So far the only uses I see are on changelog.

On further reflection and seeing the usage of these in later patches, I guess I'd prefer if callers constructed ancestorsets themselves rather than adding these "convenience" methods into revlog. (They're the sort of methods that are likely to become in-convenient on later refactorings.)

pacem in terris / ??? / ?????? / ????????? / ??
Kevin R. Bullock
Siddharth Agarwal - Dec. 14, 2012, 9:11 p.m.
On 12/14/2012 12:58 PM, Kevin Bullock wrote:
> Are these actually going to be useful on _all_ revlogs, or should they 
> go on changelog? So far the only uses I see are on changelog.

I added them right below the ancestors and incancestors functions that 
already exist. (After discussing this with Bryan, I guess that that's 
really an argument for moving everything out to changelog.)

Patch

diff -r 74b0e1f732cf -r 530ebaedf404 contrib/perf.py
--- a/contrib/perf.py	Fri Dec 14 11:31:23 2012 -0800
+++ b/contrib/perf.py	Fri Dec 14 10:49:38 2012 -0800
@@ -82,7 +82,7 @@ 
     revs = repo.revs(revset)
     heads = repo.changelog.headrevs()
     def d():
-        s = set(repo.changelog.ancestors(heads))
+        s = repo.changelog.ancestorset(heads)
         for rev in revs:
             rev in s
     timer(d)
diff -r 74b0e1f732cf -r 530ebaedf404 mercurial/revlog.py
--- a/mercurial/revlog.py	Fri Dec 14 11:31:23 2012 -0800
+++ b/mercurial/revlog.py	Fri Dec 14 10:49:38 2012 -0800
@@ -369,6 +369,21 @@ 
         for rev in self.ancestors(revs, stoprev):
             yield rev
 
+    def ancestorset(self, revs, stoprev=0):
+        """Return an object that resembles the set of ancestors of revs.
+
+        The only operation currently supported is membership (in). The set is
+        computed lazily starting from tip and working in descending rev
+        order."""
+        return ancestor.lazyancestorset(revs, self.parentrevs, False,
+                                        stoprev=stoprev)
+
+    def incancestorset(self, revs, stoprev=0):
+        """Identical to ancestorset() except it also generates the revisions,
+        'revs'"""
+        return ancestor.lazyancestorset(revs, self.parentrevs, True,
+                                        stoprev=stoprev)
+
     def descendants(self, revs):
         """Generate the descendants of 'revs' in revision order.