Patchwork [STABLE] revset: add an optimised baseset.__contains__ (issue4371)

login
register
mail settings
Submitter Pierre-Yves David
Date Sept. 18, 2014, 6 a.m.
Message ID <2d35709521152e0da815.1411020051@marginatus.alto.octopoid.net>
Download mbox | patch
Permalink /patch/5866/
State Accepted
Headers show

Comments

Pierre-Yves David - Sept. 18, 2014, 6 a.m.
# HG changeset patch
# User Pierre-Yves David <pierre-yves.david@fb.com>
# Date 1410937169 25200
#      Tue Sep 16 23:59:29 2014 -0700
# Branch stable
# Node ID 2d35709521152e0da815f59c37daa0032704adaa
# Parent  4bbcee186fc64e39c51f90e8256466885a3a4d65
revset: add an optimised baseset.__contains__ (issue4371)


The baseset class is based on a python list. This means that base.__contains__
was absolutely as crappy as list.__contains__. We now rely on __contains__ from
the underlying set.

This will avoid having to explicitly convert the baseset to a set (using
baseset.set()) whenever one want fast membership test.

Apparently there is already code that forgot to do such conversion since we
observe a massive speedup in some test.

revset #25: roots((0::) - (0::tip))
0) wall 2.079454 comb 2.080000 user 2.080000 sys 0.000000 (best of 5)
1) wall 0.132970 comb 0.130000 user 0.130000 sys 0.000000 (best of 65)

No regression is observed in benchmarks.

This change improve the issue4371 back to acceptable situation (but are still
slower than manual substraction)
Pierre-Yves David - Sept. 19, 2014, 4:34 p.m.
On 09/17/2014 11:00 PM, Pierre-Yves David wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@fb.com>
> # Date 1410937169 25200
> #      Tue Sep 16 23:59:29 2014 -0700
> # Branch stable
> # Node ID 2d35709521152e0da815f59c37daa0032704adaa
> # Parent  4bbcee186fc64e39c51f90e8256466885a3a4d65
> revset: add an optimised baseset.__contains__ (issue4371)
>
>
> The baseset class is based on a python list. This means that base.__contains__
> was absolutely as crappy as list.__contains__. We now rely on __contains__ from
> the underlying set.
>
> This will avoid having to explicitly convert the baseset to a set (using
> baseset.set()) whenever one want fast membership test.
>
> Apparently there is already code that forgot to do such conversion since we
> observe a massive speedup in some test.

The said code is `baseset.filter` that creates a `lazyset(self, 
filter)`. The lazyset then do `if x in self._subset`.

We could use self.set() there but it would cause issues with action 
related to ordering on lazyset. In any case I still believe that having 
a fast __contains__ method on all smartset is the way to go.

> revset #25: roots((0::) - (0::tip))
> 0) wall 2.079454 comb 2.080000 user 2.080000 sys 0.000000 (best of 5)
> 1) wall 0.132970 comb 0.130000 user 0.130000 sys 0.000000 (best of 65)
>
> No regression is observed in benchmarks.
>
> This change improve the issue4371 back to acceptable situation (but are still
> slower than manual substraction)
>
> diff --git a/mercurial/revset.py b/mercurial/revset.py
> --- a/mercurial/revset.py
> +++ b/mercurial/revset.py
> @@ -2226,10 +2226,14 @@ class baseset(list):
>           This is part of the mandatory API for smartset."""
>           if not self._set:
>               self._set = set(self)
>           return self._set
>
> +    @util.propertycache
> +    def __contains__(self):
> +        return self.set().__contains__
> +
>       def __sub__(self, other):
>           """Returns a new object with the substraction of the two collections.
>
>           This is part of the mandatory API for smartset."""
>           # If we are operating on 2 baseset, do the computation now since all
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
>
Durham Goode - Sept. 22, 2014, 11:38 p.m.
On 9/17/14, 11:00 PM, Pierre-Yves David wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@fb.com>
> # Date 1410937169 25200
> #      Tue Sep 16 23:59:29 2014 -0700
> # Branch stable
> # Node ID 2d35709521152e0da815f59c37daa0032704adaa
> # Parent  4bbcee186fc64e39c51f90e8256466885a3a4d65
> revset: add an optimised baseset.__contains__ (issue4371)
>
>
Pushed this to stable in the clowncopter repository.

Patch

diff --git a/mercurial/revset.py b/mercurial/revset.py
--- a/mercurial/revset.py
+++ b/mercurial/revset.py
@@ -2226,10 +2226,14 @@  class baseset(list):
         This is part of the mandatory API for smartset."""
         if not self._set:
             self._set = set(self)
         return self._set
 
+    @util.propertycache
+    def __contains__(self):
+        return self.set().__contains__
+
     def __sub__(self, other):
         """Returns a new object with the substraction of the two collections.
 
         This is part of the mandatory API for smartset."""
         # If we are operating on 2 baseset, do the computation now since all