Patchwork revset: add exclusive revset

login
register
mail settings
Submitter Durham Goode
Date Nov. 16, 2013, 4:36 a.m.
Message ID <029191a9eb775c586ed0.1384576570@dev350.prn1.facebook.com>
Download mbox | patch
Permalink /patch/2990/
State Superseded
Headers show

Comments

Durham Goode - Nov. 16, 2013, 4:36 a.m.
# HG changeset patch
# User Durham Goode <durham@fb.com>
# Date 1384575318 28800
#      Fri Nov 15 20:15:18 2013 -0800
# Node ID 029191a9eb775c586ed02c201d0c555053ea44da
# Parent  aa80446aacc3b1574211649cd8f190250b6b04b3
revset: add exclusive revset

Adds a exclusive revset that has two forms:

exclusive(<set>) is equivalent to "::<set> - ::(heads() - <set>)"

exclusive(<include>,<exclude>) is equivalent to "::<include> - ::<exclude>"

On a large repo, this implementation can process/traverse 50,000 revs in 0.7
seconds, versus 4.2 seconds using "::<include> - ::<exclude>".
Durham Goode - Nov. 16, 2013, 5:07 p.m.
On 11/15/13 11:36 PM, "Durham Goode" <durham@fb.com> wrote:

># HG changeset patch
># User Durham Goode <durham@fb.com>
># Date 1384575318 28800
>#      Fri Nov 15 20:15:18 2013 -0800
># Node ID 029191a9eb775c586ed02c201d0c555053ea44da
># Parent  aa80446aacc3b1574211649cd8f190250b6b04b3
>revset: add exclusive revset
>
>Adds a exclusive revset that has two forms:
>
>exclusive(<set>) is equivalent to "::<set> - ::(heads() - <set>)"
>
>exclusive(<include>,<exclude>) is equivalent to "::<include> -
>::<exclude>"
>
>On a large repo, this implementation can process/traverse 50,000 revs in
>0.7
>seconds, versus 4.2 seconds using "::<include> - ::<exclude>".

Talked in person with people about this.

- going to tweak the first form to be "::<set> - ::(heads() -
heads(<set>::))"
- going to use the built in ancestor.missingancestor. It's algorithmically
a little worse, but in practice is faster and less code.
- going to find a better name
Martin Geisler - Nov. 20, 2013, 9:59 a.m.
Durham Goode <durham@fb.com> writes:

> # HG changeset patch
> # User Durham Goode <durham@fb.com>
> # Date 1384575318 28800
> #      Fri Nov 15 20:15:18 2013 -0800
> # Node ID 029191a9eb775c586ed02c201d0c555053ea44da
> # Parent  aa80446aacc3b1574211649cd8f190250b6b04b3
> revset: add exclusive revset
>
> Adds a exclusive revset that has two forms:
>
> exclusive(<set>) is equivalent to "::<set> - ::(heads() - <set>)"
>
> exclusive(<include>,<exclude>) is equivalent to "::<include> - ::<exclude>"
>
> On a large repo, this implementation can process/traverse 50,000 revs in 0.7
> seconds, versus 4.2 seconds using "::<include> - ::<exclude>".

Very nice!

Since it sounds like the new revset computes the exact same result as
"::<include> - ::<exclude>", do you think we could let the revset
optimizer recognize "::<include> - ::<exclude>" and turn it into the new
revset automatically?

Patch

diff --git a/mercurial/revset.py b/mercurial/revset.py
--- a/mercurial/revset.py
+++ b/mercurial/revset.py
@@ -5,7 +5,7 @@ 
 # This software may be used and distributed according to the terms of the
 # GNU General Public License version 2 or any later version.
 
-import re
+import re, heapq
 import parser, util, error, discovery, hbisect, phases
 import node
 import match as matchmod
@@ -346,6 +346,60 @@ 
     kind, pattern, matcher = _substringmatcher(n)
     return [r for r in subset if matcher(encoding.lower(repo[r].user()))]
 
+def exclusive(repo, subset, x):
+    """``exclusive(set, [set])
+    Changesets that are ancestors of the first set, but not the second.
+    If no second set is specified, it is assumed to be the heads not
+    in the first set.
+    """
+    cl = repo.changelog
+    args = getargs(x, 1, 2, _('exclusive takes one or two arguments'))
+    include = set(getset(repo, list(repo), args[0]))
+    if len(args) == 1:
+        exclude = [rev for rev in cl.headrevs() if not rev in include]
+    else:
+        exclude = getset(repo, list(repo), args[1])
+
+    def buildheap(revs):
+        # heapq is only a min heap, so use negative revs
+        heap = list([-rev for rev in revs])
+        heapq.heapify(heap)
+        return heap
+
+    def pushparents(heap, rev):
+        for parentrev in cl.parentrevs(rev):
+            if parentrev != node.nullrev:
+                heapq.heappush(heap, -parentrev)
+
+    def pop(heap):
+        rev = heapq.heappop(heap)
+        # the heap can contain duplicates, so skip them
+        while heap and heap[0] == rev:
+            heapq.heappop(heap)
+
+    includeheap = buildheap(include)
+    excludeheap = buildheap(exclude)
+
+    # Walk down the include/exclude tree, highest rev first.
+    # Accept included revs, as long as they aren't in exclude.
+    results = []
+    while includeheap:
+        maxinclude = -includeheap[0]
+        maxexclude = -excludeheap[0] if excludeheap else nullrev
+        if maxexclude == maxinclude:
+            pop(includeheap)
+            pop(excludeheap)
+            pushparents(excludeheap, maxexclude)
+        elif maxinclude > maxexclude:
+            results.append(maxinclude)
+            pop(includeheap)
+            pushparents(includeheap, maxinclude)
+        else:
+            pop(excludeheap)
+            pushparents(excludeheap, maxexclude)
+
+    return results
+
 def bisect(repo, subset, x):
     """``bisect(string)``
     Changesets marked in the specified bisect status:
@@ -1538,6 +1592,7 @@ 
     "ancestors": ancestors,
     "_firstancestors": _firstancestors,
     "author": author,
+    "exclusive": exclusive,
     "bisect": bisect,
     "bisected": bisected,
     "bookmark": bookmark,
diff --git a/tests/test-revset.t b/tests/test-revset.t
--- a/tests/test-revset.t
+++ b/tests/test-revset.t
@@ -284,6 +284,20 @@ 
   7
   8
   9
+  $ log 'exclusive(9)'
+  9
+  8
+  $ log 'exclusive(9, 5)'
+  9
+  8
+  4
+  2
+  $ log 'exclusive(7 + 9, 5 + 2)'
+  9
+  8
+  7
+  6
+  4
   $ log 'file("b*")'
   1
   4