Patchwork [6,of,7,V2] revset: detect integer list on parsing

mail settings
Submitter Boris Feld
Date Jan. 14, 2019, 12:13 p.m.
Message ID <1def212ed730ce024963.1547467996@localhost.localdomain>
Download mbox | patch
Permalink /patch/37726/
State Superseded
Headers show


Boris Feld - Jan. 14, 2019, 12:13 p.m.
# HG changeset patch
# User Boris Feld <>
# Date 1546575973 -3600
#      Fri Jan 04 05:26:13 2019 +0100
# Node ID 1def212ed730ce024963fd41f2d341f584521bbf
# Parent  2c5c8e76f2a95b10f367e50c17ae903471337b69
# EXP-Topic revs-efficiency
# Available At
#              hg pull -r 1def212ed730
revset: detect integer list on parsing

Right now, using "%ld" with `repo.revs("…%ld…", somerevs)` is very
inefficient, all items in `somerevs` will be serialized to ascii and then
reparsed as integers. If `somerevs` contains just an handful of entry this is
fine, however, when you get to thousands or hundreds of thousands of revisions
this becomes very slow.

To avoid this serialization we need to first detect this situation. The code
involved in the whole process is quite complex so we start simple and focus on
some "simple" but widespread cases.

So far we only detect the situation and don't do anything special about it.
The singled out will be serialized in `formatspec` in the same way as before.


diff --git a/mercurial/ b/mercurial/
--- a/mercurial/
+++ b/mercurial/
@@ -15,6 +15,7 @@  from . import (
+    smartset,
 from .utils import (
@@ -682,6 +683,10 @@  def formatspec(expr, *args):
     for t, arg in parsed:
         if t is None:
+        elif t == 'baseset':
+            if isinstance(arg, set):
+                arg = sorted(arg)
+            ret.append(_formatintlist(list(arg)))
             raise error.ProgrammingError("unknown revspec item type: %r" % t)
     return b''.join(ret)
@@ -692,7 +697,8 @@  def _parseargs(expr, args):
     return a list of tuple [(arg-type, arg-value)]
     Arg-type can be:
-    * None: a string ready to be concatenated into a final spec
+    * None:      a string ready to be concatenated into a final spec
+    * 'baseset': an iterable of revisions
     expr = pycompat.bytestr(expr)
     argiter = iter(args)
@@ -722,10 +728,25 @@  def _parseargs(expr, args):
         if f:
             # a list of some type, might be expensive, do not replace
             pos += 1
+            islist = (d == 'l')
                 d = expr[pos]
             except IndexError:
                 raise error.ParseError(_('incomplete revspec format character'))
+            if islist and d == 'd' and arg:
+                # special case, we might be able to speedup the list of int case
+                #
+                # We have been very conservative here for the first version.
+                # Other types (eg: generator) are probably fine, but we did not
+                # wanted to take any risk>
+                safeinputtype = (list, tuple, set, smartset.abstractsmartset)
+                if isinstance(arg, safeinputtype):
+                    # we don't create a baseset yet, because it come with an
+                    # extra cost. If we are going to serialize it we better
+                    # skip it.
+                    ret.append(('baseset', arg))
+                    pos += 1
+                    continue
                 ret.append((None, f(list(arg), d)))
             except (TypeError, ValueError):