Patchwork [2,of,4] revset: avoid "list(repo)" for efficiency on large repository

login
register
mail settings
Submitter Katsunori FUJIWARA
Date March 29, 2013, 4:16 p.m.
Message ID <bede15731ca703fdf8ca.1364573792@feefifofum>
Download mbox | patch
Permalink /patch/1220/
State Rejected
Headers show

Comments

Katsunori FUJIWARA - March 29, 2013, 4:16 p.m.
# HG changeset patch
# User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
# Date 1364573132 -32400
# Node ID bede15731ca703fdf8ca8815793029766c3fc893
# Parent  4cf0465cd64ff196ad83ec2d10b6e13ed89c2913
revset: avoid "list(repo)" for efficiency on large repository

Before this patch, "list(repo)" (or similar one) is used as "subset"
argument of the function returned by "revset.match()", to mean "whole
revisions in repository". This causes immediate list creation object
containing integers corresponding to revisions in repository, even
though repository may have many revisions in itself.

Not only for avoiding immediate list creation, but also for
convenience, this patch chooses making "subset" argument optional,
instead of replacing "list(repo)" with "revset._safesubset(repo)" on
the caller side.

Before this patch, None can't be used as "subset" argument because
"len()", "x in subset" and so on should be applicable on it. So, this
patch can make "None" default value of "subset" argument safely.

This patch also accepts "repo" itself as "subset". It should be more
readable than "subset=None" to mean "whole revisions in repository"
explicitly on the caller side.

Results of "hg perfrevset" (before/after this patch) on the repository
containing 40000 revisions are shown below:

  - "max(tip)":
    ! wall 0.000000 comb 0.000000 user 0.000000 sys 0.000000 (best of 1969)
    ! wall 0.000000 comb 0.000000 user 0.000000 sys 0.000000 (best of 30000)

Patch

diff -r 4cf0465cd64f -r bede15731ca7 mercurial/commands.py
--- a/mercurial/commands.py	Sat Mar 30 01:05:32 2013 +0900
+++ b/mercurial/commands.py	Sat Mar 30 01:05:32 2013 +0900
@@ -2367,7 +2367,7 @@ 
         if newtree != tree:
             ui.note(revset.prettyformat(newtree), "\n")
     func = revset.match(ui, expr)
-    for c in func(repo, range(len(repo))):
+    for c in func(repo):
         ui.write("%s\n" % c)
 
 @command('debugsetparents', [], _('REV1 [REV2]'))
diff -r 4cf0465cd64f -r bede15731ca7 mercurial/localrepo.py
--- a/mercurial/localrepo.py	Sat Mar 30 01:05:32 2013 +0900
+++ b/mercurial/localrepo.py	Sat Mar 30 01:05:32 2013 +0900
@@ -403,7 +403,7 @@ 
         '''Return a list of revisions matching the given revset'''
         expr = revset.formatspec(expr, *args)
         m = revset.match(None, expr)
-        return [r for r in m(self, list(self))]
+        return [r for r in m(self)]
 
     def set(self, expr, *args):
         '''
diff -r 4cf0465cd64f -r bede15731ca7 mercurial/revset.py
--- a/mercurial/revset.py	Sat Mar 30 01:05:32 2013 +0900
+++ b/mercurial/revset.py	Sat Mar 30 01:05:32 2013 +0900
@@ -1838,7 +1838,9 @@ 
     if ui:
         tree = findaliases(ui, tree)
     weight, tree = optimize(tree, True)
-    def mfunc(repo, subset):
+    def mfunc(repo, subset=None):
+        if subset is None or repo == subset:
+            subset = _safesubset(repo)
         return getset(repo, subset, tree)
     return mfunc
 
diff -r 4cf0465cd64f -r bede15731ca7 mercurial/scmutil.py
--- a/mercurial/scmutil.py	Sat Mar 30 01:05:32 2013 +0900
+++ b/mercurial/scmutil.py	Sat Mar 30 01:05:32 2013 +0900
@@ -618,7 +618,7 @@ 
 
         # fall through to new-style queries if old-style fails
         m = revset.match(repo.ui, spec)
-        dl = [r for r in m(repo, list(repo)) if r not in seen]
+        dl = [r for r in m(repo) if r not in seen]
         l.extend(dl)
         seen.update(dl)