Patchwork [STABLE] scmutil: avoid quadratic membership testing (issue5969)

login
register
mail settings
Submitter Gregory Szorc
Date Aug. 25, 2018, 1:23 a.m.
Message ID <db8e86a65460c9bc4794.1535160185@ubuntu-vm-main>
Download mbox | patch
Permalink /patch/34041/
State New
Headers show

Comments

Gregory Szorc - Aug. 25, 2018, 1:23 a.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1535160115 25200
#      Fri Aug 24 18:21:55 2018 -0700
# Branch stable
# Node ID db8e86a65460c9bc4794afc4cd6c7e4bb69e3b0b
# Parent  bd63ada7e1f838d7a579edcbd8e3c8ff7ec46a43
scmutil: avoid quadratic membership testing (issue5969)

tr.changes['revs'] is an xrange, which has an O(n) __contains__
implementation. The `rev not in newrevs` lookup a few lines below
will therefore be O(n^2) if all incoming changesets are public.

This issue isn't present on @ because 45e05d39d9ce introduced
a custom type implementing an xrange primitive with O(1) contains
and switched tr.changes['revs'] to be an instance of that type.

We work around the problem on the stable branch by casting the
xrange to a set. This is a bit hacky because it requires allocating
memory to hold each integer in the range. But we are already
holding the full set of pulled revision numbers in memory
multiple times (such as in `tr.changes['phases']`). So this is
a relatively minor problem.

This issue has been present since the phases reporting code was
introduced in the 4.7 cycle by eb9835014d20.

This change should be reverted/ignored when stable is merged into
default.

On the mozilla-unified repository with 483492 changesets, `hg clone`
time improves substantially:

before: 1843.700s user; 29.810s sys
after:   461.170s user; 29.360s sys
via Mercurial-devel - Aug. 25, 2018, 5:18 a.m.
On Fri, Aug 24, 2018 at 6:23 PM Gregory Szorc <gregory.szorc@gmail.com>
wrote:

> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1535160115 25200
> #      Fri Aug 24 18:21:55 2018 -0700
> # Branch stable
> # Node ID db8e86a65460c9bc4794afc4cd6c7e4bb69e3b0b
> # Parent  bd63ada7e1f838d7a579edcbd8e3c8ff7ec46a43
> scmutil: avoid quadratic membership testing (issue5969)
>

Queueing this, thanks.

Patch

diff --git a/mercurial/scmutil.py b/mercurial/scmutil.py
--- a/mercurial/scmutil.py
+++ b/mercurial/scmutil.py
@@ -1565,7 +1565,10 @@  def registersummarycallback(repo, otr, t
             """Report statistics of phase changes for changesets pre-existing
             pull/unbundle.
             """
-            newrevs = tr.changes.get('revs', xrange(0, 0))
+            # TODO set() is only appropriate for 4.7 since revs post
+            # 45e05d39d9ce is a pycompat.membershiprange, which has O(n)
+            # membership testing.
+            newrevs = set(tr.changes.get('revs', xrange(0, 0)))
             phasetracking = tr.changes.get('phases', {})
             if not phasetracking:
                 return