Patchwork [3,of,3] changegroup: switch to manifestctx.readstoragenewnodes() (issue5367)

login
register
mail settings
Submitter Gregory Szorc
Date March 14, 2017, 7:33 p.m.
Message ID <58091ccc1d551152d4c1.1489519988@ubuntu-vm-main>
Download mbox | patch
Permalink /patch/19337/
State Accepted
Headers show

Comments

Gregory Szorc - March 14, 2017, 7:33 p.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1489519968 25200
#      Tue Mar 14 12:32:48 2017 -0700
# Node ID 58091ccc1d551152d4c135db1a394c1763bb24c4
# Parent  c130f8c7496a823c92d3d71880e5beb9fb60c0f7
changegroup: switch to manifestctx.readstoragenewnodes() (issue5367)

readdelta() could return a delta against an empty manifest, which
would result in scanning every single file in the manifest.

We now have an API for returning a more minimal delta. So we
switch to it.
Durham Goode - March 15, 2017, 5:39 p.m.
On 3/14/17 12:33 PM, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1489519968 25200
> #      Tue Mar 14 12:32:48 2017 -0700
> # Node ID 58091ccc1d551152d4c135db1a394c1763bb24c4
> # Parent  c130f8c7496a823c92d3d71880e5beb9fb60c0f7
> changegroup: switch to manifestctx.readstoragenewnodes() (issue5367)
>
> readdelta() could return a delta against an empty manifest, which
> would result in scanning every single file in the manifest.
>
> We now have an API for returning a more minimal delta. So we
> switch to it.
>

The series looks good to me.  I wish there was a better way to allow 
efficient access to the manifest differences, but I can't really think 
of one.

For treemanifest, I think we can just implement it as mf1.diff(mf2) at 
all times, since that should be relatively fast, especially if things 
are in memory.
via Mercurial-devel - March 15, 2017, 7:42 p.m.
On Tue, Mar 14, 2017 at 12:33 PM, Gregory Szorc <gregory.szorc@gmail.com> wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1489519968 25200
> #      Tue Mar 14 12:32:48 2017 -0700
> # Node ID 58091ccc1d551152d4c135db1a394c1763bb24c4
> # Parent  c130f8c7496a823c92d3d71880e5beb9fb60c0f7
> changegroup: switch to manifestctx.readstoragenewnodes() (issue5367)

Could you include some perf numbers here? You said on IRC that it
could take minutes before, which is much worse than I expected. How
long would such a case with this patch applied?

Patch

diff --git a/mercurial/changegroup.py b/mercurial/changegroup.py
--- a/mercurial/changegroup.py
+++ b/mercurial/changegroup.py
@@ -338,9 +338,9 @@  class cg1unpacker(object):
                     # validate incoming csets have their manifests
                     for cset in xrange(clstart, clend):
                         mfnode = cl.changelogrevision(cset).manifest
-                        mfest = ml[mfnode].readdelta()
-                        # store file nodes we must see
-                        for f, n in mfest.iteritems():
+                        # Find new nodes referenced by the manifest so we
+                        # can verify they exist later.
+                        for f, n in ml[mfnode].readstoragenewnodes():
                             needfiles.setdefault(f, set()).add(n)
 
                 # process the files