Patchwork discovery: run discovery on filtered repository

login
register
mail settings
Submitter Pierre-Yves David
Date Jan. 14, 2015, 10:37 p.m.
Message ID <db1724352a691367ecef.1421275041@marginatus.alto.octopoid.net>
Download mbox | patch
Permalink /patch/7455/
State Accepted
Commit c5456b64eb07c37a4e891d641617f42fb8182dc1
Headers show

Comments

Pierre-Yves David - Jan. 14, 2015, 10:37 p.m.
# HG changeset patch
# User Pierre-Yves David <pierre-yves.david@fb.com>
# Date 1420618049 28800
#      Wed Jan 07 00:07:29 2015 -0800
# Node ID db1724352a691367ecef6fc5e3275c41e74f9676
# Parent  9b1d3bac61a772e73846bf92746d0ce0213e3ad3
discovery: run discovery on filtered repository

We have been running discovery on unfiltered repository for quite some time.
This was aimed at two things:

- save some bandwith by prevent the repushing of common but hidden changesets
- allow phases changes on secret/hidden changeset on bare push.

The cost of this unfiltered discovery combined with evolution is actually really
high. Evolution likely create thousand of hidden heads, and the discovery is
going to try to discovery if each of them are common or not. For example,
pushing from my development mercurial repository implies 17 discovery
round-trip.

The benefit are rare corner cases while the drawback are massive. So we run the
discovery on a filtered repository again.

We add some hack to detect remote heads that are known locally and adds them to
the common set anyway, so the good behavior of most of the corner case should
remains. But this will not work in all cases.

This bring my discovery phase back from 17 round-trips to 1 or 2.
Matt Mackall - Jan. 14, 2015, 11:16 p.m.
On Wed, 2015-01-14 at 14:37 -0800, Pierre-Yves David wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@fb.com>
> # Date 1420618049 28800
> #      Wed Jan 07 00:07:29 2015 -0800
> # Node ID db1724352a691367ecef6fc5e3275c41e74f9676
> # Parent  9b1d3bac61a772e73846bf92746d0ce0213e3ad3
> discovery: run discovery on filtered repository

Queued for default, thanks.

Patch

diff --git a/mercurial/exchange.py b/mercurial/exchange.py
--- a/mercurial/exchange.py
+++ b/mercurial/exchange.py
@@ -269,16 +269,15 @@  def _pushdiscovery(pushop):
         step(pushop)
 
 @pushdiscovery('changeset')
 def _pushdiscoverychangeset(pushop):
     """discover the changeset that need to be pushed"""
-    unfi = pushop.repo.unfiltered()
     fci = discovery.findcommonincoming
-    commoninc = fci(unfi, pushop.remote, force=pushop.force)
+    commoninc = fci(pushop.repo, pushop.remote, force=pushop.force)
     common, inc, remoteheads = commoninc
     fco = discovery.findcommonoutgoing
-    outgoing = fco(unfi, pushop.remote, onlyheads=pushop.revs,
+    outgoing = fco(pushop.repo, pushop.remote, onlyheads=pushop.revs,
                    commoninc=commoninc, force=pushop.force)
     pushop.outgoing = outgoing
     pushop.remoteheads = remoteheads
     pushop.incoming = inc
 
@@ -925,15 +924,40 @@  def _pulldiscovery(pullop):
 def _pulldiscoverychangegroup(pullop):
     """discovery phase for the pull
 
     Current handle changeset discovery only, will change handle all discovery
     at some point."""
-    tmp = discovery.findcommonincoming(pullop.repo.unfiltered(),
+    tmp = discovery.findcommonincoming(pullop.repo,
                                        pullop.remote,
                                        heads=pullop.heads,
                                        force=pullop.force)
-    pullop.common, pullop.fetch, pullop.rheads = tmp
+    common, fetch, rheads = tmp
+    nm = pullop.repo.unfiltered().changelog.nodemap
+    if fetch and rheads:
+        # If a remote heads in filtered locally, lets drop it from the unknown
+        # remote heads and put in back in common.
+        #
+        # This is a hackish solution to catch most of "common but locally
+        # hidden situation".  We do not performs discovery on unfiltered
+        # repository because it end up doing a pathological amount of round
+        # trip for w huge amount of changeset we do not care about.
+        #
+        # If a set of such "common but filtered" changeset exist on the server
+        # but are not including a remote heads, we'll not be able to detect it,
+        scommon = set(common)
+        filteredrheads = []
+        for n in rheads:
+            if n in nm and n not in scommon:
+                common.append(n)
+            else:
+                filteredrheads.append(n)
+        if not filteredrheads:
+            fetch = []
+        rheads = filteredrheads
+    pullop.common = common
+    pullop.fetch = fetch
+    pullop.rheads = rheads
 
 def _pullbundle2(pullop):
     """pull data using bundle2
 
     For now, the only supported data are changegroup."""
diff --git a/mercurial/wireproto.py b/mercurial/wireproto.py
--- a/mercurial/wireproto.py
+++ b/mercurial/wireproto.py
@@ -170,11 +170,15 @@  def decodelist(l, sep=' '):
     if l:
         return map(bin, l.split(sep))
     return []
 
 def encodelist(l, sep=' '):
-    return sep.join(map(hex, l))
+    try:
+        return sep.join(map(hex, l))
+    except TypeError:
+        print l
+        raise
 
 # batched call argument encoding
 
 def escapearg(plain):
     return (plain