Patchwork [2,of,2] bundle2: add a part to import external bundles

login
register
mail settings
Submitter Mike Hommey
Date Oct. 11, 2014, 8:38 a.m.
Message ID <538073e6c7f6ae680cbd.1413016701@zenigata.glandium.org>
Download mbox | patch
Permalink /patch/6217/
State Changes Requested
Headers show

Comments

Mike Hommey - Oct. 11, 2014, 8:38 a.m.
# HG changeset patch
# User Mike Hommey <mh@glandium.org>
# Date 1413016657 -32400
#      Sat Oct 11 17:37:37 2014 +0900
# Node ID 538073e6c7f6ae680cbdd9bcb44c214fd6653eb1
# Parent  225306662bbf7c8cc6d867d484eab1e038f6c2a5
bundle2: add a part to import external bundles

Bundle2 opens doors to advanced features allowing to reduce load on
mercurial servers, and improve clone experience for users on unstable or
slow networks.

For instance, it could be possible to pre-generate a bundle of a
repository, and give a pointer to it to clients cloning the repository,
followed by another changegroup with the remainder. For significantly
big repositories, this could come as several base bundles with e.g. 10k
changesets, which, combined with checkpoints (not part of this change),
would prevent users with flaky networks from starting over any time
their connection fails.

While the server-side support for those features doesn't exist yet, it
is preferable to have client-side support for this early-on.

Note, I'm not sure the exchange.py change is right. Also, I'm not sure what
pullop.fetch means. Is it possible to have pullop.fetch set, but to have e.g.
only bookmark updates? In that case, the exchange.py change might be right.

I'm also unsure if it makes sense to do so many tests. Clone and pull are not
so different from each other, and the different test scenarios are more
testing the multi-part infrastructure than the feature itself. It however
tests all the use-cases I do plan to use.
Mike Hommey - Oct. 11, 2014, 1:53 p.m.
On Sat, Oct 11, 2014 at 05:38:21PM +0900, Mike Hommey wrote:
>  capabilities = {'HG2X': (),
>                  'b2x:listkeys': (),
>                  'b2x:pushkey': (),
>                  'b2x:changegroup': (),
> +                'digests': tuple(digester.DIGESTS.keys()),

Note this lacks a capability for the new part type.

Mike
Gregory Szorc - Oct. 11, 2014, 6:46 p.m.
On 10/11/2014 1:38 AM, Mike Hommey wrote:
> # HG changeset patch
> # User Mike Hommey <mh@glandium.org>
> # Date 1413016657 -32400
> #      Sat Oct 11 17:37:37 2014 +0900
> # Node ID 538073e6c7f6ae680cbdd9bcb44c214fd6653eb1
> # Parent  225306662bbf7c8cc6d867d484eab1e038f6c2a5
> bundle2: add a part to import external bundles

Great patch. I can't wait to start utilizing this!

Comments inline.

> diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py
> --- a/mercurial/bundle2.py
> +++ b/mercurial/bundle2.py
> +@parthandler('b2x:import-bundle')
> +def handleimportbundle(op, inpart):
> +    """apply a bundle on the repo, given an url and validation information
> +
> +    The part contains a bundle url, as well as hash digest(s) and size
> +    information to validate the downloaded content.
> +    The expected format for this part is:
> +        <URL>
> +        <SIZE in bytes>
> +        <DIGEST-TYPE>: <DIGEST-VALUE>
> +        ...
> +    Multiple digest types are supported in the same part, in which case they
> +    are all validated.
> +    This currently only supports bundle10, but is intended to support bundle20
> +    eventually.
> +    """

Instead of an order-dependent format, how do you feel about the HTTP
header style "K: V" format for all metadata? Or is the bundle2 style of
expanding functionality to introduce a new part with similar but
enhanced functionality?

Do you think we should make room for the host fingerprint here so
clients following URLs won't see "unknown fingerprint" warnings? There
is conceptually a security risk here, but I think if a client trusts the
remote that is handing out this data, it can temporarily trust
fingerprints via proxy of trust.

> diff --git a/tests/test-bundle2-import.t b/tests/test-bundle2-import.t
> new file mode 100644
> --- /dev/null
> +++ b/tests/test-bundle2-import.t

> +Start a simple HTTP server to serve bundles (stolen from test-static-http.t)
> +
> +  $ cat > dumb.py <<EOF
> +  > import BaseHTTPServer, SimpleHTTPServer, os, signal, sys
> +  > 
> +  > def run(server_class=BaseHTTPServer.HTTPServer,
> +  >         handler_class=SimpleHTTPServer.SimpleHTTPRequestHandler):
> +  >     server_address = ('localhost', int(os.environ['HGPORT']))
> +  >     httpd = server_class(server_address, handler_class)
> +  >     httpd.serve_forever()
> +  > 
> +  > signal.signal(signal.SIGTERM, lambda x, y: sys.exit(0))
> +  > fp = file('dumb.pid', 'wb')
> +  > fp.write(str(os.getpid()) + '\n')
> +  > fp.close()
> +  > run()
> +  > EOF
> +  $ python dumb.py 2>/dev/null &
> +
> +Cannot just read $!, it will not be set to the right value on Windows/MinGW
> +
> +  $ cat > wait.py <<EOF
> +  > import time
> +  > while True:
> +  >     try:
> +  >         if '\n' in file('dumb.pid', 'rb').read():
> +  >             break
> +  >     except IOError:
> +  >         pass
> +  >     time.sleep(0.2)
> +  > EOF
> +  $ python wait.py
> +  $ cat dumb.pid >> $DAEMON_PIDS

Why can't dumb.py append to the os.environ['DAEMON_PIDS'] file directly?

> +Test a clone with a single import-bundle.
> +
> +  $ hg bundle -R repo -a bundle.hg
> +  8 changesets found
> +  $ cat > repo/.hg/bundle2maker << EOF
> +  > import-bundle http://localhost:$HGPORT/bundle.hg bundle.hg
> +  > EOF
> +  $ hg clone ssh://user@dummy/repo clone
> +  requesting all changes
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 8 changesets with 7 changes to 7 files (+2 heads)
> +  updating to branch default
> +  3 files updated, 0 files merged, 0 files removed, 0 files unresolved

This test doesn't actually verify import-bundle is used. We need some
kind of UI output or confirmation from the HTTP/bundle server to verify
the client is actually fetching a remote URL/bundle. This deficiency is
present in subsequent tests.

I'm not sure if we want the client to print import-bundle details by
default or if it should be hidden behind --debug. Large repositories
presumably generating many bundles could lead to console spam, which is
a point in --debug's favor. More on this later.

> +Test a clone with an import-bundle and a following changegroup
> +
> +  $ hg bundle -R repo --base 000000000000 -r '0:4' bundle2.hg
> +  5 changesets found
> +  $ cat > repo/.hg/bundle2maker << EOF
> +  > import-bundle http://localhost:$HGPORT/bundle2.hg bundle2.hg
> +  > changegroup 0:4 5:7
> +  > EOF
> +  $ hg clone ssh://user@dummy/repo clone
> +  requesting all changes
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 5 changesets with 5 changes to 5 files (+1 heads)
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 3 changesets with 2 changes to 2 files (+1 heads)
> +  updating to branch default
> +  3 files updated, 0 files merged, 0 files removed, 0 files unresolved

Pretending I'm an average user, this output is kinda verbose and
confusing. Maybe we *do* need some kind of normal output saying
"fetching changes from URL" to make this clearer. Still, I can't help
but think of the Firefox use case where clones are fetching 20+ bundles
(of 10k changesets each).

> +Corruption tests
> +
> +  $ hg clone orig clone -r 2
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 3 changesets with 3 changes to 3 files
> +  updating to branch default
> +  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +
> +  $ cat > repo/.hg/bundle2maker << EOF
> +  > import-bundle http://localhost:$HGPORT/bundle7.hg bundle7.hg
> +  > import-bundle http://localhost:$HGPORT/bundle8.hg bundle7.hg
> +  > changegroup 0:6 7
> +  > EOF
> +  $ hg pull -R clone ssh://user@dummy/repo
> +  pulling from ssh://user@dummy/repo
> +  searching for changes
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 2 changesets with 2 changes to 2 files (+1 heads)
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 2 changesets with 1 changes to 1 files
> +  transaction abort!
> +  rollback completed
> +  abort: bundle at http://localhost:$HGPORT/bundle8.hg is corrupted
> +  [255]

From an end-user perspective, it's not entirely clear what the state of
the local store is after this abort. Was the rollback full or partial?
Is there any way we could sneak a partial pull summary message in here?
(You'll probably want to do this as a separate patch. And knowing the
transaction code, this could be "fun" to implement.)

> +No content
> +Missing size
> +Size mismatch
> +Garbage data
> +Unknown digest
> +Not an HTTP url

The comprehensive testing of failure cases is quite nice!
Mike Hommey - Oct. 11, 2014, 10:27 p.m.
On Sat, Oct 11, 2014 at 11:46:18AM -0700, Gregory Szorc wrote:
> On 10/11/2014 1:38 AM, Mike Hommey wrote:
> > # HG changeset patch
> > # User Mike Hommey <mh@glandium.org>
> > # Date 1413016657 -32400
> > #      Sat Oct 11 17:37:37 2014 +0900
> > # Node ID 538073e6c7f6ae680cbdd9bcb44c214fd6653eb1
> > # Parent  225306662bbf7c8cc6d867d484eab1e038f6c2a5
> > bundle2: add a part to import external bundles
> 
> Great patch. I can't wait to start utilizing this!
> 
> Comments inline.
> 
> > diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py
> > --- a/mercurial/bundle2.py
> > +++ b/mercurial/bundle2.py
> > +@parthandler('b2x:import-bundle')
> > +def handleimportbundle(op, inpart):
> > +    """apply a bundle on the repo, given an url and validation information
> > +
> > +    The part contains a bundle url, as well as hash digest(s) and size
> > +    information to validate the downloaded content.
> > +    The expected format for this part is:
> > +        <URL>
> > +        <SIZE in bytes>
> > +        <DIGEST-TYPE>: <DIGEST-VALUE>
> > +        ...
> > +    Multiple digest types are supported in the same part, in which case they
> > +    are all validated.
> > +    This currently only supports bundle10, but is intended to support bundle20
> > +    eventually.
> > +    """
> 
> Instead of an order-dependent format, how do you feel about the HTTP
> header style "K: V" format for all metadata? Or is the bundle2 style of
> expanding functionality to introduce a new part with similar but
> enhanced functionality?

I'm not sure there's much functionality to add to this particular part.
If you want some extra functionality, you're probably better covered
with a new part type.
 
> Do you think we should make room for the host fingerprint here so
> clients following URLs won't see "unknown fingerprint" warnings?
> There is conceptually a security risk here, but I think if a client
> trusts the remote that is handing out this data, it can temporarily
> trust fingerprints via proxy of trust.

What do you mean about "unknown fingerprint" warnings? Are you talking
about possible ssl certificates problems on the remote pointed at for
external bundles? Would you want the import-bundle part to possibly
contain certificate material so that the client could identify
self-signed certificates? How useful would that be in practice?
Note that the reason there are digests for the remote bundle is that
they may be served by foreign servers, which trust level is uncertain.
The client is already trusting the mercurial server with changegroups
it sends directly, so the same level of trust can be given to those
digests, but it would be unsafe to blindly trust the foreign server
giving the right bundles.
(Although, the digests are not mandatory, someone can very well
implement import-bundle parts without them, if they don't care)

> > diff --git a/tests/test-bundle2-import.t b/tests/test-bundle2-import.t
> > new file mode 100644
> > --- /dev/null
> > +++ b/tests/test-bundle2-import.t
> 
> > +Start a simple HTTP server to serve bundles (stolen from test-static-http.t)
> > +
> > +  $ cat > dumb.py <<EOF
> > +  > import BaseHTTPServer, SimpleHTTPServer, os, signal, sys
> > +  > 
> > +  > def run(server_class=BaseHTTPServer.HTTPServer,
> > +  >         handler_class=SimpleHTTPServer.SimpleHTTPRequestHandler):
> > +  >     server_address = ('localhost', int(os.environ['HGPORT']))
> > +  >     httpd = server_class(server_address, handler_class)
> > +  >     httpd.serve_forever()
> > +  > 
> > +  > signal.signal(signal.SIGTERM, lambda x, y: sys.exit(0))
> > +  > fp = file('dumb.pid', 'wb')
> > +  > fp.write(str(os.getpid()) + '\n')
> > +  > fp.close()
> > +  > run()
> > +  > EOF
> > +  $ python dumb.py 2>/dev/null &
> > +
> > +Cannot just read $!, it will not be set to the right value on Windows/MinGW
> > +
> > +  $ cat > wait.py <<EOF
> > +  > import time
> > +  > while True:
> > +  >     try:
> > +  >         if '\n' in file('dumb.pid', 'rb').read():
> > +  >             break
> > +  >     except IOError:
> > +  >         pass
> > +  >     time.sleep(0.2)
> > +  > EOF
> > +  $ python wait.py
> > +  $ cat dumb.pid >> $DAEMON_PIDS
> 
> Why can't dumb.py append to the os.environ['DAEMON_PIDS'] file directly?

I just copied that whole thing from tests/test-static-http.t

> > +Test a clone with a single import-bundle.
> > +
> > +  $ hg bundle -R repo -a bundle.hg
> > +  8 changesets found
> > +  $ cat > repo/.hg/bundle2maker << EOF
> > +  > import-bundle http://localhost:$HGPORT/bundle.hg bundle.hg
> > +  > EOF
> > +  $ hg clone ssh://user@dummy/repo clone
> > +  requesting all changes
> > +  adding changesets
> > +  adding manifests
> > +  adding file changes
> > +  added 8 changesets with 7 changes to 7 files (+2 heads)
> > +  updating to branch default
> > +  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
> 
> This test doesn't actually verify import-bundle is used. We need some
> kind of UI output or confirmation from the HTTP/bundle server to verify
> the client is actually fetching a remote URL/bundle. This deficiency is
> present in subsequent tests.
> 
> I'm not sure if we want the client to print import-bundle details by
> default or if it should be hidden behind --debug. Large repositories
> presumably generating many bundles could lead to console spam, which is
> a point in --debug's favor. More on this later.
> 
> > +Test a clone with an import-bundle and a following changegroup
> > +
> > +  $ hg bundle -R repo --base 000000000000 -r '0:4' bundle2.hg
> > +  5 changesets found
> > +  $ cat > repo/.hg/bundle2maker << EOF
> > +  > import-bundle http://localhost:$HGPORT/bundle2.hg bundle2.hg
> > +  > changegroup 0:4 5:7
> > +  > EOF
> > +  $ hg clone ssh://user@dummy/repo clone
> > +  requesting all changes
> > +  adding changesets
> > +  adding manifests
> > +  adding file changes
> > +  added 5 changesets with 5 changes to 5 files (+1 heads)
> > +  adding changesets
> > +  adding manifests
> > +  adding file changes
> > +  added 3 changesets with 2 changes to 2 files (+1 heads)
> > +  updating to branch default
> > +  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
> 
> Pretending I'm an average user, this output is kinda verbose and
> confusing. Maybe we *do* need some kind of normal output saying
> "fetching changes from URL" to make this clearer. Still, I can't help
> but think of the Firefox use case where clones are fetching 20+ bundles
> (of 10k changesets each).

Considering the time it takes to clone Firefox, and how many bundles
might be involved, it might make sense to show their url as some kind of
progress meter that show that things are _not_ looping out of control.

> > +Corruption tests
> > +
> > +  $ hg clone orig clone -r 2
> > +  adding changesets
> > +  adding manifests
> > +  adding file changes
> > +  added 3 changesets with 3 changes to 3 files
> > +  updating to branch default
> > +  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
> > +
> > +  $ cat > repo/.hg/bundle2maker << EOF
> > +  > import-bundle http://localhost:$HGPORT/bundle7.hg bundle7.hg
> > +  > import-bundle http://localhost:$HGPORT/bundle8.hg bundle7.hg
> > +  > changegroup 0:6 7
> > +  > EOF
> > +  $ hg pull -R clone ssh://user@dummy/repo
> > +  pulling from ssh://user@dummy/repo
> > +  searching for changes
> > +  adding changesets
> > +  adding manifests
> > +  adding file changes
> > +  added 2 changesets with 2 changes to 2 files (+1 heads)
> > +  adding changesets
> > +  adding manifests
> > +  adding file changes
> > +  added 2 changesets with 1 changes to 1 files
> > +  transaction abort!
> > +  rollback completed
> > +  abort: bundle at http://localhost:$HGPORT/bundle8.hg is corrupted
> > +  [255]
> 
> From an end-user perspective, it's not entirely clear what the state of
> the local store is after this abort. Was the rollback full or partial?

Full. There will be a possibility of partial roolbacks in the future,
with checkpoints (which, in fact, if we could get them in 3.2, that
would be awesome)

> Is there any way we could sneak a partial pull summary message in here?
> (You'll probably want to do this as a separate patch. And knowing the
> transaction code, this could be "fun" to implement.)

I think all the UX part can be a separate patch, possibly even landing
well after the functionality itself (since bundle2 is still going to be
experimental)

Mike
Mads Kiilerich - Oct. 11, 2014, 11:54 p.m.
On 10/11/2014 10:38 AM, Mike Hommey wrote:
> # HG changeset patch
> # User Mike Hommey <mh@glandium.org>
> # Date 1413016657 -32400
> #      Sat Oct 11 17:37:37 2014 +0900
> # Node ID 538073e6c7f6ae680cbdd9bcb44c214fd6653eb1
> # Parent  225306662bbf7c8cc6d867d484eab1e038f6c2a5
> bundle2: add a part to import external bundles

It took me some time to figure out what this was all about.

What do "a part" mean here? "Functionality"/"support"/"infrastructure"? 
Or more like "multi part"?

It also seems essential that it is client side support only - that 
deserves mentioning in the short summary in the first line.

Perhaps:
bundle2: client side support for bundles containing references to other 
bundles

> diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py
> --- a/mercurial/bundle2.py
> +++ b/mercurial/bundle2.py

> +class digester(object):
> +    """helper to compute multiple digests for the same data."""
> +
> +    DIGESTS = {
> +        'md5': util.md5,

Why care about md5 at all?

Why support multiple digests? That seems like unnecessary complexity. 
The rest of Mercurial might be moving away from sha1 one day, but then 
this should be handled like everything else.

Digests and validation could perhaps be put in a different patch than 
the multi bundle handling itself so we got smaller parts that were 
easier to review.

>   capabilities = {'HG2X': (),
>                   'b2x:listkeys': (),
>                   'b2x:pushkey': (),
>                   'b2x:changegroup': (),
> +                'digests': tuple(digester.DIGESTS.keys()),

As mentioned, I don't see the point in supporting multiple digests, but 
I would expect these references to external bundles requires a new 
capability flag?

> +@parthandler('b2x:import-bundle')
> +def handleimportbundle(op, inpart):
> +    """apply a bundle on the repo, given an url and validation information
> +
> +    The part contains a bundle url, as well as hash digest(s) and size
> +    information to validate the downloaded content.
> +    The expected format for this part is:
> +        <URL>
> +        <SIZE in bytes>
> +        <DIGEST-TYPE>: <DIGEST-VALUE>
> +        ...
> +    Multiple digest types are supported in the same part, in which case they
> +    are all validated.
> +    This currently only supports bundle10, but is intended to support bundle20
> +    eventually.
> +    """
> +    lines = inpart.read().splitlines()
> +    try:
> +        url = lines.pop(0)
> +    except IndexError:
> +        raise util.Abort(_('import-bundle format error'))
> +    parsed_url = urlparse.urlparse(url)
> +    if parsed_url.scheme not in ('http', 'https'):
> +        raise util.Abort(_('import-bundle only supports http/https urls'))
> +
> +    try:
> +        size = int(lines.pop(0))
> +    except (ValueError, IndexError):
> +        raise util.Abort(_('import-bundle format error'))
> +    digests = {}
> +    for line in lines:
> +        if ':' not in line:
> +            raise util.Abort(_('import-bundle format error'))
> +        k, v = line.split(':', 1)
> +        digests[k.strip()] = v.strip()
> +
> +    real_part = digestchecker(urllib2.urlopen(url), size, digests)

How will this work with authentication and server certificate validation 
etc? I would expect it should use (and perhaps extend) the existing URL 
abstraction layers in url.py and elsewhere.

I also wonder why it only supports http(s). I would expect the most 
common case would be to get all bundles from the same location. Other 
protocols can of course be added later, but it seems wrong to have any 
protocol knowledge at all at this level.

Breaking bundles up in multiple parts do of course raise the question of 
whether the reference between them should use "relative" or "absolute" 
"paths". There are pros and cons, but I think I would prefer relative in 
most cases.

> +
> +    # Make sure we trigger a transaction creation
> +    #
> +    # The addchangegroup function will get a transaction object by itself, but
> +    # we need to make sure we trigger the creation of a transaction object used
> +    # for the whole processing scope.
> +    op.gettransaction()
> +    import exchange
> +    cg = exchange.readbundle(op.repo.ui, real_part, None)
> +    if isinstance(cg, unbundle20):
> +        raise util.Abort(_('import-bundle only supports bundle10'))

Why is that? (Might deserve a comment.)

It seems wrong to have an architecture where we don't just automagically 
support all bundle formats.

> +    ret = changegroup.addchangegroup(op.repo, cg, 'bundle2', 'bundle2')
> +    op.records.add('import-bundle', {'return': ret})
> +    if op.reply is not None:
> +        # This is definitly not the final form of this
> +        # return. But one need to start somewhere.
> +        part = op.reply.newpart('b2x:reply:import-bundle')
> +        part.addparam('in-reply-to', str(inpart.id), mandatory=False)
> +        part.addparam('return', '%i' % ret, mandatory=False)
> +    if not real_part.validate():
> +        raise util.Abort(_('bundle at %s is corrupted') % url)
> +    assert not inpart.read()
> +
>   @parthandler('b2x:reply:changegroup', ('return', 'in-reply-to'))
>   def handlereplychangegroup(op, inpart):
>       ret = int(inpart.params['return'])
>       replyto = int(inpart.params['in-reply-to'])
>       op.records.add('changegroup', {'return': ret}, replyto)
>   
>   @parthandler('b2x:check:heads')
>   def handlecheckheads(op, inpart):
>

/Mads
Mike Hommey - Oct. 12, 2014, 12:42 a.m.
On Sun, Oct 12, 2014 at 01:54:09AM +0200, Mads Kiilerich wrote:
> On 10/11/2014 10:38 AM, Mike Hommey wrote:
> ># HG changeset patch
> ># User Mike Hommey <mh@glandium.org>
> ># Date 1413016657 -32400
> >#      Sat Oct 11 17:37:37 2014 +0900
> ># Node ID 538073e6c7f6ae680cbdd9bcb44c214fd6653eb1
> ># Parent  225306662bbf7c8cc6d867d484eab1e038f6c2a5
> >bundle2: add a part to import external bundles
> 
> It took me some time to figure out what this was all about.
> 
> What do "a part" mean here? "Functionality"/"support"/"infrastructure"? Or
> more like "multi part"?

bundle2 contains parts that are treated independently.

> It also seems essential that it is client side support only - that deserves
> mentioning in the short summary in the first line.
> 
> Perhaps:
> bundle2: client side support for bundles containing references to other
> bundles
> 
> >diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py
> >--- a/mercurial/bundle2.py
> >+++ b/mercurial/bundle2.py
> 
> >+class digester(object):
> >+    """helper to compute multiple digests for the same data."""
> >+
> >+    DIGESTS = {
> >+        'md5': util.md5,
> 
> Why care about md5 at all?
> 
> Why support multiple digests? That seems like unnecessary complexity. The
> rest of Mercurial might be moving away from sha1 one day, but then this
> should be handled like everything else.

Because both md5 and sha1 are already weakened and they are the only
digests we can support as long as mercurial sticks to supporting python
2.4. The bundles don't /need/ to contain all digest types, but having
multiple ones when the only available ones are md5 and sha1 kind of
increases security (it's harder to create collisions for *both* digests
at the same time)

> Digests and validation could perhaps be put in a different patch than the
> multi bundle handling itself so we got smaller parts that were easier to
> review.
> 
> >  capabilities = {'HG2X': (),
> >                  'b2x:listkeys': (),
> >                  'b2x:pushkey': (),
> >                  'b2x:changegroup': (),
> >+                'digests': tuple(digester.DIGESTS.keys()),
> 
> As mentioned, I don't see the point in supporting multiple digests

The point here is that the client advertizes what digests it supports,
for the future case where it might support something else like sha512 or
sha-3. Note we could settle to use hashlib if the python running
mercurial supports it, adding support for other hashes, but that doesn't
have to be in this patch.

> but I would expect these references to external bundles requires a new
> capability flag?

that's an omission I noticed already. cf.
http://www.selenic.com/pipermail/mercurial-devel/2014-October/062507.html

> >+@parthandler('b2x:import-bundle')
> >+def handleimportbundle(op, inpart):
> >+    """apply a bundle on the repo, given an url and validation information
> >+
> >+    The part contains a bundle url, as well as hash digest(s) and size
> >+    information to validate the downloaded content.
> >+    The expected format for this part is:
> >+        <URL>
> >+        <SIZE in bytes>
> >+        <DIGEST-TYPE>: <DIGEST-VALUE>
> >+        ...
> >+    Multiple digest types are supported in the same part, in which case they
> >+    are all validated.
> >+    This currently only supports bundle10, but is intended to support bundle20
> >+    eventually.
> >+    """
> >+    lines = inpart.read().splitlines()
> >+    try:
> >+        url = lines.pop(0)
> >+    except IndexError:
> >+        raise util.Abort(_('import-bundle format error'))
> >+    parsed_url = urlparse.urlparse(url)
> >+    if parsed_url.scheme not in ('http', 'https'):
> >+        raise util.Abort(_('import-bundle only supports http/https urls'))
> >+
> >+    try:
> >+        size = int(lines.pop(0))
> >+    except (ValueError, IndexError):
> >+        raise util.Abort(_('import-bundle format error'))
> >+    digests = {}
> >+    for line in lines:
> >+        if ':' not in line:
> >+            raise util.Abort(_('import-bundle format error'))
> >+        k, v = line.split(':', 1)
> >+        digests[k.strip()] = v.strip()
> >+
> >+    real_part = digestchecker(urllib2.urlopen(url), size, digests)
> 
> How will this work with authentication and server certificate validation
> etc? I would expect it should use (and perhaps extend) the existing URL
> abstraction layers in url.py and elsewhere.

I thought mercurial's url opener was "installed" as the urllib handler.
Isn't it the case?

> I also wonder why it only supports http(s). I would expect the most common
> case would be to get all bundles from the same location.

You're assuming those references would come in a bundle stored on some
server, when the use-case behind this is that you connect to a mercurial
server for a pull or a clone, and get those references from there. Those
references would *not* point to the same server. Also, at the point
where the bundle data is handled, there is not access to where it came
from (afaik)

> Other protocols can
> of course be added later, but it seems wrong to have any protocol knowledge
> at all at this level.

I'm not sure. The first reason I limited the protocols is that it didn't
seem reasonable for a random request to a mercurial server to tell you
to connect to some ssh server or use a file on your own local file
system.

> Breaking bundles up in multiple parts do of course raise the question of
> whether the reference between them should use "relative" or "absolute"
> "paths". There are pros and cons, but I think I would prefer relative in
> most cases.
> 
> >+
> >+    # Make sure we trigger a transaction creation
> >+    #
> >+    # The addchangegroup function will get a transaction object by itself, but
> >+    # we need to make sure we trigger the creation of a transaction object used
> >+    # for the whole processing scope.
> >+    op.gettransaction()
> >+    import exchange
> >+    cg = exchange.readbundle(op.repo.ui, real_part, None)
> >+    if isinstance(cg, unbundle20):
> >+        raise util.Abort(_('import-bundle only supports bundle10'))
> 
> Why is that? (Might deserve a comment.)

The comment is there in the function description, but it's not "exposed" to the
user.

> It seems wrong to have an architecture where we don't just automagically
> support all bundle formats.

The intent *is* to support all bundle formats. It requires more work to
support bundle2 bundles. Also, bundle2 bundles don't currently exist
outside of push/pull/clone. There is no bundle/unbundle support for
creating and unbundling them yet.

Mike
Pierre-Yves David - Oct. 12, 2014, 1:17 a.m.
On 10/11/2014 01:38 AM, Mike Hommey wrote:
> # HG changeset patch
> # User Mike Hommey <mh@glandium.org>
> # Date 1413016657 -32400
> #      Sat Oct 11 17:37:37 2014 +0900
> # Node ID 538073e6c7f6ae680cbdd9bcb44c214fd6653eb1
> # Parent  225306662bbf7c8cc6d867d484eab1e038f6c2a5
> bundle2: add a part to import external bundles
>
> Bundle2 opens doors to advanced features allowing to reduce load on
> mercurial servers, and improve clone experience for users on unstable or
> slow networks.
>
> For instance, it could be possible to pre-generate a bundle of a
> repository, and give a pointer to it to clients cloning the repository,
> followed by another changegroup with the remainder. For significantly
> big repositories, this could come as several base bundles with e.g. 10k
> changesets, which, combined with checkpoints (not part of this change),
> would prevent users with flaky networks from starting over any time
> their connection fails.
>
> While the server-side support for those features doesn't exist yet, it
> is preferable to have client-side support for this early-on.
>
> Note, I'm not sure the exchange.py change is right. Also, I'm not sure what
> pullop.fetch means. Is it possible to have pullop.fetch set, but to have e.g.
> only bookmark updates? In that case, the exchange.py change might be right.
>
> I'm also unsure if it makes sense to do so many tests. Clone and pull are not
> so different from each other, and the different test scenarios are more
> testing the multi-part infrastructure than the feature itself. It however
> tests all the use-cases I do plan to use.
>
> diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py
> --- a/mercurial/bundle2.py
> +++ b/mercurial/bundle2.py
> @@ -143,16 +143,18 @@ preserve.
>   """
>
>   import util
>   import struct
>   import urllib
>   import string
>   import obsolete
>   import pushkey
> +import urllib2
> +import urlparse
>
>   import changegroup, error
>   from i18n import _
>
>   _pack = struct.pack
>   _unpack = struct.unpack
>
>   _magicstring = 'HG2X'
> @@ -771,20 +773,81 @@ class unbundlepart(unpackermixin):
>           if size is None:
>               data = self._payloadstream.read()
>           else:
>               data = self._payloadstream.read(size)
>           if size is None or len(data) < size:
>               self.consumed = True
>           return data
>
> +class digester(object):
> +    """helper to compute multiple digests for the same data."""
> +
> +    DIGESTS = {
> +        'md5': util.md5,
> +        'sha1': util.sha1,
> +    }
> +
> +    def __init__(self, digests, s=''):
> +        self._hashes = {}
> +        for k in digests:
> +            if k not in self.DIGESTS:
> +                raise util.Abort(_('unknown digest type: %s') % k)
> +            self._hashes[k] = self.DIGESTS[k]()
> +        if s:
> +            self.update(s)
> +
> +    def update(self, data):
> +        for h in self._hashes.values():
> +            h.update(data)
> +
> +    def __getitem__(self, key):
> +        if key not in self.DIGESTS:
> +            raise util.Abort(_('unknown digest type: %s') % k)
> +        return self._hashes[key].hexdigest()
> +
> +    def __iter__(self):
> +        return iter(self._hashes)
> +
> +

Move the class in another patch to make this one lighter. Another module 
maybe be more appropriate too.

> +class digestchecker(object):
> +    """file handle wrapper that additionally checks content against a given
> +    size and hash digests.
> +
> +        d = digestchecker(fh, size, {'md5': '...'})
> +
> +    When multiple digests are given, all of them are verified.
> +    """
> +
> +    def __init__(self, fh, size, digests):
> +        self._fh = fh
> +        self._size = size
> +        self._digests = dict(digests)
> +        self._digester = digester(self._digests.keys())
> +
> +    def read(self, length=-1):
> +        content = self._fh.read(length)
> +        self._digester.update(content)
> +        self._size -= len(content)
> +        return content
> +
> +    def validate(self):
> +        if self._size != 0:
> +            return False
> +        for k, v in self._digests.items():
> +            if v != self._digester[k]:
> +                return False
> +        return True
> +
> +

Same here, move the class in another patch to make this one lighter. 
Another module maybe be more appropriate too.


>   capabilities = {'HG2X': (),
>                   'b2x:listkeys': (),
>                   'b2x:pushkey': (),
>                   'b2x:changegroup': (),
> +                'digests': tuple(digester.DIGESTS.keys()),

You need to sort this otherwise this the order will be unstable. Also, 
this may be more appropriate for capabilities in general (and not just 
bundle2 capabilities) but this area is a but fuzzy.

>                  }
>
>   def getrepocaps(repo):
>       """return the bundle2 capabilities for a given repo
>
>       Exists to allow extensions (like evolution) to mutate the capabilities.
>       """
>       caps = capabilities.copy()
> @@ -826,16 +889,76 @@ def handlechangegroup(op, inpart):
>       if op.reply is not None:
>           # This is definitly not the final form of this
>           # return. But one need to start somewhere.
>           part = op.reply.newpart('b2x:reply:changegroup')
>           part.addparam('in-reply-to', str(inpart.id), mandatory=False)
>           part.addparam('return', '%i' % ret, mandatory=False)
>       assert not inpart.read()
>
> +@parthandler('b2x:import-bundle')
> +def handleimportbundle(op, inpart):
> +    """apply a bundle on the repo, given an url and validation information
> +
> +    The part contains a bundle url, as well as hash digest(s) and size
> +    information to validate the downloaded content.
> +    The expected format for this part is:
> +        <URL>
> +        <SIZE in bytes>
> +        <DIGEST-TYPE>: <DIGEST-VALUE>

Part can have parameters. You should use such parameter instead of 
encoding the information into the part payload.

> +        ...
> +    Multiple digest types are supported in the same part, in which case they
> +    are all validated.
> +    This currently only supports bundle10, but is intended to support bundle20
> +    eventually.
> +    """

Lets play simple and call this b2x:remote-changegroup and support 
changegroup for now. We will need something wider in the future but this 
will have to be more complicated.

> +    lines = inpart.read().splitlines()
> +    try:
> +        url = lines.pop(0)
> +    except IndexError:
> +        raise util.Abort(_('import-bundle format error'))
> +    parsed_url = urlparse.urlparse(url)
> +    if parsed_url.scheme not in ('http', 'https'):
> +        raise util.Abort(_('import-bundle only supports http/https urls'))
> +
> +    try:
> +        size = int(lines.pop(0))
> +    except (ValueError, IndexError):
> +        raise util.Abort(_('import-bundle format error'))
> +    digests = {}
> +    for line in lines:
> +        if ':' not in line:
> +            raise util.Abort(_('import-bundle format error'))
> +        k, v = line.split(':', 1)
> +        digests[k.strip()] = v.strip()

As pointed before, all this code can be replaced by something using 
parameter.

> +    real_part = digestchecker(urllib2.urlopen(url), size, digests)
> +
> +    # Make sure we trigger a transaction creation
> +    #
> +    # The addchangegroup function will get a transaction object by itself, but
> +    # we need to make sure we trigger the creation of a transaction object used
> +    # for the whole processing scope.
> +    op.gettransaction()
> +    import exchange
> +    cg = exchange.readbundle(op.repo.ui, real_part, None)

If we focus on changegroup only, we can use the changegroup module 
directly. Avoiding dancing around circular import issue.

> +    if isinstance(cg, unbundle20):
> +        raise util.Abort(_('import-bundle only supports bundle10'))

You should do the opposite (explicitly checking it is a bundle10. 
otherwise bundle30 bypass this. (we do not have bundle30, but this is a 
philosophical point, use while list, not blacklist)

> +    ret = changegroup.addchangegroup(op.repo, cg, 'bundle2', 'bundle2')
> +    op.records.add('import-bundle', {'return': ret})

This part is wrong (and is why we should focus on changegroup only at 
the beginning).

You are adding a changegroup so you should record a "changegroup" 
addition. This "changegroup" record is used by pull to report what have 
been added to the repository (1 new head messages etc).

This also mean that you have to patch the exchange code to properly 
merge multiple "changegroup" reply into a valid result.

If we were to support all of bundle2, we would need to merge all the 
record from the subunbundling into the current unbundling operation.


> +    if op.reply is not None:
> +        # This is definitly not the final form of this
> +        # return. But one need to start somewhere.
> +        part = op.reply.newpart('b2x:reply:import-bundle')
> +        part.addparam('in-reply-to', str(inpart.id), mandatory=False)
> +        part.addparam('return', '%i' % ret, mandatory=False)
> +    if not real_part.validate():
> +        raise util.Abort(_('bundle at %s is corrupted') % url)

We probably want a more verbose message.

> +    assert not inpart.read()
> +
>   @parthandler('b2x:reply:changegroup', ('return', 'in-reply-to'))
>   def handlereplychangegroup(op, inpart):
>       ret = int(inpart.params['return'])
>       replyto = int(inpart.params['in-reply-to'])
>       op.records.add('changegroup', {'return': ret}, replyto)
>
>   @parthandler('b2x:check:heads')
>   def handlecheckheads(op, inpart):
> diff --git a/mercurial/exchange.py b/mercurial/exchange.py
> --- a/mercurial/exchange.py
> +++ b/mercurial/exchange.py
> @@ -908,17 +908,17 @@ def _pullbundle2(pullop):
>       if kwargs.keys() == ['format']:
>           return # nothing to pull
>       bundle = pullop.remote.getbundle('pull', **kwargs)
>       try:
>           op = bundle2.processbundle(pullop.repo, bundle, pullop.gettransaction)
>       except error.BundleValueError, exc:
>           raise util.Abort('missing support for %s' % exc)
>
> -    if pullop.fetch:
> +    if pullop.fetch and op.records['changegroup']:

As all part will be using changegroup, this can be reverted

>           assert len(op.records['changegroup']) == 1
>           pullop.cgresult = op.records['changegroup'][0]['return']

This is the part that requires an upgrade to handle multiple changegroup.

>
>       # processing phases change
>       for namespace, value in op.records['listkeys']:
>           if namespace == 'phases':
>               _pullapplyphases(pullop, value)
>
> diff --git a/tests/test-bundle2-import.t b/tests/test-bundle2-import.t
> new file mode 100644
> --- /dev/null
> +++ b/tests/test-bundle2-import.t
> @@ -0,0 +1,535 @@
> +#require killdaemons
> +
> +Create an extension to test bundle2 import-bundle parts
> +
> +  $ cat > bundle2.py << EOF
> +  > """A small extension to test bundle2 import-bundle parts.
> +  >
> +  > Current bundle2 implementation doesn't provide a way to generate those
> +  > parts, so they must be created by extensions.
> +  > """
> +  > from mercurial import bundle2, changegroup, commands, exchange
> +  > from mercurial import extensions, scmutil
> +  > import os
> +  >
> +  > def _getbundlechangegrouppart(bundler, repo, source, bundlecaps=None,
> +  >                               b2caps=None, heads=None, common=None,
> +  >                               **kwargs):

This could use some documentation. It is fairly hard to follow what is 
happening here.

> +  >     for line in open(repo.join('bundle2maker'), 'r'):
> +  >         line = line.strip()
> +  >         try:
> +  >             verb, args = line.split(None, 1)
> +  >         except ValueError:
> +  >             verb, args = line, ''
> +  >         if verb == 'import-bundle':
> +  >            url, file = args.split()
> +  >            bundledata = open(file, 'rb').read()
> +  >            data = '%s\n%d\n' % (url, len(bundledata))
> +  >            d = bundle2.digester(b2caps['digests'], bundledata)
> +  >            for k in b2caps['digests']:
> +  >                data += '%s: %s\n' % (k, d[k])
> +  >            bundler.newpart('b2x:import-bundle', data=data)
> +  >         elif verb == 'raw-import-bundle':
> +  >            bundler.newpart('b2x:import-bundle',
> +  >                data=args.decode('string-escape'))
> +  >         elif verb == 'changegroup':
> +  >             _common, heads = args.split()
> +  >             common.extend(repo.lookup(r) for r in repo.revs(_common))
> +  >             heads = [repo.lookup(r) for r in repo.revs(heads)]
> +  >             cg = changegroup.getchangegroup(repo, 'changegroup',
> +  >                 heads=heads, common=common)
> +  >             bundler.newpart('b2x:changegroup', data=cg.getchunks())
> +  >         else:
> +  >             raise Exception('unknown verb')
> +  >
> +  > exchange.getbundle2partsmapping['changegroup'] = _getbundlechangegrouppart
> +  > EOF
> +
> +Start a simple HTTP server to serve bundles (stolen from test-static-http.t)

maybe it is time to promote dumb.py into an explicit dumbhttp.py in the 
tests directory.

> +
> +  $ cat > dumb.py <<EOF
> +  > import BaseHTTPServer, SimpleHTTPServer, os, signal, sys
> +  >
> +  > def run(server_class=BaseHTTPServer.HTTPServer,
> +  >         handler_class=SimpleHTTPServer.SimpleHTTPRequestHandler):
> +  >     server_address = ('localhost', int(os.environ['HGPORT']))
> +  >     httpd = server_class(server_address, handler_class)
> +  >     httpd.serve_forever()
> +  >
> +  > signal.signal(signal.SIGTERM, lambda x, y: sys.exit(0))
> +  > fp = file('dumb.pid', 'wb')
> +  > fp.write(str(os.getpid()) + '\n')
> +  > fp.close()
> +  > run()
> +  > EOF
> +  $ python dumb.py 2>/dev/null &
> +
> +Cannot just read $!, it will not be set to the right value on Windows/MinGW
> +
> +  $ cat > wait.py <<EOF
> +  > import time
> +  > while True:
> +  >     try:
> +  >         if '\n' in file('dumb.pid', 'rb').read():
> +  >             break
> +  >     except IOError:
> +  >         pass
> +  >     time.sleep(0.2)
> +  > EOF
> +  $ python wait.py
> +  $ cat dumb.pid >> $DAEMON_PIDS
> +
> +  $ cat >> $HGRCPATH << EOF
> +  > [experimental]
> +  > bundle2-exp=True
> +  > [ui]
> +  > ssh=python "$TESTDIR/dummyssh"
> +  > logtemplate={rev}:{node|short} {phase} {author} {bookmarks} {desc|firstline}
> +  > EOF
> +
> +  $ hg init repo
> +
> +  $ hg -R repo unbundle $TESTDIR/bundles/rebase.hg
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 8 changesets with 7 changes to 7 files (+2 heads)
> +  (run 'hg heads' to see heads, 'hg merge' to merge)

You should probably use a simpler bundle or create some changesets yourself.

> +
> +  $ hg -R repo log -G
> +  o  7:02de42196ebe draft Nicolas Dumazet <nicdumz.commits@gmail.com>  H
> +  |
> +  | o  6:eea13746799a draft Nicolas Dumazet <nicdumz.commits@gmail.com>  G
> +  |/|
> +  o |  5:24b6387c8c8c draft Nicolas Dumazet <nicdumz.commits@gmail.com>  F
> +  | |
> +  | o  4:9520eea781bc draft Nicolas Dumazet <nicdumz.commits@gmail.com>  E
> +  |/
> +  | o  3:32af7686d403 draft Nicolas Dumazet <nicdumz.commits@gmail.com>  D
> +  | |
> +  | o  2:5fddd98957c8 draft Nicolas Dumazet <nicdumz.commits@gmail.com>  C
> +  | |
> +  | o  1:42ccdea3bb16 draft Nicolas Dumazet <nicdumz.commits@gmail.com>  B
> +  |/
> +  o  0:cd010b8cd998 draft Nicolas Dumazet <nicdumz.commits@gmail.com>  A
> +
> +  $ hg clone repo orig
> +  updating to branch default
> +  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +
> +  $ cat > repo/.hg/hgrc << EOF
> +  > [extensions]
> +  > bundle2=$TESTTMP/bundle2.py
> +  > EOF
> +
> +Test a clone with a single import-bundle.
> +
> +  $ hg bundle -R repo -a bundle.hg
> +  8 changesets found
> +  $ cat > repo/.hg/bundle2maker << EOF
> +  > import-bundle http://localhost:$HGPORT/bundle.hg bundle.hg
> +  > EOF
> +  $ hg clone ssh://user@dummy/repo clone
> +  requesting all changes
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 8 changesets with 7 changes to 7 files (+2 heads)
> +  updating to branch default
> +  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +  $ hg -R clone log -G
> +  @  7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com>  H
> +  |
> +  | o  6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com>  G
> +  |/|
> +  o |  5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com>  F
> +  | |
> +  | o  4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com>  E
> +  |/
> +  | o  3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com>  D
> +  | |
> +  | o  2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com>  C
> +  | |
> +  | o  1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com>  B
> +  |/
> +  o  0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com>  A
> +
> +  $ rm -rf clone
> +
> +Test a clone with an import-bundle and a following changegroup
> +
> +  $ hg bundle -R repo --base 000000000000 -r '0:4' bundle2.hg
> +  5 changesets found
> +  $ cat > repo/.hg/bundle2maker << EOF
> +  > import-bundle http://localhost:$HGPORT/bundle2.hg bundle2.hg
> +  > changegroup 0:4 5:7
> +  > EOF
> +  $ hg clone ssh://user@dummy/repo clone
> +  requesting all changes
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 5 changesets with 5 changes to 5 files (+1 heads)
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 3 changesets with 2 changes to 2 files (+1 heads)
> +  updating to branch default
> +  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +  $ hg -R clone log -G
> +  @  7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com>  H
> +  |
> +  | o  6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com>  G
> +  |/|
> +  o |  5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com>  F
> +  | |
> +  | o  4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com>  E
> +  |/
> +  | o  3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com>  D
> +  | |
> +  | o  2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com>  C
> +  | |
> +  | o  1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com>  B
> +  |/
> +  o  0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com>  A
> +
> +  $ rm -rf clone
> +
> +Test a clone with a changegroup followed by an import-bundle
> +
> +  $ hg bundle -R repo --base '0:4' -r '5:7' bundle3.hg
> +  3 changesets found
> +  $ cat > repo/.hg/bundle2maker << EOF
> +  > changegroup 000000000000 :4
> +  > import-bundle http://localhost:$HGPORT/bundle3.hg bundle3.hg
> +  > EOF
> +  $ hg clone ssh://user@dummy/repo clone
> +  requesting all changes
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 5 changesets with 5 changes to 5 files (+1 heads)
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 3 changesets with 2 changes to 2 files (+1 heads)
> +  updating to branch default
> +  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +  $ hg -R clone log -G
> +  @  7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com>  H
> +  |
> +  | o  6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com>  G
> +  |/|
> +  o |  5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com>  F
> +  | |
> +  | o  4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com>  E
> +  |/
> +  | o  3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com>  D
> +  | |
> +  | o  2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com>  C
> +  | |
> +  | o  1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com>  B
> +  |/
> +  o  0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com>  A
> +
> +  $ rm -rf clone
> +
> +Test a pull with an import-bundle
> +
> +  $ hg bundle -R repo --base '0:4' -r '5:7' bundle4.hg
> +  3 changesets found
> +  $ cat > repo/.hg/bundle2maker << EOF
> +  > import-bundle http://localhost:$HGPORT/bundle4.hg bundle4.hg
> +  > EOF
> +  $ hg clone orig clone -r 3 -r 4
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 5 changesets with 5 changes to 5 files (+1 heads)
> +  updating to branch default
> +  4 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +  $ hg pull -R clone ssh://user@dummy/repo
> +  pulling from ssh://user@dummy/repo
> +  searching for changes
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 3 changesets with 2 changes to 2 files (+1 heads)
> +  (run 'hg update' to get a working copy)
> +  $ hg -R clone log -G
> +  o  7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com>  H
> +  |
> +  | o  6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com>  G
> +  |/|
> +  o |  5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com>  F
> +  | |
> +  | o  4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com>  E
> +  |/
> +  | @  3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com>  D
> +  | |
> +  | o  2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com>  C
> +  | |
> +  | o  1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com>  B
> +  |/
> +  o  0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com>  A
> +
> +  $ rm -rf clone
> +
> +Test a pull with an import-bundle and a following changegroup
> +
> +  $ hg bundle -R repo --base 2 -r '3:4' bundle5.hg
> +  2 changesets found
> +  $ cat > repo/.hg/bundle2maker << EOF
> +  > import-bundle http://localhost:$HGPORT/bundle5.hg bundle5.hg
> +  > changegroup 0:4 5:7
> +  > EOF
> +  $ hg clone orig clone -r 2
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 3 changesets with 3 changes to 3 files
> +  updating to branch default
> +  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +  $ hg pull -R clone ssh://user@dummy/repo
> +  pulling from ssh://user@dummy/repo
> +  searching for changes
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 2 changesets with 2 changes to 2 files (+1 heads)
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 3 changesets with 2 changes to 2 files (+1 heads)
> +  (run 'hg heads .' to see heads, 'hg merge' to merge)
> +  $ hg -R clone log -G
> +  o  7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com>  H
> +  |
> +  | o  6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com>  G
> +  |/|
> +  o |  5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com>  F
> +  | |
> +  | o  4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com>  E
> +  |/
> +  | o  3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com>  D
> +  | |
> +  | @  2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com>  C
> +  | |
> +  | o  1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com>  B
> +  |/
> +  o  0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com>  A
> +
> +  $ rm -rf clone
> +
> +Test a pull with a changegroup followed by an import-bundle
> +
> +  $ hg bundle -R repo --base '0:4' -r '5:7' bundle6.hg
> +  3 changesets found
> +  $ cat > repo/.hg/bundle2maker << EOF
> +  > changegroup 000000000000 :4
> +  > import-bundle http://localhost:$HGPORT/bundle6.hg bundle6.hg
> +  > EOF
> +  $ hg clone orig clone -r 2
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 3 changesets with 3 changes to 3 files
> +  updating to branch default
> +  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +  $ hg pull -R clone ssh://user@dummy/repo
> +  pulling from ssh://user@dummy/repo
> +  searching for changes
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 2 changesets with 2 changes to 2 files (+1 heads)
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 3 changesets with 2 changes to 2 files (+1 heads)
> +  (run 'hg heads .' to see heads, 'hg merge' to merge)
> +  $ hg -R clone log -G
> +  o  7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com>  H
> +  |
> +  | o  6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com>  G
> +  |/|
> +  o |  5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com>  F
> +  | |
> +  | o  4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com>  E
> +  |/
> +  | o  3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com>  D
> +  | |
> +  | @  2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com>  C
> +  | |
> +  | o  1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com>  B
> +  |/
> +  o  0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com>  A
> +
> +  $ rm -rf clone
> +
> +Test a pull with two import-bundles and a changegroup
> +
> +  $ hg bundle -R repo --base 2 -r '3:4' bundle7.hg
> +  2 changesets found
> +  $ hg bundle -R repo --base '3:4' -r '5:6' bundle8.hg
> +  2 changesets found
> +  $ cat > repo/.hg/bundle2maker << EOF
> +  > import-bundle http://localhost:$HGPORT/bundle7.hg bundle7.hg
> +  > import-bundle http://localhost:$HGPORT/bundle8.hg bundle8.hg
> +  > changegroup 0:6 7
> +  > EOF
> +  $ hg clone orig clone -r 2
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 3 changesets with 3 changes to 3 files
> +  updating to branch default
> +  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +  $ hg pull -R clone ssh://user@dummy/repo
> +  pulling from ssh://user@dummy/repo
> +  searching for changes
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 2 changesets with 2 changes to 2 files (+1 heads)
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 2 changesets with 1 changes to 1 files
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 1 changesets with 1 changes to 1 files (+1 heads)
> +  (run 'hg heads .' to see heads, 'hg merge' to merge)
> +  $ hg -R clone log -G
> +  o  7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com>  H
> +  |
> +  | o  6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com>  G
> +  |/|
> +  o |  5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com>  F
> +  | |
> +  | o  4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com>  E
> +  |/
> +  | o  3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com>  D
> +  | |
> +  | @  2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com>  C
> +  | |
> +  | o  1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com>  B
> +  |/
> +  o  0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com>  A
> +
> +  $ rm -rf clone
> +
> +Corruption tests
> +
> +  $ hg clone orig clone -r 2
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 3 changesets with 3 changes to 3 files
> +  updating to branch default
> +  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +
> +  $ cat > repo/.hg/bundle2maker << EOF
> +  > import-bundle http://localhost:$HGPORT/bundle7.hg bundle7.hg
> +  > import-bundle http://localhost:$HGPORT/bundle8.hg bundle7.hg
> +  > changegroup 0:6 7
> +  > EOF
> +  $ hg pull -R clone ssh://user@dummy/repo
> +  pulling from ssh://user@dummy/repo
> +  searching for changes
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 2 changesets with 2 changes to 2 files (+1 heads)
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 2 changesets with 1 changes to 1 files
> +  transaction abort!
> +  rollback completed
> +  abort: bundle at http://localhost:$HGPORT/bundle8.hg is corrupted
> +  [255]
> +
> +No content
> +
> +  $ cat > repo/.hg/bundle2maker << EOF
> +  > raw-import-bundle
> +  > EOF
> +  $ hg pull -R clone ssh://user@dummy/repo
> +  pulling from ssh://user@dummy/repo
> +  searching for changes
> +  abort: import-bundle format error
> +  [255]
> +
> +Missing size
> +
> +  $ cat > repo/.hg/bundle2maker << EOF
> +  > raw-import-bundle http://localhost:$HGPORT/bundle7.hg
> +  > EOF
> +  $ hg pull -R clone ssh://user@dummy/repo
> +  pulling from ssh://user@dummy/repo
> +  searching for changes
> +  abort: import-bundle format error
> +  [255]
> +
> +Size mismatch
> +
> +  $ cat > repo/.hg/bundle2maker << EOF
> +  > raw-import-bundle http://localhost:$HGPORT/bundle7.hg\n42
> +  > EOF
> +  $ hg pull -R clone ssh://user@dummy/repo
> +  pulling from ssh://user@dummy/repo
> +  searching for changes
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 2 changesets with 2 changes to 2 files (+1 heads)
> +  transaction abort!
> +  rollback completed
> +  abort: bundle at http://localhost:$HGPORT/bundle7.hg is corrupted
> +  [255]
> +
> +Garbage data
> +
> +  $ cat > repo/.hg/bundle2maker << EOF
> +  > raw-import-bundle http://localhost:$HGPORT/bundle7.hg\n581\ngarbage
> +  > EOF
> +  $ hg pull -R clone ssh://user@dummy/repo
> +  pulling from ssh://user@dummy/repo
> +  searching for changes
> +  abort: import-bundle format error
> +  [255]
> +
> +Unknown digest
> +
> +  $ cat > repo/.hg/bundle2maker << EOF
> +  > raw-import-bundle http://localhost:$HGPORT/bundle7.hg\n581\nfoo: bar
> +  > EOF
> +  $ hg pull -R clone ssh://user@dummy/repo
> +  pulling from ssh://user@dummy/repo
> +  searching for changes
> +  abort: unknown digest type: foo
> +  [255]
> +
> +Not an HTTP url
> +
> +  $ cat > repo/.hg/bundle2maker << EOF
> +  > raw-import-bundle ssh://localhost:$HGPORT/bundle7.hg\n581
> +  > EOF
> +  $ hg pull -R clone ssh://user@dummy/repo
> +  pulling from ssh://user@dummy/repo
> +  searching for changes
> +  abort: import-bundle only supports http/https urls
> +  [255]
> +
> +  $ hg -R clone log -G
> +  @  2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com>  C
> +  |
> +  o  1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com>  B
> +  |
> +  o  0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com>  A
> +
> +  $ rm -rf clone
> +
> +  $ "$TESTDIR/killdaemons.py" $DAEMON_PIDS
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
>
Pierre-Yves David - Oct. 12, 2014, 1:26 a.m.
On 10/11/2014 03:27 PM, Mike Hommey wrote:
> On Sat, Oct 11, 2014 at 11:46:18AM -0700, Gregory Szorc wrote:
>> On 10/11/2014 1:38 AM, Mike Hommey wrote:
>>> # HG changeset patch
>>> # User Mike Hommey <mh@glandium.org>
>>> # Date 1413016657 -32400
>>> #      Sat Oct 11 17:37:37 2014 +0900
>>> # Node ID 538073e6c7f6ae680cbdd9bcb44c214fd6653eb1
>>> # Parent  225306662bbf7c8cc6d867d484eab1e038f6c2a5
>>> bundle2: add a part to import external bundles
>>
>> Great patch. I can't wait to start utilizing this!
>>
>> Comments inline.
>>
>>> diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py
>>> --- a/mercurial/bundle2.py
>>> +++ b/mercurial/bundle2.py
>>> +@parthandler('b2x:import-bundle')
>>> +def handleimportbundle(op, inpart):
>>> +    """apply a bundle on the repo, given an url and validation information
>>> +
>>> +    The part contains a bundle url, as well as hash digest(s) and size
>>> +    information to validate the downloaded content.
>>> +    The expected format for this part is:
>>> +        <URL>
>>> +        <SIZE in bytes>
>>> +        <DIGEST-TYPE>: <DIGEST-VALUE>
>>> +        ...
>>> +    Multiple digest types are supported in the same part, in which case they
>>> +    are all validated.
>>> +    This currently only supports bundle10, but is intended to support bundle20
>>> +    eventually.
>>> +    """
>>
>> Instead of an order-dependent format, how do you feel about the HTTP
>> header style "K: V" format for all metadata? Or is the bundle2 style of
>> expanding functionality to introduce a new part with similar but
>> enhanced functionality?
>
> I'm not sure there's much functionality to add to this particular part.
> If you want some extra functionality, you're probably better covered
> with a new part type.

cf my other email about using part parameters instead.
 > […]
>>> +Start a simple HTTP server to serve bundles (stolen from test-static-http.t)
>>> +
>>> +  $ cat > dumb.py <<EOF
>>> +  > import BaseHTTPServer, SimpleHTTPServer, os, signal, sys
>>> +  >
>>> +  > def run(server_class=BaseHTTPServer.HTTPServer,
>>> +  >         handler_class=SimpleHTTPServer.SimpleHTTPRequestHandler):
>>> +  >     server_address = ('localhost', int(os.environ['HGPORT']))
>>> +  >     httpd = server_class(server_address, handler_class)
>>> +  >     httpd.serve_forever()
>>> +  >
>>> +  > signal.signal(signal.SIGTERM, lambda x, y: sys.exit(0))
>>> +  > fp = file('dumb.pid', 'wb')
>>> +  > fp.write(str(os.getpid()) + '\n')
>>> +  > fp.close()
>>> +  > run()
>>> +  > EOF
>>> +  $ python dumb.py 2>/dev/null &
>>> +
>>> +Cannot just read $!, it will not be set to the right value on Windows/MinGW
>>> +
>>> +  $ cat > wait.py <<EOF
>>> +  > import time
>>> +  > while True:
>>> +  >     try:
>>> +  >         if '\n' in file('dumb.pid', 'rb').read():
>>> +  >             break
>>> +  >     except IOError:
>>> +  >         pass
>>> +  >     time.sleep(0.2)
>>> +  > EOF
>>> +  $ python wait.py
>>> +  $ cat dumb.pid >> $DAEMON_PIDS
>>
>> Why can't dumb.py append to the os.environ['DAEMON_PIDS'] file directly?
>
> I just copied that whole thing from tests/test-static-http.t

cf other email about promoting that to a real script (if applicable)

>
>>> +Test a clone with a single import-bundle.
>>> +
>>> +  $ hg bundle -R repo -a bundle.hg
>>> +  8 changesets found
>>> +  $ cat > repo/.hg/bundle2maker << EOF
>>> +  > import-bundle http://localhost:$HGPORT/bundle.hg bundle.hg
>>> +  > EOF
>>> +  $ hg clone ssh://user@dummy/repo clone
>>> +  requesting all changes
>>> +  adding changesets
>>> +  adding manifests
>>> +  adding file changes
>>> +  added 8 changesets with 7 changes to 7 files (+2 heads)
>>> +  updating to branch default
>>> +  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
>>
>> This test doesn't actually verify import-bundle is used. We need some
>> kind of UI output or confirmation from the HTTP/bundle server to verify
>> the client is actually fetching a remote URL/bundle. This deficiency is
>> present in subsequent tests.
>>
>> I'm not sure if we want the client to print import-bundle details by
>> default or if it should be hidden behind --debug. Large repositories
>> presumably generating many bundles could lead to console spam, which is
>> a point in --debug's favor. More on this later.

I believe that a --debug or --verbose out would be enough (and useful in 
the wild)

>>> +Test a clone with an import-bundle and a following changegroup
>>> +
>>> +  $ hg bundle -R repo --base 000000000000 -r '0:4' bundle2.hg
>>> +  5 changesets found
>>> +  $ cat > repo/.hg/bundle2maker << EOF
>>> +  > import-bundle http://localhost:$HGPORT/bundle2.hg bundle2.hg
>>> +  > changegroup 0:4 5:7
>>> +  > EOF
>>> +  $ hg clone ssh://user@dummy/repo clone
>>> +  requesting all changes
>>> +  adding changesets
>>> +  adding manifests
>>> +  adding file changes
>>> +  added 5 changesets with 5 changes to 5 files (+1 heads)
>>> +  adding changesets
>>> +  adding manifests
>>> +  adding file changes
>>> +  added 3 changesets with 2 changes to 2 files (+1 heads)
>>> +  updating to branch default
>>> +  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
>>
>> Pretending I'm an average user, this output is kinda verbose and
>> confusing. Maybe we *do* need some kind of normal output saying
>> "fetching changes from URL" to make this clearer. Still, I can't help
>> but think of the Firefox use case where clones are fetching 20+ bundles
>> (of 10k changesets each).
>
> Considering the time it takes to clone Firefox, and how many bundles
> might be involved, it might make sense to show their url as some kind of
> progress meter that show that things are _not_ looping out of control.

We need an overall rework of exchange output since this is going to be 
an issue for kind of data. My bet is that the server will have to add 
some parts at the beginning of the bundle with hint of its whole 
content. We can also have the server sense progress and output related 
information.

In our very case. I thing that the server can insert a part that output 
"prefetching X of Y changesets from CDN" right before this part.


>
>>> +Corruption tests
>>> +
>>> +  $ hg clone orig clone -r 2
>>> +  adding changesets
>>> +  adding manifests
>>> +  adding file changes
>>> +  added 3 changesets with 3 changes to 3 files
>>> +  updating to branch default
>>> +  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
>>> +
>>> +  $ cat > repo/.hg/bundle2maker << EOF
>>> +  > import-bundle http://localhost:$HGPORT/bundle7.hg bundle7.hg
>>> +  > import-bundle http://localhost:$HGPORT/bundle8.hg bundle7.hg
>>> +  > changegroup 0:6 7
>>> +  > EOF
>>> +  $ hg pull -R clone ssh://user@dummy/repo
>>> +  pulling from ssh://user@dummy/repo
>>> +  searching for changes
>>> +  adding changesets
>>> +  adding manifests
>>> +  adding file changes
>>> +  added 2 changesets with 2 changes to 2 files (+1 heads)
>>> +  adding changesets
>>> +  adding manifests
>>> +  adding file changes
>>> +  added 2 changesets with 1 changes to 1 files
>>> +  transaction abort!
>>> +  rollback completed
>>> +  abort: bundle at http://localhost:$HGPORT/bundle8.hg is corrupted
>>> +  [255]
>>
>>  From an end-user perspective, it's not entirely clear what the state of
>> the local store is after this abort. Was the rollback full or partial?
>
> Full. There will be a possibility of partial roolbacks in the future,
> with checkpoints (which, in fact, if we could get them in 3.2, that
> would be awesome)

We do not have partial pull now. And all failure are full for now.

There is no way such thing could happen for 3.2. I'll be happy to help 
someone to work on it during the 3.3 cycle.

>> Is there any way we could sneak a partial pull summary message in here?
>> (You'll probably want to do this as a separate patch. And knowing the
>> transaction code, this could be "fun" to implement.)
>
> I think all the UX part can be a separate patch, possibly even landing
> well after the functionality itself (since bundle2 is still going to be
> experimental)

I'm confused abotu this since we do not have partial pull
Pierre-Yves David - Oct. 12, 2014, 1:37 a.m.
On 10/11/2014 04:54 PM, Mads Kiilerich wrote:
> On 10/11/2014 10:38 AM, Mike Hommey wrote:
>> # HG changeset patch
>> # User Mike Hommey <mh@glandium.org>
>> # Date 1413016657 -32400
>> #      Sat Oct 11 17:37:37 2014 +0900
>> # Node ID 538073e6c7f6ae680cbdd9bcb44c214fd6653eb1
>> # Parent  225306662bbf7c8cc6d867d484eab1e038f6c2a5
>> bundle2: add a part to import external bundles
>
> It took me some time to figure out what this was all about.
>
> What do "a part" mean here? "Functionality"/"support"/"infrastructure"?
> Or more like "multi part"?\

It refer to a bundle2 part. This is a central concept of bundle2 and 
fairly clear in this context.

> It also seems essential that it is client side support only - that
> deserves mentioning in the short summary in the first line.

That is a good point.

>> diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py
>> --- a/mercurial/bundle2.py
>> +++ b/mercurial/bundle2.py
>
>> +class digester(object):
>> +    """helper to compute multiple digests for the same data."""
>> +
>> +    DIGESTS = {
>> +        'md5': util.md5,
>
> Why care about md5 at all?
>
> Why support multiple digests? That seems like unnecessary complexity.
> The rest of Mercurial might be moving away from sha1 one day, but then
> this should be handled like everything else.
>
> Digests and validation could perhaps be put in a different patch than
> the multi bundle handling itself so we got smaller parts that were
> easier to review.
>
>>   capabilities = {'HG2X': (),
>>                   'b2x:listkeys': (),
>>                   'b2x:pushkey': (),
>>                   'b2x:changegroup': (),
>> +                'digests': tuple(digester.DIGESTS.keys()),
>
> As mentioned, I don't see the point in supporting multiple digests, but
> I would expect these references to external bundles requires a new
> capability flag?
>
>> +@parthandler('b2x:import-bundle')
>> +def handleimportbundle(op, inpart):
>> +    """apply a bundle on the repo, given an url and validation
>> information
>> +
>> +    The part contains a bundle url, as well as hash digest(s) and size
>> +    information to validate the downloaded content.
>> +    The expected format for this part is:
>> +        <URL>
>> +        <SIZE in bytes>
>> +        <DIGEST-TYPE>: <DIGEST-VALUE>
>> +        ...
>> +    Multiple digest types are supported in the same part, in which
>> case they
>> +    are all validated.
>> +    This currently only supports bundle10, but is intended to support
>> bundle20
>> +    eventually.
>> +    """
>> +    lines = inpart.read().splitlines()
>> +    try:
>> +        url = lines.pop(0)
>> +    except IndexError:
>> +        raise util.Abort(_('import-bundle format error'))
>> +    parsed_url = urlparse.urlparse(url)
>> +    if parsed_url.scheme not in ('http', 'https'):
>> +        raise util.Abort(_('import-bundle only supports http/https
>> urls'))
>> +
>> +    try:
>> +        size = int(lines.pop(0))
>> +    except (ValueError, IndexError):
>> +        raise util.Abort(_('import-bundle format error'))
>> +    digests = {}
>> +    for line in lines:
>> +        if ':' not in line:
>> +            raise util.Abort(_('import-bundle format error'))
>> +        k, v = line.split(':', 1)
>> +        digests[k.strip()] = v.strip()
>> +
>> +    real_part = digestchecker(urllib2.urlopen(url), size, digests)
>
> How will this work with authentication and server certificate validation
> etc? I would expect it should use (and perhaps extend) the existing URL
> abstraction layers in url.py and elsewhere.
>
> I also wonder why it only supports http(s). I would expect the most
> common case would be to get all bundles from the same location. Other
> protocols can of course be added later, but it seems wrong to have any
> protocol knowledge at all at this level.

Probably simplicity? What would be the other protocol you want 
supported and how would that work?

I can see a case for local file system bundle  in some case.

The list of supported protocol should probably be advertise so we can 
change it later (eg: a client may acces the server through ssh behind a 
firewall withou http access).

> Breaking bundles up in multiple parts do of course raise the question of
> whether the reference between them should use "relative" or "absolute"
> "paths". There are pros and cons, but I think I would prefer relative in
> most cases.

I do not get what your are trying to say here
Mike Hommey - Oct. 12, 2014, 1:39 a.m.
On Sat, Oct 11, 2014 at 06:17:57PM -0700, Pierre-Yves David wrote:
> >  capabilities = {'HG2X': (),
> >                  'b2x:listkeys': (),
> >                  'b2x:pushkey': (),
> >                  'b2x:changegroup': (),
> >+                'digests': tuple(digester.DIGESTS.keys()),
> 
> You need to sort this otherwise this the order will be unstable. Also, this
> may be more appropriate for capabilities in general (and not just bundle2
> capabilities) but this area is a but fuzzy.

Why would order matter?

> >                 }
> >
> >  def getrepocaps(repo):
> >      """return the bundle2 capabilities for a given repo
> >
> >      Exists to allow extensions (like evolution) to mutate the capabilities.
> >      """
> >      caps = capabilities.copy()
> >@@ -826,16 +889,76 @@ def handlechangegroup(op, inpart):
> >      if op.reply is not None:
> >          # This is definitly not the final form of this
> >          # return. But one need to start somewhere.
> >          part = op.reply.newpart('b2x:reply:changegroup')
> >          part.addparam('in-reply-to', str(inpart.id), mandatory=False)
> >          part.addparam('return', '%i' % ret, mandatory=False)
> >      assert not inpart.read()
> >
> >+@parthandler('b2x:import-bundle')
> >+def handleimportbundle(op, inpart):
> >+    """apply a bundle on the repo, given an url and validation information
> >+
> >+    The part contains a bundle url, as well as hash digest(s) and size
> >+    information to validate the downloaded content.
> >+    The expected format for this part is:
> >+        <URL>
> >+        <SIZE in bytes>
> >+        <DIGEST-TYPE>: <DIGEST-VALUE>
> 
> Part can have parameters. You should use such parameter instead of encoding
> the information into the part payload.

I'll have to check how that works.

> >+        ...
> >+    Multiple digest types are supported in the same part, in which case they
> >+    are all validated.
> >+    This currently only supports bundle10, but is intended to support bundle20
> >+    eventually.
> >+    """
> 
> Lets play simple and call this b2x:remote-changegroup and support
> changegroup for now. We will need something wider in the future but this
> will have to be more complicated.
> 
> >+    lines = inpart.read().splitlines()
> >+    try:
> >+        url = lines.pop(0)
> >+    except IndexError:
> >+        raise util.Abort(_('import-bundle format error'))
> >+    parsed_url = urlparse.urlparse(url)
> >+    if parsed_url.scheme not in ('http', 'https'):
> >+        raise util.Abort(_('import-bundle only supports http/https urls'))
> >+
> >+    try:
> >+        size = int(lines.pop(0))
> >+    except (ValueError, IndexError):
> >+        raise util.Abort(_('import-bundle format error'))
> >+    digests = {}
> >+    for line in lines:
> >+        if ':' not in line:
> >+            raise util.Abort(_('import-bundle format error'))
> >+        k, v = line.split(':', 1)
> >+        digests[k.strip()] = v.strip()
> 
> As pointed before, all this code can be replaced by something using
> parameter.
> 
> >+    real_part = digestchecker(urllib2.urlopen(url), size, digests)
> >+
> >+    # Make sure we trigger a transaction creation
> >+    #
> >+    # The addchangegroup function will get a transaction object by itself, but
> >+    # we need to make sure we trigger the creation of a transaction object used
> >+    # for the whole processing scope.
> >+    op.gettransaction()
> >+    import exchange
> >+    cg = exchange.readbundle(op.repo.ui, real_part, None)
> 
> If we focus on changegroup only, we can use the changegroup module directly.
> Avoiding dancing around circular import issue.

... except the code to handle the various HG10* headers and setting up
the corresponding decompression is in exchange.readbundle.

Mike
Pierre-Yves David - Oct. 12, 2014, 2:07 a.m.
On 10/11/2014 06:39 PM, Mike Hommey wrote:
> On Sat, Oct 11, 2014 at 06:17:57PM -0700, Pierre-Yves David wrote:
>>>   capabilities = {'HG2X': (),
>>>                   'b2x:listkeys': (),
>>>                   'b2x:pushkey': (),
>>>                   'b2x:changegroup': (),
>>> +                'digests': tuple(digester.DIGESTS.keys()),
>>
>> You need to sort this otherwise this the order will be unstable. Also, this
>> may be more appropriate for capabilities in general (and not just bundle2
>> capabilities) but this area is a but fuzzy.
>
> Why would order matter?

because otherwise, the test will be unstable and the testing goat will 
find you.

>>>   def getrepocaps(repo):
>>>       """return the bundle2 capabilities for a given repo
>>>
>>>       Exists to allow extensions (like evolution) to mutate the capabilities.
>>>       """
>>>       caps = capabilities.copy()
>>> @@ -826,16 +889,76 @@ def handlechangegroup(op, inpart):
>>>       if op.reply is not None:
>>>           # This is definitly not the final form of this
>>>           # return. But one need to start somewhere.
>>>           part = op.reply.newpart('b2x:reply:changegroup')
>>>           part.addparam('in-reply-to', str(inpart.id), mandatory=False)
>>>           part.addparam('return', '%i' % ret, mandatory=False)
>>>       assert not inpart.read()
>>>
>>> +@parthandler('b2x:import-bundle')
>>> +def handleimportbundle(op, inpart):
>>> +    """apply a bundle on the repo, given an url and validation information
>>> +
>>> +    The part contains a bundle url, as well as hash digest(s) and size
>>> +    information to validate the downloaded content.
>>> +    The expected format for this part is:
>>> +        <URL>
>>> +        <SIZE in bytes>
>>> +        <DIGEST-TYPE>: <DIGEST-VALUE>
>>
>> Part can have parameters. You should use such parameter instead of encoding
>> the information into the part payload.
>
> I'll have to check how that works.

outgoing_part.addparam(key, value)

value = incoming_part.params[key]

>
>>> +        ...
>>> +    Multiple digest types are supported in the same part, in which case they
>>> +    are all validated.
>>> +    This currently only supports bundle10, but is intended to support bundle20
>>> +    eventually.
>>> +    """
>>
>> Lets play simple and call this b2x:remote-changegroup and support
>> changegroup for now. We will need something wider in the future but this
>> will have to be more complicated.
>>
>>> +    lines = inpart.read().splitlines()
>>> +    try:
>>> +        url = lines.pop(0)
>>> +    except IndexError:
>>> +        raise util.Abort(_('import-bundle format error'))
>>> +    parsed_url = urlparse.urlparse(url)
>>> +    if parsed_url.scheme not in ('http', 'https'):
>>> +        raise util.Abort(_('import-bundle only supports http/https urls'))
>>> +
>>> +    try:
>>> +        size = int(lines.pop(0))
>>> +    except (ValueError, IndexError):
>>> +        raise util.Abort(_('import-bundle format error'))
>>> +    digests = {}
>>> +    for line in lines:
>>> +        if ':' not in line:
>>> +            raise util.Abort(_('import-bundle format error'))
>>> +        k, v = line.split(':', 1)
>>> +        digests[k.strip()] = v.strip()
>>
>> As pointed before, all this code can be replaced by something using
>> parameter.
>>
>>> +    real_part = digestchecker(urllib2.urlopen(url), size, digests)
>>> +
>>> +    # Make sure we trigger a transaction creation
>>> +    #
>>> +    # The addchangegroup function will get a transaction object by itself, but
>>> +    # we need to make sure we trigger the creation of a transaction object used
>>> +    # for the whole processing scope.
>>> +    op.gettransaction()
>>> +    import exchange
>>> +    cg = exchange.readbundle(op.repo.ui, real_part, None)
>>
>> If we focus on changegroup only, we can use the changegroup module directly.
>> Avoiding dancing around circular import issue.
>
> ... except the code to handle the various HG10* headers and setting up
> the corresponding decompression is in exchange.readbundle.

hé, true. The rest (about handling record keep applying).

>
> Mike
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
>
Martin Geisler - Oct. 12, 2014, 12:24 p.m.
Mike Hommey <mh@glandium.org> writes:

> On Sun, Oct 12, 2014 at 01:54:09AM +0200, Mads Kiilerich wrote:
>> 
>> Why care about md5 at all?
>> 
>> Why support multiple digests? That seems like unnecessary complexity.
>> The rest of Mercurial might be moving away from sha1 one day, but
>> then this should be handled like everything else.
>
> Because both md5 and sha1 are already weakened and they are the only
> digests we can support as long as mercurial sticks to supporting
> python 2.4. The bundles don't /need/ to contain all digest types, but
> having multiple ones when the only available ones are md5 and sha1
> kind of increases security (it's harder to create collisions for
> *both* digests at the same time)

Intuitively, that sounds correct.

However, I think there are some assumptions behind this claim that may
not apply here. One assumption is that the collisions you find for one
hash function is independent of the collisions you find for the other.
That need not be the case -- indeed, I believe it won't be the case if
you're trying to find two simultaneous collisions.


This reminds me a little about an exercise I saw as a TA for an
introductory course in cryptography. The question was if DES encrypting
a message twice would be equivalent to using a key of twice the length?

The answer is no. Given both the message and the ciphertext, you can
recover the key with just 2 * 2^56 DES invocations: you make a table
with all 2^56 possible encryptions of the the message and another table
with all 2^56 possible decryptions of the ciphertext. Finding a match
between these two tables reveals the key. This is one reason why there
is no "double DES" standard.

I know this is not the same situation, but it shows that one has to be
careful when expecting some kind of benefit from applying crypto
primitives twice.
Mike Hommey - Oct. 15, 2014, 8:04 a.m.
On Sat, Oct 11, 2014 at 11:46:18AM -0700, Gregory Szorc wrote:
> This test doesn't actually verify import-bundle is used. We need some
> kind of UI output or confirmation from the HTTP/bundle server to verify
> the client is actually fetching a remote URL/bundle. This deficiency is
> present in subsequent tests.
> 
> I'm not sure if we want the client to print import-bundle details by
> default or if it should be hidden behind --debug. Large repositories
> presumably generating many bundles could lead to console spam, which is
> a point in --debug's favor. More on this later.

--debug is *very* verbose, though :(

  $ hg --debug clone ssh://user@dummy/repo clone
  running python "dummyssh" user@dummy 'hg -R repo serve --stdio'
  sending hello command
  sending between command
  remote: 283
  remote: capabilities: lookup changegroupsubset branchmap pushkey known getbundle unbundlehash batch stream bundle2-exp=HG2X%0Ab2x%253Achangegroup%0Ab2x%253Aimport-bundle%3Dhttp%2Chttps%0Ab2x%253Alistkeys%0Ab2x%253Apushkey%0Adigests%3Dmd5%2Csha1 unbundle=HG10GZ,HG10BZ,HG10UN httpheader=1024
  remote: 1
  preparing listkeys for "bookmarks"
  sending listkeys command
  query 1; heads
  sending batch command
  requesting all changes
  sending getbundle command
  start processing of HG2X stream
  reading bundle2 stream parameters
  start extraction of bundle2 parts
  part header size: 24
  part type: "b2x:import-bundle"
  part id: "0"
  part parameters: 0
  found a handler for part 'b2x:import-bundle'
  payload chunk size: 123
  payload chunk size: 0
  adding changesets
  changesets: 1 chunks
  add changeset cd010b8cd998
  changesets: 2 chunks
  add changeset 42ccdea3bb16
  changesets: 3 chunks
  add changeset 5fddd98957c8
  changesets: 4 chunks
  add changeset 32af7686d403
  changesets: 5 chunks
  add changeset 9520eea781bc
  changesets: 6 chunks
  add changeset 24b6387c8c8c
  changesets: 7 chunks
  add changeset eea13746799a
  changesets: 8 chunks
  add changeset 02de42196ebe
  adding manifests
  manifests: 1/8 chunks (12.50%)
  manifests: 2/8 chunks (25.00%)
  manifests: 3/8 chunks (37.50%)
  manifests: 4/8 chunks (50.00%)
  manifests: 5/8 chunks (62.50%)
  manifests: 6/8 chunks (75.00%)
  manifests: 7/8 chunks (87.50%)
  manifests: 8/8 chunks (100.00%)
  adding file changes
  adding A revisions
  files: 1/7 chunks (14.29%)
  adding B revisions
  files: 2/7 chunks (28.57%)
  adding C revisions
  files: 3/7 chunks (42.86%)
  adding D revisions
  files: 4/7 chunks (57.14%)
  adding E revisions
  files: 5/7 chunks (71.43%)
  adding F revisions
  files: 6/7 chunks (85.71%)
  adding H revisions
  files: 7/7 chunks (100.00%)
  added 8 changesets with 7 changes to 7 files (+2 heads)
  part header size: 35
  part type: "b2x:listkeys"
  part id: "1"
  part parameters: 1
  found a handler for part 'b2x:listkeys'
  payload chunk size: 0
  part header size: 39
  part type: "b2x:listkeys"
  part id: "2"
  part parameters: 1
  found a handler for part 'b2x:listkeys'
  payload chunk size: 0
  part header size: 0
  end of bundle2 stream
  checking for updated bookmarks
  preparing listkeys for "phases"
  sending listkeys command
  updating the branch cache
  updating to branch default
  resolving manifests
   branchmerge: False, force: False, partial: False
   ancestor: 000000000000, local: 000000000000+, remote: 02de42196ebe
   A: remote created -> g
  getting A
   F: remote created -> g
  getting F
   H: remote created -> g
  getting H
  updating: H 3/3 files (100.00%)
  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
Pierre-Yves David - Oct. 15, 2014, 9:40 a.m.
On 10/15/2014 01:04 AM, Mike Hommey wrote:
> On Sat, Oct 11, 2014 at 11:46:18AM -0700, Gregory Szorc wrote:
>> This test doesn't actually verify import-bundle is used. We need some
>> kind of UI output or confirmation from the HTTP/bundle server to verify
>> the client is actually fetching a remote URL/bundle. This deficiency is
>> present in subsequent tests.
>>
>> I'm not sure if we want the client to print import-bundle details by
>> default or if it should be hidden behind --debug. Large repositories
>> presumably generating many bundles could lead to console spam, which is
>> a point in --debug's favor. More on this later.
>
> --debug is *very* verbose, though :(

adding and extra output to your test should be enought (with an output 
part).

>
>    $ hg --debug clone ssh://user@dummy/repo clone
>    running python "dummyssh" user@dummy 'hg -R repo serve --stdio'
>    sending hello command
>    sending between command
>    remote: 283
>    remote: capabilities: lookup changegroupsubset branchmap pushkey known getbundle unbundlehash batch stream bundle2-exp=HG2X%0Ab2x%253Achangegroup%0Ab2x%253Aimport-bundle%3Dhttp%2Chttps%0Ab2x%253Alistkeys%0Ab2x%253Apushkey%0Adigests%3Dmd5%2Csha1 unbundle=HG10GZ,HG10BZ,HG10UN httpheader=1024
>    remote: 1
>    preparing listkeys for "bookmarks"
>    sending listkeys command
>    query 1; heads
>    sending batch command
>    requesting all changes
>    sending getbundle command
>    start processing of HG2X stream
>    reading bundle2 stream parameters
>    start extraction of bundle2 parts
>    part header size: 24
>    part type: "b2x:import-bundle"
>    part id: "0"
>    part parameters: 0
>    found a handler for part 'b2x:import-bundle'
>    payload chunk size: 123
>    payload chunk size: 0
>    adding changesets
>    changesets: 1 chunks
>    add changeset cd010b8cd998
>    changesets: 2 chunks
>    add changeset 42ccdea3bb16
>    changesets: 3 chunks
>    add changeset 5fddd98957c8
>    changesets: 4 chunks
>    add changeset 32af7686d403
>    changesets: 5 chunks
>    add changeset 9520eea781bc
>    changesets: 6 chunks
>    add changeset 24b6387c8c8c
>    changesets: 7 chunks
>    add changeset eea13746799a
>    changesets: 8 chunks
>    add changeset 02de42196ebe
>    adding manifests
>    manifests: 1/8 chunks (12.50%)
>    manifests: 2/8 chunks (25.00%)
>    manifests: 3/8 chunks (37.50%)
>    manifests: 4/8 chunks (50.00%)
>    manifests: 5/8 chunks (62.50%)
>    manifests: 6/8 chunks (75.00%)
>    manifests: 7/8 chunks (87.50%)
>    manifests: 8/8 chunks (100.00%)
>    adding file changes
>    adding A revisions
>    files: 1/7 chunks (14.29%)
>    adding B revisions
>    files: 2/7 chunks (28.57%)
>    adding C revisions
>    files: 3/7 chunks (42.86%)
>    adding D revisions
>    files: 4/7 chunks (57.14%)
>    adding E revisions
>    files: 5/7 chunks (71.43%)
>    adding F revisions
>    files: 6/7 chunks (85.71%)
>    adding H revisions
>    files: 7/7 chunks (100.00%)
>    added 8 changesets with 7 changes to 7 files (+2 heads)
>    part header size: 35
>    part type: "b2x:listkeys"
>    part id: "1"
>    part parameters: 1
>    found a handler for part 'b2x:listkeys'
>    payload chunk size: 0
>    part header size: 39
>    part type: "b2x:listkeys"
>    part id: "2"
>    part parameters: 1
>    found a handler for part 'b2x:listkeys'
>    payload chunk size: 0
>    part header size: 0
>    end of bundle2 stream
>    checking for updated bookmarks
>    preparing listkeys for "phases"
>    sending listkeys command
>    updating the branch cache
>    updating to branch default
>    resolving manifests
>     branchmerge: False, force: False, partial: False
>     ancestor: 000000000000, local: 000000000000+, remote: 02de42196ebe
>     A: remote created -> g
>    getting A
>     F: remote created -> g
>    getting F
>     H: remote created -> g
>    getting H
>    updating: H 3/3 files (100.00%)
>    3 files updated, 0 files merged, 0 files removed, 0 files unresolved
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
>

Patch

diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py
--- a/mercurial/bundle2.py
+++ b/mercurial/bundle2.py
@@ -143,16 +143,18 @@  preserve.
 """
 
 import util
 import struct
 import urllib
 import string
 import obsolete
 import pushkey
+import urllib2
+import urlparse
 
 import changegroup, error
 from i18n import _
 
 _pack = struct.pack
 _unpack = struct.unpack
 
 _magicstring = 'HG2X'
@@ -771,20 +773,81 @@  class unbundlepart(unpackermixin):
         if size is None:
             data = self._payloadstream.read()
         else:
             data = self._payloadstream.read(size)
         if size is None or len(data) < size:
             self.consumed = True
         return data
 
+class digester(object):
+    """helper to compute multiple digests for the same data."""
+
+    DIGESTS = {
+        'md5': util.md5,
+        'sha1': util.sha1,
+    }
+
+    def __init__(self, digests, s=''):
+        self._hashes = {}
+        for k in digests:
+            if k not in self.DIGESTS:
+                raise util.Abort(_('unknown digest type: %s') % k)
+            self._hashes[k] = self.DIGESTS[k]()
+        if s:
+            self.update(s)
+
+    def update(self, data):
+        for h in self._hashes.values():
+            h.update(data)
+
+    def __getitem__(self, key):
+        if key not in self.DIGESTS:
+            raise util.Abort(_('unknown digest type: %s') % k)
+        return self._hashes[key].hexdigest()
+
+    def __iter__(self):
+        return iter(self._hashes)
+
+
+class digestchecker(object):
+    """file handle wrapper that additionally checks content against a given
+    size and hash digests.
+
+        d = digestchecker(fh, size, {'md5': '...'})
+
+    When multiple digests are given, all of them are verified.
+    """
+
+    def __init__(self, fh, size, digests):
+        self._fh = fh
+        self._size = size
+        self._digests = dict(digests)
+        self._digester = digester(self._digests.keys())
+
+    def read(self, length=-1):
+        content = self._fh.read(length)
+        self._digester.update(content)
+        self._size -= len(content)
+        return content
+
+    def validate(self):
+        if self._size != 0:
+            return False
+        for k, v in self._digests.items():
+            if v != self._digester[k]:
+                return False
+        return True
+
+
 capabilities = {'HG2X': (),
                 'b2x:listkeys': (),
                 'b2x:pushkey': (),
                 'b2x:changegroup': (),
+                'digests': tuple(digester.DIGESTS.keys()),
                }
 
 def getrepocaps(repo):
     """return the bundle2 capabilities for a given repo
 
     Exists to allow extensions (like evolution) to mutate the capabilities.
     """
     caps = capabilities.copy()
@@ -826,16 +889,76 @@  def handlechangegroup(op, inpart):
     if op.reply is not None:
         # This is definitly not the final form of this
         # return. But one need to start somewhere.
         part = op.reply.newpart('b2x:reply:changegroup')
         part.addparam('in-reply-to', str(inpart.id), mandatory=False)
         part.addparam('return', '%i' % ret, mandatory=False)
     assert not inpart.read()
 
+@parthandler('b2x:import-bundle')
+def handleimportbundle(op, inpart):
+    """apply a bundle on the repo, given an url and validation information
+
+    The part contains a bundle url, as well as hash digest(s) and size
+    information to validate the downloaded content.
+    The expected format for this part is:
+        <URL>
+        <SIZE in bytes>
+        <DIGEST-TYPE>: <DIGEST-VALUE>
+        ...
+    Multiple digest types are supported in the same part, in which case they
+    are all validated.
+    This currently only supports bundle10, but is intended to support bundle20
+    eventually.
+    """
+    lines = inpart.read().splitlines()
+    try:
+        url = lines.pop(0)
+    except IndexError:
+        raise util.Abort(_('import-bundle format error'))
+    parsed_url = urlparse.urlparse(url)
+    if parsed_url.scheme not in ('http', 'https'):
+        raise util.Abort(_('import-bundle only supports http/https urls'))
+
+    try:
+        size = int(lines.pop(0))
+    except (ValueError, IndexError):
+        raise util.Abort(_('import-bundle format error'))
+    digests = {}
+    for line in lines:
+        if ':' not in line:
+            raise util.Abort(_('import-bundle format error'))
+        k, v = line.split(':', 1)
+        digests[k.strip()] = v.strip()
+
+    real_part = digestchecker(urllib2.urlopen(url), size, digests)
+
+    # Make sure we trigger a transaction creation
+    #
+    # The addchangegroup function will get a transaction object by itself, but
+    # we need to make sure we trigger the creation of a transaction object used
+    # for the whole processing scope.
+    op.gettransaction()
+    import exchange
+    cg = exchange.readbundle(op.repo.ui, real_part, None)
+    if isinstance(cg, unbundle20):
+        raise util.Abort(_('import-bundle only supports bundle10'))
+    ret = changegroup.addchangegroup(op.repo, cg, 'bundle2', 'bundle2')
+    op.records.add('import-bundle', {'return': ret})
+    if op.reply is not None:
+        # This is definitly not the final form of this
+        # return. But one need to start somewhere.
+        part = op.reply.newpart('b2x:reply:import-bundle')
+        part.addparam('in-reply-to', str(inpart.id), mandatory=False)
+        part.addparam('return', '%i' % ret, mandatory=False)
+    if not real_part.validate():
+        raise util.Abort(_('bundle at %s is corrupted') % url)
+    assert not inpart.read()
+
 @parthandler('b2x:reply:changegroup', ('return', 'in-reply-to'))
 def handlereplychangegroup(op, inpart):
     ret = int(inpart.params['return'])
     replyto = int(inpart.params['in-reply-to'])
     op.records.add('changegroup', {'return': ret}, replyto)
 
 @parthandler('b2x:check:heads')
 def handlecheckheads(op, inpart):
diff --git a/mercurial/exchange.py b/mercurial/exchange.py
--- a/mercurial/exchange.py
+++ b/mercurial/exchange.py
@@ -908,17 +908,17 @@  def _pullbundle2(pullop):
     if kwargs.keys() == ['format']:
         return # nothing to pull
     bundle = pullop.remote.getbundle('pull', **kwargs)
     try:
         op = bundle2.processbundle(pullop.repo, bundle, pullop.gettransaction)
     except error.BundleValueError, exc:
         raise util.Abort('missing support for %s' % exc)
 
-    if pullop.fetch:
+    if pullop.fetch and op.records['changegroup']:
         assert len(op.records['changegroup']) == 1
         pullop.cgresult = op.records['changegroup'][0]['return']
 
     # processing phases change
     for namespace, value in op.records['listkeys']:
         if namespace == 'phases':
             _pullapplyphases(pullop, value)
 
diff --git a/tests/test-bundle2-import.t b/tests/test-bundle2-import.t
new file mode 100644
--- /dev/null
+++ b/tests/test-bundle2-import.t
@@ -0,0 +1,535 @@ 
+#require killdaemons
+
+Create an extension to test bundle2 import-bundle parts
+
+  $ cat > bundle2.py << EOF
+  > """A small extension to test bundle2 import-bundle parts.
+  > 
+  > Current bundle2 implementation doesn't provide a way to generate those
+  > parts, so they must be created by extensions.
+  > """
+  > from mercurial import bundle2, changegroup, commands, exchange
+  > from mercurial import extensions, scmutil
+  > import os
+  > 
+  > def _getbundlechangegrouppart(bundler, repo, source, bundlecaps=None,
+  >                               b2caps=None, heads=None, common=None,
+  >                               **kwargs):
+  >     for line in open(repo.join('bundle2maker'), 'r'):
+  >         line = line.strip()
+  >         try:
+  >             verb, args = line.split(None, 1)
+  >         except ValueError:
+  >             verb, args = line, ''
+  >         if verb == 'import-bundle':
+  >            url, file = args.split()
+  >            bundledata = open(file, 'rb').read()
+  >            data = '%s\n%d\n' % (url, len(bundledata))
+  >            d = bundle2.digester(b2caps['digests'], bundledata)
+  >            for k in b2caps['digests']:
+  >                data += '%s: %s\n' % (k, d[k])
+  >            bundler.newpart('b2x:import-bundle', data=data)
+  >         elif verb == 'raw-import-bundle':
+  >            bundler.newpart('b2x:import-bundle',
+  >                data=args.decode('string-escape'))
+  >         elif verb == 'changegroup':
+  >             _common, heads = args.split()
+  >             common.extend(repo.lookup(r) for r in repo.revs(_common))
+  >             heads = [repo.lookup(r) for r in repo.revs(heads)]
+  >             cg = changegroup.getchangegroup(repo, 'changegroup',
+  >                 heads=heads, common=common)
+  >             bundler.newpart('b2x:changegroup', data=cg.getchunks())
+  >         else:
+  >             raise Exception('unknown verb')
+  > 
+  > exchange.getbundle2partsmapping['changegroup'] = _getbundlechangegrouppart
+  > EOF
+
+Start a simple HTTP server to serve bundles (stolen from test-static-http.t)
+
+  $ cat > dumb.py <<EOF
+  > import BaseHTTPServer, SimpleHTTPServer, os, signal, sys
+  > 
+  > def run(server_class=BaseHTTPServer.HTTPServer,
+  >         handler_class=SimpleHTTPServer.SimpleHTTPRequestHandler):
+  >     server_address = ('localhost', int(os.environ['HGPORT']))
+  >     httpd = server_class(server_address, handler_class)
+  >     httpd.serve_forever()
+  > 
+  > signal.signal(signal.SIGTERM, lambda x, y: sys.exit(0))
+  > fp = file('dumb.pid', 'wb')
+  > fp.write(str(os.getpid()) + '\n')
+  > fp.close()
+  > run()
+  > EOF
+  $ python dumb.py 2>/dev/null &
+
+Cannot just read $!, it will not be set to the right value on Windows/MinGW
+
+  $ cat > wait.py <<EOF
+  > import time
+  > while True:
+  >     try:
+  >         if '\n' in file('dumb.pid', 'rb').read():
+  >             break
+  >     except IOError:
+  >         pass
+  >     time.sleep(0.2)
+  > EOF
+  $ python wait.py
+  $ cat dumb.pid >> $DAEMON_PIDS
+
+  $ cat >> $HGRCPATH << EOF
+  > [experimental]
+  > bundle2-exp=True
+  > [ui]
+  > ssh=python "$TESTDIR/dummyssh"
+  > logtemplate={rev}:{node|short} {phase} {author} {bookmarks} {desc|firstline}
+  > EOF
+
+  $ hg init repo
+
+  $ hg -R repo unbundle $TESTDIR/bundles/rebase.hg
+  adding changesets
+  adding manifests
+  adding file changes
+  added 8 changesets with 7 changes to 7 files (+2 heads)
+  (run 'hg heads' to see heads, 'hg merge' to merge)
+
+  $ hg -R repo log -G
+  o  7:02de42196ebe draft Nicolas Dumazet <nicdumz.commits@gmail.com>  H
+  |
+  | o  6:eea13746799a draft Nicolas Dumazet <nicdumz.commits@gmail.com>  G
+  |/|
+  o |  5:24b6387c8c8c draft Nicolas Dumazet <nicdumz.commits@gmail.com>  F
+  | |
+  | o  4:9520eea781bc draft Nicolas Dumazet <nicdumz.commits@gmail.com>  E
+  |/
+  | o  3:32af7686d403 draft Nicolas Dumazet <nicdumz.commits@gmail.com>  D
+  | |
+  | o  2:5fddd98957c8 draft Nicolas Dumazet <nicdumz.commits@gmail.com>  C
+  | |
+  | o  1:42ccdea3bb16 draft Nicolas Dumazet <nicdumz.commits@gmail.com>  B
+  |/
+  o  0:cd010b8cd998 draft Nicolas Dumazet <nicdumz.commits@gmail.com>  A
+  
+  $ hg clone repo orig
+  updating to branch default
+  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+  $ cat > repo/.hg/hgrc << EOF
+  > [extensions]
+  > bundle2=$TESTTMP/bundle2.py
+  > EOF
+
+Test a clone with a single import-bundle.
+
+  $ hg bundle -R repo -a bundle.hg
+  8 changesets found
+  $ cat > repo/.hg/bundle2maker << EOF
+  > import-bundle http://localhost:$HGPORT/bundle.hg bundle.hg
+  > EOF
+  $ hg clone ssh://user@dummy/repo clone
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 8 changesets with 7 changes to 7 files (+2 heads)
+  updating to branch default
+  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  $ hg -R clone log -G
+  @  7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com>  H
+  |
+  | o  6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com>  G
+  |/|
+  o |  5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com>  F
+  | |
+  | o  4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com>  E
+  |/
+  | o  3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com>  D
+  | |
+  | o  2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com>  C
+  | |
+  | o  1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com>  B
+  |/
+  o  0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com>  A
+  
+  $ rm -rf clone
+
+Test a clone with an import-bundle and a following changegroup
+
+  $ hg bundle -R repo --base 000000000000 -r '0:4' bundle2.hg
+  5 changesets found
+  $ cat > repo/.hg/bundle2maker << EOF
+  > import-bundle http://localhost:$HGPORT/bundle2.hg bundle2.hg
+  > changegroup 0:4 5:7
+  > EOF
+  $ hg clone ssh://user@dummy/repo clone
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 5 changesets with 5 changes to 5 files (+1 heads)
+  adding changesets
+  adding manifests
+  adding file changes
+  added 3 changesets with 2 changes to 2 files (+1 heads)
+  updating to branch default
+  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  $ hg -R clone log -G
+  @  7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com>  H
+  |
+  | o  6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com>  G
+  |/|
+  o |  5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com>  F
+  | |
+  | o  4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com>  E
+  |/
+  | o  3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com>  D
+  | |
+  | o  2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com>  C
+  | |
+  | o  1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com>  B
+  |/
+  o  0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com>  A
+  
+  $ rm -rf clone
+
+Test a clone with a changegroup followed by an import-bundle
+
+  $ hg bundle -R repo --base '0:4' -r '5:7' bundle3.hg
+  3 changesets found
+  $ cat > repo/.hg/bundle2maker << EOF
+  > changegroup 000000000000 :4
+  > import-bundle http://localhost:$HGPORT/bundle3.hg bundle3.hg
+  > EOF
+  $ hg clone ssh://user@dummy/repo clone
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 5 changesets with 5 changes to 5 files (+1 heads)
+  adding changesets
+  adding manifests
+  adding file changes
+  added 3 changesets with 2 changes to 2 files (+1 heads)
+  updating to branch default
+  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  $ hg -R clone log -G
+  @  7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com>  H
+  |
+  | o  6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com>  G
+  |/|
+  o |  5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com>  F
+  | |
+  | o  4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com>  E
+  |/
+  | o  3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com>  D
+  | |
+  | o  2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com>  C
+  | |
+  | o  1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com>  B
+  |/
+  o  0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com>  A
+  
+  $ rm -rf clone
+
+Test a pull with an import-bundle
+
+  $ hg bundle -R repo --base '0:4' -r '5:7' bundle4.hg
+  3 changesets found
+  $ cat > repo/.hg/bundle2maker << EOF
+  > import-bundle http://localhost:$HGPORT/bundle4.hg bundle4.hg
+  > EOF
+  $ hg clone orig clone -r 3 -r 4
+  adding changesets
+  adding manifests
+  adding file changes
+  added 5 changesets with 5 changes to 5 files (+1 heads)
+  updating to branch default
+  4 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  $ hg pull -R clone ssh://user@dummy/repo
+  pulling from ssh://user@dummy/repo
+  searching for changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 3 changesets with 2 changes to 2 files (+1 heads)
+  (run 'hg update' to get a working copy)
+  $ hg -R clone log -G
+  o  7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com>  H
+  |
+  | o  6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com>  G
+  |/|
+  o |  5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com>  F
+  | |
+  | o  4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com>  E
+  |/
+  | @  3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com>  D
+  | |
+  | o  2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com>  C
+  | |
+  | o  1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com>  B
+  |/
+  o  0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com>  A
+  
+  $ rm -rf clone
+
+Test a pull with an import-bundle and a following changegroup
+
+  $ hg bundle -R repo --base 2 -r '3:4' bundle5.hg
+  2 changesets found
+  $ cat > repo/.hg/bundle2maker << EOF
+  > import-bundle http://localhost:$HGPORT/bundle5.hg bundle5.hg
+  > changegroup 0:4 5:7
+  > EOF
+  $ hg clone orig clone -r 2
+  adding changesets
+  adding manifests
+  adding file changes
+  added 3 changesets with 3 changes to 3 files
+  updating to branch default
+  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  $ hg pull -R clone ssh://user@dummy/repo
+  pulling from ssh://user@dummy/repo
+  searching for changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 2 files (+1 heads)
+  adding changesets
+  adding manifests
+  adding file changes
+  added 3 changesets with 2 changes to 2 files (+1 heads)
+  (run 'hg heads .' to see heads, 'hg merge' to merge)
+  $ hg -R clone log -G
+  o  7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com>  H
+  |
+  | o  6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com>  G
+  |/|
+  o |  5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com>  F
+  | |
+  | o  4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com>  E
+  |/
+  | o  3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com>  D
+  | |
+  | @  2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com>  C
+  | |
+  | o  1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com>  B
+  |/
+  o  0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com>  A
+  
+  $ rm -rf clone
+
+Test a pull with a changegroup followed by an import-bundle
+
+  $ hg bundle -R repo --base '0:4' -r '5:7' bundle6.hg
+  3 changesets found
+  $ cat > repo/.hg/bundle2maker << EOF
+  > changegroup 000000000000 :4
+  > import-bundle http://localhost:$HGPORT/bundle6.hg bundle6.hg
+  > EOF
+  $ hg clone orig clone -r 2
+  adding changesets
+  adding manifests
+  adding file changes
+  added 3 changesets with 3 changes to 3 files
+  updating to branch default
+  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  $ hg pull -R clone ssh://user@dummy/repo
+  pulling from ssh://user@dummy/repo
+  searching for changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 2 files (+1 heads)
+  adding changesets
+  adding manifests
+  adding file changes
+  added 3 changesets with 2 changes to 2 files (+1 heads)
+  (run 'hg heads .' to see heads, 'hg merge' to merge)
+  $ hg -R clone log -G
+  o  7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com>  H
+  |
+  | o  6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com>  G
+  |/|
+  o |  5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com>  F
+  | |
+  | o  4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com>  E
+  |/
+  | o  3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com>  D
+  | |
+  | @  2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com>  C
+  | |
+  | o  1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com>  B
+  |/
+  o  0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com>  A
+  
+  $ rm -rf clone
+
+Test a pull with two import-bundles and a changegroup
+
+  $ hg bundle -R repo --base 2 -r '3:4' bundle7.hg
+  2 changesets found
+  $ hg bundle -R repo --base '3:4' -r '5:6' bundle8.hg
+  2 changesets found
+  $ cat > repo/.hg/bundle2maker << EOF
+  > import-bundle http://localhost:$HGPORT/bundle7.hg bundle7.hg
+  > import-bundle http://localhost:$HGPORT/bundle8.hg bundle8.hg
+  > changegroup 0:6 7
+  > EOF
+  $ hg clone orig clone -r 2
+  adding changesets
+  adding manifests
+  adding file changes
+  added 3 changesets with 3 changes to 3 files
+  updating to branch default
+  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  $ hg pull -R clone ssh://user@dummy/repo
+  pulling from ssh://user@dummy/repo
+  searching for changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 2 files (+1 heads)
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 1 changes to 1 files
+  adding changesets
+  adding manifests
+  adding file changes
+  added 1 changesets with 1 changes to 1 files (+1 heads)
+  (run 'hg heads .' to see heads, 'hg merge' to merge)
+  $ hg -R clone log -G
+  o  7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com>  H
+  |
+  | o  6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com>  G
+  |/|
+  o |  5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com>  F
+  | |
+  | o  4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com>  E
+  |/
+  | o  3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com>  D
+  | |
+  | @  2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com>  C
+  | |
+  | o  1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com>  B
+  |/
+  o  0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com>  A
+  
+  $ rm -rf clone
+
+Corruption tests
+
+  $ hg clone orig clone -r 2
+  adding changesets
+  adding manifests
+  adding file changes
+  added 3 changesets with 3 changes to 3 files
+  updating to branch default
+  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+  $ cat > repo/.hg/bundle2maker << EOF
+  > import-bundle http://localhost:$HGPORT/bundle7.hg bundle7.hg
+  > import-bundle http://localhost:$HGPORT/bundle8.hg bundle7.hg
+  > changegroup 0:6 7
+  > EOF
+  $ hg pull -R clone ssh://user@dummy/repo
+  pulling from ssh://user@dummy/repo
+  searching for changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 2 files (+1 heads)
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 1 changes to 1 files
+  transaction abort!
+  rollback completed
+  abort: bundle at http://localhost:$HGPORT/bundle8.hg is corrupted
+  [255]
+
+No content
+
+  $ cat > repo/.hg/bundle2maker << EOF
+  > raw-import-bundle
+  > EOF
+  $ hg pull -R clone ssh://user@dummy/repo
+  pulling from ssh://user@dummy/repo
+  searching for changes
+  abort: import-bundle format error
+  [255]
+
+Missing size
+
+  $ cat > repo/.hg/bundle2maker << EOF
+  > raw-import-bundle http://localhost:$HGPORT/bundle7.hg
+  > EOF
+  $ hg pull -R clone ssh://user@dummy/repo
+  pulling from ssh://user@dummy/repo
+  searching for changes
+  abort: import-bundle format error
+  [255]
+
+Size mismatch
+
+  $ cat > repo/.hg/bundle2maker << EOF
+  > raw-import-bundle http://localhost:$HGPORT/bundle7.hg\n42
+  > EOF
+  $ hg pull -R clone ssh://user@dummy/repo
+  pulling from ssh://user@dummy/repo
+  searching for changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 2 files (+1 heads)
+  transaction abort!
+  rollback completed
+  abort: bundle at http://localhost:$HGPORT/bundle7.hg is corrupted
+  [255]
+
+Garbage data
+
+  $ cat > repo/.hg/bundle2maker << EOF
+  > raw-import-bundle http://localhost:$HGPORT/bundle7.hg\n581\ngarbage
+  > EOF
+  $ hg pull -R clone ssh://user@dummy/repo
+  pulling from ssh://user@dummy/repo
+  searching for changes
+  abort: import-bundle format error
+  [255]
+
+Unknown digest
+
+  $ cat > repo/.hg/bundle2maker << EOF
+  > raw-import-bundle http://localhost:$HGPORT/bundle7.hg\n581\nfoo: bar
+  > EOF
+  $ hg pull -R clone ssh://user@dummy/repo
+  pulling from ssh://user@dummy/repo
+  searching for changes
+  abort: unknown digest type: foo
+  [255]
+
+Not an HTTP url
+
+  $ cat > repo/.hg/bundle2maker << EOF
+  > raw-import-bundle ssh://localhost:$HGPORT/bundle7.hg\n581
+  > EOF
+  $ hg pull -R clone ssh://user@dummy/repo
+  pulling from ssh://user@dummy/repo
+  searching for changes
+  abort: import-bundle only supports http/https urls
+  [255]
+
+  $ hg -R clone log -G
+  @  2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com>  C
+  |
+  o  1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com>  B
+  |
+  o  0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com>  A
+  
+  $ rm -rf clone
+
+  $ "$TESTDIR/killdaemons.py" $DAEMON_PIDS