Submitter | Mike Hommey |
---|---|
Date | Oct. 11, 2014, 8:38 a.m. |
Message ID | <538073e6c7f6ae680cbd.1413016701@zenigata.glandium.org> |
Download | mbox | patch |
Permalink | /patch/6217/ |
State | Changes Requested |
Headers | show |
Comments
On Sat, Oct 11, 2014 at 05:38:21PM +0900, Mike Hommey wrote: > capabilities = {'HG2X': (), > 'b2x:listkeys': (), > 'b2x:pushkey': (), > 'b2x:changegroup': (), > + 'digests': tuple(digester.DIGESTS.keys()), Note this lacks a capability for the new part type. Mike
On 10/11/2014 1:38 AM, Mike Hommey wrote: > # HG changeset patch > # User Mike Hommey <mh@glandium.org> > # Date 1413016657 -32400 > # Sat Oct 11 17:37:37 2014 +0900 > # Node ID 538073e6c7f6ae680cbdd9bcb44c214fd6653eb1 > # Parent 225306662bbf7c8cc6d867d484eab1e038f6c2a5 > bundle2: add a part to import external bundles Great patch. I can't wait to start utilizing this! Comments inline. > diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py > --- a/mercurial/bundle2.py > +++ b/mercurial/bundle2.py > +@parthandler('b2x:import-bundle') > +def handleimportbundle(op, inpart): > + """apply a bundle on the repo, given an url and validation information > + > + The part contains a bundle url, as well as hash digest(s) and size > + information to validate the downloaded content. > + The expected format for this part is: > + <URL> > + <SIZE in bytes> > + <DIGEST-TYPE>: <DIGEST-VALUE> > + ... > + Multiple digest types are supported in the same part, in which case they > + are all validated. > + This currently only supports bundle10, but is intended to support bundle20 > + eventually. > + """ Instead of an order-dependent format, how do you feel about the HTTP header style "K: V" format for all metadata? Or is the bundle2 style of expanding functionality to introduce a new part with similar but enhanced functionality? Do you think we should make room for the host fingerprint here so clients following URLs won't see "unknown fingerprint" warnings? There is conceptually a security risk here, but I think if a client trusts the remote that is handing out this data, it can temporarily trust fingerprints via proxy of trust. > diff --git a/tests/test-bundle2-import.t b/tests/test-bundle2-import.t > new file mode 100644 > --- /dev/null > +++ b/tests/test-bundle2-import.t > +Start a simple HTTP server to serve bundles (stolen from test-static-http.t) > + > + $ cat > dumb.py <<EOF > + > import BaseHTTPServer, SimpleHTTPServer, os, signal, sys > + > > + > def run(server_class=BaseHTTPServer.HTTPServer, > + > handler_class=SimpleHTTPServer.SimpleHTTPRequestHandler): > + > server_address = ('localhost', int(os.environ['HGPORT'])) > + > httpd = server_class(server_address, handler_class) > + > httpd.serve_forever() > + > > + > signal.signal(signal.SIGTERM, lambda x, y: sys.exit(0)) > + > fp = file('dumb.pid', 'wb') > + > fp.write(str(os.getpid()) + '\n') > + > fp.close() > + > run() > + > EOF > + $ python dumb.py 2>/dev/null & > + > +Cannot just read $!, it will not be set to the right value on Windows/MinGW > + > + $ cat > wait.py <<EOF > + > import time > + > while True: > + > try: > + > if '\n' in file('dumb.pid', 'rb').read(): > + > break > + > except IOError: > + > pass > + > time.sleep(0.2) > + > EOF > + $ python wait.py > + $ cat dumb.pid >> $DAEMON_PIDS Why can't dumb.py append to the os.environ['DAEMON_PIDS'] file directly? > +Test a clone with a single import-bundle. > + > + $ hg bundle -R repo -a bundle.hg > + 8 changesets found > + $ cat > repo/.hg/bundle2maker << EOF > + > import-bundle http://localhost:$HGPORT/bundle.hg bundle.hg > + > EOF > + $ hg clone ssh://user@dummy/repo clone > + requesting all changes > + adding changesets > + adding manifests > + adding file changes > + added 8 changesets with 7 changes to 7 files (+2 heads) > + updating to branch default > + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved This test doesn't actually verify import-bundle is used. We need some kind of UI output or confirmation from the HTTP/bundle server to verify the client is actually fetching a remote URL/bundle. This deficiency is present in subsequent tests. I'm not sure if we want the client to print import-bundle details by default or if it should be hidden behind --debug. Large repositories presumably generating many bundles could lead to console spam, which is a point in --debug's favor. More on this later. > +Test a clone with an import-bundle and a following changegroup > + > + $ hg bundle -R repo --base 000000000000 -r '0:4' bundle2.hg > + 5 changesets found > + $ cat > repo/.hg/bundle2maker << EOF > + > import-bundle http://localhost:$HGPORT/bundle2.hg bundle2.hg > + > changegroup 0:4 5:7 > + > EOF > + $ hg clone ssh://user@dummy/repo clone > + requesting all changes > + adding changesets > + adding manifests > + adding file changes > + added 5 changesets with 5 changes to 5 files (+1 heads) > + adding changesets > + adding manifests > + adding file changes > + added 3 changesets with 2 changes to 2 files (+1 heads) > + updating to branch default > + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved Pretending I'm an average user, this output is kinda verbose and confusing. Maybe we *do* need some kind of normal output saying "fetching changes from URL" to make this clearer. Still, I can't help but think of the Firefox use case where clones are fetching 20+ bundles (of 10k changesets each). > +Corruption tests > + > + $ hg clone orig clone -r 2 > + adding changesets > + adding manifests > + adding file changes > + added 3 changesets with 3 changes to 3 files > + updating to branch default > + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved > + > + $ cat > repo/.hg/bundle2maker << EOF > + > import-bundle http://localhost:$HGPORT/bundle7.hg bundle7.hg > + > import-bundle http://localhost:$HGPORT/bundle8.hg bundle7.hg > + > changegroup 0:6 7 > + > EOF > + $ hg pull -R clone ssh://user@dummy/repo > + pulling from ssh://user@dummy/repo > + searching for changes > + adding changesets > + adding manifests > + adding file changes > + added 2 changesets with 2 changes to 2 files (+1 heads) > + adding changesets > + adding manifests > + adding file changes > + added 2 changesets with 1 changes to 1 files > + transaction abort! > + rollback completed > + abort: bundle at http://localhost:$HGPORT/bundle8.hg is corrupted > + [255] From an end-user perspective, it's not entirely clear what the state of the local store is after this abort. Was the rollback full or partial? Is there any way we could sneak a partial pull summary message in here? (You'll probably want to do this as a separate patch. And knowing the transaction code, this could be "fun" to implement.) > +No content > +Missing size > +Size mismatch > +Garbage data > +Unknown digest > +Not an HTTP url The comprehensive testing of failure cases is quite nice!
On Sat, Oct 11, 2014 at 11:46:18AM -0700, Gregory Szorc wrote: > On 10/11/2014 1:38 AM, Mike Hommey wrote: > > # HG changeset patch > > # User Mike Hommey <mh@glandium.org> > > # Date 1413016657 -32400 > > # Sat Oct 11 17:37:37 2014 +0900 > > # Node ID 538073e6c7f6ae680cbdd9bcb44c214fd6653eb1 > > # Parent 225306662bbf7c8cc6d867d484eab1e038f6c2a5 > > bundle2: add a part to import external bundles > > Great patch. I can't wait to start utilizing this! > > Comments inline. > > > diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py > > --- a/mercurial/bundle2.py > > +++ b/mercurial/bundle2.py > > +@parthandler('b2x:import-bundle') > > +def handleimportbundle(op, inpart): > > + """apply a bundle on the repo, given an url and validation information > > + > > + The part contains a bundle url, as well as hash digest(s) and size > > + information to validate the downloaded content. > > + The expected format for this part is: > > + <URL> > > + <SIZE in bytes> > > + <DIGEST-TYPE>: <DIGEST-VALUE> > > + ... > > + Multiple digest types are supported in the same part, in which case they > > + are all validated. > > + This currently only supports bundle10, but is intended to support bundle20 > > + eventually. > > + """ > > Instead of an order-dependent format, how do you feel about the HTTP > header style "K: V" format for all metadata? Or is the bundle2 style of > expanding functionality to introduce a new part with similar but > enhanced functionality? I'm not sure there's much functionality to add to this particular part. If you want some extra functionality, you're probably better covered with a new part type. > Do you think we should make room for the host fingerprint here so > clients following URLs won't see "unknown fingerprint" warnings? > There is conceptually a security risk here, but I think if a client > trusts the remote that is handing out this data, it can temporarily > trust fingerprints via proxy of trust. What do you mean about "unknown fingerprint" warnings? Are you talking about possible ssl certificates problems on the remote pointed at for external bundles? Would you want the import-bundle part to possibly contain certificate material so that the client could identify self-signed certificates? How useful would that be in practice? Note that the reason there are digests for the remote bundle is that they may be served by foreign servers, which trust level is uncertain. The client is already trusting the mercurial server with changegroups it sends directly, so the same level of trust can be given to those digests, but it would be unsafe to blindly trust the foreign server giving the right bundles. (Although, the digests are not mandatory, someone can very well implement import-bundle parts without them, if they don't care) > > diff --git a/tests/test-bundle2-import.t b/tests/test-bundle2-import.t > > new file mode 100644 > > --- /dev/null > > +++ b/tests/test-bundle2-import.t > > > +Start a simple HTTP server to serve bundles (stolen from test-static-http.t) > > + > > + $ cat > dumb.py <<EOF > > + > import BaseHTTPServer, SimpleHTTPServer, os, signal, sys > > + > > > + > def run(server_class=BaseHTTPServer.HTTPServer, > > + > handler_class=SimpleHTTPServer.SimpleHTTPRequestHandler): > > + > server_address = ('localhost', int(os.environ['HGPORT'])) > > + > httpd = server_class(server_address, handler_class) > > + > httpd.serve_forever() > > + > > > + > signal.signal(signal.SIGTERM, lambda x, y: sys.exit(0)) > > + > fp = file('dumb.pid', 'wb') > > + > fp.write(str(os.getpid()) + '\n') > > + > fp.close() > > + > run() > > + > EOF > > + $ python dumb.py 2>/dev/null & > > + > > +Cannot just read $!, it will not be set to the right value on Windows/MinGW > > + > > + $ cat > wait.py <<EOF > > + > import time > > + > while True: > > + > try: > > + > if '\n' in file('dumb.pid', 'rb').read(): > > + > break > > + > except IOError: > > + > pass > > + > time.sleep(0.2) > > + > EOF > > + $ python wait.py > > + $ cat dumb.pid >> $DAEMON_PIDS > > Why can't dumb.py append to the os.environ['DAEMON_PIDS'] file directly? I just copied that whole thing from tests/test-static-http.t > > +Test a clone with a single import-bundle. > > + > > + $ hg bundle -R repo -a bundle.hg > > + 8 changesets found > > + $ cat > repo/.hg/bundle2maker << EOF > > + > import-bundle http://localhost:$HGPORT/bundle.hg bundle.hg > > + > EOF > > + $ hg clone ssh://user@dummy/repo clone > > + requesting all changes > > + adding changesets > > + adding manifests > > + adding file changes > > + added 8 changesets with 7 changes to 7 files (+2 heads) > > + updating to branch default > > + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved > > This test doesn't actually verify import-bundle is used. We need some > kind of UI output or confirmation from the HTTP/bundle server to verify > the client is actually fetching a remote URL/bundle. This deficiency is > present in subsequent tests. > > I'm not sure if we want the client to print import-bundle details by > default or if it should be hidden behind --debug. Large repositories > presumably generating many bundles could lead to console spam, which is > a point in --debug's favor. More on this later. > > > +Test a clone with an import-bundle and a following changegroup > > + > > + $ hg bundle -R repo --base 000000000000 -r '0:4' bundle2.hg > > + 5 changesets found > > + $ cat > repo/.hg/bundle2maker << EOF > > + > import-bundle http://localhost:$HGPORT/bundle2.hg bundle2.hg > > + > changegroup 0:4 5:7 > > + > EOF > > + $ hg clone ssh://user@dummy/repo clone > > + requesting all changes > > + adding changesets > > + adding manifests > > + adding file changes > > + added 5 changesets with 5 changes to 5 files (+1 heads) > > + adding changesets > > + adding manifests > > + adding file changes > > + added 3 changesets with 2 changes to 2 files (+1 heads) > > + updating to branch default > > + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved > > Pretending I'm an average user, this output is kinda verbose and > confusing. Maybe we *do* need some kind of normal output saying > "fetching changes from URL" to make this clearer. Still, I can't help > but think of the Firefox use case where clones are fetching 20+ bundles > (of 10k changesets each). Considering the time it takes to clone Firefox, and how many bundles might be involved, it might make sense to show their url as some kind of progress meter that show that things are _not_ looping out of control. > > +Corruption tests > > + > > + $ hg clone orig clone -r 2 > > + adding changesets > > + adding manifests > > + adding file changes > > + added 3 changesets with 3 changes to 3 files > > + updating to branch default > > + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved > > + > > + $ cat > repo/.hg/bundle2maker << EOF > > + > import-bundle http://localhost:$HGPORT/bundle7.hg bundle7.hg > > + > import-bundle http://localhost:$HGPORT/bundle8.hg bundle7.hg > > + > changegroup 0:6 7 > > + > EOF > > + $ hg pull -R clone ssh://user@dummy/repo > > + pulling from ssh://user@dummy/repo > > + searching for changes > > + adding changesets > > + adding manifests > > + adding file changes > > + added 2 changesets with 2 changes to 2 files (+1 heads) > > + adding changesets > > + adding manifests > > + adding file changes > > + added 2 changesets with 1 changes to 1 files > > + transaction abort! > > + rollback completed > > + abort: bundle at http://localhost:$HGPORT/bundle8.hg is corrupted > > + [255] > > From an end-user perspective, it's not entirely clear what the state of > the local store is after this abort. Was the rollback full or partial? Full. There will be a possibility of partial roolbacks in the future, with checkpoints (which, in fact, if we could get them in 3.2, that would be awesome) > Is there any way we could sneak a partial pull summary message in here? > (You'll probably want to do this as a separate patch. And knowing the > transaction code, this could be "fun" to implement.) I think all the UX part can be a separate patch, possibly even landing well after the functionality itself (since bundle2 is still going to be experimental) Mike
On 10/11/2014 10:38 AM, Mike Hommey wrote: > # HG changeset patch > # User Mike Hommey <mh@glandium.org> > # Date 1413016657 -32400 > # Sat Oct 11 17:37:37 2014 +0900 > # Node ID 538073e6c7f6ae680cbdd9bcb44c214fd6653eb1 > # Parent 225306662bbf7c8cc6d867d484eab1e038f6c2a5 > bundle2: add a part to import external bundles It took me some time to figure out what this was all about. What do "a part" mean here? "Functionality"/"support"/"infrastructure"? Or more like "multi part"? It also seems essential that it is client side support only - that deserves mentioning in the short summary in the first line. Perhaps: bundle2: client side support for bundles containing references to other bundles > diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py > --- a/mercurial/bundle2.py > +++ b/mercurial/bundle2.py > +class digester(object): > + """helper to compute multiple digests for the same data.""" > + > + DIGESTS = { > + 'md5': util.md5, Why care about md5 at all? Why support multiple digests? That seems like unnecessary complexity. The rest of Mercurial might be moving away from sha1 one day, but then this should be handled like everything else. Digests and validation could perhaps be put in a different patch than the multi bundle handling itself so we got smaller parts that were easier to review. > capabilities = {'HG2X': (), > 'b2x:listkeys': (), > 'b2x:pushkey': (), > 'b2x:changegroup': (), > + 'digests': tuple(digester.DIGESTS.keys()), As mentioned, I don't see the point in supporting multiple digests, but I would expect these references to external bundles requires a new capability flag? > +@parthandler('b2x:import-bundle') > +def handleimportbundle(op, inpart): > + """apply a bundle on the repo, given an url and validation information > + > + The part contains a bundle url, as well as hash digest(s) and size > + information to validate the downloaded content. > + The expected format for this part is: > + <URL> > + <SIZE in bytes> > + <DIGEST-TYPE>: <DIGEST-VALUE> > + ... > + Multiple digest types are supported in the same part, in which case they > + are all validated. > + This currently only supports bundle10, but is intended to support bundle20 > + eventually. > + """ > + lines = inpart.read().splitlines() > + try: > + url = lines.pop(0) > + except IndexError: > + raise util.Abort(_('import-bundle format error')) > + parsed_url = urlparse.urlparse(url) > + if parsed_url.scheme not in ('http', 'https'): > + raise util.Abort(_('import-bundle only supports http/https urls')) > + > + try: > + size = int(lines.pop(0)) > + except (ValueError, IndexError): > + raise util.Abort(_('import-bundle format error')) > + digests = {} > + for line in lines: > + if ':' not in line: > + raise util.Abort(_('import-bundle format error')) > + k, v = line.split(':', 1) > + digests[k.strip()] = v.strip() > + > + real_part = digestchecker(urllib2.urlopen(url), size, digests) How will this work with authentication and server certificate validation etc? I would expect it should use (and perhaps extend) the existing URL abstraction layers in url.py and elsewhere. I also wonder why it only supports http(s). I would expect the most common case would be to get all bundles from the same location. Other protocols can of course be added later, but it seems wrong to have any protocol knowledge at all at this level. Breaking bundles up in multiple parts do of course raise the question of whether the reference between them should use "relative" or "absolute" "paths". There are pros and cons, but I think I would prefer relative in most cases. > + > + # Make sure we trigger a transaction creation > + # > + # The addchangegroup function will get a transaction object by itself, but > + # we need to make sure we trigger the creation of a transaction object used > + # for the whole processing scope. > + op.gettransaction() > + import exchange > + cg = exchange.readbundle(op.repo.ui, real_part, None) > + if isinstance(cg, unbundle20): > + raise util.Abort(_('import-bundle only supports bundle10')) Why is that? (Might deserve a comment.) It seems wrong to have an architecture where we don't just automagically support all bundle formats. > + ret = changegroup.addchangegroup(op.repo, cg, 'bundle2', 'bundle2') > + op.records.add('import-bundle', {'return': ret}) > + if op.reply is not None: > + # This is definitly not the final form of this > + # return. But one need to start somewhere. > + part = op.reply.newpart('b2x:reply:import-bundle') > + part.addparam('in-reply-to', str(inpart.id), mandatory=False) > + part.addparam('return', '%i' % ret, mandatory=False) > + if not real_part.validate(): > + raise util.Abort(_('bundle at %s is corrupted') % url) > + assert not inpart.read() > + > @parthandler('b2x:reply:changegroup', ('return', 'in-reply-to')) > def handlereplychangegroup(op, inpart): > ret = int(inpart.params['return']) > replyto = int(inpart.params['in-reply-to']) > op.records.add('changegroup', {'return': ret}, replyto) > > @parthandler('b2x:check:heads') > def handlecheckheads(op, inpart): > /Mads
On Sun, Oct 12, 2014 at 01:54:09AM +0200, Mads Kiilerich wrote: > On 10/11/2014 10:38 AM, Mike Hommey wrote: > ># HG changeset patch > ># User Mike Hommey <mh@glandium.org> > ># Date 1413016657 -32400 > ># Sat Oct 11 17:37:37 2014 +0900 > ># Node ID 538073e6c7f6ae680cbdd9bcb44c214fd6653eb1 > ># Parent 225306662bbf7c8cc6d867d484eab1e038f6c2a5 > >bundle2: add a part to import external bundles > > It took me some time to figure out what this was all about. > > What do "a part" mean here? "Functionality"/"support"/"infrastructure"? Or > more like "multi part"? bundle2 contains parts that are treated independently. > It also seems essential that it is client side support only - that deserves > mentioning in the short summary in the first line. > > Perhaps: > bundle2: client side support for bundles containing references to other > bundles > > >diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py > >--- a/mercurial/bundle2.py > >+++ b/mercurial/bundle2.py > > >+class digester(object): > >+ """helper to compute multiple digests for the same data.""" > >+ > >+ DIGESTS = { > >+ 'md5': util.md5, > > Why care about md5 at all? > > Why support multiple digests? That seems like unnecessary complexity. The > rest of Mercurial might be moving away from sha1 one day, but then this > should be handled like everything else. Because both md5 and sha1 are already weakened and they are the only digests we can support as long as mercurial sticks to supporting python 2.4. The bundles don't /need/ to contain all digest types, but having multiple ones when the only available ones are md5 and sha1 kind of increases security (it's harder to create collisions for *both* digests at the same time) > Digests and validation could perhaps be put in a different patch than the > multi bundle handling itself so we got smaller parts that were easier to > review. > > > capabilities = {'HG2X': (), > > 'b2x:listkeys': (), > > 'b2x:pushkey': (), > > 'b2x:changegroup': (), > >+ 'digests': tuple(digester.DIGESTS.keys()), > > As mentioned, I don't see the point in supporting multiple digests The point here is that the client advertizes what digests it supports, for the future case where it might support something else like sha512 or sha-3. Note we could settle to use hashlib if the python running mercurial supports it, adding support for other hashes, but that doesn't have to be in this patch. > but I would expect these references to external bundles requires a new > capability flag? that's an omission I noticed already. cf. http://www.selenic.com/pipermail/mercurial-devel/2014-October/062507.html > >+@parthandler('b2x:import-bundle') > >+def handleimportbundle(op, inpart): > >+ """apply a bundle on the repo, given an url and validation information > >+ > >+ The part contains a bundle url, as well as hash digest(s) and size > >+ information to validate the downloaded content. > >+ The expected format for this part is: > >+ <URL> > >+ <SIZE in bytes> > >+ <DIGEST-TYPE>: <DIGEST-VALUE> > >+ ... > >+ Multiple digest types are supported in the same part, in which case they > >+ are all validated. > >+ This currently only supports bundle10, but is intended to support bundle20 > >+ eventually. > >+ """ > >+ lines = inpart.read().splitlines() > >+ try: > >+ url = lines.pop(0) > >+ except IndexError: > >+ raise util.Abort(_('import-bundle format error')) > >+ parsed_url = urlparse.urlparse(url) > >+ if parsed_url.scheme not in ('http', 'https'): > >+ raise util.Abort(_('import-bundle only supports http/https urls')) > >+ > >+ try: > >+ size = int(lines.pop(0)) > >+ except (ValueError, IndexError): > >+ raise util.Abort(_('import-bundle format error')) > >+ digests = {} > >+ for line in lines: > >+ if ':' not in line: > >+ raise util.Abort(_('import-bundle format error')) > >+ k, v = line.split(':', 1) > >+ digests[k.strip()] = v.strip() > >+ > >+ real_part = digestchecker(urllib2.urlopen(url), size, digests) > > How will this work with authentication and server certificate validation > etc? I would expect it should use (and perhaps extend) the existing URL > abstraction layers in url.py and elsewhere. I thought mercurial's url opener was "installed" as the urllib handler. Isn't it the case? > I also wonder why it only supports http(s). I would expect the most common > case would be to get all bundles from the same location. You're assuming those references would come in a bundle stored on some server, when the use-case behind this is that you connect to a mercurial server for a pull or a clone, and get those references from there. Those references would *not* point to the same server. Also, at the point where the bundle data is handled, there is not access to where it came from (afaik) > Other protocols can > of course be added later, but it seems wrong to have any protocol knowledge > at all at this level. I'm not sure. The first reason I limited the protocols is that it didn't seem reasonable for a random request to a mercurial server to tell you to connect to some ssh server or use a file on your own local file system. > Breaking bundles up in multiple parts do of course raise the question of > whether the reference between them should use "relative" or "absolute" > "paths". There are pros and cons, but I think I would prefer relative in > most cases. > > >+ > >+ # Make sure we trigger a transaction creation > >+ # > >+ # The addchangegroup function will get a transaction object by itself, but > >+ # we need to make sure we trigger the creation of a transaction object used > >+ # for the whole processing scope. > >+ op.gettransaction() > >+ import exchange > >+ cg = exchange.readbundle(op.repo.ui, real_part, None) > >+ if isinstance(cg, unbundle20): > >+ raise util.Abort(_('import-bundle only supports bundle10')) > > Why is that? (Might deserve a comment.) The comment is there in the function description, but it's not "exposed" to the user. > It seems wrong to have an architecture where we don't just automagically > support all bundle formats. The intent *is* to support all bundle formats. It requires more work to support bundle2 bundles. Also, bundle2 bundles don't currently exist outside of push/pull/clone. There is no bundle/unbundle support for creating and unbundling them yet. Mike
On 10/11/2014 01:38 AM, Mike Hommey wrote: > # HG changeset patch > # User Mike Hommey <mh@glandium.org> > # Date 1413016657 -32400 > # Sat Oct 11 17:37:37 2014 +0900 > # Node ID 538073e6c7f6ae680cbdd9bcb44c214fd6653eb1 > # Parent 225306662bbf7c8cc6d867d484eab1e038f6c2a5 > bundle2: add a part to import external bundles > > Bundle2 opens doors to advanced features allowing to reduce load on > mercurial servers, and improve clone experience for users on unstable or > slow networks. > > For instance, it could be possible to pre-generate a bundle of a > repository, and give a pointer to it to clients cloning the repository, > followed by another changegroup with the remainder. For significantly > big repositories, this could come as several base bundles with e.g. 10k > changesets, which, combined with checkpoints (not part of this change), > would prevent users with flaky networks from starting over any time > their connection fails. > > While the server-side support for those features doesn't exist yet, it > is preferable to have client-side support for this early-on. > > Note, I'm not sure the exchange.py change is right. Also, I'm not sure what > pullop.fetch means. Is it possible to have pullop.fetch set, but to have e.g. > only bookmark updates? In that case, the exchange.py change might be right. > > I'm also unsure if it makes sense to do so many tests. Clone and pull are not > so different from each other, and the different test scenarios are more > testing the multi-part infrastructure than the feature itself. It however > tests all the use-cases I do plan to use. > > diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py > --- a/mercurial/bundle2.py > +++ b/mercurial/bundle2.py > @@ -143,16 +143,18 @@ preserve. > """ > > import util > import struct > import urllib > import string > import obsolete > import pushkey > +import urllib2 > +import urlparse > > import changegroup, error > from i18n import _ > > _pack = struct.pack > _unpack = struct.unpack > > _magicstring = 'HG2X' > @@ -771,20 +773,81 @@ class unbundlepart(unpackermixin): > if size is None: > data = self._payloadstream.read() > else: > data = self._payloadstream.read(size) > if size is None or len(data) < size: > self.consumed = True > return data > > +class digester(object): > + """helper to compute multiple digests for the same data.""" > + > + DIGESTS = { > + 'md5': util.md5, > + 'sha1': util.sha1, > + } > + > + def __init__(self, digests, s=''): > + self._hashes = {} > + for k in digests: > + if k not in self.DIGESTS: > + raise util.Abort(_('unknown digest type: %s') % k) > + self._hashes[k] = self.DIGESTS[k]() > + if s: > + self.update(s) > + > + def update(self, data): > + for h in self._hashes.values(): > + h.update(data) > + > + def __getitem__(self, key): > + if key not in self.DIGESTS: > + raise util.Abort(_('unknown digest type: %s') % k) > + return self._hashes[key].hexdigest() > + > + def __iter__(self): > + return iter(self._hashes) > + > + Move the class in another patch to make this one lighter. Another module maybe be more appropriate too. > +class digestchecker(object): > + """file handle wrapper that additionally checks content against a given > + size and hash digests. > + > + d = digestchecker(fh, size, {'md5': '...'}) > + > + When multiple digests are given, all of them are verified. > + """ > + > + def __init__(self, fh, size, digests): > + self._fh = fh > + self._size = size > + self._digests = dict(digests) > + self._digester = digester(self._digests.keys()) > + > + def read(self, length=-1): > + content = self._fh.read(length) > + self._digester.update(content) > + self._size -= len(content) > + return content > + > + def validate(self): > + if self._size != 0: > + return False > + for k, v in self._digests.items(): > + if v != self._digester[k]: > + return False > + return True > + > + Same here, move the class in another patch to make this one lighter. Another module maybe be more appropriate too. > capabilities = {'HG2X': (), > 'b2x:listkeys': (), > 'b2x:pushkey': (), > 'b2x:changegroup': (), > + 'digests': tuple(digester.DIGESTS.keys()), You need to sort this otherwise this the order will be unstable. Also, this may be more appropriate for capabilities in general (and not just bundle2 capabilities) but this area is a but fuzzy. > } > > def getrepocaps(repo): > """return the bundle2 capabilities for a given repo > > Exists to allow extensions (like evolution) to mutate the capabilities. > """ > caps = capabilities.copy() > @@ -826,16 +889,76 @@ def handlechangegroup(op, inpart): > if op.reply is not None: > # This is definitly not the final form of this > # return. But one need to start somewhere. > part = op.reply.newpart('b2x:reply:changegroup') > part.addparam('in-reply-to', str(inpart.id), mandatory=False) > part.addparam('return', '%i' % ret, mandatory=False) > assert not inpart.read() > > +@parthandler('b2x:import-bundle') > +def handleimportbundle(op, inpart): > + """apply a bundle on the repo, given an url and validation information > + > + The part contains a bundle url, as well as hash digest(s) and size > + information to validate the downloaded content. > + The expected format for this part is: > + <URL> > + <SIZE in bytes> > + <DIGEST-TYPE>: <DIGEST-VALUE> Part can have parameters. You should use such parameter instead of encoding the information into the part payload. > + ... > + Multiple digest types are supported in the same part, in which case they > + are all validated. > + This currently only supports bundle10, but is intended to support bundle20 > + eventually. > + """ Lets play simple and call this b2x:remote-changegroup and support changegroup for now. We will need something wider in the future but this will have to be more complicated. > + lines = inpart.read().splitlines() > + try: > + url = lines.pop(0) > + except IndexError: > + raise util.Abort(_('import-bundle format error')) > + parsed_url = urlparse.urlparse(url) > + if parsed_url.scheme not in ('http', 'https'): > + raise util.Abort(_('import-bundle only supports http/https urls')) > + > + try: > + size = int(lines.pop(0)) > + except (ValueError, IndexError): > + raise util.Abort(_('import-bundle format error')) > + digests = {} > + for line in lines: > + if ':' not in line: > + raise util.Abort(_('import-bundle format error')) > + k, v = line.split(':', 1) > + digests[k.strip()] = v.strip() As pointed before, all this code can be replaced by something using parameter. > + real_part = digestchecker(urllib2.urlopen(url), size, digests) > + > + # Make sure we trigger a transaction creation > + # > + # The addchangegroup function will get a transaction object by itself, but > + # we need to make sure we trigger the creation of a transaction object used > + # for the whole processing scope. > + op.gettransaction() > + import exchange > + cg = exchange.readbundle(op.repo.ui, real_part, None) If we focus on changegroup only, we can use the changegroup module directly. Avoiding dancing around circular import issue. > + if isinstance(cg, unbundle20): > + raise util.Abort(_('import-bundle only supports bundle10')) You should do the opposite (explicitly checking it is a bundle10. otherwise bundle30 bypass this. (we do not have bundle30, but this is a philosophical point, use while list, not blacklist) > + ret = changegroup.addchangegroup(op.repo, cg, 'bundle2', 'bundle2') > + op.records.add('import-bundle', {'return': ret}) This part is wrong (and is why we should focus on changegroup only at the beginning). You are adding a changegroup so you should record a "changegroup" addition. This "changegroup" record is used by pull to report what have been added to the repository (1 new head messages etc). This also mean that you have to patch the exchange code to properly merge multiple "changegroup" reply into a valid result. If we were to support all of bundle2, we would need to merge all the record from the subunbundling into the current unbundling operation. > + if op.reply is not None: > + # This is definitly not the final form of this > + # return. But one need to start somewhere. > + part = op.reply.newpart('b2x:reply:import-bundle') > + part.addparam('in-reply-to', str(inpart.id), mandatory=False) > + part.addparam('return', '%i' % ret, mandatory=False) > + if not real_part.validate(): > + raise util.Abort(_('bundle at %s is corrupted') % url) We probably want a more verbose message. > + assert not inpart.read() > + > @parthandler('b2x:reply:changegroup', ('return', 'in-reply-to')) > def handlereplychangegroup(op, inpart): > ret = int(inpart.params['return']) > replyto = int(inpart.params['in-reply-to']) > op.records.add('changegroup', {'return': ret}, replyto) > > @parthandler('b2x:check:heads') > def handlecheckheads(op, inpart): > diff --git a/mercurial/exchange.py b/mercurial/exchange.py > --- a/mercurial/exchange.py > +++ b/mercurial/exchange.py > @@ -908,17 +908,17 @@ def _pullbundle2(pullop): > if kwargs.keys() == ['format']: > return # nothing to pull > bundle = pullop.remote.getbundle('pull', **kwargs) > try: > op = bundle2.processbundle(pullop.repo, bundle, pullop.gettransaction) > except error.BundleValueError, exc: > raise util.Abort('missing support for %s' % exc) > > - if pullop.fetch: > + if pullop.fetch and op.records['changegroup']: As all part will be using changegroup, this can be reverted > assert len(op.records['changegroup']) == 1 > pullop.cgresult = op.records['changegroup'][0]['return'] This is the part that requires an upgrade to handle multiple changegroup. > > # processing phases change > for namespace, value in op.records['listkeys']: > if namespace == 'phases': > _pullapplyphases(pullop, value) > > diff --git a/tests/test-bundle2-import.t b/tests/test-bundle2-import.t > new file mode 100644 > --- /dev/null > +++ b/tests/test-bundle2-import.t > @@ -0,0 +1,535 @@ > +#require killdaemons > + > +Create an extension to test bundle2 import-bundle parts > + > + $ cat > bundle2.py << EOF > + > """A small extension to test bundle2 import-bundle parts. > + > > + > Current bundle2 implementation doesn't provide a way to generate those > + > parts, so they must be created by extensions. > + > """ > + > from mercurial import bundle2, changegroup, commands, exchange > + > from mercurial import extensions, scmutil > + > import os > + > > + > def _getbundlechangegrouppart(bundler, repo, source, bundlecaps=None, > + > b2caps=None, heads=None, common=None, > + > **kwargs): This could use some documentation. It is fairly hard to follow what is happening here. > + > for line in open(repo.join('bundle2maker'), 'r'): > + > line = line.strip() > + > try: > + > verb, args = line.split(None, 1) > + > except ValueError: > + > verb, args = line, '' > + > if verb == 'import-bundle': > + > url, file = args.split() > + > bundledata = open(file, 'rb').read() > + > data = '%s\n%d\n' % (url, len(bundledata)) > + > d = bundle2.digester(b2caps['digests'], bundledata) > + > for k in b2caps['digests']: > + > data += '%s: %s\n' % (k, d[k]) > + > bundler.newpart('b2x:import-bundle', data=data) > + > elif verb == 'raw-import-bundle': > + > bundler.newpart('b2x:import-bundle', > + > data=args.decode('string-escape')) > + > elif verb == 'changegroup': > + > _common, heads = args.split() > + > common.extend(repo.lookup(r) for r in repo.revs(_common)) > + > heads = [repo.lookup(r) for r in repo.revs(heads)] > + > cg = changegroup.getchangegroup(repo, 'changegroup', > + > heads=heads, common=common) > + > bundler.newpart('b2x:changegroup', data=cg.getchunks()) > + > else: > + > raise Exception('unknown verb') > + > > + > exchange.getbundle2partsmapping['changegroup'] = _getbundlechangegrouppart > + > EOF > + > +Start a simple HTTP server to serve bundles (stolen from test-static-http.t) maybe it is time to promote dumb.py into an explicit dumbhttp.py in the tests directory. > + > + $ cat > dumb.py <<EOF > + > import BaseHTTPServer, SimpleHTTPServer, os, signal, sys > + > > + > def run(server_class=BaseHTTPServer.HTTPServer, > + > handler_class=SimpleHTTPServer.SimpleHTTPRequestHandler): > + > server_address = ('localhost', int(os.environ['HGPORT'])) > + > httpd = server_class(server_address, handler_class) > + > httpd.serve_forever() > + > > + > signal.signal(signal.SIGTERM, lambda x, y: sys.exit(0)) > + > fp = file('dumb.pid', 'wb') > + > fp.write(str(os.getpid()) + '\n') > + > fp.close() > + > run() > + > EOF > + $ python dumb.py 2>/dev/null & > + > +Cannot just read $!, it will not be set to the right value on Windows/MinGW > + > + $ cat > wait.py <<EOF > + > import time > + > while True: > + > try: > + > if '\n' in file('dumb.pid', 'rb').read(): > + > break > + > except IOError: > + > pass > + > time.sleep(0.2) > + > EOF > + $ python wait.py > + $ cat dumb.pid >> $DAEMON_PIDS > + > + $ cat >> $HGRCPATH << EOF > + > [experimental] > + > bundle2-exp=True > + > [ui] > + > ssh=python "$TESTDIR/dummyssh" > + > logtemplate={rev}:{node|short} {phase} {author} {bookmarks} {desc|firstline} > + > EOF > + > + $ hg init repo > + > + $ hg -R repo unbundle $TESTDIR/bundles/rebase.hg > + adding changesets > + adding manifests > + adding file changes > + added 8 changesets with 7 changes to 7 files (+2 heads) > + (run 'hg heads' to see heads, 'hg merge' to merge) You should probably use a simpler bundle or create some changesets yourself. > + > + $ hg -R repo log -G > + o 7:02de42196ebe draft Nicolas Dumazet <nicdumz.commits@gmail.com> H > + | > + | o 6:eea13746799a draft Nicolas Dumazet <nicdumz.commits@gmail.com> G > + |/| > + o | 5:24b6387c8c8c draft Nicolas Dumazet <nicdumz.commits@gmail.com> F > + | | > + | o 4:9520eea781bc draft Nicolas Dumazet <nicdumz.commits@gmail.com> E > + |/ > + | o 3:32af7686d403 draft Nicolas Dumazet <nicdumz.commits@gmail.com> D > + | | > + | o 2:5fddd98957c8 draft Nicolas Dumazet <nicdumz.commits@gmail.com> C > + | | > + | o 1:42ccdea3bb16 draft Nicolas Dumazet <nicdumz.commits@gmail.com> B > + |/ > + o 0:cd010b8cd998 draft Nicolas Dumazet <nicdumz.commits@gmail.com> A > + > + $ hg clone repo orig > + updating to branch default > + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved > + > + $ cat > repo/.hg/hgrc << EOF > + > [extensions] > + > bundle2=$TESTTMP/bundle2.py > + > EOF > + > +Test a clone with a single import-bundle. > + > + $ hg bundle -R repo -a bundle.hg > + 8 changesets found > + $ cat > repo/.hg/bundle2maker << EOF > + > import-bundle http://localhost:$HGPORT/bundle.hg bundle.hg > + > EOF > + $ hg clone ssh://user@dummy/repo clone > + requesting all changes > + adding changesets > + adding manifests > + adding file changes > + added 8 changesets with 7 changes to 7 files (+2 heads) > + updating to branch default > + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved > + $ hg -R clone log -G > + @ 7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com> H > + | > + | o 6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com> G > + |/| > + o | 5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com> F > + | | > + | o 4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com> E > + |/ > + | o 3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com> D > + | | > + | o 2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com> C > + | | > + | o 1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com> B > + |/ > + o 0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com> A > + > + $ rm -rf clone > + > +Test a clone with an import-bundle and a following changegroup > + > + $ hg bundle -R repo --base 000000000000 -r '0:4' bundle2.hg > + 5 changesets found > + $ cat > repo/.hg/bundle2maker << EOF > + > import-bundle http://localhost:$HGPORT/bundle2.hg bundle2.hg > + > changegroup 0:4 5:7 > + > EOF > + $ hg clone ssh://user@dummy/repo clone > + requesting all changes > + adding changesets > + adding manifests > + adding file changes > + added 5 changesets with 5 changes to 5 files (+1 heads) > + adding changesets > + adding manifests > + adding file changes > + added 3 changesets with 2 changes to 2 files (+1 heads) > + updating to branch default > + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved > + $ hg -R clone log -G > + @ 7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com> H > + | > + | o 6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com> G > + |/| > + o | 5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com> F > + | | > + | o 4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com> E > + |/ > + | o 3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com> D > + | | > + | o 2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com> C > + | | > + | o 1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com> B > + |/ > + o 0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com> A > + > + $ rm -rf clone > + > +Test a clone with a changegroup followed by an import-bundle > + > + $ hg bundle -R repo --base '0:4' -r '5:7' bundle3.hg > + 3 changesets found > + $ cat > repo/.hg/bundle2maker << EOF > + > changegroup 000000000000 :4 > + > import-bundle http://localhost:$HGPORT/bundle3.hg bundle3.hg > + > EOF > + $ hg clone ssh://user@dummy/repo clone > + requesting all changes > + adding changesets > + adding manifests > + adding file changes > + added 5 changesets with 5 changes to 5 files (+1 heads) > + adding changesets > + adding manifests > + adding file changes > + added 3 changesets with 2 changes to 2 files (+1 heads) > + updating to branch default > + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved > + $ hg -R clone log -G > + @ 7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com> H > + | > + | o 6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com> G > + |/| > + o | 5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com> F > + | | > + | o 4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com> E > + |/ > + | o 3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com> D > + | | > + | o 2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com> C > + | | > + | o 1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com> B > + |/ > + o 0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com> A > + > + $ rm -rf clone > + > +Test a pull with an import-bundle > + > + $ hg bundle -R repo --base '0:4' -r '5:7' bundle4.hg > + 3 changesets found > + $ cat > repo/.hg/bundle2maker << EOF > + > import-bundle http://localhost:$HGPORT/bundle4.hg bundle4.hg > + > EOF > + $ hg clone orig clone -r 3 -r 4 > + adding changesets > + adding manifests > + adding file changes > + added 5 changesets with 5 changes to 5 files (+1 heads) > + updating to branch default > + 4 files updated, 0 files merged, 0 files removed, 0 files unresolved > + $ hg pull -R clone ssh://user@dummy/repo > + pulling from ssh://user@dummy/repo > + searching for changes > + adding changesets > + adding manifests > + adding file changes > + added 3 changesets with 2 changes to 2 files (+1 heads) > + (run 'hg update' to get a working copy) > + $ hg -R clone log -G > + o 7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com> H > + | > + | o 6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com> G > + |/| > + o | 5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com> F > + | | > + | o 4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com> E > + |/ > + | @ 3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com> D > + | | > + | o 2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com> C > + | | > + | o 1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com> B > + |/ > + o 0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com> A > + > + $ rm -rf clone > + > +Test a pull with an import-bundle and a following changegroup > + > + $ hg bundle -R repo --base 2 -r '3:4' bundle5.hg > + 2 changesets found > + $ cat > repo/.hg/bundle2maker << EOF > + > import-bundle http://localhost:$HGPORT/bundle5.hg bundle5.hg > + > changegroup 0:4 5:7 > + > EOF > + $ hg clone orig clone -r 2 > + adding changesets > + adding manifests > + adding file changes > + added 3 changesets with 3 changes to 3 files > + updating to branch default > + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved > + $ hg pull -R clone ssh://user@dummy/repo > + pulling from ssh://user@dummy/repo > + searching for changes > + adding changesets > + adding manifests > + adding file changes > + added 2 changesets with 2 changes to 2 files (+1 heads) > + adding changesets > + adding manifests > + adding file changes > + added 3 changesets with 2 changes to 2 files (+1 heads) > + (run 'hg heads .' to see heads, 'hg merge' to merge) > + $ hg -R clone log -G > + o 7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com> H > + | > + | o 6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com> G > + |/| > + o | 5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com> F > + | | > + | o 4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com> E > + |/ > + | o 3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com> D > + | | > + | @ 2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com> C > + | | > + | o 1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com> B > + |/ > + o 0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com> A > + > + $ rm -rf clone > + > +Test a pull with a changegroup followed by an import-bundle > + > + $ hg bundle -R repo --base '0:4' -r '5:7' bundle6.hg > + 3 changesets found > + $ cat > repo/.hg/bundle2maker << EOF > + > changegroup 000000000000 :4 > + > import-bundle http://localhost:$HGPORT/bundle6.hg bundle6.hg > + > EOF > + $ hg clone orig clone -r 2 > + adding changesets > + adding manifests > + adding file changes > + added 3 changesets with 3 changes to 3 files > + updating to branch default > + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved > + $ hg pull -R clone ssh://user@dummy/repo > + pulling from ssh://user@dummy/repo > + searching for changes > + adding changesets > + adding manifests > + adding file changes > + added 2 changesets with 2 changes to 2 files (+1 heads) > + adding changesets > + adding manifests > + adding file changes > + added 3 changesets with 2 changes to 2 files (+1 heads) > + (run 'hg heads .' to see heads, 'hg merge' to merge) > + $ hg -R clone log -G > + o 7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com> H > + | > + | o 6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com> G > + |/| > + o | 5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com> F > + | | > + | o 4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com> E > + |/ > + | o 3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com> D > + | | > + | @ 2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com> C > + | | > + | o 1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com> B > + |/ > + o 0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com> A > + > + $ rm -rf clone > + > +Test a pull with two import-bundles and a changegroup > + > + $ hg bundle -R repo --base 2 -r '3:4' bundle7.hg > + 2 changesets found > + $ hg bundle -R repo --base '3:4' -r '5:6' bundle8.hg > + 2 changesets found > + $ cat > repo/.hg/bundle2maker << EOF > + > import-bundle http://localhost:$HGPORT/bundle7.hg bundle7.hg > + > import-bundle http://localhost:$HGPORT/bundle8.hg bundle8.hg > + > changegroup 0:6 7 > + > EOF > + $ hg clone orig clone -r 2 > + adding changesets > + adding manifests > + adding file changes > + added 3 changesets with 3 changes to 3 files > + updating to branch default > + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved > + $ hg pull -R clone ssh://user@dummy/repo > + pulling from ssh://user@dummy/repo > + searching for changes > + adding changesets > + adding manifests > + adding file changes > + added 2 changesets with 2 changes to 2 files (+1 heads) > + adding changesets > + adding manifests > + adding file changes > + added 2 changesets with 1 changes to 1 files > + adding changesets > + adding manifests > + adding file changes > + added 1 changesets with 1 changes to 1 files (+1 heads) > + (run 'hg heads .' to see heads, 'hg merge' to merge) > + $ hg -R clone log -G > + o 7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com> H > + | > + | o 6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com> G > + |/| > + o | 5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com> F > + | | > + | o 4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com> E > + |/ > + | o 3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com> D > + | | > + | @ 2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com> C > + | | > + | o 1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com> B > + |/ > + o 0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com> A > + > + $ rm -rf clone > + > +Corruption tests > + > + $ hg clone orig clone -r 2 > + adding changesets > + adding manifests > + adding file changes > + added 3 changesets with 3 changes to 3 files > + updating to branch default > + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved > + > + $ cat > repo/.hg/bundle2maker << EOF > + > import-bundle http://localhost:$HGPORT/bundle7.hg bundle7.hg > + > import-bundle http://localhost:$HGPORT/bundle8.hg bundle7.hg > + > changegroup 0:6 7 > + > EOF > + $ hg pull -R clone ssh://user@dummy/repo > + pulling from ssh://user@dummy/repo > + searching for changes > + adding changesets > + adding manifests > + adding file changes > + added 2 changesets with 2 changes to 2 files (+1 heads) > + adding changesets > + adding manifests > + adding file changes > + added 2 changesets with 1 changes to 1 files > + transaction abort! > + rollback completed > + abort: bundle at http://localhost:$HGPORT/bundle8.hg is corrupted > + [255] > + > +No content > + > + $ cat > repo/.hg/bundle2maker << EOF > + > raw-import-bundle > + > EOF > + $ hg pull -R clone ssh://user@dummy/repo > + pulling from ssh://user@dummy/repo > + searching for changes > + abort: import-bundle format error > + [255] > + > +Missing size > + > + $ cat > repo/.hg/bundle2maker << EOF > + > raw-import-bundle http://localhost:$HGPORT/bundle7.hg > + > EOF > + $ hg pull -R clone ssh://user@dummy/repo > + pulling from ssh://user@dummy/repo > + searching for changes > + abort: import-bundle format error > + [255] > + > +Size mismatch > + > + $ cat > repo/.hg/bundle2maker << EOF > + > raw-import-bundle http://localhost:$HGPORT/bundle7.hg\n42 > + > EOF > + $ hg pull -R clone ssh://user@dummy/repo > + pulling from ssh://user@dummy/repo > + searching for changes > + adding changesets > + adding manifests > + adding file changes > + added 2 changesets with 2 changes to 2 files (+1 heads) > + transaction abort! > + rollback completed > + abort: bundle at http://localhost:$HGPORT/bundle7.hg is corrupted > + [255] > + > +Garbage data > + > + $ cat > repo/.hg/bundle2maker << EOF > + > raw-import-bundle http://localhost:$HGPORT/bundle7.hg\n581\ngarbage > + > EOF > + $ hg pull -R clone ssh://user@dummy/repo > + pulling from ssh://user@dummy/repo > + searching for changes > + abort: import-bundle format error > + [255] > + > +Unknown digest > + > + $ cat > repo/.hg/bundle2maker << EOF > + > raw-import-bundle http://localhost:$HGPORT/bundle7.hg\n581\nfoo: bar > + > EOF > + $ hg pull -R clone ssh://user@dummy/repo > + pulling from ssh://user@dummy/repo > + searching for changes > + abort: unknown digest type: foo > + [255] > + > +Not an HTTP url > + > + $ cat > repo/.hg/bundle2maker << EOF > + > raw-import-bundle ssh://localhost:$HGPORT/bundle7.hg\n581 > + > EOF > + $ hg pull -R clone ssh://user@dummy/repo > + pulling from ssh://user@dummy/repo > + searching for changes > + abort: import-bundle only supports http/https urls > + [255] > + > + $ hg -R clone log -G > + @ 2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com> C > + | > + o 1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com> B > + | > + o 0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com> A > + > + $ rm -rf clone > + > + $ "$TESTDIR/killdaemons.py" $DAEMON_PIDS > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@selenic.com > http://selenic.com/mailman/listinfo/mercurial-devel >
On 10/11/2014 03:27 PM, Mike Hommey wrote: > On Sat, Oct 11, 2014 at 11:46:18AM -0700, Gregory Szorc wrote: >> On 10/11/2014 1:38 AM, Mike Hommey wrote: >>> # HG changeset patch >>> # User Mike Hommey <mh@glandium.org> >>> # Date 1413016657 -32400 >>> # Sat Oct 11 17:37:37 2014 +0900 >>> # Node ID 538073e6c7f6ae680cbdd9bcb44c214fd6653eb1 >>> # Parent 225306662bbf7c8cc6d867d484eab1e038f6c2a5 >>> bundle2: add a part to import external bundles >> >> Great patch. I can't wait to start utilizing this! >> >> Comments inline. >> >>> diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py >>> --- a/mercurial/bundle2.py >>> +++ b/mercurial/bundle2.py >>> +@parthandler('b2x:import-bundle') >>> +def handleimportbundle(op, inpart): >>> + """apply a bundle on the repo, given an url and validation information >>> + >>> + The part contains a bundle url, as well as hash digest(s) and size >>> + information to validate the downloaded content. >>> + The expected format for this part is: >>> + <URL> >>> + <SIZE in bytes> >>> + <DIGEST-TYPE>: <DIGEST-VALUE> >>> + ... >>> + Multiple digest types are supported in the same part, in which case they >>> + are all validated. >>> + This currently only supports bundle10, but is intended to support bundle20 >>> + eventually. >>> + """ >> >> Instead of an order-dependent format, how do you feel about the HTTP >> header style "K: V" format for all metadata? Or is the bundle2 style of >> expanding functionality to introduce a new part with similar but >> enhanced functionality? > > I'm not sure there's much functionality to add to this particular part. > If you want some extra functionality, you're probably better covered > with a new part type. cf my other email about using part parameters instead. > […] >>> +Start a simple HTTP server to serve bundles (stolen from test-static-http.t) >>> + >>> + $ cat > dumb.py <<EOF >>> + > import BaseHTTPServer, SimpleHTTPServer, os, signal, sys >>> + > >>> + > def run(server_class=BaseHTTPServer.HTTPServer, >>> + > handler_class=SimpleHTTPServer.SimpleHTTPRequestHandler): >>> + > server_address = ('localhost', int(os.environ['HGPORT'])) >>> + > httpd = server_class(server_address, handler_class) >>> + > httpd.serve_forever() >>> + > >>> + > signal.signal(signal.SIGTERM, lambda x, y: sys.exit(0)) >>> + > fp = file('dumb.pid', 'wb') >>> + > fp.write(str(os.getpid()) + '\n') >>> + > fp.close() >>> + > run() >>> + > EOF >>> + $ python dumb.py 2>/dev/null & >>> + >>> +Cannot just read $!, it will not be set to the right value on Windows/MinGW >>> + >>> + $ cat > wait.py <<EOF >>> + > import time >>> + > while True: >>> + > try: >>> + > if '\n' in file('dumb.pid', 'rb').read(): >>> + > break >>> + > except IOError: >>> + > pass >>> + > time.sleep(0.2) >>> + > EOF >>> + $ python wait.py >>> + $ cat dumb.pid >> $DAEMON_PIDS >> >> Why can't dumb.py append to the os.environ['DAEMON_PIDS'] file directly? > > I just copied that whole thing from tests/test-static-http.t cf other email about promoting that to a real script (if applicable) > >>> +Test a clone with a single import-bundle. >>> + >>> + $ hg bundle -R repo -a bundle.hg >>> + 8 changesets found >>> + $ cat > repo/.hg/bundle2maker << EOF >>> + > import-bundle http://localhost:$HGPORT/bundle.hg bundle.hg >>> + > EOF >>> + $ hg clone ssh://user@dummy/repo clone >>> + requesting all changes >>> + adding changesets >>> + adding manifests >>> + adding file changes >>> + added 8 changesets with 7 changes to 7 files (+2 heads) >>> + updating to branch default >>> + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved >> >> This test doesn't actually verify import-bundle is used. We need some >> kind of UI output or confirmation from the HTTP/bundle server to verify >> the client is actually fetching a remote URL/bundle. This deficiency is >> present in subsequent tests. >> >> I'm not sure if we want the client to print import-bundle details by >> default or if it should be hidden behind --debug. Large repositories >> presumably generating many bundles could lead to console spam, which is >> a point in --debug's favor. More on this later. I believe that a --debug or --verbose out would be enough (and useful in the wild) >>> +Test a clone with an import-bundle and a following changegroup >>> + >>> + $ hg bundle -R repo --base 000000000000 -r '0:4' bundle2.hg >>> + 5 changesets found >>> + $ cat > repo/.hg/bundle2maker << EOF >>> + > import-bundle http://localhost:$HGPORT/bundle2.hg bundle2.hg >>> + > changegroup 0:4 5:7 >>> + > EOF >>> + $ hg clone ssh://user@dummy/repo clone >>> + requesting all changes >>> + adding changesets >>> + adding manifests >>> + adding file changes >>> + added 5 changesets with 5 changes to 5 files (+1 heads) >>> + adding changesets >>> + adding manifests >>> + adding file changes >>> + added 3 changesets with 2 changes to 2 files (+1 heads) >>> + updating to branch default >>> + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved >> >> Pretending I'm an average user, this output is kinda verbose and >> confusing. Maybe we *do* need some kind of normal output saying >> "fetching changes from URL" to make this clearer. Still, I can't help >> but think of the Firefox use case where clones are fetching 20+ bundles >> (of 10k changesets each). > > Considering the time it takes to clone Firefox, and how many bundles > might be involved, it might make sense to show their url as some kind of > progress meter that show that things are _not_ looping out of control. We need an overall rework of exchange output since this is going to be an issue for kind of data. My bet is that the server will have to add some parts at the beginning of the bundle with hint of its whole content. We can also have the server sense progress and output related information. In our very case. I thing that the server can insert a part that output "prefetching X of Y changesets from CDN" right before this part. > >>> +Corruption tests >>> + >>> + $ hg clone orig clone -r 2 >>> + adding changesets >>> + adding manifests >>> + adding file changes >>> + added 3 changesets with 3 changes to 3 files >>> + updating to branch default >>> + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved >>> + >>> + $ cat > repo/.hg/bundle2maker << EOF >>> + > import-bundle http://localhost:$HGPORT/bundle7.hg bundle7.hg >>> + > import-bundle http://localhost:$HGPORT/bundle8.hg bundle7.hg >>> + > changegroup 0:6 7 >>> + > EOF >>> + $ hg pull -R clone ssh://user@dummy/repo >>> + pulling from ssh://user@dummy/repo >>> + searching for changes >>> + adding changesets >>> + adding manifests >>> + adding file changes >>> + added 2 changesets with 2 changes to 2 files (+1 heads) >>> + adding changesets >>> + adding manifests >>> + adding file changes >>> + added 2 changesets with 1 changes to 1 files >>> + transaction abort! >>> + rollback completed >>> + abort: bundle at http://localhost:$HGPORT/bundle8.hg is corrupted >>> + [255] >> >> From an end-user perspective, it's not entirely clear what the state of >> the local store is after this abort. Was the rollback full or partial? > > Full. There will be a possibility of partial roolbacks in the future, > with checkpoints (which, in fact, if we could get them in 3.2, that > would be awesome) We do not have partial pull now. And all failure are full for now. There is no way such thing could happen for 3.2. I'll be happy to help someone to work on it during the 3.3 cycle. >> Is there any way we could sneak a partial pull summary message in here? >> (You'll probably want to do this as a separate patch. And knowing the >> transaction code, this could be "fun" to implement.) > > I think all the UX part can be a separate patch, possibly even landing > well after the functionality itself (since bundle2 is still going to be > experimental) I'm confused abotu this since we do not have partial pull
On 10/11/2014 04:54 PM, Mads Kiilerich wrote: > On 10/11/2014 10:38 AM, Mike Hommey wrote: >> # HG changeset patch >> # User Mike Hommey <mh@glandium.org> >> # Date 1413016657 -32400 >> # Sat Oct 11 17:37:37 2014 +0900 >> # Node ID 538073e6c7f6ae680cbdd9bcb44c214fd6653eb1 >> # Parent 225306662bbf7c8cc6d867d484eab1e038f6c2a5 >> bundle2: add a part to import external bundles > > It took me some time to figure out what this was all about. > > What do "a part" mean here? "Functionality"/"support"/"infrastructure"? > Or more like "multi part"?\ It refer to a bundle2 part. This is a central concept of bundle2 and fairly clear in this context. > It also seems essential that it is client side support only - that > deserves mentioning in the short summary in the first line. That is a good point. >> diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py >> --- a/mercurial/bundle2.py >> +++ b/mercurial/bundle2.py > >> +class digester(object): >> + """helper to compute multiple digests for the same data.""" >> + >> + DIGESTS = { >> + 'md5': util.md5, > > Why care about md5 at all? > > Why support multiple digests? That seems like unnecessary complexity. > The rest of Mercurial might be moving away from sha1 one day, but then > this should be handled like everything else. > > Digests and validation could perhaps be put in a different patch than > the multi bundle handling itself so we got smaller parts that were > easier to review. > >> capabilities = {'HG2X': (), >> 'b2x:listkeys': (), >> 'b2x:pushkey': (), >> 'b2x:changegroup': (), >> + 'digests': tuple(digester.DIGESTS.keys()), > > As mentioned, I don't see the point in supporting multiple digests, but > I would expect these references to external bundles requires a new > capability flag? > >> +@parthandler('b2x:import-bundle') >> +def handleimportbundle(op, inpart): >> + """apply a bundle on the repo, given an url and validation >> information >> + >> + The part contains a bundle url, as well as hash digest(s) and size >> + information to validate the downloaded content. >> + The expected format for this part is: >> + <URL> >> + <SIZE in bytes> >> + <DIGEST-TYPE>: <DIGEST-VALUE> >> + ... >> + Multiple digest types are supported in the same part, in which >> case they >> + are all validated. >> + This currently only supports bundle10, but is intended to support >> bundle20 >> + eventually. >> + """ >> + lines = inpart.read().splitlines() >> + try: >> + url = lines.pop(0) >> + except IndexError: >> + raise util.Abort(_('import-bundle format error')) >> + parsed_url = urlparse.urlparse(url) >> + if parsed_url.scheme not in ('http', 'https'): >> + raise util.Abort(_('import-bundle only supports http/https >> urls')) >> + >> + try: >> + size = int(lines.pop(0)) >> + except (ValueError, IndexError): >> + raise util.Abort(_('import-bundle format error')) >> + digests = {} >> + for line in lines: >> + if ':' not in line: >> + raise util.Abort(_('import-bundle format error')) >> + k, v = line.split(':', 1) >> + digests[k.strip()] = v.strip() >> + >> + real_part = digestchecker(urllib2.urlopen(url), size, digests) > > How will this work with authentication and server certificate validation > etc? I would expect it should use (and perhaps extend) the existing URL > abstraction layers in url.py and elsewhere. > > I also wonder why it only supports http(s). I would expect the most > common case would be to get all bundles from the same location. Other > protocols can of course be added later, but it seems wrong to have any > protocol knowledge at all at this level. Probably simplicity? What would be the other protocol you want supported and how would that work? I can see a case for local file system bundle in some case. The list of supported protocol should probably be advertise so we can change it later (eg: a client may acces the server through ssh behind a firewall withou http access). > Breaking bundles up in multiple parts do of course raise the question of > whether the reference between them should use "relative" or "absolute" > "paths". There are pros and cons, but I think I would prefer relative in > most cases. I do not get what your are trying to say here
On Sat, Oct 11, 2014 at 06:17:57PM -0700, Pierre-Yves David wrote: > > capabilities = {'HG2X': (), > > 'b2x:listkeys': (), > > 'b2x:pushkey': (), > > 'b2x:changegroup': (), > >+ 'digests': tuple(digester.DIGESTS.keys()), > > You need to sort this otherwise this the order will be unstable. Also, this > may be more appropriate for capabilities in general (and not just bundle2 > capabilities) but this area is a but fuzzy. Why would order matter? > > } > > > > def getrepocaps(repo): > > """return the bundle2 capabilities for a given repo > > > > Exists to allow extensions (like evolution) to mutate the capabilities. > > """ > > caps = capabilities.copy() > >@@ -826,16 +889,76 @@ def handlechangegroup(op, inpart): > > if op.reply is not None: > > # This is definitly not the final form of this > > # return. But one need to start somewhere. > > part = op.reply.newpart('b2x:reply:changegroup') > > part.addparam('in-reply-to', str(inpart.id), mandatory=False) > > part.addparam('return', '%i' % ret, mandatory=False) > > assert not inpart.read() > > > >+@parthandler('b2x:import-bundle') > >+def handleimportbundle(op, inpart): > >+ """apply a bundle on the repo, given an url and validation information > >+ > >+ The part contains a bundle url, as well as hash digest(s) and size > >+ information to validate the downloaded content. > >+ The expected format for this part is: > >+ <URL> > >+ <SIZE in bytes> > >+ <DIGEST-TYPE>: <DIGEST-VALUE> > > Part can have parameters. You should use such parameter instead of encoding > the information into the part payload. I'll have to check how that works. > >+ ... > >+ Multiple digest types are supported in the same part, in which case they > >+ are all validated. > >+ This currently only supports bundle10, but is intended to support bundle20 > >+ eventually. > >+ """ > > Lets play simple and call this b2x:remote-changegroup and support > changegroup for now. We will need something wider in the future but this > will have to be more complicated. > > >+ lines = inpart.read().splitlines() > >+ try: > >+ url = lines.pop(0) > >+ except IndexError: > >+ raise util.Abort(_('import-bundle format error')) > >+ parsed_url = urlparse.urlparse(url) > >+ if parsed_url.scheme not in ('http', 'https'): > >+ raise util.Abort(_('import-bundle only supports http/https urls')) > >+ > >+ try: > >+ size = int(lines.pop(0)) > >+ except (ValueError, IndexError): > >+ raise util.Abort(_('import-bundle format error')) > >+ digests = {} > >+ for line in lines: > >+ if ':' not in line: > >+ raise util.Abort(_('import-bundle format error')) > >+ k, v = line.split(':', 1) > >+ digests[k.strip()] = v.strip() > > As pointed before, all this code can be replaced by something using > parameter. > > >+ real_part = digestchecker(urllib2.urlopen(url), size, digests) > >+ > >+ # Make sure we trigger a transaction creation > >+ # > >+ # The addchangegroup function will get a transaction object by itself, but > >+ # we need to make sure we trigger the creation of a transaction object used > >+ # for the whole processing scope. > >+ op.gettransaction() > >+ import exchange > >+ cg = exchange.readbundle(op.repo.ui, real_part, None) > > If we focus on changegroup only, we can use the changegroup module directly. > Avoiding dancing around circular import issue. ... except the code to handle the various HG10* headers and setting up the corresponding decompression is in exchange.readbundle. Mike
On 10/11/2014 06:39 PM, Mike Hommey wrote: > On Sat, Oct 11, 2014 at 06:17:57PM -0700, Pierre-Yves David wrote: >>> capabilities = {'HG2X': (), >>> 'b2x:listkeys': (), >>> 'b2x:pushkey': (), >>> 'b2x:changegroup': (), >>> + 'digests': tuple(digester.DIGESTS.keys()), >> >> You need to sort this otherwise this the order will be unstable. Also, this >> may be more appropriate for capabilities in general (and not just bundle2 >> capabilities) but this area is a but fuzzy. > > Why would order matter? because otherwise, the test will be unstable and the testing goat will find you. >>> def getrepocaps(repo): >>> """return the bundle2 capabilities for a given repo >>> >>> Exists to allow extensions (like evolution) to mutate the capabilities. >>> """ >>> caps = capabilities.copy() >>> @@ -826,16 +889,76 @@ def handlechangegroup(op, inpart): >>> if op.reply is not None: >>> # This is definitly not the final form of this >>> # return. But one need to start somewhere. >>> part = op.reply.newpart('b2x:reply:changegroup') >>> part.addparam('in-reply-to', str(inpart.id), mandatory=False) >>> part.addparam('return', '%i' % ret, mandatory=False) >>> assert not inpart.read() >>> >>> +@parthandler('b2x:import-bundle') >>> +def handleimportbundle(op, inpart): >>> + """apply a bundle on the repo, given an url and validation information >>> + >>> + The part contains a bundle url, as well as hash digest(s) and size >>> + information to validate the downloaded content. >>> + The expected format for this part is: >>> + <URL> >>> + <SIZE in bytes> >>> + <DIGEST-TYPE>: <DIGEST-VALUE> >> >> Part can have parameters. You should use such parameter instead of encoding >> the information into the part payload. > > I'll have to check how that works. outgoing_part.addparam(key, value) value = incoming_part.params[key] > >>> + ... >>> + Multiple digest types are supported in the same part, in which case they >>> + are all validated. >>> + This currently only supports bundle10, but is intended to support bundle20 >>> + eventually. >>> + """ >> >> Lets play simple and call this b2x:remote-changegroup and support >> changegroup for now. We will need something wider in the future but this >> will have to be more complicated. >> >>> + lines = inpart.read().splitlines() >>> + try: >>> + url = lines.pop(0) >>> + except IndexError: >>> + raise util.Abort(_('import-bundle format error')) >>> + parsed_url = urlparse.urlparse(url) >>> + if parsed_url.scheme not in ('http', 'https'): >>> + raise util.Abort(_('import-bundle only supports http/https urls')) >>> + >>> + try: >>> + size = int(lines.pop(0)) >>> + except (ValueError, IndexError): >>> + raise util.Abort(_('import-bundle format error')) >>> + digests = {} >>> + for line in lines: >>> + if ':' not in line: >>> + raise util.Abort(_('import-bundle format error')) >>> + k, v = line.split(':', 1) >>> + digests[k.strip()] = v.strip() >> >> As pointed before, all this code can be replaced by something using >> parameter. >> >>> + real_part = digestchecker(urllib2.urlopen(url), size, digests) >>> + >>> + # Make sure we trigger a transaction creation >>> + # >>> + # The addchangegroup function will get a transaction object by itself, but >>> + # we need to make sure we trigger the creation of a transaction object used >>> + # for the whole processing scope. >>> + op.gettransaction() >>> + import exchange >>> + cg = exchange.readbundle(op.repo.ui, real_part, None) >> >> If we focus on changegroup only, we can use the changegroup module directly. >> Avoiding dancing around circular import issue. > > ... except the code to handle the various HG10* headers and setting up > the corresponding decompression is in exchange.readbundle. hé, true. The rest (about handling record keep applying). > > Mike > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@selenic.com > http://selenic.com/mailman/listinfo/mercurial-devel >
Mike Hommey <mh@glandium.org> writes: > On Sun, Oct 12, 2014 at 01:54:09AM +0200, Mads Kiilerich wrote: >> >> Why care about md5 at all? >> >> Why support multiple digests? That seems like unnecessary complexity. >> The rest of Mercurial might be moving away from sha1 one day, but >> then this should be handled like everything else. > > Because both md5 and sha1 are already weakened and they are the only > digests we can support as long as mercurial sticks to supporting > python 2.4. The bundles don't /need/ to contain all digest types, but > having multiple ones when the only available ones are md5 and sha1 > kind of increases security (it's harder to create collisions for > *both* digests at the same time) Intuitively, that sounds correct. However, I think there are some assumptions behind this claim that may not apply here. One assumption is that the collisions you find for one hash function is independent of the collisions you find for the other. That need not be the case -- indeed, I believe it won't be the case if you're trying to find two simultaneous collisions. This reminds me a little about an exercise I saw as a TA for an introductory course in cryptography. The question was if DES encrypting a message twice would be equivalent to using a key of twice the length? The answer is no. Given both the message and the ciphertext, you can recover the key with just 2 * 2^56 DES invocations: you make a table with all 2^56 possible encryptions of the the message and another table with all 2^56 possible decryptions of the ciphertext. Finding a match between these two tables reveals the key. This is one reason why there is no "double DES" standard. I know this is not the same situation, but it shows that one has to be careful when expecting some kind of benefit from applying crypto primitives twice.
On Sat, Oct 11, 2014 at 11:46:18AM -0700, Gregory Szorc wrote: > This test doesn't actually verify import-bundle is used. We need some > kind of UI output or confirmation from the HTTP/bundle server to verify > the client is actually fetching a remote URL/bundle. This deficiency is > present in subsequent tests. > > I'm not sure if we want the client to print import-bundle details by > default or if it should be hidden behind --debug. Large repositories > presumably generating many bundles could lead to console spam, which is > a point in --debug's favor. More on this later. --debug is *very* verbose, though :( $ hg --debug clone ssh://user@dummy/repo clone running python "dummyssh" user@dummy 'hg -R repo serve --stdio' sending hello command sending between command remote: 283 remote: capabilities: lookup changegroupsubset branchmap pushkey known getbundle unbundlehash batch stream bundle2-exp=HG2X%0Ab2x%253Achangegroup%0Ab2x%253Aimport-bundle%3Dhttp%2Chttps%0Ab2x%253Alistkeys%0Ab2x%253Apushkey%0Adigests%3Dmd5%2Csha1 unbundle=HG10GZ,HG10BZ,HG10UN httpheader=1024 remote: 1 preparing listkeys for "bookmarks" sending listkeys command query 1; heads sending batch command requesting all changes sending getbundle command start processing of HG2X stream reading bundle2 stream parameters start extraction of bundle2 parts part header size: 24 part type: "b2x:import-bundle" part id: "0" part parameters: 0 found a handler for part 'b2x:import-bundle' payload chunk size: 123 payload chunk size: 0 adding changesets changesets: 1 chunks add changeset cd010b8cd998 changesets: 2 chunks add changeset 42ccdea3bb16 changesets: 3 chunks add changeset 5fddd98957c8 changesets: 4 chunks add changeset 32af7686d403 changesets: 5 chunks add changeset 9520eea781bc changesets: 6 chunks add changeset 24b6387c8c8c changesets: 7 chunks add changeset eea13746799a changesets: 8 chunks add changeset 02de42196ebe adding manifests manifests: 1/8 chunks (12.50%) manifests: 2/8 chunks (25.00%) manifests: 3/8 chunks (37.50%) manifests: 4/8 chunks (50.00%) manifests: 5/8 chunks (62.50%) manifests: 6/8 chunks (75.00%) manifests: 7/8 chunks (87.50%) manifests: 8/8 chunks (100.00%) adding file changes adding A revisions files: 1/7 chunks (14.29%) adding B revisions files: 2/7 chunks (28.57%) adding C revisions files: 3/7 chunks (42.86%) adding D revisions files: 4/7 chunks (57.14%) adding E revisions files: 5/7 chunks (71.43%) adding F revisions files: 6/7 chunks (85.71%) adding H revisions files: 7/7 chunks (100.00%) added 8 changesets with 7 changes to 7 files (+2 heads) part header size: 35 part type: "b2x:listkeys" part id: "1" part parameters: 1 found a handler for part 'b2x:listkeys' payload chunk size: 0 part header size: 39 part type: "b2x:listkeys" part id: "2" part parameters: 1 found a handler for part 'b2x:listkeys' payload chunk size: 0 part header size: 0 end of bundle2 stream checking for updated bookmarks preparing listkeys for "phases" sending listkeys command updating the branch cache updating to branch default resolving manifests branchmerge: False, force: False, partial: False ancestor: 000000000000, local: 000000000000+, remote: 02de42196ebe A: remote created -> g getting A F: remote created -> g getting F H: remote created -> g getting H updating: H 3/3 files (100.00%) 3 files updated, 0 files merged, 0 files removed, 0 files unresolved
On 10/15/2014 01:04 AM, Mike Hommey wrote: > On Sat, Oct 11, 2014 at 11:46:18AM -0700, Gregory Szorc wrote: >> This test doesn't actually verify import-bundle is used. We need some >> kind of UI output or confirmation from the HTTP/bundle server to verify >> the client is actually fetching a remote URL/bundle. This deficiency is >> present in subsequent tests. >> >> I'm not sure if we want the client to print import-bundle details by >> default or if it should be hidden behind --debug. Large repositories >> presumably generating many bundles could lead to console spam, which is >> a point in --debug's favor. More on this later. > > --debug is *very* verbose, though :( adding and extra output to your test should be enought (with an output part). > > $ hg --debug clone ssh://user@dummy/repo clone > running python "dummyssh" user@dummy 'hg -R repo serve --stdio' > sending hello command > sending between command > remote: 283 > remote: capabilities: lookup changegroupsubset branchmap pushkey known getbundle unbundlehash batch stream bundle2-exp=HG2X%0Ab2x%253Achangegroup%0Ab2x%253Aimport-bundle%3Dhttp%2Chttps%0Ab2x%253Alistkeys%0Ab2x%253Apushkey%0Adigests%3Dmd5%2Csha1 unbundle=HG10GZ,HG10BZ,HG10UN httpheader=1024 > remote: 1 > preparing listkeys for "bookmarks" > sending listkeys command > query 1; heads > sending batch command > requesting all changes > sending getbundle command > start processing of HG2X stream > reading bundle2 stream parameters > start extraction of bundle2 parts > part header size: 24 > part type: "b2x:import-bundle" > part id: "0" > part parameters: 0 > found a handler for part 'b2x:import-bundle' > payload chunk size: 123 > payload chunk size: 0 > adding changesets > changesets: 1 chunks > add changeset cd010b8cd998 > changesets: 2 chunks > add changeset 42ccdea3bb16 > changesets: 3 chunks > add changeset 5fddd98957c8 > changesets: 4 chunks > add changeset 32af7686d403 > changesets: 5 chunks > add changeset 9520eea781bc > changesets: 6 chunks > add changeset 24b6387c8c8c > changesets: 7 chunks > add changeset eea13746799a > changesets: 8 chunks > add changeset 02de42196ebe > adding manifests > manifests: 1/8 chunks (12.50%) > manifests: 2/8 chunks (25.00%) > manifests: 3/8 chunks (37.50%) > manifests: 4/8 chunks (50.00%) > manifests: 5/8 chunks (62.50%) > manifests: 6/8 chunks (75.00%) > manifests: 7/8 chunks (87.50%) > manifests: 8/8 chunks (100.00%) > adding file changes > adding A revisions > files: 1/7 chunks (14.29%) > adding B revisions > files: 2/7 chunks (28.57%) > adding C revisions > files: 3/7 chunks (42.86%) > adding D revisions > files: 4/7 chunks (57.14%) > adding E revisions > files: 5/7 chunks (71.43%) > adding F revisions > files: 6/7 chunks (85.71%) > adding H revisions > files: 7/7 chunks (100.00%) > added 8 changesets with 7 changes to 7 files (+2 heads) > part header size: 35 > part type: "b2x:listkeys" > part id: "1" > part parameters: 1 > found a handler for part 'b2x:listkeys' > payload chunk size: 0 > part header size: 39 > part type: "b2x:listkeys" > part id: "2" > part parameters: 1 > found a handler for part 'b2x:listkeys' > payload chunk size: 0 > part header size: 0 > end of bundle2 stream > checking for updated bookmarks > preparing listkeys for "phases" > sending listkeys command > updating the branch cache > updating to branch default > resolving manifests > branchmerge: False, force: False, partial: False > ancestor: 000000000000, local: 000000000000+, remote: 02de42196ebe > A: remote created -> g > getting A > F: remote created -> g > getting F > H: remote created -> g > getting H > updating: H 3/3 files (100.00%) > 3 files updated, 0 files merged, 0 files removed, 0 files unresolved > > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@selenic.com > http://selenic.com/mailman/listinfo/mercurial-devel >
Patch
diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py --- a/mercurial/bundle2.py +++ b/mercurial/bundle2.py @@ -143,16 +143,18 @@ preserve. """ import util import struct import urllib import string import obsolete import pushkey +import urllib2 +import urlparse import changegroup, error from i18n import _ _pack = struct.pack _unpack = struct.unpack _magicstring = 'HG2X' @@ -771,20 +773,81 @@ class unbundlepart(unpackermixin): if size is None: data = self._payloadstream.read() else: data = self._payloadstream.read(size) if size is None or len(data) < size: self.consumed = True return data +class digester(object): + """helper to compute multiple digests for the same data.""" + + DIGESTS = { + 'md5': util.md5, + 'sha1': util.sha1, + } + + def __init__(self, digests, s=''): + self._hashes = {} + for k in digests: + if k not in self.DIGESTS: + raise util.Abort(_('unknown digest type: %s') % k) + self._hashes[k] = self.DIGESTS[k]() + if s: + self.update(s) + + def update(self, data): + for h in self._hashes.values(): + h.update(data) + + def __getitem__(self, key): + if key not in self.DIGESTS: + raise util.Abort(_('unknown digest type: %s') % k) + return self._hashes[key].hexdigest() + + def __iter__(self): + return iter(self._hashes) + + +class digestchecker(object): + """file handle wrapper that additionally checks content against a given + size and hash digests. + + d = digestchecker(fh, size, {'md5': '...'}) + + When multiple digests are given, all of them are verified. + """ + + def __init__(self, fh, size, digests): + self._fh = fh + self._size = size + self._digests = dict(digests) + self._digester = digester(self._digests.keys()) + + def read(self, length=-1): + content = self._fh.read(length) + self._digester.update(content) + self._size -= len(content) + return content + + def validate(self): + if self._size != 0: + return False + for k, v in self._digests.items(): + if v != self._digester[k]: + return False + return True + + capabilities = {'HG2X': (), 'b2x:listkeys': (), 'b2x:pushkey': (), 'b2x:changegroup': (), + 'digests': tuple(digester.DIGESTS.keys()), } def getrepocaps(repo): """return the bundle2 capabilities for a given repo Exists to allow extensions (like evolution) to mutate the capabilities. """ caps = capabilities.copy() @@ -826,16 +889,76 @@ def handlechangegroup(op, inpart): if op.reply is not None: # This is definitly not the final form of this # return. But one need to start somewhere. part = op.reply.newpart('b2x:reply:changegroup') part.addparam('in-reply-to', str(inpart.id), mandatory=False) part.addparam('return', '%i' % ret, mandatory=False) assert not inpart.read() +@parthandler('b2x:import-bundle') +def handleimportbundle(op, inpart): + """apply a bundle on the repo, given an url and validation information + + The part contains a bundle url, as well as hash digest(s) and size + information to validate the downloaded content. + The expected format for this part is: + <URL> + <SIZE in bytes> + <DIGEST-TYPE>: <DIGEST-VALUE> + ... + Multiple digest types are supported in the same part, in which case they + are all validated. + This currently only supports bundle10, but is intended to support bundle20 + eventually. + """ + lines = inpart.read().splitlines() + try: + url = lines.pop(0) + except IndexError: + raise util.Abort(_('import-bundle format error')) + parsed_url = urlparse.urlparse(url) + if parsed_url.scheme not in ('http', 'https'): + raise util.Abort(_('import-bundle only supports http/https urls')) + + try: + size = int(lines.pop(0)) + except (ValueError, IndexError): + raise util.Abort(_('import-bundle format error')) + digests = {} + for line in lines: + if ':' not in line: + raise util.Abort(_('import-bundle format error')) + k, v = line.split(':', 1) + digests[k.strip()] = v.strip() + + real_part = digestchecker(urllib2.urlopen(url), size, digests) + + # Make sure we trigger a transaction creation + # + # The addchangegroup function will get a transaction object by itself, but + # we need to make sure we trigger the creation of a transaction object used + # for the whole processing scope. + op.gettransaction() + import exchange + cg = exchange.readbundle(op.repo.ui, real_part, None) + if isinstance(cg, unbundle20): + raise util.Abort(_('import-bundle only supports bundle10')) + ret = changegroup.addchangegroup(op.repo, cg, 'bundle2', 'bundle2') + op.records.add('import-bundle', {'return': ret}) + if op.reply is not None: + # This is definitly not the final form of this + # return. But one need to start somewhere. + part = op.reply.newpart('b2x:reply:import-bundle') + part.addparam('in-reply-to', str(inpart.id), mandatory=False) + part.addparam('return', '%i' % ret, mandatory=False) + if not real_part.validate(): + raise util.Abort(_('bundle at %s is corrupted') % url) + assert not inpart.read() + @parthandler('b2x:reply:changegroup', ('return', 'in-reply-to')) def handlereplychangegroup(op, inpart): ret = int(inpart.params['return']) replyto = int(inpart.params['in-reply-to']) op.records.add('changegroup', {'return': ret}, replyto) @parthandler('b2x:check:heads') def handlecheckheads(op, inpart): diff --git a/mercurial/exchange.py b/mercurial/exchange.py --- a/mercurial/exchange.py +++ b/mercurial/exchange.py @@ -908,17 +908,17 @@ def _pullbundle2(pullop): if kwargs.keys() == ['format']: return # nothing to pull bundle = pullop.remote.getbundle('pull', **kwargs) try: op = bundle2.processbundle(pullop.repo, bundle, pullop.gettransaction) except error.BundleValueError, exc: raise util.Abort('missing support for %s' % exc) - if pullop.fetch: + if pullop.fetch and op.records['changegroup']: assert len(op.records['changegroup']) == 1 pullop.cgresult = op.records['changegroup'][0]['return'] # processing phases change for namespace, value in op.records['listkeys']: if namespace == 'phases': _pullapplyphases(pullop, value) diff --git a/tests/test-bundle2-import.t b/tests/test-bundle2-import.t new file mode 100644 --- /dev/null +++ b/tests/test-bundle2-import.t @@ -0,0 +1,535 @@ +#require killdaemons + +Create an extension to test bundle2 import-bundle parts + + $ cat > bundle2.py << EOF + > """A small extension to test bundle2 import-bundle parts. + > + > Current bundle2 implementation doesn't provide a way to generate those + > parts, so they must be created by extensions. + > """ + > from mercurial import bundle2, changegroup, commands, exchange + > from mercurial import extensions, scmutil + > import os + > + > def _getbundlechangegrouppart(bundler, repo, source, bundlecaps=None, + > b2caps=None, heads=None, common=None, + > **kwargs): + > for line in open(repo.join('bundle2maker'), 'r'): + > line = line.strip() + > try: + > verb, args = line.split(None, 1) + > except ValueError: + > verb, args = line, '' + > if verb == 'import-bundle': + > url, file = args.split() + > bundledata = open(file, 'rb').read() + > data = '%s\n%d\n' % (url, len(bundledata)) + > d = bundle2.digester(b2caps['digests'], bundledata) + > for k in b2caps['digests']: + > data += '%s: %s\n' % (k, d[k]) + > bundler.newpart('b2x:import-bundle', data=data) + > elif verb == 'raw-import-bundle': + > bundler.newpart('b2x:import-bundle', + > data=args.decode('string-escape')) + > elif verb == 'changegroup': + > _common, heads = args.split() + > common.extend(repo.lookup(r) for r in repo.revs(_common)) + > heads = [repo.lookup(r) for r in repo.revs(heads)] + > cg = changegroup.getchangegroup(repo, 'changegroup', + > heads=heads, common=common) + > bundler.newpart('b2x:changegroup', data=cg.getchunks()) + > else: + > raise Exception('unknown verb') + > + > exchange.getbundle2partsmapping['changegroup'] = _getbundlechangegrouppart + > EOF + +Start a simple HTTP server to serve bundles (stolen from test-static-http.t) + + $ cat > dumb.py <<EOF + > import BaseHTTPServer, SimpleHTTPServer, os, signal, sys + > + > def run(server_class=BaseHTTPServer.HTTPServer, + > handler_class=SimpleHTTPServer.SimpleHTTPRequestHandler): + > server_address = ('localhost', int(os.environ['HGPORT'])) + > httpd = server_class(server_address, handler_class) + > httpd.serve_forever() + > + > signal.signal(signal.SIGTERM, lambda x, y: sys.exit(0)) + > fp = file('dumb.pid', 'wb') + > fp.write(str(os.getpid()) + '\n') + > fp.close() + > run() + > EOF + $ python dumb.py 2>/dev/null & + +Cannot just read $!, it will not be set to the right value on Windows/MinGW + + $ cat > wait.py <<EOF + > import time + > while True: + > try: + > if '\n' in file('dumb.pid', 'rb').read(): + > break + > except IOError: + > pass + > time.sleep(0.2) + > EOF + $ python wait.py + $ cat dumb.pid >> $DAEMON_PIDS + + $ cat >> $HGRCPATH << EOF + > [experimental] + > bundle2-exp=True + > [ui] + > ssh=python "$TESTDIR/dummyssh" + > logtemplate={rev}:{node|short} {phase} {author} {bookmarks} {desc|firstline} + > EOF + + $ hg init repo + + $ hg -R repo unbundle $TESTDIR/bundles/rebase.hg + adding changesets + adding manifests + adding file changes + added 8 changesets with 7 changes to 7 files (+2 heads) + (run 'hg heads' to see heads, 'hg merge' to merge) + + $ hg -R repo log -G + o 7:02de42196ebe draft Nicolas Dumazet <nicdumz.commits@gmail.com> H + | + | o 6:eea13746799a draft Nicolas Dumazet <nicdumz.commits@gmail.com> G + |/| + o | 5:24b6387c8c8c draft Nicolas Dumazet <nicdumz.commits@gmail.com> F + | | + | o 4:9520eea781bc draft Nicolas Dumazet <nicdumz.commits@gmail.com> E + |/ + | o 3:32af7686d403 draft Nicolas Dumazet <nicdumz.commits@gmail.com> D + | | + | o 2:5fddd98957c8 draft Nicolas Dumazet <nicdumz.commits@gmail.com> C + | | + | o 1:42ccdea3bb16 draft Nicolas Dumazet <nicdumz.commits@gmail.com> B + |/ + o 0:cd010b8cd998 draft Nicolas Dumazet <nicdumz.commits@gmail.com> A + + $ hg clone repo orig + updating to branch default + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved + + $ cat > repo/.hg/hgrc << EOF + > [extensions] + > bundle2=$TESTTMP/bundle2.py + > EOF + +Test a clone with a single import-bundle. + + $ hg bundle -R repo -a bundle.hg + 8 changesets found + $ cat > repo/.hg/bundle2maker << EOF + > import-bundle http://localhost:$HGPORT/bundle.hg bundle.hg + > EOF + $ hg clone ssh://user@dummy/repo clone + requesting all changes + adding changesets + adding manifests + adding file changes + added 8 changesets with 7 changes to 7 files (+2 heads) + updating to branch default + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved + $ hg -R clone log -G + @ 7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com> H + | + | o 6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com> G + |/| + o | 5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com> F + | | + | o 4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com> E + |/ + | o 3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com> D + | | + | o 2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com> C + | | + | o 1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com> B + |/ + o 0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com> A + + $ rm -rf clone + +Test a clone with an import-bundle and a following changegroup + + $ hg bundle -R repo --base 000000000000 -r '0:4' bundle2.hg + 5 changesets found + $ cat > repo/.hg/bundle2maker << EOF + > import-bundle http://localhost:$HGPORT/bundle2.hg bundle2.hg + > changegroup 0:4 5:7 + > EOF + $ hg clone ssh://user@dummy/repo clone + requesting all changes + adding changesets + adding manifests + adding file changes + added 5 changesets with 5 changes to 5 files (+1 heads) + adding changesets + adding manifests + adding file changes + added 3 changesets with 2 changes to 2 files (+1 heads) + updating to branch default + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved + $ hg -R clone log -G + @ 7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com> H + | + | o 6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com> G + |/| + o | 5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com> F + | | + | o 4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com> E + |/ + | o 3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com> D + | | + | o 2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com> C + | | + | o 1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com> B + |/ + o 0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com> A + + $ rm -rf clone + +Test a clone with a changegroup followed by an import-bundle + + $ hg bundle -R repo --base '0:4' -r '5:7' bundle3.hg + 3 changesets found + $ cat > repo/.hg/bundle2maker << EOF + > changegroup 000000000000 :4 + > import-bundle http://localhost:$HGPORT/bundle3.hg bundle3.hg + > EOF + $ hg clone ssh://user@dummy/repo clone + requesting all changes + adding changesets + adding manifests + adding file changes + added 5 changesets with 5 changes to 5 files (+1 heads) + adding changesets + adding manifests + adding file changes + added 3 changesets with 2 changes to 2 files (+1 heads) + updating to branch default + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved + $ hg -R clone log -G + @ 7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com> H + | + | o 6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com> G + |/| + o | 5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com> F + | | + | o 4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com> E + |/ + | o 3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com> D + | | + | o 2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com> C + | | + | o 1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com> B + |/ + o 0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com> A + + $ rm -rf clone + +Test a pull with an import-bundle + + $ hg bundle -R repo --base '0:4' -r '5:7' bundle4.hg + 3 changesets found + $ cat > repo/.hg/bundle2maker << EOF + > import-bundle http://localhost:$HGPORT/bundle4.hg bundle4.hg + > EOF + $ hg clone orig clone -r 3 -r 4 + adding changesets + adding manifests + adding file changes + added 5 changesets with 5 changes to 5 files (+1 heads) + updating to branch default + 4 files updated, 0 files merged, 0 files removed, 0 files unresolved + $ hg pull -R clone ssh://user@dummy/repo + pulling from ssh://user@dummy/repo + searching for changes + adding changesets + adding manifests + adding file changes + added 3 changesets with 2 changes to 2 files (+1 heads) + (run 'hg update' to get a working copy) + $ hg -R clone log -G + o 7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com> H + | + | o 6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com> G + |/| + o | 5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com> F + | | + | o 4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com> E + |/ + | @ 3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com> D + | | + | o 2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com> C + | | + | o 1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com> B + |/ + o 0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com> A + + $ rm -rf clone + +Test a pull with an import-bundle and a following changegroup + + $ hg bundle -R repo --base 2 -r '3:4' bundle5.hg + 2 changesets found + $ cat > repo/.hg/bundle2maker << EOF + > import-bundle http://localhost:$HGPORT/bundle5.hg bundle5.hg + > changegroup 0:4 5:7 + > EOF + $ hg clone orig clone -r 2 + adding changesets + adding manifests + adding file changes + added 3 changesets with 3 changes to 3 files + updating to branch default + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved + $ hg pull -R clone ssh://user@dummy/repo + pulling from ssh://user@dummy/repo + searching for changes + adding changesets + adding manifests + adding file changes + added 2 changesets with 2 changes to 2 files (+1 heads) + adding changesets + adding manifests + adding file changes + added 3 changesets with 2 changes to 2 files (+1 heads) + (run 'hg heads .' to see heads, 'hg merge' to merge) + $ hg -R clone log -G + o 7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com> H + | + | o 6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com> G + |/| + o | 5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com> F + | | + | o 4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com> E + |/ + | o 3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com> D + | | + | @ 2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com> C + | | + | o 1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com> B + |/ + o 0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com> A + + $ rm -rf clone + +Test a pull with a changegroup followed by an import-bundle + + $ hg bundle -R repo --base '0:4' -r '5:7' bundle6.hg + 3 changesets found + $ cat > repo/.hg/bundle2maker << EOF + > changegroup 000000000000 :4 + > import-bundle http://localhost:$HGPORT/bundle6.hg bundle6.hg + > EOF + $ hg clone orig clone -r 2 + adding changesets + adding manifests + adding file changes + added 3 changesets with 3 changes to 3 files + updating to branch default + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved + $ hg pull -R clone ssh://user@dummy/repo + pulling from ssh://user@dummy/repo + searching for changes + adding changesets + adding manifests + adding file changes + added 2 changesets with 2 changes to 2 files (+1 heads) + adding changesets + adding manifests + adding file changes + added 3 changesets with 2 changes to 2 files (+1 heads) + (run 'hg heads .' to see heads, 'hg merge' to merge) + $ hg -R clone log -G + o 7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com> H + | + | o 6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com> G + |/| + o | 5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com> F + | | + | o 4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com> E + |/ + | o 3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com> D + | | + | @ 2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com> C + | | + | o 1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com> B + |/ + o 0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com> A + + $ rm -rf clone + +Test a pull with two import-bundles and a changegroup + + $ hg bundle -R repo --base 2 -r '3:4' bundle7.hg + 2 changesets found + $ hg bundle -R repo --base '3:4' -r '5:6' bundle8.hg + 2 changesets found + $ cat > repo/.hg/bundle2maker << EOF + > import-bundle http://localhost:$HGPORT/bundle7.hg bundle7.hg + > import-bundle http://localhost:$HGPORT/bundle8.hg bundle8.hg + > changegroup 0:6 7 + > EOF + $ hg clone orig clone -r 2 + adding changesets + adding manifests + adding file changes + added 3 changesets with 3 changes to 3 files + updating to branch default + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved + $ hg pull -R clone ssh://user@dummy/repo + pulling from ssh://user@dummy/repo + searching for changes + adding changesets + adding manifests + adding file changes + added 2 changesets with 2 changes to 2 files (+1 heads) + adding changesets + adding manifests + adding file changes + added 2 changesets with 1 changes to 1 files + adding changesets + adding manifests + adding file changes + added 1 changesets with 1 changes to 1 files (+1 heads) + (run 'hg heads .' to see heads, 'hg merge' to merge) + $ hg -R clone log -G + o 7:02de42196ebe public Nicolas Dumazet <nicdumz.commits@gmail.com> H + | + | o 6:eea13746799a public Nicolas Dumazet <nicdumz.commits@gmail.com> G + |/| + o | 5:24b6387c8c8c public Nicolas Dumazet <nicdumz.commits@gmail.com> F + | | + | o 4:9520eea781bc public Nicolas Dumazet <nicdumz.commits@gmail.com> E + |/ + | o 3:32af7686d403 public Nicolas Dumazet <nicdumz.commits@gmail.com> D + | | + | @ 2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com> C + | | + | o 1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com> B + |/ + o 0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com> A + + $ rm -rf clone + +Corruption tests + + $ hg clone orig clone -r 2 + adding changesets + adding manifests + adding file changes + added 3 changesets with 3 changes to 3 files + updating to branch default + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved + + $ cat > repo/.hg/bundle2maker << EOF + > import-bundle http://localhost:$HGPORT/bundle7.hg bundle7.hg + > import-bundle http://localhost:$HGPORT/bundle8.hg bundle7.hg + > changegroup 0:6 7 + > EOF + $ hg pull -R clone ssh://user@dummy/repo + pulling from ssh://user@dummy/repo + searching for changes + adding changesets + adding manifests + adding file changes + added 2 changesets with 2 changes to 2 files (+1 heads) + adding changesets + adding manifests + adding file changes + added 2 changesets with 1 changes to 1 files + transaction abort! + rollback completed + abort: bundle at http://localhost:$HGPORT/bundle8.hg is corrupted + [255] + +No content + + $ cat > repo/.hg/bundle2maker << EOF + > raw-import-bundle + > EOF + $ hg pull -R clone ssh://user@dummy/repo + pulling from ssh://user@dummy/repo + searching for changes + abort: import-bundle format error + [255] + +Missing size + + $ cat > repo/.hg/bundle2maker << EOF + > raw-import-bundle http://localhost:$HGPORT/bundle7.hg + > EOF + $ hg pull -R clone ssh://user@dummy/repo + pulling from ssh://user@dummy/repo + searching for changes + abort: import-bundle format error + [255] + +Size mismatch + + $ cat > repo/.hg/bundle2maker << EOF + > raw-import-bundle http://localhost:$HGPORT/bundle7.hg\n42 + > EOF + $ hg pull -R clone ssh://user@dummy/repo + pulling from ssh://user@dummy/repo + searching for changes + adding changesets + adding manifests + adding file changes + added 2 changesets with 2 changes to 2 files (+1 heads) + transaction abort! + rollback completed + abort: bundle at http://localhost:$HGPORT/bundle7.hg is corrupted + [255] + +Garbage data + + $ cat > repo/.hg/bundle2maker << EOF + > raw-import-bundle http://localhost:$HGPORT/bundle7.hg\n581\ngarbage + > EOF + $ hg pull -R clone ssh://user@dummy/repo + pulling from ssh://user@dummy/repo + searching for changes + abort: import-bundle format error + [255] + +Unknown digest + + $ cat > repo/.hg/bundle2maker << EOF + > raw-import-bundle http://localhost:$HGPORT/bundle7.hg\n581\nfoo: bar + > EOF + $ hg pull -R clone ssh://user@dummy/repo + pulling from ssh://user@dummy/repo + searching for changes + abort: unknown digest type: foo + [255] + +Not an HTTP url + + $ cat > repo/.hg/bundle2maker << EOF + > raw-import-bundle ssh://localhost:$HGPORT/bundle7.hg\n581 + > EOF + $ hg pull -R clone ssh://user@dummy/repo + pulling from ssh://user@dummy/repo + searching for changes + abort: import-bundle only supports http/https urls + [255] + + $ hg -R clone log -G + @ 2:5fddd98957c8 public Nicolas Dumazet <nicdumz.commits@gmail.com> C + | + o 1:42ccdea3bb16 public Nicolas Dumazet <nicdumz.commits@gmail.com> B + | + o 0:cd010b8cd998 public Nicolas Dumazet <nicdumz.commits@gmail.com> A + + $ rm -rf clone + + $ "$TESTDIR/killdaemons.py" $DAEMON_PIDS