Patchwork [V2] help: internals topic for wire protocol

login
register
mail settings
Submitter Gregory Szorc
Date Aug. 18, 2016, 3:09 a.m.
Message ID <4ff105e8cfc56a643e55.1471489780@ubuntu-vm-main>
Download mbox | patch
Permalink /patch/16344/
State Superseded
Headers show

Comments

Gregory Szorc - Aug. 18, 2016, 3:09 a.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1471489767 25200
#      Wed Aug 17 20:09:27 2016 -0700
# Node ID 4ff105e8cfc56a643e55b079c0d01d7e86d43879
# Parent  997e8cf4d0a29d28759e38659736cb3d1cf9ef3f
help: internals topic for wire protocol

The Mercurial wire protocol is under-documented. This includes a lack
of source docstrings and comments as well as pages on the official
wiki.

This patch adds the beginnings of "internals" documentation on the
wire protocol.

The documentation, while fairly comprehensive, is far from exhaustive.
Known missing pieces include a detailed overview of bundle2 (including
the "push back" protocol) as well as higher-level details, such as
how mechanisms like discovery work. But you have to start somewhere.

The documentation should have nearly complete coverage on the
lower-level parts of the protocol, such as the different transport
mechanims, how commands and arguments are sent, capabilities, and,
of course, the commands themselves.

As part of writing this documentation, I discovered a number of
deficiencies in the protocol and bugs in the implementation. I've
started sending patches for some of the issues. I hope to send a lot
more.

I'm sure this documentation could be bikeshedded for months. Unless
there are factual errors, I'd prefer to have it land and iterate on
improving it rather than wait for perfection.
Pierre-Yves David - Aug. 22, 2016, 12:53 p.m.
On 08/18/2016 05:09 AM, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1471489767 25200
> #      Wed Aug 17 20:09:27 2016 -0700
> # Node ID 4ff105e8cfc56a643e55b079c0d01d7e86d43879
> # Parent  997e8cf4d0a29d28759e38659736cb3d1cf9ef3f
> help: internals topic for wire protocol

Can you send V3 as a series with subsection in there own patch? 
Reviewing the the whole topic in one go is an hour long task that is 
hard to schedule.
Augie Fackler - Aug. 22, 2016, 1:57 p.m.
On Mon, Aug 22, 2016 at 02:53:28PM +0200, Pierre-Yves David wrote:
>
>
> On 08/18/2016 05:09 AM, Gregory Szorc wrote:
> ># HG changeset patch
> ># User Gregory Szorc <gregory.szorc@gmail.com>
> ># Date 1471489767 25200
> >#      Wed Aug 17 20:09:27 2016 -0700
> ># Node ID 4ff105e8cfc56a643e55b079c0d01d7e86d43879
> ># Parent  997e8cf4d0a29d28759e38659736cb3d1cf9ef3f
> >help: internals topic for wire protocol
>
> Can you send V3 as a series with subsection in there own patch? Reviewing
> the the whole topic in one go is an hour long task that is hard to schedule.

Good idea. I've been punting on this one due to insufficiently large
round tuits.

>
> --
> Pierre-Yves David
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Patch

diff --git a/contrib/wix/help.wxs b/contrib/wix/help.wxs
--- a/contrib/wix/help.wxs
+++ b/contrib/wix/help.wxs
@@ -37,16 +37,17 @@ 
         </Component>
 
         <Directory Id="help.internaldir" Name="internals">
           <Component Id="help.internals" Guid="$(var.help.internals.guid)" Win64='$(var.IsX64)'>
             <File Id="internals.bundles.txt"      Name="bundles.txt" KeyPath="yes" />
             <File Id="internals.changegroups.txt" Name="changegroups.txt" />
             <File Id="internals.requirements.txt" Name="requirements.txt" />
             <File Id="internals.revlogs.txt"      Name="revlogs.txt" />
+            <File Id="internals.wireprotocol.txt" Name="wireprotocol.txt" />
           </Component>
         </Directory>
 
       </Directory>
     </DirectoryRef>
   </Fragment>
 
 </Wix>
diff --git a/mercurial/help.py b/mercurial/help.py
--- a/mercurial/help.py
+++ b/mercurial/help.py
@@ -187,16 +187,18 @@  internalstable = sorted([
     (['bundles'], _('Bundles'),
      loaddoc('bundles', subdir='internals')),
     (['changegroups'], _('Changegroups'),
      loaddoc('changegroups', subdir='internals')),
     (['requirements'], _('Repository Requirements'),
      loaddoc('requirements', subdir='internals')),
     (['revlogs'], _('Revision Logs'),
      loaddoc('revlogs', subdir='internals')),
+    (['wireprotocol'], _('Wire Protocol'),
+     loaddoc('wireprotocol', subdir='internals')),
 ])
 
 def internalshelp(ui):
     """Generate the index for the "internals" topic."""
     lines = []
     for names, header, doc in internalstable:
         lines.append(' :%s: %s\n' % (names[0], header))
 
diff --git a/mercurial/help/internals/wireprotocol.txt b/mercurial/help/internals/wireprotocol.txt
new file mode 100644
--- /dev/null
+++ b/mercurial/help/internals/wireprotocol.txt
@@ -0,0 +1,773 @@ 
+The Mercurial wire protocol is a request-response based protocol
+with multiple wire representations.
+
+Each request is modeled as a command name, a dictionary of arguments, and
+optional raw input. Command arguments and their types are intrinsic
+properties of commands. So is the response type of the command. This means
+clients can't always send arbitrary arguments to servers and servers can't
+return multiple response types.
+
+The protocol is synchronous and does not support multiplexing (concurrent
+commands).
+
+Transport Protocols
+===================
+
+HTTP Transport
+--------------
+
+Commands are issued as HTTP/1.0 or HTTP/1.1 requests. Commands are
+sent to the base URL of the repository with the command name sent in
+the ``cmd`` query string parameter. e.g.
+``https://example.com/repo?cmd=capabilities``. The HTTP method is ``GET``
+or ``POST`` depending on the command and whether there is a request
+body.
+
+Command arguments can be sent multiple ways.
+
+The simplest is part of the URL query string using ``x-www-form-urlencoded``
+encoding (see Python's ``urllib.urlencode()``. However, many servers impose
+length limitations on the URL. So this mechanism is typically only used if
+the server doesn't support other mechanisms.
+
+If the server supports the ``httpheader`` capability, command arguments can
+be sent in HTTP request headers named ``X-HgArg-<N>`` where ``<N>`` is an
+integer starting at 1. A ``x-www-form-urlencoded`` representation of the
+arguments is obtained. This full string is then split into chunks and sent
+in numbered ``X-HgArg-<N>`` headers. The maximum length of each HTTP header
+is defined by the server in the ``httpheader`` capability value, which defaults
+to ``1024``. The server reassembles the encoded arguments string by
+concatenating the ``X-HgArg-<N>`` headers then URL decodes them into a
+dictionary.
+
+The list of ``X-HgArg-<N>`` headers should be added to the ``Vary`` request
+header to instruct caches to take these headers into consideration when caching
+requests.
+
+If the server supports the ``httppostargs`` capability, the client
+may send command arguments in the HTTP request body as part of an
+HTTP POST request. The command arguments will be URL encoded just like
+they would for sending them via HTTP headers. However, no splitting is
+performed: the raw arguments are included in the HTTP request body.
+
+The client sends a ``X-HgArgs-Post`` header with the string length of the
+encoded arguments data. Additional data may be included in the HTTP
+request body immediately following the argument data. The offset of the
+non-argument data is defined by the ``X-HgArgs-Post`` header. The
+``X-HgArgs-Post`` header is not required if there is no argument data.
+
+Additional command data can be sent as part of the HTTP request body. The
+default ``Content-Type`` when sending data is ``application/mercurial-0.1``.
+A ``Content-Length`` header is currently always sent.
+
+Example HTTP requests::
+
+    GET /repo?cmd=capabilities
+    X-HgArg-1: foo=bar&baz=hello%20world
+
+The ``Content-Type`` HTTP response header identifies the response as coming
+from Mercurial and can also be used to signal an error has occurred.
+
+The ``application/mercurial-0.1`` media type indicates a generic Mercurial
+response. It matches the media type sent by the client.
+
+The ``application/hg-error`` media type indicates a generic error occurred.
+The content of the HTTP response body typically holds text describing the
+error.
+
+The ``application/hg-changegroup`` media type indicates a changegroup response
+type.
+
+Clients also accept the ``text/plain`` media type. All other media
+types should cause the client to error.
+
+Clients should issue a ``User-Agent`` request header that identifies the client.
+The server should not use the ``User-Agent`` for feature detection.
+
+A command returning a ``string`` response issues the
+``application/mercurial-0.1`` media type and the HTTP response body contains
+the raw string value. A ``Content-Length`` header is typically issued.
+
+A command returning a ``stream`` response issues the
+``application/mercurial-0.1`` media type and the HTTP response is typically
+using *chunked transfer* (``Transfer-Encoding: chunked``).
+
+SSH Transport
+=============
+
+The SSH transport is a custom text-based protocol suitable for use over any
+bi-directional stream transport. It is most commonly used with SSH.
+
+A SSH transport server can be started with ``hg serve --stdio``. The stdin,
+stderr, and stdout file descriptors of the started process are used to exchange
+data. When Mercurial connects to a remote server over SSH, it actually starts
+a ``hg serve --stdio`` process on the remote server.
+
+Commands are issued by sending the command name followed by a trailing newline
+``\n`` to the server. e.g. ``capabilities\n``.
+
+Command arguments are sent in the following format::
+
+    <argument> <length>\n<value>
+
+That is, the argument string name followed by a space followed by the
+integer length of the value (expressed as a string) followed by a newline
+(``\n``) followed by the raw argument value.
+
+Dictionary arguments are encoded differently::
+
+    <argument> <# elements>\n
+    <key1> <length1>\n<value1>
+    <key2> <length2>\n<value2>
+    ...
+
+Non-argument data is sent immediately after the final argument value. It is
+encoded in chunks::
+
+    <length>\n<data>
+
+Each command declares a list of supported arguments and their types. If a
+client sends an unknown argument to the server, the server should abort
+immediately. The special argument ``*`` in a command's definition indicates
+that all argument names are allowed.
+
+The definition of supported arguments and types is initially made when a
+new command is implemented. The client and server must initially independently
+agree on the arguments and their types. This initial set of arguments can be
+supplemented through the presence of *capabilities* advertised by the server.
+
+Each command has a defined expected response type.
+
+A ``string`` response type is a length framed value. The response consists of
+the string encoded integer length of a value followed by a newline (``\n``)
+followed by the value. Empty values are allowed (and are represented as
+``0\n``).
+
+A ``stream`` response type consists of raw bytes of data. There is no framing.
+
+A generic error response type is also supported. It consists of a an error
+message written to ``stderr`` followed by ``\n-\n``. In addition, ``\n`` is
+written to ``stdout``.
+
+If the server receives an unknown command, it will send an empty ``string``
+response.
+
+The server terminates if it receives an empty command (a ``\n`` character).
+
+Capabilities
+============
+
+Servers advertise supported wire protocol features. This allows clients to
+probe for server features before blindly calling a command or passing a
+specific argument.
+
+The server's features are exposed via a *capabilities* string. This is a
+space-delimited string of tokens/features. Some features are single words
+like ``lookup`` or ``batch``. Others are complicated key-value pairs
+advertising sub-features. e.g. ``httpheader=2048``. When complex, non-word
+values are used, each feature name can define its own encoding of sub-values.
+Comma-delimited and ``x-www-form-urlencoded`` values are common.
+
+The following document capabilities defined by the canonical Mercurial server
+implementation.
+
+batch
+-----
+
+Whether the server supports the ``batch`` command.
+
+This capability/command was introduced in Mercurial 1.9 (released July 2011).
+
+branchmap
+---------
+
+Whether the server supports the ``branchmap`` command.
+
+This capability/command was introduced in Mercurial 1.3 (released July 2009).
+
+bundle2-exp
+-----------
+
+Precursor to ``bundle2`` capability that was used before bundle2 was a
+stable feature.
+
+This capability was introduced in Mercurial 3.0 behind an experimental
+flag. This capability should not be observed in the wild.
+
+bundle2
+-------
+
+Indicates whether the server supports the ``bundle2`` data exchange format.
+
+The value of the capability is a URL quoted, newline (``\n``) delimited
+list of keys or key-value pairs.
+
+A key is simply a URL encoded string.
+
+A key-value pair is a URL encoded key separated from a URL encoded value by
+an ``=``. If the value is a list, elements are delimited by a ``,`` after
+URL encoding.
+
+For example, say we have the values::
+
+  {'HG20': [], 'changegroup': ['01', '02'], 'digests': ['sha1', 'sha512']}
+
+We would first construct a string::
+
+  HG20\nchangegroup=01,02\ndigests=sha1,sha512
+
+We would then URL quote this string::
+
+  HG20%0Achangegroup%3D01%2C02%0Adigests%3Dsha1%2Csha512
+
+This capability was introduced in Mercurial 3.4 (released May 2015).
+
+changegroupsubset
+-----------------
+
+Whether the server supports the ``changegroupsubset`` command.
+
+This capability was introduced in Mercurial 0.9.2 (released December
+2006).
+
+This capability was introduced at the same time as the ``lookup``
+capability/command.
+
+getbundle
+---------
+
+Whether the server supports the ``getbundle`` command.
+
+This capability was introduced in Mercurial 1.9 (released July 2011).
+
+httpheader
+----------
+
+Whether the server supports receiving command arguments via HTTP request
+headers.
+
+The value of the capability is an integer describing the max header
+length that clients should send. Clients should ignore any content after a
+comma in the value, as this is reserved for future use.
+
+This capability was introduced in Mercurial 1.9 (released July 2011).
+
+httppostargs
+------------
+
+**Experimental**
+
+Indicates that the server supports and prefers clients send command arguments
+via a HTTP POST request as part of the request body.
+
+This capability was introduced in Mercurial 3.8 (released May 2016).
+
+known
+-----
+
+Whether the server supports the ``known`` command.
+
+This capability/command was introduced in Mercurial 1.9 (released July 2011).
+
+lookup
+------
+
+Whether the server supports the ``lookup`` command.
+
+This capability was introduced in Mercurial 0.9.2 (released December
+2006).
+
+This capability was introduced at the same time as the ``changegroupsubset``
+capability/command.
+
+pushkey
+-------
+
+Whether the server supports the ``pushkey`` and ``listkeys`` commands.
+
+This capability was introduced in Mercurial 1.6 (released July 2010).
+
+standardbundle
+--------------
+
+**Unsupported**
+
+This capability was introduced during the Mercurial 0.9.2 development cycle in
+2006. It was never present in a release, as it was replaced by the ``unbundle``
+capability. This capability should not be encountered in the wild.
+
+stream-preferred
+----------------
+
+If present the server prefers that clients clone using the streaming clone
+protocol (``hg clone --uncompressed``) rather than the standard
+changegroup/bundle based protocol.
+
+This capability was introduced in Mercurial 2.2 (released May 2012).
+
+streamreqs
+----------
+
+Indicates whether the server supports *streaming clones* and the *requirements*
+that clients must support to receive it.
+
+If present, the server supports the ``stream_out`` command, which transmits
+raw revlogs from the repository instead of changegroups. This provides a faster
+cloning mechanism at the expense of more bandwidth used.
+
+The value of this capability is a comma-delimited list of repo format
+*requirements*. These are requirements that impact the reading of data in
+the ``.hg/store`` directory. An example value is
+``streamreqs=generaldelta,revlogv1`` indicating the server repo requires
+the ``revlogv1`` and ``generaldelta`` requirements.
+
+If the only format requirement is ``revlogv1``, the server may expose the
+``stream`` capability instead of the ``streamreqs`` capability.
+
+This capability was introduced in Mercurial 1.7 (released November 2010).
+
+stream
+------
+
+Whether the server supports *streaming clones* from ``revlogv1`` repos.
+
+If present, the server supports the ``stream_out`` command, which transmits
+raw revlogs from the repository instead of changegroups. This provides a faster
+cloning mechanism at the expense of more bandwidth used.
+
+This capability was introduced in Mercurial 0.9.1 (released July 2006).
+
+When initially introduced, the value of the capability was the numeric
+revlog revision. e.g. ``stream=1``. This indicates the changegroup is using
+``revlogv1``. This simple integer value wasn't powerful enough, so the
+``streamreqs`` capability was invented to handle cases where the repo
+requirements have more than just ``revlogv1``. Newer servers omit the
+``=1`` since it was the only value supported and the value of ``1`` can
+be implied by clients.
+
+unbundlehash
+------------
+
+Whether the ``unbundle`` commands supports receiving a hash of all the
+heads instead of a list.
+
+For more, see the documentation for the ``unbundle`` command.
+
+This capability was introduced in Mercurial 1.9 (released July 2011).
+
+unbundle
+--------
+
+Whether the server supports pushing via the ``unbundle`` command.
+
+This capability/command has been present since Mercurial 0.9.1 (released
+July 2006).
+
+Mercurial 0.9.2 (released December 2006) added values to the capability
+indicating which bundle types the server supports receiving. This value is a
+comma-delimited list. e.g. ``HG10GZ,HG10BZ,HG10UN``. The order of values
+reflects the priority/preference of that type, where the first value is the
+most preferred type.
+
+Handshake Protocol
+==================
+
+While not explicitly required, it is common for clients to perform a
+*handshake* when connecting to a server. The handshake accomplishes 2 things:
+
+* Obtaining capabilities and other server features
+* Flushing extra server output (e.g. SSH servers may print extra text
+  when connecting that may confuse the wire protocol)
+
+This isn't a traditional *handshake* as far as network protocols go because
+there is no persistent state as a result of the handshake: the handshake is
+simply the issuing of commands and commands are stateless.
+
+The canonical clients perform a capabilities lookup at connection establishment
+time. This is because clients must assume a server only supports the features
+of the original Mercurial server implementation until proven otherwise (from
+advertised capabilities). Nearly every server running today supports features
+that weren't present in the original Mercurial server implementation. Rather
+than wait for a client to perform functionality that needs to consult
+capabilities, it issues the lookup at connection start to avoid any delay later.
+
+For HTTP servers, the client sends a ``capabilities`` command request as
+soon as the connection is established. The server responds with a capabilities
+string, which the client parses.
+
+For SSH servers, the client sends the ``hello`` command (no arguments)
+and a ``between`` command with the ``pairs`` argument having the value
+``0000000000000000000000000000000000000000-0000000000000000000000000000000000000000``.
+
+The ``between`` command has been supported since the original Mercurial
+server. Requesting the empty range will return a ``\n`` string response,
+which will be encoded as ``1\n\n`` (value length of ``1`` followed by a newline
+followed by the value, which happens to  be a newline).
+
+The ``hello`` command was later introduced. Servers supporting it will issue
+a response to that command before sending the ``1\n\n`` response to the
+``between`` command. Servers not supporting ``hello`` will send an empty
+response (``0\n``).
+
+In addition to the expected output from the ``hello`` and ``between`` commands,
+servers may also send other output, such as *message of the day (MOTD)*
+announcements. Clients assume servers will send this output before the
+Mercurial server replies to the client-issued commands. So any server output
+not conforming to the expected command responses is assumed to be not related
+to Mercurial and can be ignored.
+
+Commands
+========
+
+This section contains a list of all wire protocol commands implemented by
+the canonical Mercurial server.
+
+batch
+-----
+
+Issue multiple commands while sending a single command request. The purpose
+of this command is to allow a client to issue multiple commands while avoiding
+multiple round trips to the server therefore enabling commands to complete
+quicker.
+
+The command accepts a ``cmds`` argument that contains a list of commands to
+execute.
+
+The value of ``cmds`` is a ``;`` delimited list of strings. Each string has the
+form ``<command> <arguments>``. That is, the command name followed by a space
+followed by an argument string.
+
+The argument string is a ``,`` delimited list of ``<key>=<value>`` values
+corresponding to command arguments. Both the argument name and value are
+escaped using a special substitution map::
+
+   : -> :c
+   , -> :o
+   ; -> :s
+   = -> :e
+
+The response type for this command is ``string``. The value contains a
+``;`` delimited list of responses for each requested command. Each value
+in this list is escaped using the same substitution map used for arguments.
+
+If an error occurs, the generic error response may be sent.
+
+between
+-------
+
+(Legacy command used for discovery in old clients)
+
+Obtain nodes between pairs of nodes.
+
+The ``pairs`` arguments contains a space-delimited list of ``-`` delimited
+hex node pairs. e.g.::
+
+   a072279d3f7fd3a4aa7ffa1a5af8efc573e1c896-6dc58916e7c070f678682bfe404d2e2d68291a18
+
+Return type is a ``string``. Value consists of lines corresponding to each
+requested range. Each line contains a space-delimited list of hex nodes.
+A newline ``\n`` terminates each line, including the last one.
+
+branchmap
+---------
+
+Obtain heads in named branches.
+
+Accepts no arguments. Return type is a ``string``.
+
+Return value contains lines with URL encoded branch names followed by a space
+followed by a space-delimited list of hex nodes of heads on that branch.
+e.g.::
+
+    default a072279d3f7fd3a4aa7ffa1a5af8efc573e1c896 6dc58916e7c070f678682bfe404d2e2d68291a18
+    stable baae3bf31522f41dd5e6d7377d0edd8d1cf3fccc
+
+There is no trailing newline.
+
+branches
+--------
+
+Obtain ancestor changesets of specific nodes back to a branch point.
+
+Despite the name, this command has nothing to do with Mercurial named branches.
+Instead, it is related to DAG branches.
+
+The command accepts a ``nodes`` argument, which is a string of space-delimited
+hex nodes.
+
+For each node requested, the server will find the first ancestor node that is
+a DAG root or is a merge.
+
+Return type is a ``string``. Return value contains lines with result data for
+each requested node. Each line contains space-delimited nodes followed by a
+newline (``\n``). The 4 nodes reported on each line correspond to the requested
+node, the ancestor node found, and its 2 parent nodes (which may be the null
+node).
+
+capabilities
+------------
+
+Obtain the capabilities string for the repo.
+
+Unlike the ``hello`` command, the capabilities string is not prefixed.
+There is no trailing newline.
+
+This command does not accept any arguments. Return type is a ``string``.
+
+changegroup
+-----------
+
+(Legacy command: use ``getbundle`` instead)
+
+Obtain a changegroup version 1 with data for changesets that are
+descendants of client-specified changesets.
+
+The ``roots`` arguments contains a list of space-delimited hex nodes.
+
+The server responds with a changegroup version 1 containing all
+changesets between the requested root/base nodes and the repo's head nodes
+at the time of the request.
+
+The return type is a ``stream``.
+
+changegroupsubset
+-----------------
+
+(Legacy command: use ``getbundle`` instead)
+
+Obtain a changegroup version 1 with data for changesetsets between
+client specified base and head nodes.
+
+The ``bases`` argument contains a list of space-delimited hex nodes.
+The ``heads`` argument contains a list of space-delimited hex nodes.
+
+The server responds with a changegroup version 1 containing all
+changesets between the requested base and head nodes at the time of the
+request.
+
+The return type is a ``stream``.
+
+clonebundles
+------------
+
+Obtains a manifest of bundle URLs available to seed clones.
+
+Each returned line contains a URL followed by metadata. See the
+documentation in the ``clonebundles`` extension for more.
+
+The return type is a ``string``.
+
+getbundle
+---------
+
+Obtain a bundle containing repository data.
+
+This command accepts the following arguments:
+
+heads
+   List of space-delimited hex nodes of heads to retrieve.
+common
+   List of space-delimited hex nodes that the client has in common with the
+   server.
+obsmarkers
+   Boolean indicating whether to include obsolescence markers as part
+   of the response. Only works with bundle2.
+bundlecaps
+   Comma-delimited set of strings defining client bundle capabilities.
+listkeys
+   Comma-delimited list of strings of ``pushkey`` namespaces. For each
+   namespace listed, a bundle2 part will be included with the content of
+   that namespace.
+cg
+   Boolean indicating whether changegroup data is requested.
+cbattempted
+   Boolean indicating whether the client attempted to use the *clone bundles*
+   feature before performing this request.
+
+The return type on success is a ``stream`` where the value is bundle.
+On the HTTP transport, the response is zlib compressed.
+
+If an error occurs, a generic error response can be sent.
+
+Unless the client sends a false value for the ``cg`` argument, the returned
+bundle contains a changegroup with the nodes between the specified ``common``
+and ``heads`` nodes. Depending on the command arguments, the type and content
+of the returned bundle can vary significantly.
+
+The default behavior is for the server to send a raw changegroup version
+``01`` response.
+
+If the ``bundlecaps`` provided by the client contain a value beginning
+with ``HG2``, a bundle2 will be returned. The bundle2 data may contain
+additional repository data, such as ``pushkey`` namespace values.
+
+heads
+-----
+
+Returns a list of space-delimited hex nodes of repository heads followed
+by a newline. e.g.
+``a9eeb3adc7ddb5006c088e9eda61791c777cbf7c 31f91a3da534dc849f0d6bfc00a395a97cf218a1\n``
+
+This command does not accept any arguments. The return type is a ``string``.
+
+hello
+-----
+
+Returns lines describing interesting things about the server in an RFC-822
+like format.
+
+Currently, the only line defines the server capabilities. It has the form::
+
+    capabilities: <value>
+
+See above for more about the capabilities string.
+
+SSH clients typically issue this command as soon as a connection is
+established.
+
+This command does not accept any arguments. The return type is a ``string``.
+
+listkeys
+--------
+
+List values in a specified ``pushkey`` namespace.
+
+The ``namespace`` argument defines the pushkey namespace to operate on.
+
+The return type is a ``string``. The value is an encoded dictionary of keys.
+
+Key-value pairs are delimited by newlines (``\n``). Within each line, keys and
+values are separated by a tab (``\t``). Keys and values are both strings.
+
+lookup
+------
+
+Try to resolve a value to a known repository revision.
+
+The ``key`` argument is converted from bytes to an
+``encoding.localstr`` instance then passed into
+``localrepository.__getitem__`` in an attempt to resolve it.
+
+The return type is a ``string``.
+
+Upon successful resolution, returns ``1 <hex node>\n``. On failure,
+returns ``0 <error string>\n``. e.g.::
+
+   1 273ce12ad8f155317b2c078ec75a4eba507f1fba\n
+
+   0 unknown revision 'foo'\n
+
+known
+-----
+
+Determine whether multiple nodes are known.
+
+The ``nodes`` argument is a list of space-delimited hex nodes to check
+for existence.
+
+The return type is ``string``.
+
+Returns a string consisting of ``0``s and ``1``s indicating whether nodes
+are known. If the Nth node specified in the ``nodes`` argument is known,
+a ``1`` will be returned at byte offset N. If the node isn't known, ``0``
+will be present at byte offset N.
+
+There is no trailing newline.
+
+pushkey
+-------
+
+Set a value using the ``pushkey`` protocol.
+
+Accepts arguments ``namespace``, ``key``, ``old``, and ``new``, which
+correspond to the pushkey namespace to operate on, the key within that
+namespace to change, the old value (which may be empty), and the new value.
+All arguments are string types.
+
+The return type is a ``string``. The value depends on the transport protocol.
+
+The SSH transport sends a string encoded integer followed by a newline
+(``\n``) which indicates operation result. The server may send additional
+output on the ``stderr`` stream that should be displayed to the user.
+
+The HTTP transport sends a string encoded integer followed by a newline
+followed by additional server output that should be displayed to the user.
+This may include output from hooks, etc.
+
+The integer result varies by namespace. ``0`` means an error has occurred
+and there should be additional output to display to the user.
+
+stream_out
+----------
+
+Obtain *streaming clone* data.
+
+The return type is either a ``string`` or a ``stream``, depending on
+whether the request was fulfilled properly.
+
+A return value of ``1\n`` indicates the server is not configured to serve
+this data. If this is seen by the client, they may not have verified the
+``stream`` capability is set before making the request.
+
+A return value of ``2\n`` indicates the server was unable to lock the
+repository to generate data.
+
+All other responses are a ``stream`` of bytes. The first line of this data
+contains 2 space-delimited integers corresponding to the path count and
+payload size, respectively::
+
+    <path count> <payload size>\n
+
+The ``<payload size>`` is the total size of path data: it does not include
+the size of the per-path header lines.
+
+Following that header are ``<path count>`` entries. Each entry consists of a
+line with metadata followed by raw revlog data. The line consists of::
+
+    <store path>\0<size>\n
+
+The ``<store path>`` is the encoded store path of the data that follows.
+``<size>`` is the amount of data for this store path/revlog that follows the
+newline.
+
+There is no trailer to indicate end of data. Instead, the client should stop
+reading after ``<path count>`` entries are consumed.
+
+unbundle
+--------
+
+Send a bundle containing data (usually changegroup data) to the server.
+
+Accepts the argument ``heads``, which is a space-delimited list of hex nodes
+corresponding to server repository heads observed by the client. This is used
+to detect race conditions and abort push operations before a server performs
+too much work or a client transfers too much data.
+
+The request payload consists of a bundle to be applied to the repository,
+similarly to as if :hg:`unbundle` were called.
+
+In most scenarios, a special ``push response`` type is returned. This type
+contains an integer describing the change in heads as a result of the
+operation. A value of ``0`` indicates nothing changed. ``1`` means the number
+of heads remained the same. Values ``2`` and larger indicate the number of
+added heads minus 1. e.g. ``3`` means 2 heads were added. Negative values
+indicate the number of fewer heads, also off by 1. e.g. ``-2`` means there
+is 1 fewer head.
+
+The encoding of the ``push response`` type varies by transport.
+
+For the SSH transport, this type is composed of 2 ``string`` responses: an
+empty response (``0\n``) followed by the integer result value. e.g.
+``1\n2``. So the full response might be ``0\n1\n2``.
+
+For the HTTP transport, the response is a ``string`` type composed of an
+integer result value followed by a newline (``\n``) followed by string
+content holding server output that should be displayed on the client (output
+hooks, etc).
+
+In some cases, the server may respond with a ``bundle2`` bundle. In this
+case, the response type is ``stream``. For the HTTP transport, the response
+is zlib compressed.
+
+The server may also respond with a generic error type, which contains a string
+indicating the failure.
diff --git a/tests/test-help.t b/tests/test-help.t
--- a/tests/test-help.t
+++ b/tests/test-help.t
@@ -928,16 +928,17 @@  internals topic renders index of availab
   $ hg help internals
   Technical implementation topics
   """""""""""""""""""""""""""""""
   
        bundles       Bundles
        changegroups  Changegroups
        requirements  Repository Requirements
        revlogs       Revision Logs
+       wireprotocol  Wire Protocol
 
 sub-topics can be accessed
 
   $ hg help internals.changegroups
   Changegroups
   """"""""""""
   
       Changegroups are representations of repository revlog data, specifically
@@ -2890,16 +2891,23 @@  Sub-topic indexes rendered properly
   </td></tr>
   <tr><td>
   <a href="/help/internals.revlogs">
   revlogs
   </a>
   </td><td>
   Revision Logs
   </td></tr>
+  <tr><td>
+  <a href="/help/internals.wireprotocol">
+  wireprotocol
+  </a>
+  </td><td>
+  Wire Protocol
+  </td></tr>
   
   
   
   
   
   </table>
   </div>
   </div>