Patchwork url: add distribution and version to user-agent request header (BC)

login
register
mail settings
Submitter Gregory Szorc
Date July 14, 2016, 5:18 a.m.
Message ID <6ad61d5001b1fbfebf31.1468473506@ubuntu-vm-main>
Download mbox | patch
Permalink /patch/15840/
State Superseded
Headers show

Comments

Gregory Szorc - July 14, 2016, 5:18 a.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1468473406 25200
#      Wed Jul 13 22:16:46 2016 -0700
# Node ID 6ad61d5001b1fbfebf317d0557f158d4b34a0772
# Parent  52433f89f816e21ca992ac8c4a41cba0345f1b73
url: add distribution and version to user-agent request header (BC)

As a server operator, I've always wanted to know what Mercurial
version clients are running. Unfortunately, there is no easy
way to discern this today: the best you can do is sniff capabilities
from getbundle commands and those aren't updated frequently enough
to tell you anything that interesting.

This patch adds the distribution name and version to the user-agent
HTTP request header. We choose "Mercurial" for the distribution
name because that seems appropriate. The version string comes
from __version__. It should have no spaces and should therefore be
safe to include outside of quotes, parenthesis, etc.

Flagging the patch as BC so it shows up in release notes. This
change should be backwards compatible. But I'm sure there is a server
operator somewhere filtering on the existing user-agent request
header. So I want to make noise about this change.
Mike Hommey - July 14, 2016, 5:38 a.m.
On Wed, Jul 13, 2016 at 10:18:26PM -0700, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1468473406 25200
> #      Wed Jul 13 22:16:46 2016 -0700
> # Node ID 6ad61d5001b1fbfebf317d0557f158d4b34a0772
> # Parent  52433f89f816e21ca992ac8c4a41cba0345f1b73
> url: add distribution and version to user-agent request header (BC)
> 
> As a server operator, I've always wanted to know what Mercurial
> version clients are running. Unfortunately, there is no easy
> way to discern this today: the best you can do is sniff capabilities
> from getbundle commands and those aren't updated frequently enough
> to tell you anything that interesting.
> 
> This patch adds the distribution name and version to the user-agent
> HTTP request header. We choose "Mercurial" for the distribution
> name because that seems appropriate. The version string comes
> from __version__. It should have no spaces and should therefore be
> safe to include outside of quotes, parenthesis, etc.
> 
> Flagging the patch as BC so it shows up in release notes. This
> change should be backwards compatible. But I'm sure there is a server
> operator somewhere filtering on the existing user-agent request
> header. So I want to make noise about this change.

Did you check it doesn't break on e.g. bitbucket? They do user-agent
sniffing, but I don't know what exactly they are sniffing for (I just
know that if you do mercurial protocol requests with a git UA, it
rejects you)

Mike
Gregory Szorc - July 14, 2016, 5:47 a.m.
On Wed, Jul 13, 2016 at 10:38 PM, Mike Hommey <mh@glandium.org> wrote:

> On Wed, Jul 13, 2016 at 10:18:26PM -0700, Gregory Szorc wrote:
> > # HG changeset patch
> > # User Gregory Szorc <gregory.szorc@gmail.com>
> > # Date 1468473406 25200
> > #      Wed Jul 13 22:16:46 2016 -0700
> > # Node ID 6ad61d5001b1fbfebf317d0557f158d4b34a0772
> > # Parent  52433f89f816e21ca992ac8c4a41cba0345f1b73
> > url: add distribution and version to user-agent request header (BC)
> >
> > As a server operator, I've always wanted to know what Mercurial
> > version clients are running. Unfortunately, there is no easy
> > way to discern this today: the best you can do is sniff capabilities
> > from getbundle commands and those aren't updated frequently enough
> > to tell you anything that interesting.
> >
> > This patch adds the distribution name and version to the user-agent
> > HTTP request header. We choose "Mercurial" for the distribution
> > name because that seems appropriate. The version string comes
> > from __version__. It should have no spaces and should therefore be
> > safe to include outside of quotes, parenthesis, etc.
> >
> > Flagging the patch as BC so it shows up in release notes. This
> > change should be backwards compatible. But I'm sure there is a server
> > operator somewhere filtering on the existing user-agent request
> > header. So I want to make noise about this change.
>
> Did you check it doesn't break on e.g. bitbucket? They do user-agent
> sniffing, but I don't know what exactly they are sniffing for (I just
> know that if you do mercurial protocol requests with a git UA, it
> rejects you)
>

bitbucket works with this new user agent. It also works with the user agent
"foo."

bitbucket also accepts "git" but not "git/" (it 404s). So they appear to be
filtering on "git/"
Augie Fackler - July 14, 2016, 5:48 p.m.
On Wed, Jul 13, 2016 at 10:18:26PM -0700, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1468473406 25200
> #      Wed Jul 13 22:16:46 2016 -0700
> # Node ID 6ad61d5001b1fbfebf317d0557f158d4b34a0772
> # Parent  52433f89f816e21ca992ac8c4a41cba0345f1b73
> url: add distribution and version to user-agent request header (BC)

It's actually intentional that we don't advertise hg version in either
direction to my recollection. That said, I have been meaning to write
a patch like this (but with it behind a config knob) so that big
companies can track how many versions of hg are in use. Can you do a
v2 with this off by default behind a config knob?

>
> As a server operator, I've always wanted to know what Mercurial
> version clients are running. Unfortunately, there is no easy
> way to discern this today: the best you can do is sniff capabilities
> from getbundle commands and those aren't updated frequently enough
> to tell you anything that interesting.
>
> This patch adds the distribution name and version to the user-agent
> HTTP request header. We choose "Mercurial" for the distribution
> name because that seems appropriate. The version string comes
> from __version__. It should have no spaces and should therefore be
> safe to include outside of quotes, parenthesis, etc.
>
> Flagging the patch as BC so it shows up in release notes. This
> change should be backwards compatible. But I'm sure there is a server
> operator somewhere filtering on the existing user-agent request
> header. So I want to make noise about this change.
>
> diff --git a/mercurial/url.py b/mercurial/url.py
> --- a/mercurial/url.py
> +++ b/mercurial/url.py
> @@ -500,18 +500,22 @@ def opener(ui, authinfo=None):
>          ui.debug('http auth: user %s, password %s\n' %
>                   (user, passwd and '*' * len(passwd) or 'not set'))
>
>      handlers.extend((httpbasicauthhandler(passmgr),
>                       httpdigestauthhandler(passmgr)))
>      handlers.extend([h(ui, passmgr) for h in handlerfuncs])
>      opener = urlreq.buildopener(*handlers)
>
> -    # 1.0 here is the _protocol_ version
> -    opener.addheaders = [('User-agent', 'mercurial/proto-1.0')]
> +    opener.addheaders = [('User-agent',
> +                          # 1.0 here is the _protocol_ version
> +                          # "Mercurial/%s" identifies the distribution name
> +                          # and version. Other implementations of the client
> +                          # should choose a different name.
> +                          'mercurial/proto-1.0 Mercurial/%s' % util.version())]
>      opener.addheaders.append(('Accept', 'application/mercurial-0.1'))
>      return opener
>
>  def open(ui, url_, data=None):
>      u = util.url(url_)
>      if u.scheme:
>          u.scheme = u.scheme.lower()
>          url_, authinfo = u.authinfo()
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Gregory Szorc - July 14, 2016, 6:04 p.m.
On Thu, Jul 14, 2016 at 10:48 AM, Augie Fackler <raf@durin42.com> wrote:

> On Wed, Jul 13, 2016 at 10:18:26PM -0700, Gregory Szorc wrote:
> > # HG changeset patch
> > # User Gregory Szorc <gregory.szorc@gmail.com>
> > # Date 1468473406 25200
> > #      Wed Jul 13 22:16:46 2016 -0700
> > # Node ID 6ad61d5001b1fbfebf317d0557f158d4b34a0772
> > # Parent  52433f89f816e21ca992ac8c4a41cba0345f1b73
> > url: add distribution and version to user-agent request header (BC)
>
> It's actually intentional that we don't advertise hg version in either
> direction to my recollection.


Do you know why?


> That said, I have been meaning to write
> a patch like this (but with it behind a config knob) so that big
> companies can track how many versions of hg are in use. Can you do a
> v2 with this off by default behind a config knob?
>

I /can/. But I'm not thrilled about making it optional because open source
projects (like Mozilla) don't have a good way of force turning it on :/


>
> >
> > As a server operator, I've always wanted to know what Mercurial
> > version clients are running. Unfortunately, there is no easy
> > way to discern this today: the best you can do is sniff capabilities
> > from getbundle commands and those aren't updated frequently enough
> > to tell you anything that interesting.
> >
> > This patch adds the distribution name and version to the user-agent
> > HTTP request header. We choose "Mercurial" for the distribution
> > name because that seems appropriate. The version string comes
> > from __version__. It should have no spaces and should therefore be
> > safe to include outside of quotes, parenthesis, etc.
> >
> > Flagging the patch as BC so it shows up in release notes. This
> > change should be backwards compatible. But I'm sure there is a server
> > operator somewhere filtering on the existing user-agent request
> > header. So I want to make noise about this change.
> >
> > diff --git a/mercurial/url.py b/mercurial/url.py
> > --- a/mercurial/url.py
> > +++ b/mercurial/url.py
> > @@ -500,18 +500,22 @@ def opener(ui, authinfo=None):
> >          ui.debug('http auth: user %s, password %s\n' %
> >                   (user, passwd and '*' * len(passwd) or 'not set'))
> >
> >      handlers.extend((httpbasicauthhandler(passmgr),
> >                       httpdigestauthhandler(passmgr)))
> >      handlers.extend([h(ui, passmgr) for h in handlerfuncs])
> >      opener = urlreq.buildopener(*handlers)
> >
> > -    # 1.0 here is the _protocol_ version
> > -    opener.addheaders = [('User-agent', 'mercurial/proto-1.0')]
> > +    opener.addheaders = [('User-agent',
> > +                          # 1.0 here is the _protocol_ version
> > +                          # "Mercurial/%s" identifies the distribution
> name
> > +                          # and version. Other implementations of the
> client
> > +                          # should choose a different name.
> > +                          'mercurial/proto-1.0 Mercurial/%s' %
> util.version())]
> >      opener.addheaders.append(('Accept', 'application/mercurial-0.1'))
> >      return opener
> >
> >  def open(ui, url_, data=None):
> >      u = util.url(url_)
> >      if u.scheme:
> >          u.scheme = u.scheme.lower()
> >          url_, authinfo = u.authinfo()
> > _______________________________________________
> > Mercurial-devel mailing list
> > Mercurial-devel@mercurial-scm.org
> > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
Augie Fackler - July 14, 2016, 6:06 p.m.
(+mpm for history confirmation)

On Thu, Jul 14, 2016 at 2:04 PM, Gregory Szorc <gregory.szorc@gmail.com> wrote:
> On Thu, Jul 14, 2016 at 10:48 AM, Augie Fackler <raf@durin42.com> wrote:
>>
>> On Wed, Jul 13, 2016 at 10:18:26PM -0700, Gregory Szorc wrote:
>> > # HG changeset patch
>> > # User Gregory Szorc <gregory.szorc@gmail.com>
>> > # Date 1468473406 25200
>> > #      Wed Jul 13 22:16:46 2016 -0700
>> > # Node ID 6ad61d5001b1fbfebf317d0557f158d4b34a0772
>> > # Parent  52433f89f816e21ca992ac8c4a41cba0345f1b73
>> > url: add distribution and version to user-agent request header (BC)
>>
>> It's actually intentional that we don't advertise hg version in either
>> direction to my recollection.
>
>
> Do you know why?

I believe it's so clients don't advertise "I'm vulnerable to X!", and
also a bit so that people properly use capabilities and not version
numbers to sniff for behavior.

>
>>
>> That said, I have been meaning to write
>> a patch like this (but with it behind a config knob) so that big
>> companies can track how many versions of hg are in use. Can you do a
>> v2 with this off by default behind a config knob?
>
>
> I /can/. But I'm not thrilled about making it optional because open source
> projects (like Mozilla) don't have a good way of force turning it on :/

I sympathize.
Gregory Szorc - July 14, 2016, 6:22 p.m.
On Thu, Jul 14, 2016 at 11:06 AM, Augie Fackler <raf@durin42.com> wrote:

> (+mpm for history confirmation)
>
> On Thu, Jul 14, 2016 at 2:04 PM, Gregory Szorc <gregory.szorc@gmail.com>
> wrote:
> > On Thu, Jul 14, 2016 at 10:48 AM, Augie Fackler <raf@durin42.com> wrote:
> >>
> >> On Wed, Jul 13, 2016 at 10:18:26PM -0700, Gregory Szorc wrote:
> >> > # HG changeset patch
> >> > # User Gregory Szorc <gregory.szorc@gmail.com>
> >> > # Date 1468473406 25200
> >> > #      Wed Jul 13 22:16:46 2016 -0700
> >> > # Node ID 6ad61d5001b1fbfebf317d0557f158d4b34a0772
> >> > # Parent  52433f89f816e21ca992ac8c4a41cba0345f1b73
> >> > url: add distribution and version to user-agent request header (BC)
> >>
> >> It's actually intentional that we don't advertise hg version in either
> >> direction to my recollection.
> >
> >
> > Do you know why?
>
> I believe it's so clients don't advertise "I'm vulnerable to X!",


Browsers, Git, curl, wget, and nearly every other application advertises
version numbers and therefore vulnerabilities to known issues.


> and
> also a bit so that people properly use capabilities and not version
> numbers to sniff for behavior.
>

I sympathize. To counter that point, the User-Agent can also be used by
servers to work around bugs in known busted clients. This is explicitly
called out as a use case for the header in the HTTP RFCs.

To tie this into the concern about advertising vulnerable clients, servers
could detect vulnerable clients and a) serve a message to them telling them
to upgrade b) refuse to service them because they are broken.


>
> >
> >>
> >> That said, I have been meaning to write
> >> a patch like this (but with it behind a config knob) so that big
> >> companies can track how many versions of hg are in use. Can you do a
> >> v2 with this off by default behind a config knob?
> >
> >
> > I /can/. But I'm not thrilled about making it optional because open
> source
> > projects (like Mozilla) don't have a good way of force turning it on :/
>
> I sympathize.
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
Sean Farley - July 14, 2016, 6:24 p.m.
Gregory Szorc <gregory.szorc@gmail.com> writes:

> On Thu, Jul 14, 2016 at 10:48 AM, Augie Fackler <raf@durin42.com> wrote:
>
>> On Wed, Jul 13, 2016 at 10:18:26PM -0700, Gregory Szorc wrote:
>> > # HG changeset patch
>> > # User Gregory Szorc <gregory.szorc@gmail.com>
>> > # Date 1468473406 25200
>> > #      Wed Jul 13 22:16:46 2016 -0700
>> > # Node ID 6ad61d5001b1fbfebf317d0557f158d4b34a0772
>> > # Parent  52433f89f816e21ca992ac8c4a41cba0345f1b73
>> > url: add distribution and version to user-agent request header (BC)
>>
>> It's actually intentional that we don't advertise hg version in either
>> direction to my recollection.
>
>
> Do you know why?

If I recall, it was to force people to properly program against
capabilities and not versions.

>> That said, I have been meaning to write
>> a patch like this (but with it behind a config knob) so that big
>> companies can track how many versions of hg are in use. Can you do a
>> v2 with this off by default behind a config knob?
>>
>
> I /can/. But I'm not thrilled about making it optional because open source
> projects (like Mozilla) don't have a good way of force turning it on :/

I agree here. I'd either want this on all the time or not at all.
Matt Mackall - July 14, 2016, 10:16 p.m.
On Thu, 2016-07-14 at 11:22 -0700, Gregory Szorc wrote:
> On Thu, Jul 14, 2016 at 11:06 AM, Augie Fackler <raf@durin42.com> wrote:
> 
> > 
> > (+mpm for history confirmation)
> > 
> > On Thu, Jul 14, 2016 at 2:04 PM, Gregory Szorc <gregory.szorc@gmail.com>
> > wrote:
> > > 
> > > On Thu, Jul 14, 2016 at 10:48 AM, Augie Fackler <raf@durin42.com> wrote:
> > > > 
> > > > 
> > > > On Wed, Jul 13, 2016 at 10:18:26PM -0700, Gregory Szorc wrote:
> > > > > 
> > > > > # HG changeset patch
> > > > > # User Gregory Szorc <gregory.szorc@gmail.com>
> > > > > # Date 1468473406 25200
> > > > > #      Wed Jul 13 22:16:46 2016 -0700
> > > > > # Node ID 6ad61d5001b1fbfebf317d0557f158d4b34a0772
> > > > > # Parent  52433f89f816e21ca992ac8c4a41cba0345f1b73
> > > > > url: add distribution and version to user-agent request header (BC)
> > > > It's actually intentional that we don't advertise hg version in either
> > > > direction to my recollection.
> > > 
> > > Do you know why?
> > I believe it's so clients don't advertise "I'm vulnerable to X!",
> 
> Browsers, Git, curl, wget, and nearly every other application advertises
> version numbers and therefore vulnerabilities to known issues.

Yeah, this is actually a bigger issue in the other direction. The _server_
should not expose its version because it's a sitting target that's vulnerable to
scanning. But our capabilities model makes us somewhat vulnerable to
fingerprinting that can establish ranges. Mostly not a problem: our server
attack surface for unauthenticated users is pretty solid.

> 
> > 
> > and
> > also a bit so that people properly use capabilities and not version
> > numbers to sniff for behavior.
> > 
> I sympathize. To counter that point, the User-Agent can also be used by
> servers to work around bugs in known busted clients. This is explicitly
> called out as a use case for the header in the HTTP RFCs.

Yes, but the web's protocol model is not really the same as ours. In Mercurial,
the client is supposed to make every protocol decision while the server
passively does what it's asked to do while keeping no state and making no
choices. There's no negotiation taking place, so the server never needs or
should know what the client is capable of except via what it asks for.

This model has served us very well for keeping the server side simple and
compatible and the client complexity has been pretty well-contained as well. But
it does have the various downsides you've mentioned like being unable to herd
your kittens remotely.

I'm ok with exposing a version to servers for logging purposes, but we shouldn't
do anything to facilitate looking at it in the server side code.  

I think your theory that __version__ won't contain spaces is optimistic.

-- 
Mathematics is the supreme nostalgia of our time.
Mike Hommey - July 14, 2016, 10:38 p.m.
On Thu, Jul 14, 2016 at 05:16:12PM -0500, Matt Mackall wrote:
> Yes, but the web's protocol model is not really the same as ours. In Mercurial,
> the client is supposed to make every protocol decision while the server
> passively does what it's asked to do while keeping no state and making no
> choices. There's no negotiation taking place, so the server never needs or
> should know what the client is capable of except via what it asks for.

Note negociation /could/ be added, through the Accept header. bundle2
pushes also expose client capabilities in the reply:changegroup part.

Mike
Pierre-Yves David - July 15, 2016, 1:24 a.m.
On 07/15/2016 12:16 AM, Matt Mackall wrote:
> On Thu, 2016-07-14 at 11:22 -0700, Gregory Szorc wrote:
>> On Thu, Jul 14, 2016 at 11:06 AM, Augie Fackler <raf@durin42.com> wrote:
>>
>>>
>>> (+mpm for history confirmation)
>>>
>>> On Thu, Jul 14, 2016 at 2:04 PM, Gregory Szorc <gregory.szorc@gmail.com>
>>> wrote:
>>>>
>>>> On Thu, Jul 14, 2016 at 10:48 AM, Augie Fackler <raf@durin42.com> wrote:
>>>>>
>>>>>
>>>>> On Wed, Jul 13, 2016 at 10:18:26PM -0700, Gregory Szorc wrote:
>>>>>>
>>>>>> # HG changeset patch
>>>>>> # User Gregory Szorc <gregory.szorc@gmail.com>
>>>>>> # Date 1468473406 25200
>>>>>> #      Wed Jul 13 22:16:46 2016 -0700
>>>>>> # Node ID 6ad61d5001b1fbfebf317d0557f158d4b34a0772
>>>>>> # Parent  52433f89f816e21ca992ac8c4a41cba0345f1b73
>>>>>> url: add distribution and version to user-agent request header (BC)
>>>>> It's actually intentional that we don't advertise hg version in either
>>>>> direction to my recollection.
>>>>
>>>> Do you know why?
>>> I believe it's so clients don't advertise "I'm vulnerable to X!",
>>
>> Browsers, Git, curl, wget, and nearly every other application advertises
>> version numbers and therefore vulnerabilities to known issues.
> 
> Yeah, this is actually a bigger issue in the other direction. The _server_
> should not expose its version because it's a sitting target that's vulnerable to
> scanning. But our capabilities model makes us somewhat vulnerable to
> fingerprinting that can establish ranges. Mostly not a problem: our server
> attack surface for unauthenticated users is pretty solid.
> 
>>
>>>
>>> and
>>> also a bit so that people properly use capabilities and not version
>>> numbers to sniff for behavior.
>>>
>> I sympathize. To counter that point, the User-Agent can also be used by
>> servers to work around bugs in known busted clients. This is explicitly
>> called out as a use case for the header in the HTTP RFCs.
> 
> Yes, but the web's protocol model is not really the same as ours. In Mercurial,
> the client is supposed to make every protocol decision while the server
> passively does what it's asked to do while keeping no state and making no
> choices. There's no negotiation taking place, so the server never needs or
> should know what the client is capable of except via what it asks for.
> 
> This model has served us very well for keeping the server side simple and
> compatible and the client complexity has been pretty well-contained as well. But
> it does have the various downsides you've mentioned like being unable to herd
> your kittens remotely.
> 
> I'm ok with exposing a version to servers for logging purposes, but we shouldn't
> do anything to facilitate looking at it in the server side code.

I'm leaning the same way, helping people to get state on their clients
seems like a good idea, but we should  make sure there is enough ward
around this that people stick on capability for all logic related code.

> I think your theory that __version__ won't contain spaces is optimistic.

Yep, we probably want to urlencode this.
timeless - July 17, 2016, 8:40 p.m.
If you're going to include things, might I suggest including the python version?

Something so that you can count pypy, cpy, py3, py2.6/py2.7, ironpy...

On Thu, Jul 14, 2016 at 9:24 PM, Pierre-Yves David
<pierre-yves.david@ens-lyon.org> wrote:
>
>
> On 07/15/2016 12:16 AM, Matt Mackall wrote:
>> On Thu, 2016-07-14 at 11:22 -0700, Gregory Szorc wrote:
>>> On Thu, Jul 14, 2016 at 11:06 AM, Augie Fackler <raf@durin42.com> wrote:
>>>
>>>>
>>>> (+mpm for history confirmation)
>>>>
>>>> On Thu, Jul 14, 2016 at 2:04 PM, Gregory Szorc <gregory.szorc@gmail.com>
>>>> wrote:
>>>>>
>>>>> On Thu, Jul 14, 2016 at 10:48 AM, Augie Fackler <raf@durin42.com> wrote:
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 13, 2016 at 10:18:26PM -0700, Gregory Szorc wrote:
>>>>>>>
>>>>>>> # HG changeset patch
>>>>>>> # User Gregory Szorc <gregory.szorc@gmail.com>
>>>>>>> # Date 1468473406 25200
>>>>>>> #      Wed Jul 13 22:16:46 2016 -0700
>>>>>>> # Node ID 6ad61d5001b1fbfebf317d0557f158d4b34a0772
>>>>>>> # Parent  52433f89f816e21ca992ac8c4a41cba0345f1b73
>>>>>>> url: add distribution and version to user-agent request header (BC)
>>>>>> It's actually intentional that we don't advertise hg version in either
>>>>>> direction to my recollection.
>>>>>
>>>>> Do you know why?
>>>> I believe it's so clients don't advertise "I'm vulnerable to X!",
>>>
>>> Browsers, Git, curl, wget, and nearly every other application advertises
>>> version numbers and therefore vulnerabilities to known issues.
>>
>> Yeah, this is actually a bigger issue in the other direction. The _server_
>> should not expose its version because it's a sitting target that's vulnerable to
>> scanning. But our capabilities model makes us somewhat vulnerable to
>> fingerprinting that can establish ranges. Mostly not a problem: our server
>> attack surface for unauthenticated users is pretty solid.
>>
>>>
>>>>
>>>> and
>>>> also a bit so that people properly use capabilities and not version
>>>> numbers to sniff for behavior.
>>>>
>>> I sympathize. To counter that point, the User-Agent can also be used by
>>> servers to work around bugs in known busted clients. This is explicitly
>>> called out as a use case for the header in the HTTP RFCs.
>>
>> Yes, but the web's protocol model is not really the same as ours. In Mercurial,
>> the client is supposed to make every protocol decision while the server
>> passively does what it's asked to do while keeping no state and making no
>> choices. There's no negotiation taking place, so the server never needs or
>> should know what the client is capable of except via what it asks for.
>>
>> This model has served us very well for keeping the server side simple and
>> compatible and the client complexity has been pretty well-contained as well. But
>> it does have the various downsides you've mentioned like being unable to herd
>> your kittens remotely.
>>
>> I'm ok with exposing a version to servers for logging purposes, but we shouldn't
>> do anything to facilitate looking at it in the server side code.
>
> I'm leaning the same way, helping people to get state on their clients
> seems like a good idea, but we should  make sure there is enough ward
> around this that people stick on capability for all logic related code.
>
>> I think your theory that __version__ won't contain spaces is optimistic.
>
> Yep, we probably want to urlencode this.
>
> --
> Pierre-Yves David
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/url.py b/mercurial/url.py
--- a/mercurial/url.py
+++ b/mercurial/url.py
@@ -500,18 +500,22 @@  def opener(ui, authinfo=None):
         ui.debug('http auth: user %s, password %s\n' %
                  (user, passwd and '*' * len(passwd) or 'not set'))
 
     handlers.extend((httpbasicauthhandler(passmgr),
                      httpdigestauthhandler(passmgr)))
     handlers.extend([h(ui, passmgr) for h in handlerfuncs])
     opener = urlreq.buildopener(*handlers)
 
-    # 1.0 here is the _protocol_ version
-    opener.addheaders = [('User-agent', 'mercurial/proto-1.0')]
+    opener.addheaders = [('User-agent',
+                          # 1.0 here is the _protocol_ version
+                          # "Mercurial/%s" identifies the distribution name
+                          # and version. Other implementations of the client
+                          # should choose a different name.
+                          'mercurial/proto-1.0 Mercurial/%s' % util.version())]
     opener.addheaders.append(('Accept', 'application/mercurial-0.1'))
     return opener
 
 def open(ui, url_, data=None):
     u = util.url(url_)
     if u.scheme:
         u.scheme = u.scheme.lower()
         url_, authinfo = u.authinfo()