Patchwork [3,of,7] dispatch: rework the serve --stdio safe argument checks

login
register
mail settings
Submitter Paul Morelle
Date June 20, 2018, 4:36 p.m.
Message ID <81edf3431b95d57257c6.1529512584@belenos.localdomain>
Download mbox | patch
Permalink /patch/32346/
State Accepted
Headers show

Comments

Paul Morelle - June 20, 2018, 4:36 p.m.
# HG changeset patch
# User Boris Feld <boris.feld@octobus.net>
# Date 1529489906 -7200
#      Wed Jun 20 12:18:26 2018 +0200
# Node ID 81edf3431b95d57257c690f7fe125c6676a78e18
# Parent  b7051e4bf783c844f705473a2396458acecc59dc
# EXP-Topic remote-debug
# Available At https://bitbucket.org/octobus/mercurial-devel/
#              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 81edf3431b95
dispatch: rework the serve --stdio safe argument checks

We prepare the code to check for arguments after the mandatory ones.

We want to relax this check to allow for wider argument passing in certain
conditions (example --debug). We make preliminary refactoring in different
changesets for clarity.
Yuya Nishihara - June 21, 2018, 11:53 a.m.
On Wed, 20 Jun 2018 18:36:24 +0200, Paul Morelle wrote:
>> # HG changeset patch
> # User Boris Feld <boris.feld@octobus.net>
> # Date 1529489906 -7200
> #      Wed Jun 20 12:18:26 2018 +0200
> # Node ID 81edf3431b95d57257c690f7fe125c6676a78e18
> # Parent  b7051e4bf783c844f705473a2396458acecc59dc
> # EXP-Topic remote-debug
> # Available At https://bitbucket.org/octobus/mercurial-devel/
> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 81edf3431b95
> dispatch: rework the serve --stdio safe argument checks
> 
> We prepare the code to check for arguments after the mandatory ones.
> 
> We want to relax this check to allow for wider argument passing in certain
> conditions (example --debug). We make preliminary refactoring in different
> changesets for clarity.
> 
> diff -r b7051e4bf783 -r 81edf3431b95 mercurial/dispatch.py
> --- a/mercurial/dispatch.py	Wed Jun 20 12:16:48 2018 +0200
> +++ b/mercurial/dispatch.py	Wed Jun 20 12:18:26 2018 +0200
> @@ -285,12 +285,15 @@
>              def unsafe():
>                  msg = _('potentially unsafe serve --stdio invocation: %s')
>                  raise error.Abort(msg % (stringutil.pprint(req.args),))
> -            if (len(req.args) != 4 or
> +            if (len(req.args) < 4 or
>                  req.args[0] != '-R' or
>                  req.args[1].startswith('--') or
>                  req.args[2] != 'serve' or
>                  req.args[3] != '--stdio'):
>                  unsafe()
> +            other_args = req.args[4:]
> +            if other_args:
> +                unsafe()

It's a bit scary to extend this just for debugging aids because argument
parsing at this phase has to be ad-hoc. Can't you instead use the
ssh/authorized_keys file to redirect a server session to 'hg debugserve'?

Alternatively, we could add a wrapper script like hg-ssh.
Paul Morelle - June 21, 2018, 5:46 p.m.
On 21/06/18 13:53, Yuya Nishihara wrote:
> On Wed, 20 Jun 2018 18:36:24 +0200, Paul Morelle wrote:
>>> # HG changeset patch
>> # User Boris Feld <boris.feld@octobus.net>
>> # Date 1529489906 -7200
>> #      Wed Jun 20 12:18:26 2018 +0200
>> # Node ID 81edf3431b95d57257c690f7fe125c6676a78e18
>> # Parent  b7051e4bf783c844f705473a2396458acecc59dc
>> # EXP-Topic remote-debug
>> # Available At https://bitbucket.org/octobus/mercurial-devel/
>> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 81edf3431b95
>> dispatch: rework the serve --stdio safe argument checks
>>
>> We prepare the code to check for arguments after the mandatory ones.
>>
>> We want to relax this check to allow for wider argument passing in certain
>> conditions (example --debug). We make preliminary refactoring in different
>> changesets for clarity.
>>
>> diff -r b7051e4bf783 -r 81edf3431b95 mercurial/dispatch.py
>> --- a/mercurial/dispatch.py	Wed Jun 20 12:16:48 2018 +0200
>> +++ b/mercurial/dispatch.py	Wed Jun 20 12:18:26 2018 +0200
>> @@ -285,12 +285,15 @@
>>              def unsafe():
>>                  msg = _('potentially unsafe serve --stdio invocation: %s')
>>                  raise error.Abort(msg % (stringutil.pprint(req.args),))
>> -            if (len(req.args) != 4 or
>> +            if (len(req.args) < 4 or
>>                  req.args[0] != '-R' or
>>                  req.args[1].startswith('--') or
>>                  req.args[2] != 'serve' or
>>                  req.args[3] != '--stdio'):
>>                  unsafe()
>> +            other_args = req.args[4:]
>> +            if other_args:
>> +                unsafe()
> It's a bit scary to extend this just for debugging aids because argument
> parsing at this phase has to be ad-hoc. Can't you instead use the
> ssh/authorized_keys file to redirect a server session to 'hg debugserve'?
>
> Alternatively, we could add a wrapper script like hg-ssh.
Hello Yuya,

If I have correctly understood, your proposition is to keep the
client-side version of this, and move the permission management to the
sshkey/ssh-server level. Is this right?

Something we could do in this area is to replace the call to sshpeer by
`hg debugserve …` when we need the remote-debugging feature.
However, exposing a "debug" command at the wireprotocol level seems bad
; maybe we could introduce a `--remote-argument` flag that would lift
the check with a permission handling similar to what we have today (in
patch 4).

However, having all checks disabled (debugserve) is quite different than
what is proposed in this series, which only allows for more information
to be retrieved, and that's it.
Fully bypassing the argument check would allow the client do a full
range of actions (including arbitrary code execution). This is a much
different access level, and in my current use case it would be a problem.

I don't think that moving the permission handling outside of Mercurial
would be a good move : implementing similar features for HTTP would
still require some Mercurial-side restrictions and permissions handling
anyway. It seems more consistent to implement it on the Mercurial side
for all protocols.

Cheers,

Paul Morelle
Yuya Nishihara - June 22, 2018, 11:15 a.m.
On Thu, 21 Jun 2018 19:46:24 +0200, Paul Morelle wrote:
> On 21/06/18 13:53, Yuya Nishihara wrote:
> > On Wed, 20 Jun 2018 18:36:24 +0200, Paul Morelle wrote:
> >>> # HG changeset patch
> >> # User Boris Feld <boris.feld@octobus.net>
> >> # Date 1529489906 -7200
> >> #      Wed Jun 20 12:18:26 2018 +0200
> >> # Node ID 81edf3431b95d57257c690f7fe125c6676a78e18
> >> # Parent  b7051e4bf783c844f705473a2396458acecc59dc
> >> # EXP-Topic remote-debug
> >> # Available At https://bitbucket.org/octobus/mercurial-devel/
> >> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 81edf3431b95
> >> dispatch: rework the serve --stdio safe argument checks
> >>
> >> We prepare the code to check for arguments after the mandatory ones.
> >>
> >> We want to relax this check to allow for wider argument passing in certain
> >> conditions (example --debug). We make preliminary refactoring in different
> >> changesets for clarity.
> >>
> >> diff -r b7051e4bf783 -r 81edf3431b95 mercurial/dispatch.py
> >> --- a/mercurial/dispatch.py	Wed Jun 20 12:16:48 2018 +0200
> >> +++ b/mercurial/dispatch.py	Wed Jun 20 12:18:26 2018 +0200
> >> @@ -285,12 +285,15 @@
> >>              def unsafe():
> >>                  msg = _('potentially unsafe serve --stdio invocation: %s')
> >>                  raise error.Abort(msg % (stringutil.pprint(req.args),))
> >> -            if (len(req.args) != 4 or
> >> +            if (len(req.args) < 4 or
> >>                  req.args[0] != '-R' or
> >>                  req.args[1].startswith('--') or
> >>                  req.args[2] != 'serve' or
> >>                  req.args[3] != '--stdio'):
> >>                  unsafe()
> >> +            other_args = req.args[4:]
> >> +            if other_args:
> >> +                unsafe()
> > It's a bit scary to extend this just for debugging aids because argument
> > parsing at this phase has to be ad-hoc. Can't you instead use the
> > ssh/authorized_keys file to redirect a server session to 'hg debugserve'?
> >
> > Alternatively, we could add a wrapper script like hg-ssh.
> 
> If I have correctly understood, your proposition is to keep the
> client-side version of this, and move the permission management to the
> sshkey/ssh-server level. Is this right?

Yes.

> Something we could do in this area is to replace the call to sshpeer by
> `hg debugserve …` when we need the remote-debugging feature.
> However, exposing a "debug" command at the wireprotocol level seems bad
> ; maybe we could introduce a `--remote-argument` flag that would lift
> the check with a permission handling similar to what we have today (in
> patch 4).
> 
> However, having all checks disabled (debugserve) is quite different than
> what is proposed in this series, which only allows for more information
> to be retrieved, and that's it.
> Fully bypassing the argument check would allow the client do a full
> range of actions (including arbitrary code execution). This is a much
> different access level, and in my current use case it would be a problem.

Instead of using debugserve, the wrapper script can set ui.debug/traceback
flag just like contrib/hg-ssh does for hooks.

> I don't think that moving the permission handling outside of Mercurial
> would be a good move : implementing similar features for HTTP would
> still require some Mercurial-side restrictions and permissions handling
> anyway. It seems more consistent to implement it on the Mercurial side
> for all protocols.

I agree it should be handled at protocol level, and I think it should be
a wire command/option rather than the command argument hack. IMHO, it's
way safer than constructing/parsing shell command arguments carefully,
and HTTP will need that anyway.
Paul Morelle - June 25, 2018, 5:51 p.m.
On 22/06/18 13:15, Yuya Nishihara wrote:
> On Thu, 21 Jun 2018 19:46:24 +0200, Paul Morelle wrote:
>> On 21/06/18 13:53, Yuya Nishihara wrote:
>>> On Wed, 20 Jun 2018 18:36:24 +0200, Paul Morelle wrote:
>>>>> # HG changeset patch
>>>> # User Boris Feld <boris.feld@octobus.net>
>>>> # Date 1529489906 -7200
>>>> #      Wed Jun 20 12:18:26 2018 +0200
>>>> # Node ID 81edf3431b95d57257c690f7fe125c6676a78e18
>>>> # Parent  b7051e4bf783c844f705473a2396458acecc59dc
>>>> # EXP-Topic remote-debug
>>>> # Available At https://bitbucket.org/octobus/mercurial-devel/
>>>> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 81edf3431b95
>>>> dispatch: rework the serve --stdio safe argument checks
>>>>
>>>> We prepare the code to check for arguments after the mandatory ones.
>>>>
>>>> We want to relax this check to allow for wider argument passing in certain
>>>> conditions (example --debug). We make preliminary refactoring in different
>>>> changesets for clarity.
>>>>
>>>> diff -r b7051e4bf783 -r 81edf3431b95 mercurial/dispatch.py
>>>> --- a/mercurial/dispatch.py	Wed Jun 20 12:16:48 2018 +0200
>>>> +++ b/mercurial/dispatch.py	Wed Jun 20 12:18:26 2018 +0200
>>>> @@ -285,12 +285,15 @@
>>>>              def unsafe():
>>>>                  msg = _('potentially unsafe serve --stdio invocation: %s')
>>>>                  raise error.Abort(msg % (stringutil.pprint(req.args),))
>>>> -            if (len(req.args) != 4 or
>>>> +            if (len(req.args) < 4 or
>>>>                  req.args[0] != '-R' or
>>>>                  req.args[1].startswith('--') or
>>>>                  req.args[2] != 'serve' or
>>>>                  req.args[3] != '--stdio'):
>>>>                  unsafe()
>>>> +            other_args = req.args[4:]
>>>> +            if other_args:
>>>> +                unsafe()
>>> It's a bit scary to extend this just for debugging aids because argument
>>> parsing at this phase has to be ad-hoc. Can't you instead use the
>>> ssh/authorized_keys file to redirect a server session to 'hg debugserve'?
>>>
>>> Alternatively, we could add a wrapper script like hg-ssh.
>> If I have correctly understood, your proposition is to keep the
>> client-side version of this, and move the permission management to the
>> sshkey/ssh-server level. Is this right?
> Yes.
>
>> Something we could do in this area is to replace the call to sshpeer by
>> `hg debugserve …` when we need the remote-debugging feature.
>> However, exposing a "debug" command at the wireprotocol level seems bad
>> ; maybe we could introduce a `--remote-argument` flag that would lift
>> the check with a permission handling similar to what we have today (in
>> patch 4).
>>
>> However, having all checks disabled (debugserve) is quite different than
>> what is proposed in this series, which only allows for more information
>> to be retrieved, and that's it.
>> Fully bypassing the argument check would allow the client do a full
>> range of actions (including arbitrary code execution). This is a much
>> different access level, and in my current use case it would be a problem.
> Instead of using debugserve, the wrapper script can set ui.debug/traceback
> flag just like contrib/hg-ssh does for hooks.
Doing this at the authorized_keys level is not an option for us. In one
of our target environment for this kind of debugging, any change to the
ssh script requires validation by admin people that can take from a
couple of days to a couple of weeks, making it impossible to turn
debugging on for a couple of commands, or easily compare various settings.

The ability to control the used debug options from the client is very
valuable in this case. We have already gathered many useful information
from it!

I agree that ad-hoc argument parsing is less than optimal. Handling
three extra flags (--debug, --profile and --traceback) is not too awful,
but then we need to pass some associated configuration option (e.g.
profiling.time-track=real) which will make things even more verbose.

Maybe we could leverage the contents of `mercurial/configitems.py` by
adding a "serversafe" attribute to the configuration items? Such config
item could be specified on the client side (if the user is allowed to).

What do you think about this?

>> I don't think that moving the permission handling outside of Mercurial
>> would be a good move : implementing similar features for HTTP would
>> still require some Mercurial-side restrictions and permissions handling
>> anyway. It seems more consistent to implement it on the Mercurial side
>> for all protocols.
> I agree it should be handled at protocol level, and I think it should be
> a wire command/option rather than the command argument hack. IMHO, it's
> way safer than constructing/parsing shell command arguments carefully,
> and HTTP will need that anyway.
The life cycle of the server is quite different in ssh and http, and
this has different implications.
For the ssh client, there are multiple initializations and cachings that
are worth tracking, so tracking the full run is what is most appropriate
for our needs.
For http, the arguments will have to be processed at the hgweb level
(instead of the protocol level).

(note: we will need a similar kind of handling to give access to the
obsolete changesets of a server)

--
Paul Morelle
Yuya Nishihara - June 26, 2018, 11:53 a.m.
On Mon, 25 Jun 2018 19:51:52 +0200, Paul Morelle wrote:
> On 22/06/18 13:15, Yuya Nishihara wrote:
> > On Thu, 21 Jun 2018 19:46:24 +0200, Paul Morelle wrote:
> >> On 21/06/18 13:53, Yuya Nishihara wrote:
> >>> On Wed, 20 Jun 2018 18:36:24 +0200, Paul Morelle wrote:
> >>>>> # HG changeset patch
> >>>> # User Boris Feld <boris.feld@octobus.net>
> >>>> # Date 1529489906 -7200
> >>>> #      Wed Jun 20 12:18:26 2018 +0200
> >>>> # Node ID 81edf3431b95d57257c690f7fe125c6676a78e18
> >>>> # Parent  b7051e4bf783c844f705473a2396458acecc59dc
> >>>> # EXP-Topic remote-debug
> >>>> # Available At https://bitbucket.org/octobus/mercurial-devel/
> >>>> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 81edf3431b95
> >>>> dispatch: rework the serve --stdio safe argument checks
> >>>>
> >>>> We prepare the code to check for arguments after the mandatory ones.
> >>>>
> >>>> We want to relax this check to allow for wider argument passing in certain
> >>>> conditions (example --debug). We make preliminary refactoring in different
> >>>> changesets for clarity.
> >>>>
> >>>> diff -r b7051e4bf783 -r 81edf3431b95 mercurial/dispatch.py
> >>>> --- a/mercurial/dispatch.py	Wed Jun 20 12:16:48 2018 +0200
> >>>> +++ b/mercurial/dispatch.py	Wed Jun 20 12:18:26 2018 +0200
> >>>> @@ -285,12 +285,15 @@
> >>>>              def unsafe():
> >>>>                  msg = _('potentially unsafe serve --stdio invocation: %s')
> >>>>                  raise error.Abort(msg % (stringutil.pprint(req.args),))
> >>>> -            if (len(req.args) != 4 or
> >>>> +            if (len(req.args) < 4 or
> >>>>                  req.args[0] != '-R' or
> >>>>                  req.args[1].startswith('--') or
> >>>>                  req.args[2] != 'serve' or
> >>>>                  req.args[3] != '--stdio'):
> >>>>                  unsafe()
> >>>> +            other_args = req.args[4:]
> >>>> +            if other_args:
> >>>> +                unsafe()
> >>> It's a bit scary to extend this just for debugging aids because argument
> >>> parsing at this phase has to be ad-hoc. Can't you instead use the
> >>> ssh/authorized_keys file to redirect a server session to 'hg debugserve'?
> >>>
> >>> Alternatively, we could add a wrapper script like hg-ssh.
> >> If I have correctly understood, your proposition is to keep the
> >> client-side version of this, and move the permission management to the
> >> sshkey/ssh-server level. Is this right?
> > Yes.
> >
> >> Something we could do in this area is to replace the call to sshpeer by
> >> `hg debugserve …` when we need the remote-debugging feature.
> >> However, exposing a "debug" command at the wireprotocol level seems bad
> >> ; maybe we could introduce a `--remote-argument` flag that would lift
> >> the check with a permission handling similar to what we have today (in
> >> patch 4).
> >>
> >> However, having all checks disabled (debugserve) is quite different than
> >> what is proposed in this series, which only allows for more information
> >> to be retrieved, and that's it.
> >> Fully bypassing the argument check would allow the client do a full
> >> range of actions (including arbitrary code execution). This is a much
> >> different access level, and in my current use case it would be a problem.
> > Instead of using debugserve, the wrapper script can set ui.debug/traceback
> > flag just like contrib/hg-ssh does for hooks.
> Doing this at the authorized_keys level is not an option for us. In one
> of our target environment for this kind of debugging, any change to the
> ssh script requires validation by admin people that can take from a
> couple of days to a couple of weeks, making it impossible to turn
> debugging on for a couple of commands, or easily compare various settings.

IMHO, that doesn't justify why we have to put everything in hg core.
"hg serve" itself is an SSH script in security POV.

> The ability to control the used debug options from the client is very
> valuable in this case. We have already gathered many useful information
> from it!
> 
> I agree that ad-hoc argument parsing is less than optimal. Handling
> three extra flags (--debug, --profile and --traceback) is not too awful,
> but then we need to pass some associated configuration option (e.g.
> profiling.time-track=real) which will make things even more verbose.
> 
> Maybe we could leverage the contents of `mercurial/configitems.py` by
> adding a "serversafe" attribute to the configuration items? Such config
> item could be specified on the client side (if the user is allowed to).

No, that sounds worse than the original patch. IIUC, the reason why we have
hard-coded check for req.args is that our option parser isn't strict and is
likely to allow command-line injection. Making the whitelist fully configurable
would be dangerous.

> >> I don't think that moving the permission handling outside of Mercurial
> >> would be a good move : implementing similar features for HTTP would
> >> still require some Mercurial-side restrictions and permissions handling
> >> anyway. It seems more consistent to implement it on the Mercurial side
> >> for all protocols.
> > I agree it should be handled at protocol level, and I think it should be
> > a wire command/option rather than the command argument hack. IMHO, it's
> > way safer than constructing/parsing shell command arguments carefully,
> > and HTTP will need that anyway.
> The life cycle of the server is quite different in ssh and http, and
> this has different implications.
> For the ssh client, there are multiple initializations and cachings that
> are worth tracking, so tracking the full run is what is most appropriate
> for our needs.
> For http, the arguments will have to be processed at the hgweb level
> (instead of the protocol level).

I know the situation of HTTP and SSH is slightly different, but how important
it is to catch the early-stage thingy by extending the SSH command-line
interface, which we have to support forever? I think wire-protocol level
implementation will be good enough, and better in the long run.
Augie Fackler - June 26, 2018, 3:32 p.m.
> On Jun 26, 2018, at 07:53, Yuya Nishihara <yuya@tcha.org> wrote:
> 
> On Mon, 25 Jun 2018 19:51:52 +0200, Paul Morelle wrote:
>> On 22/06/18 13:15, Yuya Nishihara wrote:
>>> On Thu, 21 Jun 2018 19:46:24 +0200, Paul Morelle wrote:
>> Doing this at the authorized_keys level is not an option for us. In one
>> of our target environment for this kind of debugging, any change to the
>> ssh script requires validation by admin people that can take from a
>> couple of days to a couple of weeks, making it impossible to turn
>> debugging on for a couple of commands, or easily compare various settings.
> 
> IMHO, that doesn't justify why we have to put everything in hg core.
> "hg serve" itself is an SSH script in security POV.

Agreed. It's also a strong argument for infrastructure-as-code so you can test configuration changes like this on a disposable VM, then apply to the real environment once it's right.

> 
>> The ability to control the used debug options from the client is very
>> valuable in this case. We have already gathered many useful information
>> from it!
>> 
>> I agree that ad-hoc argument parsing is less than optimal. Handling
>> three extra flags (--debug, --profile and --traceback) is not too awful,
>> but then we need to pass some associated configuration option (e.g.
>> profiling.time-track=real) which will make things even more verbose.
>> 
>> Maybe we could leverage the contents of `mercurial/configitems.py` by
>> adding a "serversafe" attribute to the configuration items? Such config
>> item could be specified on the client side (if the user is allowed to).
> 
> No, that sounds worse than the original patch. IIUC, the reason why we have
> hard-coded check for req.args is that our option parser isn't strict and is
> likely to allow command-line injection. Making the whitelist fully configurable
> would be dangerous.

Yep. I'm dubious we'd ever take such a patch.

> 
>>>> I don't think that moving the permission handling outside of Mercurial
>>>> would be a good move : implementing similar features for HTTP would
>>>> still require some Mercurial-side restrictions and permissions handling
>>>> anyway. It seems more consistent to implement it on the Mercurial side
>>>> for all protocols.
>>> I agree it should be handled at protocol level, and I think it should be
>>> a wire command/option rather than the command argument hack. IMHO, it's
>>> way safer than constructing/parsing shell command arguments carefully,
>>> and HTTP will need that anyway.
>> The life cycle of the server is quite different in ssh and http, and
>> this has different implications.
>> For the ssh client, there are multiple initializations and cachings that
>> are worth tracking, so tracking the full run is what is most appropriate
>> for our needs.
>> For http, the arguments will have to be processed at the hgweb level
>> (instead of the protocol level).
> 
> I know the situation of HTTP and SSH is slightly different, but how important
> it is to catch the early-stage thingy by extending the SSH command-line
> interface, which we have to support forever? I think wire-protocol level
> implementation will be good enough, and better in the long run.
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Boris Feld - July 15, 2018, 10:33 a.m.
Hello,

This topic is important to us, I will try to provide more context about 
our needsand constraintsin order to reach a consensus.

Our ultimate goal is to activate various debugging output (profile, 
debug log...) on the server via the client. We have already used this 
series in real condition by having a custom Mercurial version deployed. 
The data we get from using this feature proved to be very valuable and 
speed up our works a lot. Weidentified the source of multipleperformance 
issuesthanks to it.

The properties that are important for us are the following and we will 
explain why a bit further:
     - We want to activate it as soon as possible (ssh: dispatch level, 
http: wsgi server),
     - Controlled by the client,
     - Security managed by core.

Yuya would like to move the implementation outside of core dispatch, for 
security and maintenance concerns. Either, into custom ssh script, or at 
the protocol level. Neither of these two locations fits our needs (and 
constraints).

So we need to find a way to do this early extra validation of parameter 
in a way that is good enough in terms of security and maintenance.

# Longer explanation

## Why does it has to be set up so early?

A significant part of the slowness we observe happen very early, during 
the initial setup (extensions setup, various data loading, custom 
extensions code, etc...). So the earlier the extra debug/profile is 
setup, the more useful data we gather.

Doing this at the protocol level would be too late, as a lot of the 
important time-consuming activities have already happened.

## Why control it from the client instead of being a custom ssh script?

Some of the people we work with have a high level of security (eg: we 
don't have a full shell access on their server and we only have access 
to a subset of their repositories).

This means multiples things:

* We still want some security management, using `hg debugserve` would 
allow arbitrary code to be executed, something we still want to prevent.

* Deploying a new ssh script is something complex that takes time. 
Admins have to look at the new script and deploy it. It can take 
multiple days, making us unable to collect these valuable data (in time).

* If we, mercurial core developers, are afraid of getting the validation 
logic in core wrong, I don't think we should expect people not familiar 
with Mercurial to be able to get that right. So moving the 
responsibility to validate input to admins seems dangerous.

* Having an example script in `contrib/` would not help much as people 
usually already have some custom script.

* Controlling option from the client is critical. When investigating 
something, we try various options to look at the issue from different 
angles (profile cpu vs time). Having to redeploy a new script every time 
we want to change an option break our ability to investigate quickly as 
it often requires synchronization between different teams.

## Why do you want this into core

* Make it much simpler for us to use it at more place (including to help 
random people on the list and bug tracker). Having to deploy a custom 
Mercurial binaries on the server is a strong barrier.

* Given how valuable this is to us, we expect it to be valuable for 
others. In fact server-side debug message is something various users 
asked for in the past.

* Maintaining a custom series of patch to be moved around is time 
wasting. We would prefer focusing our work on upstreaming code for the 
benefits of all of us rather than maintaining a custom patch series. Our 
work days are more about upstreaming code into core for the benefit of 
all of us rather than the other way around.


# What next?

With the current constraints, the proposed approach was the best 
solution we came up with. Admins still have the control of the feature 
with few configuration knobs and turning them on is quick and easy. 
Debug outputs are provided on demand so we can test against servers in 
real conditions without impacting others users and it's enabled early 
enough so we could gather valuable data.

It helped us to identify existing issues and send appropriate patches. 
For example, the fncache improvement that recently landed originated 
from this.

However, we understand Yuya's concerns about security. Maybe we can 
rework the command line parsing tooling to extract a small kernel that 
would fit our need here while being easier to validate and test security 
wise?

We are willing to discuss potential other solutions, we may have 
missedone that fits our needs.

Cheers,
Yuya Nishihara - July 16, 2018, 10:48 a.m.
On Sun, 15 Jul 2018 12:33:48 +0200, Boris FELD wrote:
> The properties that are important for us are the following and we will 
> explain why a bit further:
>      - We want to activate it as soon as possible (ssh: dispatch level, 
> http: wsgi server),
>      - Controlled by the client,
>      - Security managed by core.
> 
> Yuya would like to move the implementation outside of core dispatch, for 
> security and maintenance concerns. Either, into custom ssh script, or at 
> the protocol level. Neither of these two locations fits our needs (and 
> constraints).
> 
> So we need to find a way to do this early extra validation of parameter 
> in a way that is good enough in terms of security and maintenance.
> 
> # Longer explanation
> 
> ## Why does it has to be set up so early?
> 
> A significant part of the slowness we observe happen very early, during 
> the initial setup (extensions setup, various data loading, custom 
> extensions code, etc...). So the earlier the extra debug/profile is 
> setup, the more useful data we gather.

I don't disagree with that, but I just doubt if that would be worth making
the dispatcher hack complicated and adding more variations of ssh entry
arguments. Honestly, I'm thinking that the use of "hg serve" as an entry
point was a design mistake which we can't fix, so I don't think it's good
idea to extend the syntax of the entry point.

> Doing this at the protocol level would be too late, as a lot of the 
> important time-consuming activities have already happened.
> 
> ## Why control it from the client instead of being a custom ssh script?
> 
> Some of the people we work with have a high level of security (eg: we 
> don't have a full shell access on their server and we only have access 
> to a subset of their repositories).
> 
> This means multiples things:
> 
> * We still want some security management, using `hg debugserve` would 
> allow arbitrary code to be executed, something we still want to prevent.
> 
> * Deploying a new ssh script is something complex that takes time. 
> Admins have to look at the new script and deploy it. It can take 
> multiple days, making us unable to collect these valuable data (in time).
> 
> * If we, mercurial core developers, are afraid of getting the validation 
> logic in core wrong, I don't think we should expect people not familiar 
> with Mercurial to be able to get that right. So moving the 
> responsibility to validate input to admins seems dangerous.
> 
> * Having an example script in `contrib/` would not help much as people 
> usually already have some custom script.

Why? If there's already a custom script installed, IIUC, only thing the
sysadmin will have to do is to replace 'hg serve' with
/path/to/contrib/hg-ssh-script, one time. After that, you can inject
commands understood by that script via sys.argv or stdin.

If the server has a robust pre-validation logic of 'hg serve' command,
or hg-ssh is used for example, argv injection wouldn't work anyway unless
the validation rule is relaxed or the existing script is adjusted to allow
extra command arguments.

> * Controlling option from the client is critical. When investigating 
> something, we try various options to look at the issue from different 
> angles (profile cpu vs time). Having to redeploy a new script every time 
> we want to change an option break our ability to investigate quickly as 
> it often requires synchronization between different teams.
> 
> ## Why do you want this into core
> 
> * Make it much simpler for us to use it at more place (including to help 
> random people on the list and bug tracker). Having to deploy a custom 
> Mercurial binaries on the server is a strong barrier.
> 
> * Given how valuable this is to us, we expect it to be valuable for 
> others. In fact server-side debug message is something various users 
> asked for in the past.

Well, server-side "debug" message could be handled at protocol level
unlike profiling of the early uisetup() thingy. That said, I'm slightly
interested in how ssh issues are debugged in various production environments
as there might be a better approach.

Patch

diff -r b7051e4bf783 -r 81edf3431b95 mercurial/dispatch.py
--- a/mercurial/dispatch.py	Wed Jun 20 12:16:48 2018 +0200
+++ b/mercurial/dispatch.py	Wed Jun 20 12:18:26 2018 +0200
@@ -285,12 +285,15 @@ 
             def unsafe():
                 msg = _('potentially unsafe serve --stdio invocation: %s')
                 raise error.Abort(msg % (stringutil.pprint(req.args),))
-            if (len(req.args) != 4 or
+            if (len(req.args) < 4 or
                 req.args[0] != '-R' or
                 req.args[1].startswith('--') or
                 req.args[2] != 'serve' or
                 req.args[3] != '--stdio'):
                 unsafe()
+            other_args = req.args[4:]
+            if other_args:
+                unsafe()
 
         try:
             debugger = 'pdb'