Patchwork dispatch: strip command line options like config file options

login
register
mail settings
Submitter Tony Tung
Date Feb. 8, 2016, 11:36 p.m.
Message ID <97240c9addb84f73752d.1454974575@dev8692.prn1.facebook.com>
Download mbox | patch
Permalink /patch/13073/
State Accepted
Delegated to: Yuya Nishihara
Headers show

Comments

Tony Tung - Feb. 8, 2016, 11:36 p.m.
# HG changeset patch
# User Tony Tung <ttung@fb.com>
# Date 1454974530 28800
#      Mon Feb 08 15:35:30 2016 -0800
# Node ID 97240c9addb84f73752dad57e6531dfea30f3f0d
# Parent  01a5143cd25f285f8c745a92986cd7186bb32c90
dispatch: strip command line options like config file options.

Currently, whitespace in command line --config options are considered
significant while whitespace in config files are not considered
significant.  This diff strips the leading and trailing whitespace from
command line config options.
Yuya Nishihara - Feb. 11, 2016, 7:49 a.m.
On Mon, 8 Feb 2016 15:36:15 -0800, Tony Tung wrote:
> # HG changeset patch
> # User Tony Tung <ttung@fb.com>
> # Date 1454974530 28800
> #      Mon Feb 08 15:35:30 2016 -0800
> # Node ID 97240c9addb84f73752dad57e6531dfea30f3f0d
> # Parent  01a5143cd25f285f8c745a92986cd7186bb32c90
> dispatch: strip command line options like config file options.
> 
> Currently, whitespace in command line --config options are considered
> significant while whitespace in config files are not considered
> significant.  This diff strips the leading and trailing whitespace from
> command line config options.
> 
> diff --git a/mercurial/dispatch.py b/mercurial/dispatch.py
> --- a/mercurial/dispatch.py
> +++ b/mercurial/dispatch.py
> @@ -609,7 +609,8 @@
>  
>      for cfg in config:
>          try:
> -            name, value = cfg.split('=', 1)
> +            name, value = [cfgelem.strip()
> +                           for cfgelem in cfg.split('=', 1)]

I think that's up to parser whether white space is significant or not.
Tony Tung - Feb. 11, 2016, 8:06 a.m.
> On Feb 10, 2016, at 11:49 PM, Yuya Nishihara <yuya@tcha.org> wrote:
> 
> On Mon, 8 Feb 2016 15:36:15 -0800, Tony Tung wrote:
>> # HG changeset patch
>> # User Tony Tung <ttung@fb.com>
>> # Date 1454974530 28800
>> #      Mon Feb 08 15:35:30 2016 -0800
>> # Node ID 97240c9addb84f73752dad57e6531dfea30f3f0d
>> # Parent  01a5143cd25f285f8c745a92986cd7186bb32c90
>> dispatch: strip command line options like config file options.
>> 
>> Currently, whitespace in command line --config options are considered
>> significant while whitespace in config files are not considered
>> significant.  This diff strips the leading and trailing whitespace from
>> command line config options.
>> 
>> diff --git a/mercurial/dispatch.py b/mercurial/dispatch.py
>> --- a/mercurial/dispatch.py
>> +++ b/mercurial/dispatch.py
>> @@ -609,7 +609,8 @@
>> 
>>     for cfg in config:
>>         try:
>> -            name, value = cfg.split('=', 1)
>> +            name, value = [cfgelem.strip()
>> +                           for cfgelem in cfg.split('=', 1)]
> 
> I think that's up to parser whether white space is significant or not.

The parser definitely has a say into the significance of the outer whitespaces, but it has no visibility into the inner whitespaces, i.e., the spaces surrounding the =.

Thanks,
Tony
Yuya Nishihara - Feb. 11, 2016, 10:09 a.m.
On Thu, 11 Feb 2016 08:06:40 +0000, Tony Tung wrote:
> > On Feb 10, 2016, at 11:49 PM, Yuya Nishihara <yuya@tcha.org> wrote:
> > 
> > On Mon, 8 Feb 2016 15:36:15 -0800, Tony Tung wrote:
> >> # HG changeset patch
> >> # User Tony Tung <ttung@fb.com>
> >> # Date 1454974530 28800
> >> #      Mon Feb 08 15:35:30 2016 -0800
> >> # Node ID 97240c9addb84f73752dad57e6531dfea30f3f0d
> >> # Parent  01a5143cd25f285f8c745a92986cd7186bb32c90
> >> dispatch: strip command line options like config file options.
> >> 
> >> Currently, whitespace in command line --config options are considered
> >> significant while whitespace in config files are not considered
> >> significant.  This diff strips the leading and trailing whitespace from
> >> command line config options.
> >> 
> >> diff --git a/mercurial/dispatch.py b/mercurial/dispatch.py
> >> --- a/mercurial/dispatch.py
> >> +++ b/mercurial/dispatch.py
> >> @@ -609,7 +609,8 @@
> >> 
> >>     for cfg in config:
> >>         try:
> >> -            name, value = cfg.split('=', 1)
> >> +            name, value = [cfgelem.strip()
> >> +                           for cfgelem in cfg.split('=', 1)]
> > 
> > I think that's up to parser whether white space is significant or not.
> 
> The parser definitely has a say into the significance of the outer
> whitespaces, but it has no visibility into the inner whitespaces,
> i.e., the spaces surrounding the =.

I'm not sure if I understand what you mean. I wanted to tell that
--config is parsed as a command-line option, which rule can be different
from a config file. And if --config contains a whitespace, it would be
considered explicitly specified.

  --config foo.bar="baz "

That said, because config key shouldn't start/end with space,
--config " foo . bar "=baz might be an error.
Tony Tung - Feb. 11, 2016, 11:25 p.m.
> On Feb 11, 2016, at 2:09 AM, Yuya Nishihara <yuya@tcha.org> wrote:
> 
> On Thu, 11 Feb 2016 08:06:40 +0000, Tony Tung wrote:
>>> On Feb 10, 2016, at 11:49 PM, Yuya Nishihara <yuya@tcha.org> wrote:
>>> 
>>> On Mon, 8 Feb 2016 15:36:15 -0800, Tony Tung wrote:
>>>> # HG changeset patch
>>>> # User Tony Tung <ttung@fb.com>
>>>> # Date 1454974530 28800
>>>> #      Mon Feb 08 15:35:30 2016 -0800
>>>> # Node ID 97240c9addb84f73752dad57e6531dfea30f3f0d
>>>> # Parent  01a5143cd25f285f8c745a92986cd7186bb32c90
>>>> dispatch: strip command line options like config file options.
>>>> 
>>>> Currently, whitespace in command line --config options are considered
>>>> significant while whitespace in config files are not considered
>>>> significant.  This diff strips the leading and trailing whitespace from
>>>> command line config options.
>>>> 
>>>> diff --git a/mercurial/dispatch.py b/mercurial/dispatch.py
>>>> --- a/mercurial/dispatch.py
>>>> +++ b/mercurial/dispatch.py
>>>> @@ -609,7 +609,8 @@
>>>> 
>>>>    for cfg in config:
>>>>        try:
>>>> -            name, value = cfg.split('=', 1)
>>>> +            name, value = [cfgelem.strip()
>>>> +                           for cfgelem in cfg.split('=', 1)]
>>> 
>>> I think that's up to parser whether white space is significant or not.
>> 
>> The parser definitely has a say into the significance of the outer
>> whitespaces, but it has no visibility into the inner whitespaces,
>> i.e., the spaces surrounding the =.
> 
> I'm not sure if I understand what you mean. I wanted to tell that
> --config is parsed as a command-line option, which rule can be different
> from a config file. And if --config contains a whitespace, it would be
> considered explicitly specified.
> 
>  --config foo.bar="baz "
> 
> That said, because config key shouldn't start/end with space,
> --config " foo . bar "=baz might be an error.

Why is explicit whitespace in a config file treated differently than explicit whitespace in the command line?

Thanks,
Tony
Yuya Nishihara - Feb. 12, 2016, 3:52 a.m.
On Thu, 11 Feb 2016 23:25:10 +0000, Tony Tung wrote:
> > On Feb 11, 2016, at 2:09 AM, Yuya Nishihara <yuya@tcha.org> wrote:
> > 
> > On Thu, 11 Feb 2016 08:06:40 +0000, Tony Tung wrote:
> >>> On Feb 10, 2016, at 11:49 PM, Yuya Nishihara <yuya@tcha.org> wrote:
> >>> 
> >>> On Mon, 8 Feb 2016 15:36:15 -0800, Tony Tung wrote:
> >>>> # HG changeset patch
> >>>> # User Tony Tung <ttung@fb.com>
> >>>> # Date 1454974530 28800
> >>>> #      Mon Feb 08 15:35:30 2016 -0800
> >>>> # Node ID 97240c9addb84f73752dad57e6531dfea30f3f0d
> >>>> # Parent  01a5143cd25f285f8c745a92986cd7186bb32c90
> >>>> dispatch: strip command line options like config file options.
> >>>> 
> >>>> Currently, whitespace in command line --config options are considered
> >>>> significant while whitespace in config files are not considered
> >>>> significant.  This diff strips the leading and trailing whitespace from
> >>>> command line config options.
> >>>> 
> >>>> diff --git a/mercurial/dispatch.py b/mercurial/dispatch.py
> >>>> --- a/mercurial/dispatch.py
> >>>> +++ b/mercurial/dispatch.py
> >>>> @@ -609,7 +609,8 @@
> >>>> 
> >>>>    for cfg in config:
> >>>>        try:
> >>>> -            name, value = cfg.split('=', 1)
> >>>> +            name, value = [cfgelem.strip()
> >>>> +                           for cfgelem in cfg.split('=', 1)]
> >>> 
> >>> I think that's up to parser whether white space is significant or not.
> >> 
> >> The parser definitely has a say into the significance of the outer
> >> whitespaces, but it has no visibility into the inner whitespaces,
> >> i.e., the spaces surrounding the =.
> > 
> > I'm not sure if I understand what you mean. I wanted to tell that
> > --config is parsed as a command-line option, which rule can be different
> > from a config file. And if --config contains a whitespace, it would be
> > considered explicitly specified.
> > 
> >  --config foo.bar="baz "
> > 
> > That said, because config key shouldn't start/end with space,
> > --config " foo . bar "=baz might be an error.
> 
> Why is explicit whitespace in a config file treated differently than explicit
> whitespace in the command line?

Well, I don't have strong reasoning, sorry. I just feel stripping whitespaces
in command argument is different thing than stripping whitespaces while parsing
config file. Perhaps one reason is command arguments are first parsed by the
shell.

And I thought ' section . name '=value might be invalid because _parseconfig()
says "use --config section.name=value".
Pierre-Yves David - Feb. 12, 2016, 11:48 a.m.
On 02/11/2016 11:25 PM, Tony Tung wrote:
>> On Feb 11, 2016, at 2:09 AM, Yuya Nishihara <yuya@tcha.org> wrote:
>>
>> On Thu, 11 Feb 2016 08:06:40 +0000, Tony Tung wrote:
>>>> On Feb 10, 2016, at 11:49 PM, Yuya Nishihara <yuya@tcha.org> wrote:
>>>>
>>>> On Mon, 8 Feb 2016 15:36:15 -0800, Tony Tung wrote:
>>>>> # HG changeset patch
>>>>> # User Tony Tung <ttung@fb.com>
>>>>> # Date 1454974530 28800
>>>>> #      Mon Feb 08 15:35:30 2016 -0800
>>>>> # Node ID 97240c9addb84f73752dad57e6531dfea30f3f0d
>>>>> # Parent  01a5143cd25f285f8c745a92986cd7186bb32c90
>>>>> dispatch: strip command line options like config file options.
>>>>>
>>>>> Currently, whitespace in command line --config options are considered
>>>>> significant while whitespace in config files are not considered
>>>>> significant.  This diff strips the leading and trailing whitespace from
>>>>> command line config options.
>>>>>
>>>>> diff --git a/mercurial/dispatch.py b/mercurial/dispatch.py
>>>>> --- a/mercurial/dispatch.py
>>>>> +++ b/mercurial/dispatch.py
>>>>> @@ -609,7 +609,8 @@
>>>>>
>>>>>     for cfg in config:
>>>>>         try:
>>>>> -            name, value = cfg.split('=', 1)
>>>>> +            name, value = [cfgelem.strip()
>>>>> +                           for cfgelem in cfg.split('=', 1)]
>>>>
>>>> I think that's up to parser whether white space is significant or not.
>>>
>>> The parser definitely has a say into the significance of the outer
>>> whitespaces, but it has no visibility into the inner whitespaces,
>>> i.e., the spaces surrounding the =.
>>
>> I'm not sure if I understand what you mean. I wanted to tell that
>> --config is parsed as a command-line option, which rule can be different
>> from a config file. And if --config contains a whitespace, it would be
>> considered explicitly specified.
>>
>>   --config foo.bar="baz "
>>
>> That said, because config key shouldn't start/end with space,
>> --config " foo . bar "=baz might be an error.
>
> Why is explicit whitespace in a config file treated differently than explicit whitespace in the command line?

I think that what Yuya means here is:

- white space stripping is handled by the .ini file parsing,
- that .ini parsing is third party code with probably non-trivial logic,
- We (Yuya, I, etc) are not certain what these .ini parsing rules are,
- Therefor we are not confident that "simply stripping white space" from 
the command line will give us a behavior consistent with "what happen in 
the config file"

So, from my point of view, to move forward here we need, to clarify what 
are the rules regarding white space in .ini file parsing and make a case 
that the proposal for command line option is consistent with these rules.

Cheers,
Yuya Nishihara - Feb. 12, 2016, 1:26 p.m.
On Fri, 12 Feb 2016 11:48:44 +0000, Pierre-Yves David wrote:
> On 02/11/2016 11:25 PM, Tony Tung wrote:
> >> On Feb 11, 2016, at 2:09 AM, Yuya Nishihara <yuya@tcha.org> wrote:
> >> On Thu, 11 Feb 2016 08:06:40 +0000, Tony Tung wrote:
> >>>> On Feb 10, 2016, at 11:49 PM, Yuya Nishihara <yuya@tcha.org> wrote:
> >>>>
> >>>> On Mon, 8 Feb 2016 15:36:15 -0800, Tony Tung wrote:
> >>>>> # HG changeset patch
> >>>>> # User Tony Tung <ttung@fb.com>
> >>>>> # Date 1454974530 28800
> >>>>> #      Mon Feb 08 15:35:30 2016 -0800
> >>>>> # Node ID 97240c9addb84f73752dad57e6531dfea30f3f0d
> >>>>> # Parent  01a5143cd25f285f8c745a92986cd7186bb32c90
> >>>>> dispatch: strip command line options like config file options.
> >>>>>
> >>>>> Currently, whitespace in command line --config options are considered
> >>>>> significant while whitespace in config files are not considered
> >>>>> significant.  This diff strips the leading and trailing whitespace from
> >>>>> command line config options.
> >>>>>
> >>>>> diff --git a/mercurial/dispatch.py b/mercurial/dispatch.py
> >>>>> --- a/mercurial/dispatch.py
> >>>>> +++ b/mercurial/dispatch.py
> >>>>> @@ -609,7 +609,8 @@
> >>>>>
> >>>>>     for cfg in config:
> >>>>>         try:
> >>>>> -            name, value = cfg.split('=', 1)
> >>>>> +            name, value = [cfgelem.strip()
> >>>>> +                           for cfgelem in cfg.split('=', 1)]
> >>>>
> >>>> I think that's up to parser whether white space is significant or not.
> >>>
> >>> The parser definitely has a say into the significance of the outer
> >>> whitespaces, but it has no visibility into the inner whitespaces,
> >>> i.e., the spaces surrounding the =.
> >>
> >> I'm not sure if I understand what you mean. I wanted to tell that
> >> --config is parsed as a command-line option, which rule can be different
> >> from a config file. And if --config contains a whitespace, it would be
> >> considered explicitly specified.
> >>
> >>   --config foo.bar="baz "
> >>
> >> That said, because config key shouldn't start/end with space,
> >> --config " foo . bar "=baz might be an error.
> >
> > Why is explicit whitespace in a config file treated differently than explicit whitespace in the command line?
> 
> I think that what Yuya means here is:
> 
> - white space stripping is handled by the .ini file parsing,
> - that .ini parsing is third party code with probably non-trivial logic,
> - We (Yuya, I, etc) are not certain what these .ini parsing rules are,
> - Therefor we are not confident that "simply stripping white space" from 
> the command line will give us a behavior consistent with "what happen in 
> the config file"

We have own config parser. My opinion is that it is common to strip
whitespaces while parsing ini-style file, but it isn't for command option.

That said, it appears I'm wrong. gcc does it.

  % echo 'FOO' | gcc -E -D" FOO = a " -
  a

https://github.com/gcc-mirror/gcc/blob/gcc_5_3_0_release/libcpp/directives.c#L2361

So, I'm inclined to accepting this patch.
Pierre-Yves David - Feb. 12, 2016, 1:30 p.m.
On 02/12/2016 01:26 PM, Yuya Nishihara wrote:
> On Fri, 12 Feb 2016 11:48:44 +0000, Pierre-Yves David wrote:
>> On 02/11/2016 11:25 PM, Tony Tung wrote:
>>>> On Feb 11, 2016, at 2:09 AM, Yuya Nishihara <yuya@tcha.org> wrote:
>>>> On Thu, 11 Feb 2016 08:06:40 +0000, Tony Tung wrote:
>>>>>> On Feb 10, 2016, at 11:49 PM, Yuya Nishihara <yuya@tcha.org> wrote:
>>>>>>
>>>>>> On Mon, 8 Feb 2016 15:36:15 -0800, Tony Tung wrote:
>>>>>>> # HG changeset patch
>>>>>>> # User Tony Tung <ttung@fb.com>
>>>>>>> # Date 1454974530 28800
>>>>>>> #      Mon Feb 08 15:35:30 2016 -0800
>>>>>>> # Node ID 97240c9addb84f73752dad57e6531dfea30f3f0d
>>>>>>> # Parent  01a5143cd25f285f8c745a92986cd7186bb32c90
>>>>>>> dispatch: strip command line options like config file options.
>>>>>>>
>>>>>>> Currently, whitespace in command line --config options are considered
>>>>>>> significant while whitespace in config files are not considered
>>>>>>> significant.  This diff strips the leading and trailing whitespace from
>>>>>>> command line config options.
>>>>>>>
>>>>>>> diff --git a/mercurial/dispatch.py b/mercurial/dispatch.py
>>>>>>> --- a/mercurial/dispatch.py
>>>>>>> +++ b/mercurial/dispatch.py
>>>>>>> @@ -609,7 +609,8 @@
>>>>>>>
>>>>>>>      for cfg in config:
>>>>>>>          try:
>>>>>>> -            name, value = cfg.split('=', 1)
>>>>>>> +            name, value = [cfgelem.strip()
>>>>>>> +                           for cfgelem in cfg.split('=', 1)]
>>>>>>
>>>>>> I think that's up to parser whether white space is significant or not.
>>>>>
>>>>> The parser definitely has a say into the significance of the outer
>>>>> whitespaces, but it has no visibility into the inner whitespaces,
>>>>> i.e., the spaces surrounding the =.
>>>>
>>>> I'm not sure if I understand what you mean. I wanted to tell that
>>>> --config is parsed as a command-line option, which rule can be different
>>>> from a config file. And if --config contains a whitespace, it would be
>>>> considered explicitly specified.
>>>>
>>>>    --config foo.bar="baz "
>>>>
>>>> That said, because config key shouldn't start/end with space,
>>>> --config " foo . bar "=baz might be an error.
>>>
>>> Why is explicit whitespace in a config file treated differently than explicit whitespace in the command line?
>>
>> I think that what Yuya means here is:
>>
>> - white space stripping is handled by the .ini file parsing,
>> - that .ini parsing is third party code with probably non-trivial logic,
>> - We (Yuya, I, etc) are not certain what these .ini parsing rules are,
>> - Therefor we are not confident that "simply stripping white space" from
>> the command line will give us a behavior consistent with "what happen in
>> the config file"
>
> We have own config parser. My opinion is that it is common to strip
> whitespaces while parsing ini-style file, but it isn't for command option.
>
> That said, it appears I'm wrong. gcc does it.
>
>    % echo 'FOO' | gcc -E -D" FOO = a " -
>    a
>
> https://github.com/gcc-mirror/gcc/blob/gcc_5_3_0_release/libcpp/directives.c#L2361
>
> So, I'm inclined to accepting this patch.

Go for it then.
Yuya Nishihara - Feb. 12, 2016, 2:09 p.m.
On Mon, 8 Feb 2016 15:36:15 -0800, Tony Tung wrote:
> # HG changeset patch
> # User Tony Tung <ttung@fb.com>
> # Date 1454974530 28800
> #      Mon Feb 08 15:35:30 2016 -0800
> # Node ID 97240c9addb84f73752dad57e6531dfea30f3f0d
> # Parent  01a5143cd25f285f8c745a92986cd7186bb32c90
> dispatch: strip command line options like config file options.

Pushed to the clowncopter, thanks for long discussion.

check-commit says hi.

Patch

diff --git a/mercurial/dispatch.py b/mercurial/dispatch.py
--- a/mercurial/dispatch.py
+++ b/mercurial/dispatch.py
@@ -609,7 +609,8 @@ 
 
     for cfg in config:
         try:
-            name, value = cfg.split('=', 1)
+            name, value = [cfgelem.strip()
+                           for cfgelem in cfg.split('=', 1)]
             section, name = name.split('.', 1)
             if not section or not name:
                 raise IndexError
diff --git a/tests/test-ui-config.py.out b/tests/test-ui-config.py.out
--- a/tests/test-ui-config.py.out
+++ b/tests/test-ui-config.py.out
@@ -1,5 +1,5 @@ 
 [('string', 'string value'), ('bool1', 'true'), ('bool2', 'false'), ('boolinvalid', 'foo'), ('int1', '42'), ('int2', '-42'), ('intinvalid', 'foo')]
-[('list1', 'foo'), ('list2', 'foo bar baz'), ('list3', 'alice, bob'), ('list4', 'foo bar baz alice, bob'), ('list5', 'abc d"ef"g "hij def"'), ('list6', '"hello world", "how are you?"'), ('list7', 'Do"Not"Separate'), ('list8', '"Do"Separate'), ('list9', '"Do\\"NotSeparate"'), ('list10', 'string "with extraneous" quotation mark"'), ('list11', 'x, y'), ('list12', '"x", "y"'), ('list13', '""" key = "x", "y" """'), ('list14', ',,,,     '), ('list15', '" just with starting quotation'), ('list16', '"longer quotation" with "no ending quotation'), ('list17', 'this is \\" "not a quotation mark"'), ('list18', '\n \n\nding\ndong')]
+[('list1', 'foo'), ('list2', 'foo bar baz'), ('list3', 'alice, bob'), ('list4', 'foo bar baz alice, bob'), ('list5', 'abc d"ef"g "hij def"'), ('list6', '"hello world", "how are you?"'), ('list7', 'Do"Not"Separate'), ('list8', '"Do"Separate'), ('list9', '"Do\\"NotSeparate"'), ('list10', 'string "with extraneous" quotation mark"'), ('list11', 'x, y'), ('list12', '"x", "y"'), ('list13', '""" key = "x", "y" """'), ('list14', ',,,,'), ('list15', '" just with starting quotation'), ('list16', '"longer quotation" with "no ending quotation'), ('list17', 'this is \\" "not a quotation mark"'), ('list18', 'ding\ndong')]
 ---
 'string value'
 'true'