Patchwork [STABLE] templater: strip single backslash before quotation mark in quoted template

login
register
mail settings
Submitter Yuya Nishihara
Date May 8, 2015, 10:30 a.m.
Message ID <0dae4fd11c554338c2f0.1431081003@mimosa>
Download mbox | patch
Permalink /patch/8965/
State Accepted
Delegated to: Pierre-Yves David
Headers show

Comments

Yuya Nishihara - May 8, 2015, 10:30 a.m.
# HG changeset patch
# User Yuya Nishihara <yuya@tcha.org>
# Date 1431076286 -32400
#      Fri May 08 18:11:26 2015 +0900
# Branch stable
# Node ID 0dae4fd11c554338c2f0083d259d996e45aa1536
# Parent  169d2470d283f19f7c94112dc1eff1d355f29f40
templater: strip single backslash before quotation mark in quoted template

db7463aa080f fixed the issue of double escapes, but it made the following
template fail with syntax error because of <\">. Strictly speaking, <\">
appears to be invalid in non-string part, but we are likely to escape <">
if surrounded by quotes, and we are used to write such templates by trial
and error.

  [templates]
  sl = "{tags % \"{ifeq(tag,'tip','',label('log.tag', ' {tag}'))}\"}"

So, for backward compatibility between 2.8.1 and 3.4, a single backslash
before quotation mark is stripped only in quoted template. We don't care
for <\"> in string literal in quoted template, which never worked as expected
before.

  template  result
  --------- ------------------------
  {\"\"}    parse error
  "{""}"    {""} -> <>
  "{\"\"}"  {""} -> <>
  {"\""}    {"\""} -> <">
  '{"\""}'  {"\""} -> <">
  "{"\""}"  parse error (don't care)
Augie Fackler - May 8, 2015, 2:59 p.m.
On Fri, May 08, 2015 at 07:30:03PM +0900, Yuya Nishihara wrote:
> # HG changeset patch
> # User Yuya Nishihara <yuya@tcha.org>
> # Date 1431076286 -32400
> #      Fri May 08 18:11:26 2015 +0900
> # Branch stable
> # Node ID 0dae4fd11c554338c2f0083d259d996e45aa1536
> # Parent  169d2470d283f19f7c94112dc1eff1d355f29f40
> templater: strip single backslash before quotation mark in quoted template

This looks right to me, but I'm not taking it so that someone else can review.

>
> db7463aa080f fixed the issue of double escapes, but it made the following
> template fail with syntax error because of <\">. Strictly speaking, <\">
> appears to be invalid in non-string part, but we are likely to escape <">
> if surrounded by quotes, and we are used to write such templates by trial
> and error.
>
>   [templates]
>   sl = "{tags % \"{ifeq(tag,'tip','',label('log.tag', ' {tag}'))}\"}"
>
> So, for backward compatibility between 2.8.1 and 3.4, a single backslash
> before quotation mark is stripped only in quoted template. We don't care
> for <\"> in string literal in quoted template, which never worked as expected
> before.
>
>   template  result
>   --------- ------------------------
>   {\"\"}    parse error
>   "{""}"    {""} -> <>
>   "{\"\"}"  {""} -> <>
>   {"\""}    {"\""} -> <">
>   '{"\""}'  {"\""} -> <">
>   "{"\""}"  parse error (don't care)
>
> diff --git a/mercurial/templater.py b/mercurial/templater.py
> --- a/mercurial/templater.py
> +++ b/mercurial/templater.py
> @@ -623,7 +623,12 @@ def parsestring(s, quoted=True):
>      if quoted:
>          if len(s) < 2 or s[0] != s[-1]:
>              raise SyntaxError(_('unmatched quotes'))
> -        return s[1:-1]
> +        # de-backslash-ify only <\">. it is invalid syntax in non-string part of
> +        # template, but we are likely to escape <"> in quoted string and it was
> +        # accepted before, thanks to issue4290. <\\"> is unmodified because it
> +        # is ambiguous and it was processed as such before 2.8.1.
> +        q = s[0]
> +        return s[1:-1].replace('\\\\' + q, '\\\\\\' + q).replace('\\' + q, q)
>
>      return s
>
> diff --git a/tests/test-command-template.t b/tests/test-command-template.t
> --- a/tests/test-command-template.t
> +++ b/tests/test-command-template.t
> @@ -2273,6 +2273,25 @@ Test string escaping:
>    <>\n<]>
>    <>\n<
>
> +Test exception in quoted template. single backslash before quotation mark is
> +stripped before parsing:
> +
> +  $ cat <<'EOF' > escquotetmpl
> +  > changeset = "\" \\" \\\" \\\\" {files % \"{file}\"}\n"
> +  > EOF
> +  $ cd latesttag
> +  $ hg log -r 2 --style ../escquotetmpl
> +  " \" \" \\" head1
> +
> +  $ hg log -r 2 -T esc --config templates.esc='{\"invalid\"}\n'
> +  hg: parse error at 1: syntax error
> +  [255]
> +  $ hg log -r 2 -T esc --config templates.esc='"{\"valid\"}\n"'
> +  valid
> +  $ hg log -r 2 -T esc --config templates.esc="'"'{\'"'"'valid\'"'"'}\n'"'"
> +  valid
> +  $ cd ..
> +
>  Test leading backslashes:
>
>    $ cd latesttag
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
Pierre-Yves David - May 8, 2015, 7:48 p.m.
On 05/08/2015 07:59 AM, Augie Fackler wrote:
> On Fri, May 08, 2015 at 07:30:03PM +0900, Yuya Nishihara wrote:
>> # HG changeset patch
>> # User Yuya Nishihara <yuya@tcha.org>
>> # Date 1431076286 -32400
>> #      Fri May 08 18:11:26 2015 +0900
>> # Branch stable
>> # Node ID 0dae4fd11c554338c2f0083d259d996e45aa1536
>> # Parent  169d2470d283f19f7c94112dc1eff1d355f29f40
>> templater: strip single backslash before quotation mark in quoted template
>
> This looks right to me, but I'm not taking it so that someone else can review.
>
>>
>> db7463aa080f fixed the issue of double escapes, but it made the following
>> template fail with syntax error because of <\">. Strictly speaking, <\">
>> appears to be invalid in non-string part, but we are likely to escape <">
>> if surrounded by quotes, and we are used to write such templates by trial
>> and error.
>>
>>    [templates]
>>    sl = "{tags % \"{ifeq(tag,'tip','',label('log.tag', ' {tag}'))}\"}"
>>
>> So, for backward compatibility between 2.8.1 and 3.4, a single backslash
>> before quotation mark is stripped only in quoted template. We don't care
>> for <\"> in string literal in quoted template, which never worked as expected
>> before.
>>
>>    template  result
>>    --------- ------------------------
>>    {\"\"}    parse error
>>    "{""}"    {""} -> <>
>>    "{\"\"}"  {""} -> <>
>>    {"\""}    {"\""} -> <">
>>    '{"\""}'  {"\""} -> <">
>>    "{"\""}"  parse error (don't care)
>>
>> diff --git a/mercurial/templater.py b/mercurial/templater.py
>> --- a/mercurial/templater.py
>> +++ b/mercurial/templater.py
>> @@ -623,7 +623,12 @@ def parsestring(s, quoted=True):
>>       if quoted:
>>           if len(s) < 2 or s[0] != s[-1]:
>>               raise SyntaxError(_('unmatched quotes'))
>> -        return s[1:-1]
>> +        # de-backslash-ify only <\">. it is invalid syntax in non-string part of
>> +        # template, but we are likely to escape <"> in quoted string and it was
>> +        # accepted before, thanks to issue4290. <\\"> is unmodified because it
>> +        # is ambiguous and it was processed as such before 2.8.1.
>> +        q = s[0]
>> +        return s[1:-1].replace('\\\\' + q, '\\\\\\' + q).replace('\\' + q, q)


This looks good to me too, but maybe we should inflate this comment to 
contains the magic table explaining the final behavior?

What do you think?
Augie Fackler - May 8, 2015, 7:49 p.m.
On Fri, May 8, 2015 at 3:48 PM, Pierre-Yves David
<pierre-yves.david@ens-lyon.org> wrote:
> This looks good to me too, but maybe we should inflate this comment to
> contains the magic table explaining the final behavior?
>
> What do you think?


More documentation here probably makes sense, but i'm happy to see
that as a followup if you are.
Pierre-Yves David - May 8, 2015, 8:02 p.m.
On 05/08/2015 12:49 PM, Augie Fackler wrote:
> On Fri, May 8, 2015 at 3:48 PM, Pierre-Yves David
> <pierre-yves.david@ens-lyon.org> wrote:
>> This looks good to me too, but maybe we should inflate this comment to
>> contains the magic table explaining the final behavior?
>>
>> What do you think?
>
>
> More documentation here probably makes sense, but i'm happy to see
> that as a followup if you are.

I've stolen the table from the commit message and added it to the comment.

Yuya is encouraged to follow up if needed.

Patch

diff --git a/mercurial/templater.py b/mercurial/templater.py
--- a/mercurial/templater.py
+++ b/mercurial/templater.py
@@ -623,7 +623,12 @@  def parsestring(s, quoted=True):
     if quoted:
         if len(s) < 2 or s[0] != s[-1]:
             raise SyntaxError(_('unmatched quotes'))
-        return s[1:-1]
+        # de-backslash-ify only <\">. it is invalid syntax in non-string part of
+        # template, but we are likely to escape <"> in quoted string and it was
+        # accepted before, thanks to issue4290. <\\"> is unmodified because it
+        # is ambiguous and it was processed as such before 2.8.1.
+        q = s[0]
+        return s[1:-1].replace('\\\\' + q, '\\\\\\' + q).replace('\\' + q, q)
 
     return s
 
diff --git a/tests/test-command-template.t b/tests/test-command-template.t
--- a/tests/test-command-template.t
+++ b/tests/test-command-template.t
@@ -2273,6 +2273,25 @@  Test string escaping:
   <>\n<]>
   <>\n<
 
+Test exception in quoted template. single backslash before quotation mark is
+stripped before parsing:
+
+  $ cat <<'EOF' > escquotetmpl
+  > changeset = "\" \\" \\\" \\\\" {files % \"{file}\"}\n"
+  > EOF
+  $ cd latesttag
+  $ hg log -r 2 --style ../escquotetmpl
+  " \" \" \\" head1
+
+  $ hg log -r 2 -T esc --config templates.esc='{\"invalid\"}\n'
+  hg: parse error at 1: syntax error
+  [255]
+  $ hg log -r 2 -T esc --config templates.esc='"{\"valid\"}\n"'
+  valid
+  $ hg log -r 2 -T esc --config templates.esc="'"'{\'"'"'valid\'"'"'}\n'"'"
+  valid
+  $ cd ..
+
 Test leading backslashes:
 
   $ cd latesttag