Patchwork [3,of,3] templater: make strings in template expressions be "string-escape"-ed correctly

login
register
mail settings
Submitter Katsunori FUJIWARA
Date March 9, 2014, 4:09 p.m.
Message ID <5ab28a2e9962f78f90cf.1394381365@feefifofum>
Download mbox | patch
Permalink /patch/3901/
State Accepted
Headers show

Comments

Katsunori FUJIWARA - March 9, 2014, 4:09 p.m.
# HG changeset patch
# User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
# Date 1394380903 -32400
#      Mon Mar 10 01:01:43 2014 +0900
# Branch stable
# Node ID 5ab28a2e9962f78f90cf3e38483af1bd24035e1a
# Parent  a54c0d830499da186d56bf071c3e698eb818de0c
templater: make strings in template expressions be "string-escape"-ed correctly

Changeset 64b4f0cd7336 (released with 2.8.1) fixed "recursively
evaluate string literals as templates" problem (issue4102) by moving
the location of "string-escape"-ing from "tokenizer()" to
"compiletemplate()".

But some parts in template expressions below are not processed by
"compiletemplate()", and it may cause unexpected result.

  - 'expr' of 'if(expr, then, else)'
  - 'expr's of 'ifeq(expr, expr, then, else)'
  - 'sep' of 'join(list, sep)'
  - 'text' and 'style' of 'rstdoc(text, style)'
  - 'text' and 'chars' of 'strip(text, chars)'
  - 'pat' and 'repl' of 'sub(pat, repl, expr)'

For example, '\n' of "{join(extras, '\n')}" is not "string-escape"-ed
and treated as a literal '\n'. This breaks "Display the contents of
the 'extra' field, one per line" example in "hg help templates".

Just "string-escape"-ing on each parts above may not work correctly,
because inside expression of nested ones already applies
"string-escape" on string literals. For example:

  - "{join(files, '\n')}" doesn't return "string-escape"-ed string, but
  - "{join(files, if(branch, '\n', '\n'))}" does

To fix this problem, this patch does:

  - introduce "rawstring" token and "runrawstring" method to handle
    strings not to be "string-escape"-ed correctly, and

  - make "runstring" method return "string-escape"-ed string, and
    delay "string-escape"-ing until evaluation

This patch invokes "compiletemplate()" with "strtoken=exp[0]" in
"gettemplate()", because "exp[1]" is not yet evaluated. This code path
is tested via mapping ("expr % '{template}'").

In the other hand, this patch invokes it with "strtoken='rawstring'"
in "_evalifliteral()", because "t" is the result of "arg" evaluation
and it should be "string-escape"-ed if "arg" is "string" expression.

This patch doesn't test "string-escape"-ing on 'expr' of 'if(expr,
then, else)', because it doesn't affect the result.
Matt Mackall - March 9, 2014, 9:38 p.m.
On Mon, 2014-03-10 at 01:09 +0900, FUJIWARA Katsunori wrote:
> # HG changeset patch
> # User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
> # Date 1394380903 -32400
> #      Mon Mar 10 01:01:43 2014 +0900
> # Branch stable
> # Node ID 5ab28a2e9962f78f90cf3e38483af1bd24035e1a
> # Parent  a54c0d830499da186d56bf071c3e698eb818de0c
> templater: make strings in template expressions be "string-escape"-ed correctly

These are queued for stable, thanks.

Patch

diff --git a/mercurial/templater.py b/mercurial/templater.py
--- a/mercurial/templater.py
+++ b/mercurial/templater.py
@@ -21,6 +21,7 @@ 
     ")": (0, None, None),
     "symbol": (0, ("symbol",), None),
     "string": (0, ("string",), None),
+    "rawstring": (0, ("rawstring",), None),
     "end": (0, None, None),
 }
 
@@ -50,7 +51,7 @@ 
                     continue
                 if d == c:
                     if not decode:
-                        yield ('string', program[s:pos].replace('\\', r'\\'), s)
+                        yield ('rawstring', program[s:pos], s)
                         break
                     yield ('string', program[s:pos], s)
                     break
@@ -76,23 +77,22 @@ 
         pos += 1
     yield ('end', None, pos)
 
-def compiletemplate(tmpl, context):
+def compiletemplate(tmpl, context, strtoken="string"):
     parsed = []
     pos, stop = 0, len(tmpl)
     p = parser.parser(tokenizer, elements)
     while pos < stop:
         n = tmpl.find('{', pos)
         if n < 0:
-            parsed.append(("string", tmpl[pos:].decode("string-escape")))
+            parsed.append((strtoken, tmpl[pos:]))
             break
         if n > 0 and tmpl[n - 1] == '\\':
             # escaped
-            parsed.append(("string",
-                           (tmpl[pos:n - 1] + "{").decode("string-escape")))
+            parsed.append((strtoken, (tmpl[pos:n - 1] + "{")))
             pos = n + 1
             continue
         if n > pos:
-            parsed.append(("string", tmpl[pos:n].decode("string-escape")))
+            parsed.append((strtoken, tmpl[pos:n]))
 
         pd = [tmpl, n + 1, stop]
         parseres, pos = p.parse(pd)
@@ -127,13 +127,16 @@ 
     return context._filters[f]
 
 def gettemplate(exp, context):
-    if exp[0] == 'string':
-        return compiletemplate(exp[1], context)
+    if exp[0] == 'string' or exp[0] == 'rawstring':
+        return compiletemplate(exp[1], context, strtoken=exp[0])
     if exp[0] == 'symbol':
         return context._load(exp[1])
     raise error.ParseError(_("expected template specifier"))
 
 def runstring(context, mapping, data):
+    return data.decode("string-escape")
+
+def runrawstring(context, mapping, data):
     return data
 
 def runsymbol(context, mapping, key):
@@ -256,8 +259,9 @@ 
 
 def _evalifliteral(arg, context, mapping):
     t = stringify(arg[0](context, mapping, arg[1]))
-    if arg[0] == runstring:
-        yield runtemplate(context, mapping, compiletemplate(t, context))
+    if arg[0] == runstring or arg[0] == runrawstring:
+        yield runtemplate(context, mapping,
+                          compiletemplate(t, context, strtoken='rawstring'))
     else:
         yield t
 
@@ -346,6 +350,7 @@ 
 
 methods = {
     "string": lambda e, c: (runstring, e[1]),
+    "rawstring": lambda e, c: (runrawstring, e[1]),
     "symbol": lambda e, c: (runsymbol, e[1]),
     "group": lambda e, c: compileexp(e[1], c),
 #    ".": buildmember,
diff --git a/tests/test-command-template.t b/tests/test-command-template.t
--- a/tests/test-command-template.t
+++ b/tests/test-command-template.t
@@ -1610,6 +1610,92 @@ 
   <>\n<]>
   <>\n<
 
+"string-escape"-ed "\x5c\x786e" becomes r"\x6e" (once) or r"n" (twice)
+
+  $ hg log -R a -r 0 --template '{if("1", "\x5c\x786e", "NG")}\n'
+  \x6e
+  $ hg log -R a -r 0 --template '{if("1", r"\x5c\x786e", "NG")}\n'
+  \x5c\x786e
+  $ hg log -R a -r 0 --template '{if("", "NG", "\x5c\x786e")}\n'
+  \x6e
+  $ hg log -R a -r 0 --template '{if("", "NG", r"\x5c\x786e")}\n'
+  \x5c\x786e
+
+  $ hg log -R a -r 2 --template '{ifeq("no perso\x6e", desc, "\x5c\x786e", "NG")}\n'
+  \x6e
+  $ hg log -R a -r 2 --template '{ifeq(r"no perso\x6e", desc, "NG", r"\x5c\x786e")}\n'
+  \x5c\x786e
+  $ hg log -R a -r 2 --template '{ifeq(desc, "no perso\x6e", "\x5c\x786e", "NG")}\n'
+  \x6e
+  $ hg log -R a -r 2 --template '{ifeq(desc, r"no perso\x6e", "NG", r"\x5c\x786e")}\n'
+  \x5c\x786e
+
+  $ hg log -R a -r 8 --template '{join(files, "\n")}\n'
+  fourth
+  second
+  third
+  $ hg log -R a -r 8 --template '{join(files, r"\n")}\n'
+  fourth\nsecond\nthird
+
+  $ hg log -R a -r 2 --template '{rstdoc("1st\n\n2nd", "htm\x6c")}'
+  <p>
+  1st
+  </p>
+  <p>
+  2nd
+  </p>
+  $ hg log -R a -r 2 --template '{rstdoc(r"1st\n\n2nd", "html")}'
+  <p>
+  1st\n\n2nd
+  </p>
+  $ hg log -R a -r 2 --template '{rstdoc("1st\n\n2nd", r"htm\x6c")}'
+  1st
+  
+  2nd
+
+  $ hg log -R a -r 2 --template '{strip(desc, "\x6e")}\n'
+  o perso
+  $ hg log -R a -r 2 --template '{strip(desc, r"\x6e")}\n'
+  no person
+  $ hg log -R a -r 2 --template '{strip("no perso\x6e", "\x6e")}\n'
+  o perso
+  $ hg log -R a -r 2 --template '{strip(r"no perso\x6e", r"\x6e")}\n'
+  no perso
+
+  $ hg log -R a -r 2 --template '{sub("\\x6e", "\x2d", desc)}\n'
+  -o perso-
+  $ hg log -R a -r 2 --template '{sub(r"\\x6e", "-", desc)}\n'
+  no person
+  $ hg log -R a -r 2 --template '{sub("n", r"\x2d", desc)}\n'
+  \x2do perso\x2d
+  $ hg log -R a -r 2 --template '{sub("n", "\x2d", "no perso\x6e")}\n'
+  -o perso-
+  $ hg log -R a -r 2 --template '{sub("n", r"\x2d", r"no perso\x6e")}\n'
+  \x2do perso\x6e
+
+  $ hg log -R a -r 8 --template '{files % "{file}\n"}'
+  fourth
+  second
+  third
+  $ hg log -R a -r 8 --template '{files % r"{file}\n"}\n'
+  fourth\nsecond\nthird\n
+
+Test string escapeing in nested expression:
+
+  $ hg log -R a -r 8 --template '{ifeq(r"\x6e", if("1", "\x5c\x786e"), join(files, "\x5c\x786e"))}\n'
+  fourth\x6esecond\x6ethird
+  $ hg log -R a -r 8 --template '{ifeq(if("1", r"\x6e"), "\x5c\x786e", join(files, "\x5c\x786e"))}\n'
+  fourth\x6esecond\x6ethird
+
+  $ hg log -R a -r 8 --template '{join(files, ifeq(branch, "default", "\x5c\x786e"))}\n'
+  fourth\x6esecond\x6ethird
+  $ hg log -R a -r 8 --template '{join(files, ifeq(branch, "default", r"\x5c\x786e"))}\n'
+  fourth\x5c\x786esecond\x5c\x786ethird
+
+  $ hg log -R a -r 3:4 --template '{rev}:{sub(if("1", "\x6e"), ifeq(branch, "foo", r"\x5c\x786e", "\x5c\x786e"), desc)}\n'
+  3:\x6eo user, \x6eo domai\x6e
+  4:\x5c\x786eew bra\x5c\x786ech
+
 Test recursive evaluation:
 
   $ hg init r