Patchwork [2,of,4,py3] revsetlang: work around repr() returning unicode on py3

login
register
mail settings
Submitter Augie Fackler
Date March 21, 2017, 7:13 p.m.
Message ID <63466a54ec9e266031cc.1490123581@arthedain.pit.corp.google.com>
Download mbox | patch
Permalink /patch/19536/
State Accepted
Headers show

Comments

Augie Fackler - March 21, 2017, 7:13 p.m.
# HG changeset patch
# User Augie Fackler <augie@google.com>
# Date 1489900459 14400
#      Sun Mar 19 01:14:19 2017 -0400
# Node ID 63466a54ec9e266031cce12fbd699ccfccd40738
# Parent  a4745fd9219ed5b408bfc0403a4a8e6acd41df6c
revsetlang: work around repr() returning unicode on py3

I'm not confident in my choice of "encode to utf8" here: it seems
reasonable that we could expect the revset engine to do the right
thing there. On the other hand these strings might have encoding
weirdness to begin with if there's `branch(some-weird-byte-sequence)`.
Yuya Nishihara - March 22, 2017, 1:41 p.m.
On Tue, 21 Mar 2017 15:13:01 -0400, Augie Fackler wrote:
> # HG changeset patch
> # User Augie Fackler <augie@google.com>
> # Date 1489900459 14400
> #      Sun Mar 19 01:14:19 2017 -0400
> # Node ID 63466a54ec9e266031cce12fbd699ccfccd40738
> # Parent  a4745fd9219ed5b408bfc0403a4a8e6acd41df6c
> revsetlang: work around repr() returning unicode on py3
> 
> I'm not confident in my choice of "encode to utf8" here: it seems
> reasonable that we could expect the revset engine to do the right
> thing there. On the other hand these strings might have encoding
> weirdness to begin with if there's `branch(some-weird-byte-sequence)`.

> --- a/mercurial/revsetlang.py
> +++ b/mercurial/revsetlang.py
> @@ -607,7 +607,11 @@ def formatspec(expr, *args):
>      '''
>  
>      def quote(s):
> -        return repr(str(s))
> +        r = repr(bytes(s))
> +        if pycompat.ispy3:
> +            r = r[1:] # strip off b prefix
> +            return r.encode('utf-8')

utf-8 is wrong, but it would be okay as repr(bytes) wouldn't contain
non-ascii characters.

That said, it'll be better to use util.escapestr().

  "'" + util.escapestr(s) + "'"

Patch

diff --git a/mercurial/revsetlang.py b/mercurial/revsetlang.py
--- a/mercurial/revsetlang.py
+++ b/mercurial/revsetlang.py
@@ -607,7 +607,11 @@  def formatspec(expr, *args):
     '''
 
     def quote(s):
-        return repr(str(s))
+        r = repr(bytes(s))
+        if pycompat.ispy3:
+            r = r[1:] # strip off b prefix
+            return r.encode('utf-8')
+        return r
 
     def argtype(c, arg):
         if c == 'd':