Patchwork [STABLE] extdiff: escape path for docstring (issue5301)

login
register
mail settings
Submitter Matt Mackall
Date July 25, 2016, 9:26 p.m.
Message ID <13a9fb0d88ac8544e28b.1469481963@ruin.waste.org>
Download mbox | patch
Permalink /patch/15987/
State Accepted
Headers show

Comments

Matt Mackall - July 25, 2016, 9:26 p.m.
# HG changeset patch
# User Matt Mackall <mpm@selenic.com>
# Date 1468877135 18000
#      Mon Jul 18 16:25:35 2016 -0500
# Node ID 13a9fb0d88ac8544e28b17e1663315d31f11859b
# Parent  953839de96ab574caa40557c542c262286c6287c
extdiff: escape path for docstring (issue5301)

The existing code (a) assumed path would be specified in
encoding.encoding and (b) assumed unicode() objects wouldn't cause
other parts of Mercurial to blow up. Both are dangerous assumptions.

Since we don't know the encoding of path and can't pass non-ASCII
through docstrings, just escape the path and drop the early _(). Will
have to suffice until we can teach docstrings to handle UTF-8b
escaping.

This has the side-effect that the line containing the path is now
variable by the time it reaches _() and thus can't be translated.
Yuya Nishihara - July 26, 2016, 12:58 p.m.
On Mon, 25 Jul 2016 16:26:03 -0500, Matt Mackall wrote:
> # HG changeset patch
> # User Matt Mackall <mpm@selenic.com>
> # Date 1468877135 18000
> #      Mon Jul 18 16:25:35 2016 -0500
> # Node ID 13a9fb0d88ac8544e28b17e1663315d31f11859b
> # Parent  953839de96ab574caa40557c542c262286c6287c
> extdiff: escape path for docstring (issue5301)

Removed unused import and queued this, thanks.

> This has the side-effect that the line containing the path is now
> variable by the time it reaches _() and thus can't be translated.

That's unfortunate, but the direction of this patch seems legit. Maybe
we can rephrase the docstring to not include %(path)s in sentence.
Matt Mackall - July 26, 2016, 5:45 p.m.
On Tue, 2016-07-26 at 21:58 +0900, Yuya Nishihara wrote:
> On Mon, 25 Jul 2016 16:26:03 -0500, Matt Mackall wrote:
> > 
> > # HG changeset patch
> > # User Matt Mackall <mpm@selenic.com>
> > # Date 1468877135 18000
> > #      Mon Jul 18 16:25:35 2016 -0500
> > # Node ID 13a9fb0d88ac8544e28b17e1663315d31f11859b
> > # Parent  953839de96ab574caa40557c542c262286c6287c
> > extdiff: escape path for docstring (issue5301)
> Removed unused import and queued this, thanks.

Seems my pyflakes setup is still busted.

> > This has the side-effect that the line containing the path is now
> > variable by the time it reaches _() and thus can't be translated.
> That's unfortunate, but the direction of this patch seems legit. Maybe
> we can rephrase the docstring to not include %(path)s in sentence.

That's probably the right answer, since we have exactly one user at present.
Yuya Nishihara - July 27, 2016, 1:37 p.m.
On Tue, 26 Jul 2016 12:45:00 -0500, Matt Mackall wrote:
> On Tue, 2016-07-26 at 21:58 +0900, Yuya Nishihara wrote:
> > On Mon, 25 Jul 2016 16:26:03 -0500, Matt Mackall wrote:
> > > This has the side-effect that the line containing the path is now
> > > variable by the time it reaches _() and thus can't be translated.
> > That's unfortunate, but the direction of this patch seems legit. Maybe
> > we can rephrase the docstring to not include %(path)s in sentence.
> 
> That's probably the right answer, since we have exactly one user at present.

I'll send patches when the freeze is over.

I want to refactor save() closure to a top-level class so that its __doc__
can be collected. Otherwise, we couldn't introduce new i18n strings in addition
to the existing extdiff() command.
Matt Mackall - July 27, 2016, 7:23 p.m.
On Wed, 2016-07-27 at 22:37 +0900, Yuya Nishihara wrote:
> On Tue, 26 Jul 2016 12:45:00 -0500, Matt Mackall wrote:
> > 
> > On Tue, 2016-07-26 at 21:58 +0900, Yuya Nishihara wrote:
> > > 
> > > On Mon, 25 Jul 2016 16:26:03 -0500, Matt Mackall wrote:
> > > > 
> > > > This has the side-effect that the line containing the path is now
> > > > variable by the time it reaches _() and thus can't be translated.
> > > That's unfortunate, but the direction of this patch seems legit. Maybe
> > > we can rephrase the docstring to not include %(path)s in sentence.
> > That's probably the right answer, since we have exactly one user at present.
> I'll send patches when the freeze is over.

Yeah, this certainly isn't pressing.

Patch

diff -r 953839de96ab -r 13a9fb0d88ac hgext/extdiff.py
--- a/hgext/extdiff.py	Sun Jul 17 15:13:51 2016 -0700
+++ b/hgext/extdiff.py	Mon Jul 18 16:25:35 2016 -0500
@@ -365,7 +365,10 @@ 
                 if options:
                     options = ' ' + options
                 return dodiff(ui, repo, cmdline + options, pats, opts)
-            doc = _('''\
+            # We can't pass non-ASCII through docstrings (and path is
+            # in an unknown encoding anyway)
+            docpath = path.encode("string-escape")
+            mydiff.__doc__ = '''\
 use %(path)s to diff repository (or selected files)
 
     Show differences between revisions for the specified files, using
@@ -376,15 +379,7 @@ 
     that revision is compared to the working directory, and, when no
     revisions are specified, the working directory files are compared
     to its parent.\
-''') % {'path': util.uirepr(path)}
-
-            # We must translate the docstring right away since it is
-            # used as a format string. The string will unfortunately
-            # be translated again in commands.helpcmd and this will
-            # fail when the docstring contains non-ASCII characters.
-            # Decoding the string to a Unicode string here (using the
-            # right encoding) prevents that.
-            mydiff.__doc__ = doc.decode(encoding.encoding)
+''' % {'path': util.uirepr(docpath)}
             return mydiff
         command(cmd, extdiffopts[:], _('hg %s [OPTION]... [FILE]...') % cmd,
                 inferrepo=True)(save(cmdline))
diff -r 953839de96ab -r 13a9fb0d88ac tests/test-extdiff.t
--- a/tests/test-extdiff.t	Sun Jul 17 15:13:51 2016 -0700
+++ b/tests/test-extdiff.t	Mon Jul 18 16:25:35 2016 -0500
@@ -389,3 +389,23 @@ 
   $ cd ..
 
 #endif
+
+Test handling of non-ASCII paths in generated docstrings (issue5301)
+
+  >>> open("u", "w").write("\xa5\xa5")
+  $ U=`cat u`
+
+  $ HGPLAIN=1 hg --config hgext.extdiff= --config extdiff.cmd.td=hi help -k xyzzy
+  abort: no matches
+  (try "hg help" for a list of topics)
+  [255]
+
+  $ HGPLAIN=1 hg --config hgext.extdiff= --config extdiff.cmd.td=hi help td > /dev/null
+
+  $ LC_MESSAGES=ja_JP.UTF-8 hg --config hgext.extdiff= --config extdiff.cmd.td=$U help -k xyzzy
+  abort: no matches
+  (try "hg help" for a list of topics)
+  [255]
+
+  $ LC_MESSAGES=ja_JP.UTF-8 hg --config hgext.extdiff= --config extdiff.cmd.td=$U help td | grep "^use"
+  use '\xa5\xa5' to diff repository (or selected files)