Patchwork templatekw: workaround for utf-8 round-trip of {desc}

login
register
mail settings
Submitter Yuya Nishihara
Date Feb. 25, 2016, 3:55 p.m.
Message ID <0dc2f0669300594a30d5.1456415723@mimosa>
Download mbox | patch
Permalink /patch/13391/
State Accepted
Headers show

Comments

Yuya Nishihara - Feb. 25, 2016, 3:55 p.m.
# HG changeset patch
# User Yuya Nishihara <yuya@tcha.org>
# Date 1451215297 -32400
#      Sun Dec 27 20:21:37 2015 +0900
# Node ID 0dc2f0669300594a30d5eb845cf41a248a082d47
# Parent  8b4ad169d564f2a45d1897b850da475b7ad08d20
templatekw: workaround for utf-8 round-trip of {desc}

Though our encoding strategy is best effort, {desc} is a primitive keyword
that should be worth enough to try hard to preserve UTF-8 bytes.
Sean Farley - Feb. 25, 2016, 7:21 p.m.
Yuya Nishihara <yuya@tcha.org> writes:

> # HG changeset patch
> # User Yuya Nishihara <yuya@tcha.org>
> # Date 1451215297 -32400
> #      Sun Dec 27 20:21:37 2015 +0900
> # Node ID 0dc2f0669300594a30d5eb845cf41a248a082d47
> # Parent  8b4ad169d564f2a45d1897b850da475b7ad08d20
> templatekw: workaround for utf-8 round-trip of {desc}
>
> Though our encoding strategy is best effort, {desc} is a primitive keyword
> that should be worth enough to try hard to preserve UTF-8 bytes.

Sadly, looks good to me. Thanks for the test, too.

Patch

diff --git a/mercurial/templatekw.py b/mercurial/templatekw.py
--- a/mercurial/templatekw.py
+++ b/mercurial/templatekw.py
@@ -9,6 +9,7 @@  from __future__ import absolute_import
 
 from .node import hex, nullid
 from . import (
+    encoding,
     error,
     hbisect,
     patch,
@@ -257,7 +258,12 @@  def showdate(repo, ctx, templ, **args):
 
 def showdescription(repo, ctx, templ, **args):
     """:desc: String. The text of the changeset description."""
-    return ctx.description().strip()
+    s = ctx.description()
+    if isinstance(s, encoding.localstr):
+        # try hard to preserve utf-8 bytes
+        return encoding.tolocal(encoding.fromlocal(s).strip())
+    else:
+        return s.strip()
 
 def showdiffstat(repo, ctx, templ, **args):
     """:diffstat: String. Statistics of changes with the following format:
diff --git a/tests/test-command-template.t b/tests/test-command-template.t
--- a/tests/test-command-template.t
+++ b/tests/test-command-template.t
@@ -3556,12 +3556,14 @@  Set up repository for non-ascii encoding
   > open('utf-8', 'w').write('\xc3\xa9')
   > EOF
   $ HGENCODING=utf-8 hg branch -q `cat utf-8`
-  $ HGENCODING=utf-8 hg ci -qAm 'non-ascii branch' utf-8
+  $ HGENCODING=utf-8 hg ci -qAm "non-ascii branch: `cat utf-8`" utf-8
 
 json filter should try round-trip conversion to utf-8:
 
   $ HGENCODING=ascii hg log -T "{branch|json}\n" -r0
   "\u00e9"
+  $ HGENCODING=ascii hg log -T "{desc|json}\n" -r0
+  "non-ascii branch: \u00e9"
 
 json filter takes input as utf-8b: