Patchwork [02,of,11,V2] util: replace 'ellipsis' implementation by 'encoding.trim'

login
register
mail settings
Submitter Katsunori FUJIWARA
Date July 5, 2014, 6 p.m.
Message ID <d8ef3634ea3d248fca19.1404583201@feefifofum>
Download mbox | patch
Permalink /patch/5106/
State Accepted
Commit 86c2d792a4b782a379ebf4f71d2267af1df681a1
Headers show

Comments

Katsunori FUJIWARA - July 5, 2014, 6 p.m.
# HG changeset patch
# User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
# Date 1404583001 -32400
#      Sun Jul 06 02:56:41 2014 +0900
# Node ID d8ef3634ea3d248fca19054f8c3af06e622eb575
# Parent  fd924cea4eb92cd9445f3e370346f1eaee0647ac
util: replace 'ellipsis' implementation by 'encoding.trim'

Before this patch, 'util.ellipsis' tried to avoid splitting at
intermediate multi-byte sequence, but its implementation was incorrect.

Internal function '_ellipsis' trims specified unicode sequence not at
most maxlength 'columns in display', but at most maxlength number of
'unicode characters'.

    def _ellipsis(text, maxlength):
        if len(text) <= maxlength:
            return text, False
        else:
            return "%s..." % (text[:maxlength - 3]), True

In many encodings, number of unicode characters can be different from
columns in display.

This patch replaces 'ellipsis' implementation by 'encoding.trim',
which can trim string at most maxlength columns in display correctly,
even though specified string contains multi-byte characters.

'_ellipsis' is removed in this patch, because it is referred only from
'ellipsis'.

Patch

diff --git a/mercurial/util.py b/mercurial/util.py
--- a/mercurial/util.py
+++ b/mercurial/util.py
@@ -1318,23 +1318,9 @@ 
         r = None
     return author[author.find('<') + 1:r]
 
-def _ellipsis(text, maxlength):
-    if len(text) <= maxlength:
-        return text, False
-    else:
-        return "%s..." % (text[:maxlength - 3]), True
-
 def ellipsis(text, maxlength=400):
-    """Trim string to at most maxlength (default: 400) characters."""
-    try:
-        # use unicode not to split at intermediate multi-byte sequence
-        utext, truncated = _ellipsis(text.decode(encoding.encoding),
-                                     maxlength)
-        if not truncated:
-            return text
-        return utext.encode(encoding.encoding)
-    except (UnicodeDecodeError, UnicodeEncodeError):
-        return _ellipsis(text, maxlength)[0]
+    """Trim string to at most maxlength (default: 400) columns in display."""
+    return encoding.trim(text, maxlength, ellipsis='...')
 
 def unitcountfn(*unittable):
     '''return a function that renders a readable count of some quantity'''