Patchwork registrar: make format strings unicodes and not bytes

login
register
mail settings
Submitter Augie Fackler
Date Oct. 7, 2016, 12:49 p.m.
Message ID <c79b21f1d5dea9de7195.1475844540@augie-macbookair2.roam.corp.google.com>
Download mbox | patch
Permalink /patch/16884/
State Superseded
Headers show

Comments

Augie Fackler - Oct. 7, 2016, 12:49 p.m.
# HG changeset patch
# User Augie Fackler <augie@google.com>
# Date 1475843538 14400
#      Fri Oct 07 08:32:18 2016 -0400
# Node ID c79b21f1d5dea9de719504be30ebdb5635263d37
# Parent  f3a2125968377fb1d4b9ea3f4917260d5aca3536
registrar: make format strings unicodes and not bytes

Fixes issues on Python 3, wherein docstrings are unicodes. Shouldn't
break anything on Python 2.
Gregory Szorc - Oct. 7, 2016, 1:18 p.m.
LGTM. As a refresher, we don't rewrite docstrings to b'' in the module
transformer because Python 3 doesn't like that. Python 2 accepts both str
and unicode for docstrings.

On Fri, Oct 7, 2016 at 2:49 PM, Augie Fackler <raf@durin42.com> wrote:

> # HG changeset patch
> # User Augie Fackler <augie@google.com>
> # Date 1475843538 14400
> #      Fri Oct 07 08:32:18 2016 -0400
> # Node ID c79b21f1d5dea9de719504be30ebdb5635263d37
> # Parent  f3a2125968377fb1d4b9ea3f4917260d5aca3536
> registrar: make format strings unicodes and not bytes
>
> Fixes issues on Python 3, wherein docstrings are unicodes. Shouldn't
> break anything on Python 2.
>
> diff --git a/mercurial/registrar.py b/mercurial/registrar.py
> --- a/mercurial/registrar.py
> +++ b/mercurial/registrar.py
> @@ -121,7 +121,7 @@ class revsetpredicate(_funcregistrarbase
>      Otherwise, explicit 'revset.loadpredicate()' is needed.
>      """
>      _getname = _funcregistrarbase._parsefuncdecl
> -    _docformat = "``%s``\n    %s"
> +    _docformat = u"``%s``\n    %s"
>
>      def _extrasetup(self, name, func, safe=False, takeorder=False):
>          func._safe = safe
> @@ -160,7 +160,7 @@ class filesetpredicate(_funcregistrarbas
>      Otherwise, explicit 'fileset.loadpredicate()' is needed.
>      """
>      _getname = _funcregistrarbase._parsefuncdecl
> -    _docformat = "``%s``\n    %s"
> +    _docformat = u"``%s``\n    %s"
>
>      def _extrasetup(self, name, func, callstatus=False,
> callexisting=False):
>          func._callstatus = callstatus
> @@ -169,7 +169,7 @@ class filesetpredicate(_funcregistrarbas
>  class _templateregistrarbase(_funcregistrarbase):
>      """Base of decorator to register functions as template specific one
>      """
> -    _docformat = ":%s: %s"
> +    _docformat = u":%s: %s"
>
>  class templatekeyword(_templateregistrarbase):
>      """Decorator to register template keyword
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
Martijn Pieters - Oct. 7, 2016, 3:01 p.m.
On 7 October 2016 at 14:49, Augie Fackler <raf@durin42.com> wrote:

> # HG changeset patch
> # User Augie Fackler <augie@google.com>
> # Date 1475843538 14400
> #      Fri Oct 07 08:32:18 2016 -0400
> # Node ID c79b21f1d5dea9de719504be30ebdb5635263d37
> # Parent  f3a2125968377fb1d4b9ea3f4917260d5aca3536
> registrar: make format strings unicodes and not bytes
>
> Fixes issues on Python 3, wherein docstrings are unicodes. Shouldn't
> break anything on Python 2.
>

This will break in Python 2 if one of the two interpolated strings is not
ASCII-decodable.

These strings should be `str` in 2, `str` in 3.

diff --git a/mercurial/registrar.py b/mercurial/registrar.py
> --- a/mercurial/registrar.py
> +++ b/mercurial/registrar.py
> @@ -121,7 +121,7 @@ class revsetpredicate(_funcregistrarbase
>      Otherwise, explicit 'revset.loadpredicate()' is needed.
>      """
>      _getname = _funcregistrarbase._parsefuncdecl
> -    _docformat = "``%s``\n    %s"
> +    _docformat = u"``%s``\n    %s"
>
>      def _extrasetup(self, name, func, safe=False, takeorder=False):
>          func._safe = safe
> @@ -160,7 +160,7 @@ class filesetpredicate(_funcregistrarbas
>      Otherwise, explicit 'fileset.loadpredicate()' is needed.
>      """
>      _getname = _funcregistrarbase._parsefuncdecl
> -    _docformat = "``%s``\n    %s"
> +    _docformat = u"``%s``\n    %s"
>
>      def _extrasetup(self, name, func, callstatus=False,
> callexisting=False):
>          func._callstatus = callstatus
> @@ -169,7 +169,7 @@ class filesetpredicate(_funcregistrarbas
>  class _templateregistrarbase(_funcregistrarbase):
>      """Base of decorator to register functions as template specific one
>      """
> -    _docformat = ":%s: %s"
> +    _docformat = u":%s: %s"
>
>  class templatekeyword(_templateregistrarbase):
>      """Decorator to register template keyword
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
Yuya Nishihara - Oct. 7, 2016, 3:10 p.m.
On Fri, 7 Oct 2016 17:01:19 +0200, Martijn Pieters wrote:
> On 7 October 2016 at 14:49, Augie Fackler <raf@durin42.com> wrote:
> 
> > # HG changeset patch
> > # User Augie Fackler <augie@google.com>
> > # Date 1475843538 14400
> > #      Fri Oct 07 08:32:18 2016 -0400
> > # Node ID c79b21f1d5dea9de719504be30ebdb5635263d37
> > # Parent  f3a2125968377fb1d4b9ea3f4917260d5aca3536
> > registrar: make format strings unicodes and not bytes
> >
> > Fixes issues on Python 3, wherein docstrings are unicodes. Shouldn't
> > break anything on Python 2.
> >
> 
> This will break in Python 2 if one of the two interpolated strings is not
> ASCII-decodable.
> 
> These strings should be `str` in 2, `str` in 3.

Good point. So we can use """ instead of " to trick the importer?
Augie Fackler - Oct. 7, 2016, 3:11 p.m.
> On Oct 7, 2016, at 17:10, Yuya Nishihara <yuya@tcha.org> wrote:
> 
> On Fri, 7 Oct 2016 17:01:19 +0200, Martijn Pieters wrote:
>> On 7 October 2016 at 14:49, Augie Fackler <raf@durin42.com> wrote:
>> 
>>> # HG changeset patch
>>> # User Augie Fackler <augie@google.com>
>>> # Date 1475843538 14400
>>> #      Fri Oct 07 08:32:18 2016 -0400
>>> # Node ID c79b21f1d5dea9de719504be30ebdb5635263d37
>>> # Parent  f3a2125968377fb1d4b9ea3f4917260d5aca3536
>>> registrar: make format strings unicodes and not bytes
>>> 
>>> Fixes issues on Python 3, wherein docstrings are unicodes. Shouldn't
>>> break anything on Python 2.
>>> 
>> 
>> This will break in Python 2 if one of the two interpolated strings is not
>> ASCII-decodable.
>> 
>> These strings should be `str` in 2, `str` in 3.
> 
> Good point. So we can use """ instead of " to trick the importer?

Clever. I mailed a v2 that uses sysstr, which seems like it's probably good enough? Maybe even better than depending on the subtlety of how we're handling triple-quoted strings.
Yuya Nishihara - Oct. 7, 2016, 3:23 p.m.
On Fri, 7 Oct 2016 17:11:50 +0200, Augie Fackler wrote:
> > On Oct 7, 2016, at 17:10, Yuya Nishihara <yuya@tcha.org> wrote:
> > On Fri, 7 Oct 2016 17:01:19 +0200, Martijn Pieters wrote:
> >> On 7 October 2016 at 14:49, Augie Fackler <raf@durin42.com> wrote:
> >> 
> >>> # HG changeset patch
> >>> # User Augie Fackler <augie@google.com>
> >>> # Date 1475843538 14400
> >>> #      Fri Oct 07 08:32:18 2016 -0400
> >>> # Node ID c79b21f1d5dea9de719504be30ebdb5635263d37
> >>> # Parent  f3a2125968377fb1d4b9ea3f4917260d5aca3536
> >>> registrar: make format strings unicodes and not bytes
> >>> 
> >>> Fixes issues on Python 3, wherein docstrings are unicodes. Shouldn't
> >>> break anything on Python 2.
> >>> 
> >> 
> >> This will break in Python 2 if one of the two interpolated strings is not
> >> ASCII-decodable.
> >> 
> >> These strings should be `str` in 2, `str` in 3.
> > 
> > Good point. So we can use """ instead of " to trick the importer?
> 
> Clever. I mailed a v2 that uses sysstr, which seems like it's probably good enough? Maybe even better than depending on the subtlety of how we're handling triple-quoted strings.

That seems also fine.
Augie Fackler - Oct. 7, 2016, 3:28 p.m.
> On Oct 7, 2016, at 17:23, Yuya Nishihara <yuya@tcha.org> wrote:
> 
> On Fri, 7 Oct 2016 17:11:50 +0200, Augie Fackler wrote:
>>> On Oct 7, 2016, at 17:10, Yuya Nishihara <yuya@tcha.org> wrote:
>>> On Fri, 7 Oct 2016 17:01:19 +0200, Martijn Pieters wrote:
>>>> On 7 October 2016 at 14:49, Augie Fackler <raf@durin42.com> wrote:
>>>> 
>>>>> # HG changeset patch
>>>>> # User Augie Fackler <augie@google.com>
>>>>> # Date 1475843538 14400
>>>>> #      Fri Oct 07 08:32:18 2016 -0400
>>>>> # Node ID c79b21f1d5dea9de719504be30ebdb5635263d37
>>>>> # Parent  f3a2125968377fb1d4b9ea3f4917260d5aca3536
>>>>> registrar: make format strings unicodes and not bytes
>>>>> 
>>>>> Fixes issues on Python 3, wherein docstrings are unicodes. Shouldn't
>>>>> break anything on Python 2.
>>>>> 
>>>> 
>>>> This will break in Python 2 if one of the two interpolated strings is not
>>>> ASCII-decodable.
>>>> 
>>>> These strings should be `str` in 2, `str` in 3.
>>> 
>>> Good point. So we can use """ instead of " to trick the importer?
>> 
>> Clever. I mailed a v2 that uses sysstr, which seems like it's probably good enough? Maybe even better than depending on the subtlety of how we're handling triple-quoted strings.
> 
> That seems also fine.

Shall I go ahead and push that version then?
Martijn Pieters - Oct. 7, 2016, 4:20 p.m.
On 7 October 2016 at 17:28, Augie Fackler <raf@durin42.com> wrote:
>
> >>>> These strings should be `str` in 2, `str` in 3.
> >>>
> >>> Good point. So we can use """ instead of " to trick the importer?
> >>
> >> Clever. I mailed a v2 that uses sysstr, which seems like it's probably
> good enough? Maybe even better than depending on the subtlety of how we're
> handling triple-quoted strings.
> >
> > That seems also fine.
> Shall I go ahead and push that version then?
>

I'd prefer the explicit version, yes.
Yuya Nishihara - Oct. 8, 2016, 6:27 a.m.
On Fri, 7 Oct 2016 18:20:14 +0200, Martijn Pieters wrote:
> On 7 October 2016 at 17:28, Augie Fackler <raf@durin42.com> wrote:
> >
> > >>>> These strings should be `str` in 2, `str` in 3.
> > >>>
> > >>> Good point. So we can use """ instead of " to trick the importer?
> > >>
> > >> Clever. I mailed a v2 that uses sysstr, which seems like it's probably
> > good enough? Maybe even better than depending on the subtlety of how we're
> > handling triple-quoted strings.
> > >
> > > That seems also fine.
> > Shall I go ahead and push that version then?
> >
> 
> I'd prefer the explicit version, yes.

+1

Patch

diff --git a/mercurial/registrar.py b/mercurial/registrar.py
--- a/mercurial/registrar.py
+++ b/mercurial/registrar.py
@@ -121,7 +121,7 @@  class revsetpredicate(_funcregistrarbase
     Otherwise, explicit 'revset.loadpredicate()' is needed.
     """
     _getname = _funcregistrarbase._parsefuncdecl
-    _docformat = "``%s``\n    %s"
+    _docformat = u"``%s``\n    %s"
 
     def _extrasetup(self, name, func, safe=False, takeorder=False):
         func._safe = safe
@@ -160,7 +160,7 @@  class filesetpredicate(_funcregistrarbas
     Otherwise, explicit 'fileset.loadpredicate()' is needed.
     """
     _getname = _funcregistrarbase._parsefuncdecl
-    _docformat = "``%s``\n    %s"
+    _docformat = u"``%s``\n    %s"
 
     def _extrasetup(self, name, func, callstatus=False, callexisting=False):
         func._callstatus = callstatus
@@ -169,7 +169,7 @@  class filesetpredicate(_funcregistrarbas
 class _templateregistrarbase(_funcregistrarbase):
     """Base of decorator to register functions as template specific one
     """
-    _docformat = ":%s: %s"
+    _docformat = u":%s: %s"
 
 class templatekeyword(_templateregistrarbase):
     """Decorator to register template keyword