Patchwork [1,of,2] py3: make the string unicode so its iterable in py3k

login
register
mail settings
Submitter Mateusz Kwapich
Date Oct. 8, 2016, 3:55 p.m.
Message ID <225efa4bf7f497e55f0b.1475942145@devvm314.lla2.facebook.com>
Download mbox | patch
Permalink /patch/16939/
State Accepted
Headers show

Comments

Mateusz Kwapich - Oct. 8, 2016, 3:55 p.m.
# HG changeset patch
# User Mateusz Kwapich <mitrandir@fb.com>
# Date 1475941528 25200
#      Sat Oct 08 08:45:28 2016 -0700
# Node ID 225efa4bf7f497e55f0ba57f64a33dce39eaeb29
# Parent  8f34e217338be6a1b997807521e95f9f7409d722
py3: make the string unicode so its iterable in py3k
Pulkit Goyal - Oct. 8, 2016, 4 p.m.
On Sat, Oct 8, 2016 at 5:55 PM, Mateusz Kwapich <mitrandir@fb.com> wrote:
> # HG changeset patch
> # User Mateusz Kwapich <mitrandir@fb.com>
> # Date 1475941528 25200
> #      Sat Oct 08 08:45:28 2016 -0700
> # Node ID 225efa4bf7f497e55f0ba57f64a33dce39eaeb29
> # Parent  8f34e217338be6a1b997807521e95f9f7409d722
> py3: make the string unicode so its iterable in py3k
>
> diff --git a/mercurial/store.py b/mercurial/store.py
> --- a/mercurial/store.py
> +++ b/mercurial/store.py
> @@ -65,7 +65,7 @@ def _reserved():
>
>      these characters will be escaped by encodefunctions
>      '''
> -    winreserved = [ord(x) for x in '\\:*?"<>|']
> +    winreserved = [ord(x) for x in u'\\:*?"<>|']

ord() accepts both unicode and bytes in Python 2 world. So its better
to prevent using pycompat.sysstr() here.

>      for x in range(32):
>          yield x
>      for x in range(126, 256):
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Martijn Pieters - Oct. 8, 2016, 4:07 p.m.
On 8 October 2016 at 18:00, Pulkit Goyal <7895pulkit@gmail.com> wrote:

> On Sat, Oct 8, 2016 at 5:55 PM, Mateusz Kwapich <mitrandir@fb.com> wrote:
> > # HG changeset patch
> > # User Mateusz Kwapich <mitrandir@fb.com>
> > # Date 1475941528 25200
> > #      Sat Oct 08 08:45:28 2016 -0700
> > # Node ID 225efa4bf7f497e55f0ba57f64a33dce39eaeb29
> > # Parent  8f34e217338be6a1b997807521e95f9f7409d722
> > py3: make the string unicode so its iterable in py3k
> >
> > diff --git a/mercurial/store.py b/mercurial/store.py
> > --- a/mercurial/store.py
> > +++ b/mercurial/store.py
> > @@ -65,7 +65,7 @@ def _reserved():
> >
> >      these characters will be escaped by encodefunctions
> >      '''
> > -    winreserved = [ord(x) for x in '\\:*?"<>|']
> > +    winreserved = [ord(x) for x in u'\\:*?"<>|']
>
> ord() accepts both unicode and bytes in Python 2 world. So its better
> to prevent using pycompat.sysstr() here.


This comment is slightly confusing; you make it sound as if iteration over
a bytestring in Py3 would be okay too (it isn't). With the above change we
avoid iterating over a `bytes` object in Py3 which would be bad.

In Py2 this is an iteration over unicode and then indeed ord() is fine (as
it was when iterating over str).



>

>      for x in range(32):
> >          yield x
> >      for x in range(126, 256):
> > _______________________________________________
> > Mercurial-devel mailing list
> > Mercurial-devel@mercurial-scm.org
> > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>

Patch

diff --git a/mercurial/store.py b/mercurial/store.py
--- a/mercurial/store.py
+++ b/mercurial/store.py
@@ -65,7 +65,7 @@  def _reserved():
 
     these characters will be escaped by encodefunctions
     '''
-    winreserved = [ord(x) for x in '\\:*?"<>|']
+    winreserved = [ord(x) for x in u'\\:*?"<>|']
     for x in range(32):
         yield x
     for x in range(126, 256):