Patchwork [2,of,2,V3] py3: use namedtuple._replace to produce new tokens

login
register
mail settings
Submitter Martijn Pieters
Date Oct. 14, 2016, 4:55 p.m.
Message ID <d20dd7db86044bdca798.1476464126@mjpieters-mbp>
Download mbox | patch
Permalink /patch/17087/
State Accepted
Headers show

Comments

Martijn Pieters - Oct. 14, 2016, 4:55 p.m.
# HG changeset patch
# User Martijn Pieters <mjpieters@fb.com>
# Date 1476347257 -3600
#      Thu Oct 13 09:27:37 2016 +0100
# Node ID d20dd7db86044bdca79825499b913840d726d841
# Parent  9031460519503abe5dc430c8ece29d198121cd65
py3: use namedtuple._replace to produce new tokens
Pierre-Yves David - Oct. 16, 2016, 2:30 p.m.
On 10/14/2016 06:55 PM, Martijn Pieters wrote:
> # HG changeset patch
> # User Martijn Pieters <mjpieters@fb.com>
> # Date 1476347257 -3600
> #      Thu Oct 13 09:27:37 2016 +0100
> # Node ID d20dd7db86044bdca79825499b913840d726d841
> # Parent  9031460519503abe5dc430c8ece29d198121cd65
> py3: use namedtuple._replace to produce new tokens

We seems to be using a private function of some stdlib type?
Can you elaborate on why this is a good move?

> diff --git a/mercurial/__init__.py b/mercurial/__init__.py
> --- a/mercurial/__init__.py
> +++ b/mercurial/__init__.py
> @@ -233,9 +233,7 @@
>              """
>              st = tokens[j]
>              if st.type == token.STRING and st.string.startswith(("'", '"')):
> -                rt = tokenize.TokenInfo(st.type, 'u%s' % st.string,
> -                                        st.start, st.end, st.line)
> -                tokens[j] = rt
> +                tokens[j] = st._replace(string='u%s' % st.string)
>
>          for i, t in enumerate(tokens):
>              # Convert most string literals to byte literals. String literals
> @@ -266,8 +264,7 @@
>                      continue
>
>                  # String literal. Prefix to make a b'' string.
> -                yield tokenize.TokenInfo(t.type, 'b%s' % s, t.start, t.end,
> -                                          t.line)
> +                yield t._replace(string='b%s' % t.string)
>                  continue
>
>              # Insert compatibility imports at "from __future__ import" line.
> @@ -287,10 +284,8 @@
>                  for u in tokenize.tokenize(io.BytesIO(l).readline):
>                      if u.type in (tokenize.ENCODING, token.ENDMARKER):
>                          continue
> -                    yield tokenize.TokenInfo(u.type, u.string,
> -                                             (r, c + u.start[1]),
> -                                             (r, c + u.end[1]),
> -                                             '')
> +                    yield u._replace(
> +                        start=(r, c + u.start[1]), end=(r, c + u.end[1]))
>                  continue
>
>              # This looks like a function call.
> @@ -322,8 +317,7 @@
>                  # It changes iteritems to items as iteritems is not
>                  # present in Python 3 world.
>                  elif fn == 'iteritems':
> -                    yield tokenize.TokenInfo(t.type, 'items',
> -                                             t.start, t.end, t.line)
> +                    yield t._replace(string='items')
>                      continue
>
>              # Emit unmodified token.
Yuya Nishihara - Oct. 16, 2016, 2:48 p.m.
On Sun, 16 Oct 2016 16:30:04 +0200, Pierre-Yves David wrote:
> On 10/14/2016 06:55 PM, Martijn Pieters wrote:
> > # HG changeset patch
> > # User Martijn Pieters <mjpieters@fb.com>
> > # Date 1476347257 -3600
> > #      Thu Oct 13 09:27:37 2016 +0100
> > # Node ID d20dd7db86044bdca79825499b913840d726d841
> > # Parent  9031460519503abe5dc430c8ece29d198121cd65
> > py3: use namedtuple._replace to produce new tokens
> 
> We seems to be using a private function of some stdlib type?
> Can you elaborate on why this is a good move?

It is a public function. All public functions of namedtuple start with '_'
to avoid conflicts with field names.
Pierre-Yves David - Oct. 16, 2016, 2:51 p.m.
On 10/16/2016 04:48 PM, Yuya Nishihara wrote:
> On Sun, 16 Oct 2016 16:30:04 +0200, Pierre-Yves David wrote:
>> On 10/14/2016 06:55 PM, Martijn Pieters wrote:
>>> # HG changeset patch
>>> # User Martijn Pieters <mjpieters@fb.com>
>>> # Date 1476347257 -3600
>>> #      Thu Oct 13 09:27:37 2016 +0100
>>> # Node ID d20dd7db86044bdca79825499b913840d726d841
>>> # Parent  9031460519503abe5dc430c8ece29d198121cd65
>>> py3: use namedtuple._replace to produce new tokens
>>
>> We seems to be using a private function of some stdlib type?
>> Can you elaborate on why this is a good move?
>
> It is a public function. All public functions of namedtuple start with '_'
> to avoid conflicts with field names.

Haaaaa, much confusion.
Martijn Pieters - Oct. 16, 2016, 9:05 p.m.
On 16 Oct 2016, at 15:30, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote:
> 
>> py3: use namedtuple._replace to produce new tokens
> 
> We seems to be using a private function of some stdlib type?
> Can you elaborate on why this is a good move?

It is not private. This is one of the exceptions to the Python style guide; the namedtuple methods all have a single leading underscore to avoid clashing with the field names. If they didn't, you wouldn't be able to name a field `replace`.

See the https://docs.python.org/3/library/collections.html#collections.namedtuple documentation:

> In addition to the methods inherited from tuples, named tuples support three additional methods and two attributes. To prevent conflicts with field names, the method and attribute names start with an underscore.

and https://docs.python.org/3/library/collections.html#collections.somenamedtuple._replace for the specific method:

> Return a new instance of the named tuple replacing specified fields with new values

--
Martijn
Pierre-Yves David - Oct. 17, 2016, 11:29 a.m.
On 10/16/2016 11:05 PM, Martijn Pieters wrote:
> On 16 Oct 2016, at 15:30, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote:
>>
>>> py3: use namedtuple._replace to produce new tokens
>>
>> We seems to be using a private function of some stdlib type?
>> Can you elaborate on why this is a good move?
>
> It is not private. This is one of the exceptions to the Python style guide; the namedtuple methods all have a single leading underscore to avoid clashing with the field names. If they didn't, you wouldn't be able to name a field `replace`.

That was probably worth mentioning in the description. Thanks for the 
enlightening, I've accepted it as is.

Cheers

Patch

diff --git a/mercurial/__init__.py b/mercurial/__init__.py
--- a/mercurial/__init__.py
+++ b/mercurial/__init__.py
@@ -233,9 +233,7 @@ 
             """
             st = tokens[j]
             if st.type == token.STRING and st.string.startswith(("'", '"')):
-                rt = tokenize.TokenInfo(st.type, 'u%s' % st.string,
-                                        st.start, st.end, st.line)
-                tokens[j] = rt
+                tokens[j] = st._replace(string='u%s' % st.string)
 
         for i, t in enumerate(tokens):
             # Convert most string literals to byte literals. String literals
@@ -266,8 +264,7 @@ 
                     continue
 
                 # String literal. Prefix to make a b'' string.
-                yield tokenize.TokenInfo(t.type, 'b%s' % s, t.start, t.end,
-                                          t.line)
+                yield t._replace(string='b%s' % t.string)
                 continue
 
             # Insert compatibility imports at "from __future__ import" line.
@@ -287,10 +284,8 @@ 
                 for u in tokenize.tokenize(io.BytesIO(l).readline):
                     if u.type in (tokenize.ENCODING, token.ENDMARKER):
                         continue
-                    yield tokenize.TokenInfo(u.type, u.string,
-                                             (r, c + u.start[1]),
-                                             (r, c + u.end[1]),
-                                             '')
+                    yield u._replace(
+                        start=(r, c + u.start[1]), end=(r, c + u.end[1]))
                 continue
 
             # This looks like a function call.
@@ -322,8 +317,7 @@ 
                 # It changes iteritems to items as iteritems is not
                 # present in Python 3 world.
                 elif fn == 'iteritems':
-                    yield tokenize.TokenInfo(t.type, 'items',
-                                             t.start, t.end, t.line)
+                    yield t._replace(string='items')
                     continue
 
             # Emit unmodified token.