Patchwork D5048: hgdemandimport: port line consuming to Python 2

login
register
mail settings
Submitter phabricator
Date Oct. 13, 2018, 7:28 a.m.
Message ID <differential-rev-PHID-DREV-5oylehvcd2ebahjutqq6-req@phab.mercurial-scm.org>
Download mbox | patch
Permalink /patch/35833/
State New
Headers show

Comments

phabricator - Oct. 13, 2018, 7:28 a.m.
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  This commit rewrites the line consuming to use next() on a generator
  in order to appease Python 2.
  
  With this commit, we are now able to emit some tokens on Python 2.7 using
  the Python 3.7 tokenizer. But there are still bugs...

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5048

AFFECTED FILES
  hgdemandimport/py3tokenize.py

CHANGE DETAILS




To: indygreg, #hg-reviewers
Cc: mercurial-devel

Patch

diff --git a/hgdemandimport/py3tokenize.py b/hgdemandimport/py3tokenize.py
--- a/hgdemandimport/py3tokenize.py
+++ b/hgdemandimport/py3tokenize.py
@@ -62,6 +62,7 @@ 
 # * Adjusted for relative imports.
 # * absolute_import added.
 # * Removed re.ASCII.
+# * Various backports to work on Python 2.7.
 
 from __future__ import absolute_import
 
@@ -256,7 +257,7 @@ 
 class StopTokenizing(Exception): pass
 
 
-class Untokenizer:
+class Untokenizer(object):
 
     def __init__(self):
         self.tokens = []
@@ -503,11 +504,22 @@ 
     """
     # This import is here to avoid problems when the itertools module is not
     # built yet and tokenize is imported.
-    from itertools import chain, repeat
+    from itertools import repeat
     encoding, consumed = detect_encoding(readline)
-    rl_gen = iter(readline, b"")
-    empty = repeat(b"")
-    return _tokenize(chain(consumed, rl_gen, empty).__next__, encoding)
+
+    def lines():
+        for line in consumed:
+            yield line
+
+        while True:
+            try:
+                yield readline()
+            except StopIteration:
+                break
+
+        yield repeat(b'')
+
+    return _tokenize(lines(), encoding)
 
 
 def _tokenize(readline, encoding):
@@ -531,7 +543,7 @@ 
             # hence `line` itself will always be overwritten at the end
             # of this loop.
             last_line = line
-            line = readline()
+            line = next(readline)
         except StopIteration:
             line = b''