Patchwork D299: pycompat: introduce a wrapper for __builtins__.{raw_,}input()

login
register
mail settings
Submitter phabricator
Date Aug. 9, 2017, 2:25 p.m.
Message ID <differential-rev-PHID-DREV-rxlxdwz3rbnxxijdzrkx-req@phab.mercurial-scm.org>
Download mbox | patch
Permalink /patch/22798/
State Superseded
Headers show

Comments

phabricator - Aug. 9, 2017, 2:25 p.m.
durin42 created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  In order to make this work, we have to wrap the io streams in a
  TextIOWrapper so that __builtins__.input() can do unicode IO on Python
  
  3. We can't just restore the original (unicode) sys.std* because we
  
  might be running a cmdserver, and if we blindly restore sys.* to the
  original values then we end up breaking the cmdserver. Sadly,
  TextIOWrapper tries to close the underlying stream during its __del__,
  so we have to make a sublcass to prevent that.
  
  If you see errors like:
  
  TypeError: a bytes-like object is required, not 'str'
  
  On an input() or print() call on Python 3, the substitution of
  sys.std* is probably the root cause.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D299

AFFECTED FILES
  hgext/hgk.py
  mercurial/pycompat.py
  mercurial/ui.py

CHANGE DETAILS




To: durin42, #hg-reviewers
Cc: mercurial-devel
phabricator - Aug. 9, 2017, 3:41 p.m.
yuja added inline comments.

INLINE COMMENTS

> pycompat.py:84
> +                    setattr(sys, stream, noclosetextio(s))
> +            return bytestr(input(*args, **kwargs))
> +        finally:

Needs to specify encoding because user input may contain
non-ascii characters.

Perhaps it should be `noclosetextio(s, encoding=encoding.encoding)` and
`encoding.strtolocal(input(...))`.
Alternatively, forcing `latin-1` might work, but I'm not sure.

> ui.py:1222
>          with self.timeblockedsection('stdio'):
> -            line = raw_input(' ')
> +            line = pycompat.bytesinput(r' ')
>          sys.stdin = oldin

I think it's better to specify input and output explicitly.

`bytesinput(..., self.fin, self.fout)`

Setting BytesIO to `sys.stdin/stdout` is just plain wrong on Python 3.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D299

To: durin42, #hg-reviewers
Cc: yuja, mercurial-devel
phabricator - Aug. 11, 2017, 7:52 p.m.
durin42 marked an inline comment as done.
durin42 added inline comments.

INLINE COMMENTS

> yuja wrote in pycompat.py:84
> Needs to specify encoding because user input may contain
> non-ascii characters.
> 
> Perhaps it should be `noclosetextio(s, encoding=encoding.encoding)` and
> `encoding.strtolocal(input(...))`.
> Alternatively, forcing `latin-1` might work, but I'm not sure.

Ugh, gross. I can't use encoding from pycompat, can I?

> yuja wrote in ui.py:1222
> I think it's better to specify input and output explicitly.
> 
> `bytesinput(..., self.fin, self.fout)`
> 
> Setting BytesIO to `sys.stdin/stdout` is just plain wrong on Python 3.

Yeah, it's pretty clowny.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D299

To: durin42, #hg-reviewers
Cc: yuja, mercurial-devel
phabricator - Aug. 12, 2017, 10:31 a.m.
yuja added inline comments.

INLINE COMMENTS

> hgk.py:100
>              try:
> -                line = raw_input().split(' ')
> +                line = pycompat.bytesinput(sys.stdin, sys.stdout).split(' ')
>                  node1 = line[0]

Perhaps hgk should use ui.fin and ui.fout.

> durin42 wrote in pycompat.py:84
> Ugh, gross. I can't use encoding from pycompat, can I?

Nah. So this could be `util.bytesinput(fin, fout)` and `encoding.strio()`.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D299

To: durin42, #hg-reviewers
Cc: yuja, mercurial-devel
phabricator - Aug. 15, 2017, 8:50 p.m.
durin42 marked 5 inline comments as done.
durin42 added inline comments.

INLINE COMMENTS

> yuja wrote in pycompat.py:84
> Nah. So this could be `util.bytesinput(fin, fout)` and `encoding.strio()`.

Sigh. I'm not sure I got the encoding.py bit quite right, take a look?

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D299

To: durin42, #hg-reviewers
Cc: yuja, mercurial-devel
phabricator - Aug. 16, 2017, 3:34 a.m.
yuja accepted this revision.
yuja added inline comments.
This revision is now accepted and ready to land.

INLINE COMMENTS

> util.py:180
> +            sys.stdin, sys.stdout = encoding.strio(fin), encoding.strio(fout)
> +            return pycompat.bytestr(input(*args, **kwargs))
> +        else:

s/pycompat.bytestr/encoding.strtolocal/ and queued, thanks.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D299

To: durin42, #hg-reviewers, yuja
Cc: yuja, mercurial-devel
phabricator - Aug. 16, 2017, 4:41 a.m.
yuja added inline comments.

INLINE COMMENTS

> encoding.py:585
> +    def __init__(self, buffer, **kwargs):
> +        kwargs[r'encoding'] = encoding
> +        super(strio, self).__init__(buffer, **kwargs)

Fixed as `s/encoding/_sysstr(encoding)/`.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D299

To: durin42, #hg-reviewers, yuja
Cc: yuja, mercurial-devel

Patch

diff --git a/mercurial/ui.py b/mercurial/ui.py
--- a/mercurial/ui.py
+++ b/mercurial/ui.py
@@ -1219,7 +1219,7 @@ 
         # prompt ' ' must exist; otherwise readline may delete entire line
         # - http://bugs.python.org/issue12833
         with self.timeblockedsection('stdio'):
-            line = raw_input(' ')
+            line = pycompat.bytesinput(r' ')
         sys.stdin = oldin
         sys.stdout = oldout
 
diff --git a/mercurial/pycompat.py b/mercurial/pycompat.py
--- a/mercurial/pycompat.py
+++ b/mercurial/pycompat.py
@@ -69,6 +69,23 @@ 
     stdout = sys.stdout.buffer
     stderr = sys.stderr.buffer
 
+    class noclosetextio(io.TextIOWrapper):
+        def __del__(self):
+            """Override __del__ so it doesn't close the underlying stream."""
+
+    def bytesinput(*args, **kwargs):
+        origs = {}
+        try:
+            for stream in 'stdin stdout stderr'.split():
+                s = getattr(sys, stream)
+                origs[stream] = s
+                if not isinstance(s, io.TextIOBase):
+                    setattr(sys, stream, noclosetextio(s))
+            return bytestr(input(*args, **kwargs))
+        finally:
+            for stream, restore in origs.items():
+                setattr(sys, stream, restore)
+
     # Since Python 3 converts argv to wchar_t type by Py_DecodeLocale() on Unix,
     # we can use os.fsencode() to get back bytes argv.
     #
@@ -303,6 +320,7 @@ 
     stdin = sys.stdin
     stdout = sys.stdout
     stderr = sys.stderr
+    bytesinput = raw_input
     if getattr(sys, 'argv', None) is not None:
         sysargv = sys.argv
     sysplatform = sys.platform
diff --git a/hgext/hgk.py b/hgext/hgk.py
--- a/hgext/hgk.py
+++ b/hgext/hgk.py
@@ -48,6 +48,7 @@ 
     commands,
     obsolete,
     patch,
+    pycompat,
     registrar,
     scmutil,
 )
@@ -96,7 +97,7 @@ 
     while True:
         if opts['stdin']:
             try:
-                line = raw_input().split(' ')
+                line = pycompat.bytesinput().split(' ')
                 node1 = line[0]
                 if len(line) > 1:
                     node2 = line[1]
@@ -177,7 +178,7 @@ 
     prefix = ""
     if opts['stdin']:
         try:
-            (type, r) = raw_input().split(' ')
+            (type, r) = pycompat.bytesinput().split(' ')
             prefix = "    "
         except EOFError:
             return
@@ -195,7 +196,7 @@ 
         catcommit(ui, repo, n, prefix)
         if opts['stdin']:
             try:
-                (type, r) = raw_input().split(' ')
+                (type, r) = pycompat.bytesinput().split(' ')
             except EOFError:
                 break
         else: