Patchwork [3,of,3] cmdserver: protect pipe server streams against corruption caused by direct io

login
register
mail settings
Submitter Yuya Nishihara
Date Nov. 15, 2014, 10:34 a.m.
Message ID <15b63d55d18198a1e262.1416047667@mimosa>
Download mbox | patch
Permalink /patch/6744/
State Accepted
Commit 69f86b9370350a939ed415f49e2ff110085d447d
Headers show

Comments

Yuya Nishihara - Nov. 15, 2014, 10:34 a.m.
# HG changeset patch
# User Yuya Nishihara <yuya@tcha.org>
# Date 1416027043 -32400
#      Sat Nov 15 13:50:43 2014 +0900
# Node ID 15b63d55d18198a1e2621c79419b7b1f66a8875e
# Parent  d6e1a0cd0da71ebd777d2f35a52b6390ca3915e8
cmdserver: protect pipe server streams against corruption caused by direct io

Because pipe-mode server uses stdio as IPC channel, other modules should not
touch stdio directly and use ui instead.  However, this strategy is brittle
because several Python functions read and write stdio implicitly.

    print 'hello'  # should use ui.write()
    # => ch = 'h', size = 1701604463 'ello', data = '\n'

This patch adds protection for such mistakes.  Both stdio files and low-level
file descriptors are redirected to /dev/null while command server uses them.
Gregory Szorc - Nov. 15, 2014, 4:40 p.m.
On 11/15/14 2:34 AM, Yuya Nishihara wrote:
> # HG changeset patch
> # User Yuya Nishihara <yuya@tcha.org>
> # Date 1416027043 -32400
> #      Sat Nov 15 13:50:43 2014 +0900
> # Node ID 15b63d55d18198a1e2621c79419b7b1f66a8875e
> # Parent  d6e1a0cd0da71ebd777d2f35a52b6390ca3915e8
> cmdserver: protect pipe server streams against corruption caused by direct io
>
> Because pipe-mode server uses stdio as IPC channel, other modules should not
> touch stdio directly and use ui instead.  However, this strategy is brittle
> because several Python functions read and write stdio implicitly.
>
>      print 'hello'  # should use ui.write()
>      # => ch = 'h', size = 1701604463 'ello', data = '\n'
>
> This patch adds protection for such mistakes.  Both stdio files and low-level
> file descriptors are redirected to /dev/null while command server uses them.

Did you consider installing a wrapper object into sys.std* that converts 
raw I/O into the proper format and having the pipe server and ui write 
into the "real" fds? That would seem like a safer and more robust 
approach, especially for people with extensions using 3rd party 
libraries that may not send all stdio through Mercurial APIs.
Yuya Nishihara - Nov. 16, 2014, 3:02 a.m.
On Sat, 15 Nov 2014 08:40:15 -0800, Gregory Szorc wrote:
> On 11/15/14 2:34 AM, Yuya Nishihara wrote:
> > # HG changeset patch
> > # User Yuya Nishihara <yuya@tcha.org>
> > # Date 1416027043 -32400
> > #      Sat Nov 15 13:50:43 2014 +0900
> > # Node ID 15b63d55d18198a1e2621c79419b7b1f66a8875e
> > # Parent  d6e1a0cd0da71ebd777d2f35a52b6390ca3915e8
> > cmdserver: protect pipe server streams against corruption caused by direct io
> >
> > Because pipe-mode server uses stdio as IPC channel, other modules should not
> > touch stdio directly and use ui instead.  However, this strategy is brittle
> > because several Python functions read and write stdio implicitly.
> >
> >      print 'hello'  # should use ui.write()
> >      # => ch = 'h', size = 1701604463 'ello', data = '\n'
> >
> > This patch adds protection for such mistakes.  Both stdio files and low-level
> > file descriptors are redirected to /dev/null while command server uses them.
> 
> Did you consider installing a wrapper object into sys.std* that converts 
> raw I/O into the proper format and having the pipe server and ui write 
> into the "real" fds? That would seem like a safer and more robust 
> approach, especially for people with extensions using 3rd party 
> libraries that may not send all stdio through Mercurial APIs.

sys.std* wrapper is useless if underlying libraries use Std-C functions [1].
Also, it could cause trouble if they expect that sys.std* are real file
objects.

 [1]: https://docs.python.org/2.7/library/sys.html#sys.stdin

Mercurial extensions should conform to the rule of Mercurial internal API.
This patch is the last ditch to avoid corruption of command-server protocol
in case they are broken.

Regards,
Matt Mackall - Nov. 17, 2014, 11:22 p.m.
On Sat, 2014-11-15 at 19:34 +0900, Yuya Nishihara wrote:
> # HG changeset patch
> # User Yuya Nishihara <yuya@tcha.org>
> # Date 1416027043 -32400
> #      Sat Nov 15 13:50:43 2014 +0900
> # Node ID 15b63d55d18198a1e2621c79419b7b1f66a8875e
> # Parent  d6e1a0cd0da71ebd777d2f35a52b6390ca3915e8
> cmdserver: protect pipe server streams against corruption caused by direct io

These are queued for default, thanks.

Patch

diff --git a/mercurial/commandserver.py b/mercurial/commandserver.py
--- a/mercurial/commandserver.py
+++ b/mercurial/commandserver.py
@@ -7,7 +7,7 @@ 
 
 from i18n import _
 import struct
-import os, errno, traceback, SocketServer
+import sys, os, errno, traceback, SocketServer
 import dispatch, encoding, util
 
 logfile = None
@@ -248,6 +248,29 @@  class server(object):
 
         return 0
 
+def _protectio(ui):
+    """ duplicates streams and redirect original to null if ui uses stdio """
+    ui.flush()
+    newfiles = []
+    nullfd = os.open(os.devnull, os.O_RDWR)
+    for f, sysf, mode in [(ui.fin, sys.stdin, 'rb'),
+                          (ui.fout, sys.stdout, 'wb')]:
+        if f is sysf:
+            newfd = os.dup(f.fileno())
+            os.dup2(nullfd, f.fileno())
+            f = os.fdopen(newfd, mode)
+        newfiles.append(f)
+    os.close(nullfd)
+    return tuple(newfiles)
+
+def _restoreio(ui, fin, fout):
+    """ restores streams from duplicated ones """
+    ui.flush()
+    for f, uif in [(fin, ui.fin), (fout, ui.fout)]:
+        if f is not uif:
+            os.dup2(f.fileno(), uif.fileno())
+            f.close()
+
 class pipeservice(object):
     def __init__(self, ui, repo, opts):
         self.ui = ui
@@ -258,8 +281,14 @@  class pipeservice(object):
 
     def run(self):
         ui = self.ui
-        sv = server(ui, self.repo, ui.fin, ui.fout)
-        return sv.serve()
+        # redirect stdio to null device so that broken extensions or in-process
+        # hooks will never cause corruption of channel protocol.
+        fin, fout = _protectio(ui)
+        try:
+            sv = server(ui, self.repo, fin, fout)
+            return sv.serve()
+        finally:
+            _restoreio(ui, fin, fout)
 
 class _requesthandler(SocketServer.StreamRequestHandler):
     def handle(self):
diff --git a/tests/test-commandserver.t b/tests/test-commandserver.t
--- a/tests/test-commandserver.t
+++ b/tests/test-commandserver.t
@@ -492,6 +492,7 @@  check that local configs for the cached 
   foo
 
   $ cat <<EOF > dbgui.py
+  > import os, sys
   > from mercurial import cmdutil, commands
   > cmdtable = {}
   > command = cmdutil.command(cmdtable)
@@ -501,6 +502,14 @@  check that local configs for the cached 
   > @command("debugprompt", norepo=True)
   > def debugprompt(ui):
   >     ui.write("%s\\n" % ui.prompt("prompt:"))
+  > @command("debugreadstdin", norepo=True)
+  > def debugreadstdin(ui):
+  >     ui.write("read: %r\n" % sys.stdin.read(1))
+  > @command("debugwritestdout", norepo=True)
+  > def debugwritestdout(ui):
+  >     os.write(1, "low-level stdout fd and\n")
+  >     sys.stdout.write("stdout should be redirected to /dev/null\n")
+  >     sys.stdout.flush()
   > EOF
   $ cat <<EOF >> .hg/hgrc
   > [extensions]
@@ -518,10 +527,15 @@  check that local configs for the cached 
   ...     runcommand(server, ['debugprompt', '--config',
   ...                         'ui.interactive=True'],
   ...                input=cStringIO.StringIO('5678\n'))
+  ...     runcommand(server, ['debugreadstdin'])
+  ...     runcommand(server, ['debugwritestdout'])
   *** runcommand debuggetpass --config ui.interactive=True
   password: 1234
   *** runcommand debugprompt --config ui.interactive=True
   prompt: 5678
+  *** runcommand debugreadstdin
+  read: ''
+  *** runcommand debugwritestdout
 
 
 run commandserver in commandserver, which is silly but should work: