Patchwork [1,of,8] hgk: stop using util.bytesinput() to read a single line from stdin

login
register
mail settings
Submitter Yuya Nishihara
Date March 9, 2018, 12:35 p.m.
Message ID <f7d9876d750e048b4c0e.1520598934@mimosa>
Download mbox | patch
Permalink /patch/29161/
State Accepted
Headers show

Comments

Yuya Nishihara - March 9, 2018, 12:35 p.m.
# HG changeset patch
# User Yuya Nishihara <yuya@tcha.org>
# Date 1520323525 21600
#      Tue Mar 06 02:05:25 2018 -0600
# Node ID f7d9876d750e048b4c0e0ec0682928e86a8e8ecb
# Parent  b434965f984eff168a7caaa239277b15729bd0b1
hgk: stop using util.bytesinput() to read a single line from stdin

Appears that the stdio here is an IPC channel between hg and hgk (tk)
processes, which shouldn't need a fancy readline support.
Gregory Szorc - March 9, 2018, 5:25 p.m.
On Fri, Mar 9, 2018 at 4:35 AM, Yuya Nishihara <yuya@tcha.org> wrote:

> # HG changeset patch
> # User Yuya Nishihara <yuya@tcha.org>
> # Date 1520323525 21600
> #      Tue Mar 06 02:05:25 2018 -0600
> # Node ID f7d9876d750e048b4c0e0ec0682928e86a8e8ecb
> # Parent  b434965f984eff168a7caaa239277b15729bd0b1
> hgk: stop using util.bytesinput() to read a single line from stdin
>

Queued this series, thanks.

If you have this stdio stuff paged into your brain, you may want to look at
hook.py and what it is doing with stdio. Essentially, it is doing dup() and
dup2() to temporarily redirect stdout to stderr such that the wire protocol
can intercept output and forward it to the wire protocol client (or the
CLI). See hook.redirect(True) in the wire protocol server code and follow
the trail from there.

In the case of shell hooks, I believe those processes inherit the parent's
file descriptors. Which after hook.py mucks with the file descriptors, is
actually the stderr stream.

I question the appropriateness of the approach. I think it would be better
to e.g. send ui.ferr to shell hooks and to temporarily muck with
sys.stdout/sys.stderr when running Python hooks. But this has implications
and I haven't thought it through. I'd *really* like to see us not have to
do the dup()/dup2() dance in the hooks because that is mucking with global
state and can make it hard to debug server processes.


>
> Appears that the stdio here is an IPC channel between hg and hgk (tk)
> processes, which shouldn't need a fancy readline support.
>
> diff --git a/hgext/hgk.py b/hgext/hgk.py
> --- a/hgext/hgk.py
> +++ b/hgext/hgk.py
> @@ -51,7 +51,6 @@ from mercurial import (
>      pycompat,
>      registrar,
>      scmutil,
> -    util,
>  )
>
>  cmdtable = {}
> @@ -105,15 +104,15 @@ def difftree(ui, repo, node1=None, node2
>
>      while True:
>          if opts[r'stdin']:
> -            try:
> -                line = util.bytesinput(ui.fin, ui.fout).split(' ')
> -                node1 = line[0]
> -                if len(line) > 1:
> -                    node2 = line[1]
> -                else:
> -                    node2 = None
> -            except EOFError:
> +            line = ui.fin.readline()
> +            if not line:
>                  break
> +            line = line.rstrip(pycompat.oslinesep).split(b' ')
> +            node1 = line[0]
> +            if len(line) > 1:
> +                node2 = line[1]
> +            else:
> +                node2 = None
>          node1 = repo.lookup(node1)
>          if node2:
>              node2 = repo.lookup(node2)
> @@ -186,12 +185,11 @@ def catfile(ui, repo, type=None, r=None,
>      #
>      prefix = ""
>      if opts[r'stdin']:
> -        try:
> -            (type, r) = util.bytesinput(ui.fin, ui.fout).split(' ')
> -            prefix = "    "
> -        except EOFError:
> +        line = ui.fin.readline()
> +        if not line:
>              return
> -
> +        (type, r) = line.rstrip(pycompat.oslinesep).split(b' ')
> +        prefix = "    "
>      else:
>          if not type or not r:
>              ui.warn(_("cat-file: type or revision not supplied\n"))
> @@ -204,10 +202,10 @@ def catfile(ui, repo, type=None, r=None,
>          n = repo.lookup(r)
>          catcommit(ui, repo, n, prefix)
>          if opts[r'stdin']:
> -            try:
> -                (type, r) = util.bytesinput(ui.fin, ui.fout).split(' ')
> -            except EOFError:
> +            line = ui.fin.readline()
> +            if not line:
>                  break
> +            (type, r) = line.rstrip(pycompat.oslinesep).split(b' ')
>          else:
>              break
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
Yuya Nishihara - March 10, 2018, 7:49 a.m.
On Fri, 9 Mar 2018 09:25:10 -0800, Gregory Szorc wrote:
> If you have this stdio stuff paged into your brain, you may want to look at
> hook.py and what it is doing with stdio. Essentially, it is doing dup() and
> dup2() to temporarily redirect stdout to stderr such that the wire protocol
> can intercept output and forward it to the wire protocol client (or the
> CLI).

and hook output never be interleaved into the wire protocol channel.

> See hook.redirect(True) in the wire protocol server code and follow
> the trail from there.
> 
> In the case of shell hooks, I believe those processes inherit the parent's
> file descriptors. Which after hook.py mucks with the file descriptors, is
> actually the stderr stream.

I think these dup()s are no longer needed for shell hooks since a subprocess
output is sent to ui.fout which is actually ui.ferr in ssh session. They're
still valid for in-process hooks, though.

> I question the appropriateness of the approach. I think it would be better
> to e.g. send ui.ferr to shell hooks and to temporarily muck with
> sys.stdout/sys.stderr when running Python hooks. But this has implications
> and I haven't thought it through. I'd *really* like to see us not have to
> do the dup()/dup2() dance in the hooks because that is mucking with global
> state and can make it hard to debug server processes.

Yeah. I doubt if the current code would work well in threaded hgweb.

I think we can do a similar trick to commandserver, which basically nullifes
the global sys.stdin/stdout to protect IPC channels from being corrupted by
third-party extensions, and use ui.fin/fout thoroughly.

https://www.mercurial-scm.org/repo/hg/file/4.5.2/mercurial/commandserver.py#l334

Regarding the wire protocol and hooks, this means:

 - do dup()/dup2() business globally, not locally in hook.py
 - isolate wire-protocol streams from ui.fin/fout
   (ssh-wire: dup()ed stdin/stdout, ui.fin -> stdin -> null, stdout -> null,
    ui.fout -> ui.ferr -> stderr)
 - in-process hook must use ui instead of print() because print()s will
   be sent to /dev/null
 - shell hook output is redirected to stderr as long as ui.fout points to
   stderr (the stderr will be dup()ed to stdout after fork)

Patch

diff --git a/hgext/hgk.py b/hgext/hgk.py
--- a/hgext/hgk.py
+++ b/hgext/hgk.py
@@ -51,7 +51,6 @@  from mercurial import (
     pycompat,
     registrar,
     scmutil,
-    util,
 )
 
 cmdtable = {}
@@ -105,15 +104,15 @@  def difftree(ui, repo, node1=None, node2
 
     while True:
         if opts[r'stdin']:
-            try:
-                line = util.bytesinput(ui.fin, ui.fout).split(' ')
-                node1 = line[0]
-                if len(line) > 1:
-                    node2 = line[1]
-                else:
-                    node2 = None
-            except EOFError:
+            line = ui.fin.readline()
+            if not line:
                 break
+            line = line.rstrip(pycompat.oslinesep).split(b' ')
+            node1 = line[0]
+            if len(line) > 1:
+                node2 = line[1]
+            else:
+                node2 = None
         node1 = repo.lookup(node1)
         if node2:
             node2 = repo.lookup(node2)
@@ -186,12 +185,11 @@  def catfile(ui, repo, type=None, r=None,
     #
     prefix = ""
     if opts[r'stdin']:
-        try:
-            (type, r) = util.bytesinput(ui.fin, ui.fout).split(' ')
-            prefix = "    "
-        except EOFError:
+        line = ui.fin.readline()
+        if not line:
             return
-
+        (type, r) = line.rstrip(pycompat.oslinesep).split(b' ')
+        prefix = "    "
     else:
         if not type or not r:
             ui.warn(_("cat-file: type or revision not supplied\n"))
@@ -204,10 +202,10 @@  def catfile(ui, repo, type=None, r=None,
         n = repo.lookup(r)
         catcommit(ui, repo, n, prefix)
         if opts[r'stdin']:
-            try:
-                (type, r) = util.bytesinput(ui.fin, ui.fout).split(' ')
-            except EOFError:
+            line = ui.fin.readline()
+            if not line:
                 break
+            (type, r) = line.rstrip(pycompat.oslinesep).split(b' ')
         else:
             break