Patchwork [V2] logtoprocess: new experimental extension

login
register
mail settings
Submitter Martijn Pieters
Date April 5, 2016, 5:31 p.m.
Message ID <ee2dd1a3f2e15b92b4d7.1459877468@mjpieters-mbp>
Download mbox | patch
Permalink /patch/14387/
State Accepted
Commit a368da441b322c01b99a6c82c3f2442e5691a76a
Headers show

Comments

Martijn Pieters - April 5, 2016, 5:31 p.m.
# HG changeset patch
# User Martijn Pieters <mjpieters@fb.com>
# Date 1459874399 -3600
#      Tue Apr 05 17:39:59 2016 +0100
# Node ID ee2dd1a3f2e15b92b4d7f525443e442c08399ec1
# Parent  1e97bcbb87767e1a96909900595a0b6a7791407b
logtoprocess: new experimental extension

This extension lets you direct specific ui.log() information to shell
commands, spawned in the background. This can be used to, say, capture
timings, commandfailure tracebacks, etc., making the output available
to other command-line tools.
Pierre-Yves David - April 5, 2016, 5:42 p.m.
On 04/05/2016 10:31 AM, Martijn Pieters wrote:
> # HG changeset patch
> # User Martijn Pieters <mjpieters@fb.com>
> # Date 1459874399 -3600
> #      Tue Apr 05 17:39:59 2016 +0100
> # Node ID ee2dd1a3f2e15b92b4d7f525443e442c08399ec1
> # Parent  1e97bcbb87767e1a96909900595a0b6a7791407b
> logtoprocess: new experimental extension

We need an entry on the wiki page as a first step:

https://www.mercurial-scm.org/wiki/ExperimentalExtensionsPlan#Current_Candidate_extensions
Augie Fackler - April 5, 2016, 6:10 p.m.
On Tue, Apr 05, 2016 at 06:31:08PM +0100, Martijn Pieters wrote:
> # HG changeset patch
> # User Martijn Pieters <mjpieters@fb.com>
> # Date 1459874399 -3600
> #      Tue Apr 05 17:39:59 2016 +0100
> # Node ID ee2dd1a3f2e15b92b4d7f525443e442c08399ec1
> # Parent  1e97bcbb87767e1a96909900595a0b6a7791407b
> logtoprocess: new experimental extension

OOC, is this part of the grand blackbox-replacement plan that I
vaguely remember from the sprint?

(haven't looked closely at this yet)


>
> This extension lets you direct specific ui.log() information to shell
> commands, spawned in the background. This can be used to, say, capture
> timings, commandfailure tracebacks, etc., making the output available
> to other command-line tools.
>
> diff --git a/hgext/logtoprocess.py b/hgext/logtoprocess.py
> new file mode 100644
> --- /dev/null
> +++ b/hgext/logtoprocess.py
> @@ -0,0 +1,127 @@
> +# logtoprocess.py - send ui.log() data to a subprocess
> +#
> +# Copyright 2016 Facebook, Inc.
> +#
> +# This software may be used and distributed according to the terms of the
> +# GNU General Public License version 2 or any later version.
> +"""Send ui.log() data to a subprocess (EXPERIMENTAL)
> +
> +This extension lets you specify a shell command per ui.log() event,
> +sending all remaining arguments to as environment variables to that command.
> +
> +Each positional argument to the method results in a `MSG[N]` key in the
> +environment, starting at 1 (so `MSG1`, `MSG2`, etc.). Each keyword argument
> +is set as a `OPT_UPPERCASE_KEY` variable (so the key is uppercased, and
> +prefixed with `OPT_`). The original event name is passed in the `EVENT`
> +environment variable, and the process ID of mercurial is given in `HGPID`.
> +
> +So given a call `ui.log('foo', 'bar', 'baz', spam='eggs'), a script configured
> +for the `foo` event can expect an environment with `MSG1=bar`, `MSG2=baz`, and
> +`OPT_SPAM=eggs`.
> +
> +Scripts are configured in the `[logtoprocess]` section, each key an event name.
> +For example::
> +
> +  [logtoprocess]
> +  commandexception = echo "$MSG2$MSG3" > /var/log/mercurial_exceptions.log
> +
> +would log the warning message and traceback of any failed command dispatch.
> +
> +Scripts are run asychronously as detached daemon processes; mercurial will
> +not ensure that they exit cleanly.
> +
> +"""
> +
> +import os
> +import platform
> +import subprocess
> +import sys
> +
> +from itertools import chain
> +
> +# Note for extension authors: ONLY specify testedwith = 'internal' for
> +# extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
> +# be specifying the version(s) of Mercurial they are tested with, or
> +# leave the attribute unspecified.
> +testedwith = 'internal'
> +
> +def uisetup(ui):
> +    if platform.system() == 'Windows':
> +        # no fork on Windows, but we can create a detached process
> +        # https://msdn.microsoft.com/en-us/library/windows/desktop/ms684863(v=vs.85).aspx
> +        # No stdlib constant exists for this value
> +        DETACHED_PROCESS = 0x00000008
> +        _creationflags = DETACHED_PROCESS | subprocess.CREATE_NEW_PROCESS_GROUP
> +
> +        def runshellcommand(script, env):
> +            # we can't use close_fds *and* redirect stdin. I'm not sure that we
> +            # need to because the detached process has no console connection.
> +            subprocess.Popen(
> +                script, shell=True, env=env, close_fds=True,
> +                creationflags=_creationflags)
> +    else:
> +        def runshellcommand(script, env):
> +            # double-fork to completely detach from the parent process
> +            # based on http://code.activestate.com/recipes/278731
> +            pid = os.fork()
> +            if pid:
> +                # parent
> +                return
> +            # subprocess.Popen() forks again, all we need to add is
> +            # flag the new process as a new session.
> +            if sys.version_info < (3, 2):
> +                newsession = {'preexec_fn': os.setsid}
> +            else:
> +                newsession = {'start_new_session': True}
> +            try:
> +                # connect stdin to devnull to make sure the subprocess can't
> +                # muck up that stream for mercurial.
> +                subprocess.Popen(
> +                    script, shell=True, stdin=open(os.devnull, 'r'), env=env,
> +                    close_fds=True, **newsession)
> +            finally:
> +                # mission accomplished, this child needs to exit and not
> +                # continue the hg process here.
> +                os._exit(0)
> +
> +    class logtoprocessui(ui.__class__):
> +        def log(self, event, *msg, **opts):
> +            """Map log events to external commands
> +
> +            Arguments are passed on as environment variables.
> +
> +            """
> +            script = ui.config('logtoprocess', event)
> +            if script:
> +                if msg:
> +                    # try to format the log message given the remaining
> +                    # arguments
> +                    try:
> +                        # Python string formatting with % either uses a
> +                        # dictionary *or* tuple, but not both. If we have
> +                        # keyword options, assume we need a mapping.
> +                        formatted = msg[0] % (opts or msg[1:])
> +                    except (TypeError, KeyError):
> +                        # Failed to apply the arguments, ignore
> +                        formatted = msg[0]
> +                    messages = (formatted,) + msg[1:]
> +                else:
> +                    messages = msg
> +                # positional arguments are listed as MSG[N] keys in the
> +                # environment
> +                msgpairs = (
> +                    ('MSG{0:d}'.format(i), str(m))
> +                    for i, m in enumerate(messages, 1))
> +                # keyword arguments get prefixed with OPT_ and uppercased
> +                optpairs = (
> +                    ('OPT_{0}'.format(key.upper()), str(value))
> +                    for key, value in opts.iteritems())
> +                env = dict(chain(os.environ.items(), msgpairs, optpairs),
> +                           EVENT=event, HGPID=str(os.getpid()))
> +                # Connect stdin to /dev/null to prevent child processes messing
> +                # with mercurial's stdin.
> +                runshellcommand(script, env)
> +            return super(logtoprocessui, self).log(event, *msg, **opts)
> +
> +    # Replace the class for this instance and all clones created from it:
> +    ui.__class__ = logtoprocessui
> diff --git a/tests/test-logtoprocess.t b/tests/test-logtoprocess.t
> new file mode 100644
> --- /dev/null
> +++ b/tests/test-logtoprocess.t
> @@ -0,0 +1,54 @@
> +Test if logtoprocess correctly captures command-related log calls.
> +
> +  $ hg init
> +  $ cat > $TESTTMP/foocommand.py << EOF
> +  > from mercurial import cmdutil
> +  > from time import sleep
> +  > cmdtable = {}
> +  > command = cmdutil.command(cmdtable)
> +  > @command('foo', [])
> +  > def foo(ui, repo):
> +  >     ui.log('foo', 'a message: %(bar)s\n', bar='spam')
> +  > EOF
> +  $ cat >> $HGRCPATH << EOF
> +  > [extensions]
> +  > logtoprocess=
> +  > foocommand=$TESTTMP/foocommand.py
> +  > [logtoprocess]
> +  > command=echo 'logtoprocess command output:';
> +  >     echo "\$EVENT";
> +  >     echo "\$MSG1";
> +  >     echo "\$MSG2"
> +  > commandfinish=echo 'logtoprocess commandfinish output:';
> +  >     echo "\$EVENT";
> +  >     echo "\$MSG1";
> +  >     echo "\$MSG2";
> +  >     echo "\$MSG3"
> +  > foo=echo 'logtoprocess foo output:';
> +  >     echo "\$EVENT";
> +  >     echo "\$MSG1";
> +  >     echo "\$OPT_BAR"
> +  > EOF
> +
> +Running a command triggers both a ui.log('command') and a
> +ui.log('commandfinish') call. The foo command also uses ui.log.
> +
> +Use head to ensure we wait for all lines to be produced, and sort to avoid
> +ordering issues between the various processes we spawn:
> +  $ hg foo | head -n 17 | sort
> +
> +
> +
> +  0
> +  a message: spam
> +  command
> +  commandfinish
> +  foo
> +  foo
> +  foo
> +  foo
> +  foo exited 0 after * seconds (glob)
> +  logtoprocess command output:
> +  logtoprocess commandfinish output:
> +  logtoprocess foo output:
> +  spam
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Martijn Pieters - April 5, 2016, 6:14 p.m.
On 5 April 2016 at 18:42, Pierre-Yves David
<pierre-yves.david@ens-lyon.org> wrote:
> We need an entry on the wiki page as a first step:
>
> https://www.mercurial-scm.org/wiki/ExperimentalExtensionsPlan#Current_Candidate_extensions

Done.
Martijn Pieters - April 5, 2016, 6:16 p.m.
On 5 April 2016 at 19:10, Augie Fackler <raf@durin42.com> wrote:
> OOC, is this part of the grand blackbox-replacement plan that I
> vaguely remember from the sprint?
>
> (haven't looked closely at this yet)

This is one part of our internal error logging extension that I ripped
apart into constituent parts. I have an email in draft to talk more
about this.

I deliberately kept this separate from blackbox as I felt the goals
where different, but if people want to see this as part of blackbox
anyway then we can discuss that option.
Jun Wu - April 5, 2016, 6:35 p.m.
On 04/05/2016 07:10 PM, Augie Fackler wrote:
> OOC, is this part of the grand blackbox-replacement plan that I
> vaguely remember from the sprint?

Let me help you recall :)

At the spring you showed some interest about the errorredirect extension that 
allows a custom external script to send the crash log remotely. I said I will
upstream the extension.

Soon after, we found we actually want a more general extension that "pipe"s
different kinds of logs to external scripts. And then:

Mar 21 16:49:52 <junw>	durin42, about the errorredirect extension, we are 
making it more general so it will be basically allowing hooks for different 
"event"s of ui.log
Mar 21 16:53:09 <junw>	so we can put things like ui.log(service='remotelog', 
...) to get more idea about performance
Mar 21 16:53:45 <junw>	Martijn is working on this (making errorredirect more 
general)
Mar 21 17:21:23 <durin42>	junw: awesome
Augie Fackler - April 5, 2016, 6:36 p.m.
On Tue, Apr 5, 2016 at 2:35 PM, Jun Wu <quark@fb.com> wrote:
> On 04/05/2016 07:10 PM, Augie Fackler wrote:
>>
>> OOC, is this part of the grand blackbox-replacement plan that I
>> vaguely remember from the sprint?
>
>
> Let me help you recall :)
>
> At the spring you showed some interest about the errorredirect extension
> that allows a custom external script to send the crash log remotely. I said
> I will
> upstream the extension.
>
> Soon after, we found we actually want a more general extension that "pipe"s
> different kinds of logs to external scripts. And then:
>
> Mar 21 16:49:52 <junw>  durin42, about the errorredirect extension, we are
> making it more general so it will be basically allowing hooks for different
> "event"s of ui.log
> Mar 21 16:53:09 <junw>  so we can put things like
> ui.log(service='remotelog', ...) to get more idea about performance
> Mar 21 16:53:45 <junw>  Martijn is working on this (making errorredirect
> more general)
> Mar 21 17:21:23 <durin42>       junw: awesome

Awesome. Thanks for doing my homework for me. :)
Matt Mackall - April 14, 2016, 8:40 p.m.
On Tue, 2016-04-05 at 18:31 +0100, Martijn Pieters wrote:
> # HG changeset patch
> # User Martijn Pieters <mjpieters@fb.com>
> # Date 1459874399 -3600
> #      Tue Apr 05 17:39:59 2016 +0100
> # Node ID ee2dd1a3f2e15b92b4d7f525443e442c08399ec1
> # Parent  1e97bcbb87767e1a96909900595a0b6a7791407b
> logtoprocess: new experimental extension

Queued for default, thanks. Check-code says hi.

-- 
Mathematics is the supreme nostalgia of our time.
Matt Harbison - April 16, 2016, 3:43 a.m.
On Tue, 05 Apr 2016 13:31:08 -0400, Martijn Pieters <mj@zopatista.com>  
wrote:

> # HG changeset patch
> # User Martijn Pieters <mjpieters@fb.com>
> # Date 1459874399 -3600
> #      Tue Apr 05 17:39:59 2016 +0100
> # Node ID ee2dd1a3f2e15b92b4d7f525443e442c08399ec1
> # Parent  1e97bcbb87767e1a96909900595a0b6a7791407b
> logtoprocess: new experimental extension
>
> This extension lets you direct specific ui.log() information to shell
> commands, spawned in the background. This can be used to, say, capture
> timings, commandfailure tracebacks, etc., making the output available
> to other command-line tools.
>
...

> +
> +Running a command triggers both a ui.log('command') and a
> +ui.log('commandfinish') call. The foo command also uses ui.log.
> +
> +Use head to ensure we wait for all lines to be produced, and sort to  
> avoid
> +ordering issues between the various processes we spawn:
> +  $ hg foo | head -n 17 | sort
> +
> +
> +
> +  0
> +  a message: spam
> +  command
> +  commandfinish
> +  foo
> +  foo
> +  foo
> +  foo
> +  foo exited 0 after * seconds (glob)
> +  logtoprocess command output:
> +  logtoprocess commandfinish output:
> +  logtoprocess foo output:
> +  spam

FWIW, I'm not seeing any output from this command on Windows 7.  The exit  
code is still 0.
Martijn Pieters - April 18, 2016, 10:32 a.m.
On 16 April 2016 at 04:43, Matt Harbison <mharbison72@gmail.com> wrote:
> FWIW, I'm not seeing any output from this command on Windows 7.  The exit
> code is still 0.

Ah, yes, that'd be because we explicitly disconnect the subprocess
entirely on Windows (there is no forking option). Suggestions on how
to improve on this are most welcome!

Patch

diff --git a/hgext/logtoprocess.py b/hgext/logtoprocess.py
new file mode 100644
--- /dev/null
+++ b/hgext/logtoprocess.py
@@ -0,0 +1,127 @@ 
+# logtoprocess.py - send ui.log() data to a subprocess
+#
+# Copyright 2016 Facebook, Inc.
+#
+# This software may be used and distributed according to the terms of the
+# GNU General Public License version 2 or any later version.
+"""Send ui.log() data to a subprocess (EXPERIMENTAL)
+
+This extension lets you specify a shell command per ui.log() event,
+sending all remaining arguments to as environment variables to that command.
+
+Each positional argument to the method results in a `MSG[N]` key in the
+environment, starting at 1 (so `MSG1`, `MSG2`, etc.). Each keyword argument
+is set as a `OPT_UPPERCASE_KEY` variable (so the key is uppercased, and
+prefixed with `OPT_`). The original event name is passed in the `EVENT`
+environment variable, and the process ID of mercurial is given in `HGPID`.
+
+So given a call `ui.log('foo', 'bar', 'baz', spam='eggs'), a script configured
+for the `foo` event can expect an environment with `MSG1=bar`, `MSG2=baz`, and
+`OPT_SPAM=eggs`.
+
+Scripts are configured in the `[logtoprocess]` section, each key an event name.
+For example::
+
+  [logtoprocess]
+  commandexception = echo "$MSG2$MSG3" > /var/log/mercurial_exceptions.log
+
+would log the warning message and traceback of any failed command dispatch.
+
+Scripts are run asychronously as detached daemon processes; mercurial will
+not ensure that they exit cleanly.
+
+"""
+
+import os
+import platform
+import subprocess
+import sys
+
+from itertools import chain
+
+# Note for extension authors: ONLY specify testedwith = 'internal' for
+# extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
+# be specifying the version(s) of Mercurial they are tested with, or
+# leave the attribute unspecified.
+testedwith = 'internal'
+
+def uisetup(ui):
+    if platform.system() == 'Windows':
+        # no fork on Windows, but we can create a detached process
+        # https://msdn.microsoft.com/en-us/library/windows/desktop/ms684863(v=vs.85).aspx
+        # No stdlib constant exists for this value
+        DETACHED_PROCESS = 0x00000008
+        _creationflags = DETACHED_PROCESS | subprocess.CREATE_NEW_PROCESS_GROUP
+
+        def runshellcommand(script, env):
+            # we can't use close_fds *and* redirect stdin. I'm not sure that we
+            # need to because the detached process has no console connection.
+            subprocess.Popen(
+                script, shell=True, env=env, close_fds=True,
+                creationflags=_creationflags)
+    else:
+        def runshellcommand(script, env):
+            # double-fork to completely detach from the parent process
+            # based on http://code.activestate.com/recipes/278731
+            pid = os.fork()
+            if pid:
+                # parent
+                return
+            # subprocess.Popen() forks again, all we need to add is
+            # flag the new process as a new session.
+            if sys.version_info < (3, 2):
+                newsession = {'preexec_fn': os.setsid}
+            else:
+                newsession = {'start_new_session': True}
+            try:
+                # connect stdin to devnull to make sure the subprocess can't
+                # muck up that stream for mercurial.
+                subprocess.Popen(
+                    script, shell=True, stdin=open(os.devnull, 'r'), env=env,
+                    close_fds=True, **newsession)
+            finally:
+                # mission accomplished, this child needs to exit and not
+                # continue the hg process here.
+                os._exit(0)
+
+    class logtoprocessui(ui.__class__):
+        def log(self, event, *msg, **opts):
+            """Map log events to external commands
+
+            Arguments are passed on as environment variables.
+
+            """
+            script = ui.config('logtoprocess', event)
+            if script:
+                if msg:
+                    # try to format the log message given the remaining
+                    # arguments
+                    try:
+                        # Python string formatting with % either uses a
+                        # dictionary *or* tuple, but not both. If we have
+                        # keyword options, assume we need a mapping.
+                        formatted = msg[0] % (opts or msg[1:])
+                    except (TypeError, KeyError):
+                        # Failed to apply the arguments, ignore
+                        formatted = msg[0]
+                    messages = (formatted,) + msg[1:]
+                else:
+                    messages = msg
+                # positional arguments are listed as MSG[N] keys in the
+                # environment
+                msgpairs = (
+                    ('MSG{0:d}'.format(i), str(m))
+                    for i, m in enumerate(messages, 1))
+                # keyword arguments get prefixed with OPT_ and uppercased
+                optpairs = (
+                    ('OPT_{0}'.format(key.upper()), str(value))
+                    for key, value in opts.iteritems())
+                env = dict(chain(os.environ.items(), msgpairs, optpairs),
+                           EVENT=event, HGPID=str(os.getpid()))
+                # Connect stdin to /dev/null to prevent child processes messing
+                # with mercurial's stdin.
+                runshellcommand(script, env)
+            return super(logtoprocessui, self).log(event, *msg, **opts)
+
+    # Replace the class for this instance and all clones created from it:
+    ui.__class__ = logtoprocessui
diff --git a/tests/test-logtoprocess.t b/tests/test-logtoprocess.t
new file mode 100644
--- /dev/null
+++ b/tests/test-logtoprocess.t
@@ -0,0 +1,54 @@ 
+Test if logtoprocess correctly captures command-related log calls.
+
+  $ hg init
+  $ cat > $TESTTMP/foocommand.py << EOF
+  > from mercurial import cmdutil
+  > from time import sleep
+  > cmdtable = {}
+  > command = cmdutil.command(cmdtable)
+  > @command('foo', [])
+  > def foo(ui, repo):
+  >     ui.log('foo', 'a message: %(bar)s\n', bar='spam')
+  > EOF
+  $ cat >> $HGRCPATH << EOF
+  > [extensions]
+  > logtoprocess=
+  > foocommand=$TESTTMP/foocommand.py
+  > [logtoprocess]
+  > command=echo 'logtoprocess command output:';
+  >     echo "\$EVENT";
+  >     echo "\$MSG1";
+  >     echo "\$MSG2"
+  > commandfinish=echo 'logtoprocess commandfinish output:';
+  >     echo "\$EVENT";
+  >     echo "\$MSG1";
+  >     echo "\$MSG2";
+  >     echo "\$MSG3"
+  > foo=echo 'logtoprocess foo output:';
+  >     echo "\$EVENT";
+  >     echo "\$MSG1";
+  >     echo "\$OPT_BAR"
+  > EOF
+
+Running a command triggers both a ui.log('command') and a
+ui.log('commandfinish') call. The foo command also uses ui.log.
+
+Use head to ensure we wait for all lines to be produced, and sort to avoid 
+ordering issues between the various processes we spawn:
+  $ hg foo | head -n 17 | sort
+  
+  
+  
+  0
+  a message: spam
+  command
+  commandfinish
+  foo
+  foo
+  foo
+  foo
+  foo exited 0 after * seconds (glob)
+  logtoprocess command output:
+  logtoprocess commandfinish output:
+  logtoprocess foo output:
+  spam