Patchwork [6,of,8,V4] worker: make killworkers safe to be called multiple times

login
register
mail settings
Submitter Jun Wu
Date Nov. 12, 2016, 3:11 a.m.
Message ID <596cee9d5c12dcfa2d27.1478920297@x1c>
Download mbox | patch
Permalink /patch/17512/
State Changes Requested
Headers show

Comments

Jun Wu - Nov. 12, 2016, 3:11 a.m.
# HG changeset patch
# User Jun Wu <quark@fb.com>
# Date 1478919468 0
#      Sat Nov 12 02:57:48 2016 +0000
# Node ID 596cee9d5c12dcfa2d272b5441d5060a26a6440c
# Parent  ca128e326027530210f6509e3ec8b28b97bdce12
# Available At https://bitbucket.org/quark-zju/hg-draft
#              hg pull https://bitbucket.org/quark-zju/hg-draft -r 596cee9d5c12
worker: make killworkers safe to be called multiple times

Previously, calling "killworkers" multiple times may send kill signals to
children multiple times.

This patch uses the property that "set.pop()" is an "atomic test and update"
operation at the Python code level (cannot be interrupted by a Python signal
handler). That property is true for both CPython and PyPy. So even if
"killworkers" are called multiple times, it will only send kill signal to a
child process at most once.

This is needed because "problem[0]" is not tested and set atomically:

    if st and not problem[0]:
        # at this time, another SIGCHILD happens, killworkers may run twice
        problem[0] = st
        killworkers()

Note: we can also use the fact that "if next(itertools.count())" is "atomic"
at the Python code level, to make sure "problem[0]" is tested and set
atomically and "killworkers" only runs once:

    problemcount = itertools.count()
    if st and next(problemcount) == 0:
        problem[0] = st
        killworkers()

However, we cannot do an "atomic" operation around "waitpid" so we may send
an unnecessary kill signal to a waited child. I don't have a good idea on
this. Hopefully it won't cause anything serious.
Yuya Nishihara - Nov. 14, 2016, 1:03 p.m.
On Sat, 12 Nov 2016 03:11:37 +0000, Jun Wu wrote:
> # HG changeset patch
> # User Jun Wu <quark@fb.com>
> # Date 1478919468 0
> #      Sat Nov 12 02:57:48 2016 +0000
> # Node ID 596cee9d5c12dcfa2d272b5441d5060a26a6440c
> # Parent  ca128e326027530210f6509e3ec8b28b97bdce12
> # Available At https://bitbucket.org/quark-zju/hg-draft
> #              hg pull https://bitbucket.org/quark-zju/hg-draft -r 596cee9d5c12
> worker: make killworkers safe to be called multiple times

[...]

> --- a/mercurial/worker.py
> +++ b/mercurial/worker.py
> @@ -89,5 +89,14 @@ def _posixworker(ui, func, staticargs, a
>      def killworkers():
>          # if one worker bails, there's no good reason to wait for the rest
> -        for p in pids:
> +        # note: use "while pids" instead of "for p in pids" as other code may
> +        # get run by a signal handler and change "pids" in the mean time.
> +        while pids:
> +            try:
> +                # note: set.pop() cannot be interrupted by a Python signal
> +                # handler - this is true for both cpython and pypy. we use this
> +                # property to make sure a child receive the signal once.
> +                p = pids.pop()
> +            except KeyError:
> +                break # pids is empty
>              try:
>                  os.kill(p, signal.SIGTERM)

The process "p" is about to be killed but removed from the pids set, which
would mean we can't wait killed processes at cleanup().

> @@ -109,5 +118,8 @@ def _posixworker(ui, func, staticargs, a
>                          raise
>              if p:
> -                pids.remove(p)
> +                try:
> +                    pids.remove(p)
> +                except KeyError: # in case p was poped by "killworkers"
> +                    pass

Patch

diff --git a/mercurial/worker.py b/mercurial/worker.py
--- a/mercurial/worker.py
+++ b/mercurial/worker.py
@@ -89,5 +89,14 @@  def _posixworker(ui, func, staticargs, a
     def killworkers():
         # if one worker bails, there's no good reason to wait for the rest
-        for p in pids:
+        # note: use "while pids" instead of "for p in pids" as other code may
+        # get run by a signal handler and change "pids" in the mean time.
+        while pids:
+            try:
+                # note: set.pop() cannot be interrupted by a Python signal
+                # handler - this is true for both cpython and pypy. we use this
+                # property to make sure a child receive the signal once.
+                p = pids.pop()
+            except KeyError:
+                break # pids is empty
             try:
                 os.kill(p, signal.SIGTERM)
@@ -109,5 +118,8 @@  def _posixworker(ui, func, staticargs, a
                         raise
             if p:
-                pids.remove(p)
+                try:
+                    pids.remove(p)
+                except KeyError: # in case p was poped by "killworkers"
+                    pass
                 st = _exitstatus(st)
             if st and not problem[0]: