Patchwork [2,of,5,STABLE,V2] worker: wait worker pid explicitly

login
register
mail settings
Submitter Jun Wu
Date July 28, 2016, 8:50 p.m.
Message ID <644e4dfd02fa16e77e55.1469739005@x1c>
Download mbox | patch
Permalink /patch/16000/
State Rejected
Delegated to: Yuya Nishihara
Headers show

Comments

Jun Wu - July 28, 2016, 8:50 p.m.
# HG changeset patch
# User Jun Wu <quark@fb.com>
# Date 1469735480 -3600
#      Thu Jul 28 20:51:20 2016 +0100
# Node ID 644e4dfd02fa16e77e550113cafe4fca3d0a9c69
# Parent  391a26627ecf994c767c01125843184cfed49de4
# Available At https://bitbucket.org/quark-zju/hg-draft
#              hg pull https://bitbucket.org/quark-zju/hg-draft -r 644e4dfd02fa
worker: wait worker pid explicitly

Before this patch, waitforworkers uses os.wait() to collect child workers, and
only wait len(pids) processes. This can have serious issues if other code
spawns new processes and does not reap them: 1. worker.py may get wrong exit
code and kill innocent workers. 2. worker.py may continue without waiting for
all workers to complete.

This patch fixes the issue by using waitpid to wait worker pid explicitly.

However, this patch introduces a new issue: worker failure may not be handled
immediately. The issue will be addressed in next patches.

Patch

diff --git a/mercurial/worker.py b/mercurial/worker.py
--- a/mercurial/worker.py
+++ b/mercurial/worker.py
@@ -95,8 +95,8 @@  def _posixworker(ui, func, staticargs, a
                 if err.errno != errno.ESRCH:
                     raise
     def waitforworkers():
-        for _pid in pids:
-            st = _exitstatus(os.wait()[1])
+        for pid in pids:
+            st = _exitstatus(os.waitpid(pid, 0)[1])
             if st and not problem[0]:
                 problem[0] = st
                 killworkers()