Patchwork [4,of,6] plan9: prevent potential wait()

login
register
mail settings
Submitter jas@corpus-callosum.com
Date Aug. 12, 2013, 11:01 p.m.
Message ID <b55c76e1f4c1726e2a53.1376348470@acme.buf.io>
Download mbox | patch
Permalink /patch/2172/
State Changes Requested
Headers show

Comments

jas@corpus-callosum.com - Aug. 12, 2013, 11:01 p.m.
# HG changeset patch
# User Jeff Sickel <jas@corpus-callosum.com>
# Date 1376347561 18000
#      Mon Aug 12 17:46:01 2013 -0500
# Branch stable
# Node ID b55c76e1f4c1726e2a537efa1b99c360626ac23d
# Parent  c92381647a5b63b85355bf62a60df8d13fc5f858
plan9: prevent potential wait()
Augie Fackler - Aug. 13, 2013, 8:58 p.m.
On Mon, Aug 12, 2013 at 06:01:10PM -0500, Jeff Sickel wrote:
> # HG changeset patch
> # User Jeff Sickel <jas@corpus-callosum.com>
> # Date 1376347561 18000
> #      Mon Aug 12 17:46:01 2013 -0500
> # Branch stable
> # Node ID b55c76e1f4c1726e2a537efa1b99c360626ac23d
> # Parent  c92381647a5b63b85355bf62a60df8d13fc5f858
> plan9: prevent potential wait()

Not sure I understand what the value of this is - how does this wait()
for you, rather than being a (trivial, stupidly cheap) arithmetic
operation?

>
> diff -r c92381647a5b -r b55c76e1f4c1 mercurial/worker.py
> --- a/mercurial/worker.py	Mon Aug 12 17:44:31 2013 -0500
> +++ b/mercurial/worker.py	Mon Aug 12 17:46:01 2013 -0500
> @@ -48,6 +48,9 @@
>  def worthwhile(ui, costperop, nops):
>      '''try to determine whether the benefit of multiple processes can
>      outweigh the cost of starting them'''
> +    # this trivial calculation does not benefit plan 9
> +    if sys.platform == 'plan9':
> +        return 0
>      linear = costperop * nops
>      workers = _numworkers(ui)
>      benefit = linear - (_startupcost * workers + linear / workers)
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
jas@corpus-callosum.com - Aug. 14, 2013, 2:20 a.m.
A small number, less than 130 forked children, may be
beneficial on Plan 9 (see following test script).  But
under certain cases like cloning a large repository or
updating something that produces a significant number of
children, we end up with an OSError exception that will
cause hg to rollback the transaction and delete any
changes written to the work area.  So until we have
a solution that lets us fork() thousands of children
like a clone of cpython would do in the worker calculation,
we just punt and don't fork any.

acme# cat children.py
#!/bin/python
import sys, os
import errno

def usage():
    help = "usage: children.py N\n"
    help += "   N is the number of forked childred you want to wait for in the run.\n"
    print help
    sys.exit(1)

def run(n):
    pids = []
    rfd, wfd = os.pipe()
    
    for i in range(n):
        pid = os.fork()
        if pid == 0:
            # print 'child %d' % (os.getpid())
            os._exit(0)
        pids.append(pid)
    
    for i in pids:
        try:
            w = os.wait()
        except OSError, e:
            print e

if __name__ == '__main__':
    args = len(sys.argv)
    if args == 1:
        print "starting 10 children"
        run(10)
    elif args == 2:
        n = sys.argv[1]
        if n.isdigit():
            print "starting %s children" % (n)
            run(int(n))
            sys.exit(0)
        usage()
    else:
        usage()






acme# for(i in `{seq 128 132}) python children.py $i
starting 128 children
starting 129 children
starting 130 children
[Errno 6] No children
starting 131 children
[Errno 6] No children
[Errno 6] No children
starting 132 children
[Errno 6] No children
[Errno 6] No children
[Errno 6] No children



On Aug 13, 2013, at 3:58 PM, Augie Fackler <raf@durin42.com> wrote:

> On Mon, Aug 12, 2013 at 06:01:10PM -0500, Jeff Sickel wrote:
>> # HG changeset patch
>> # User Jeff Sickel <jas@corpus-callosum.com>
>> # Date 1376347561 18000
>> #      Mon Aug 12 17:46:01 2013 -0500
>> # Branch stable
>> # Node ID b55c76e1f4c1726e2a537efa1b99c360626ac23d
>> # Parent  c92381647a5b63b85355bf62a60df8d13fc5f858
>> plan9: prevent potential wait()
> 
> Not sure I understand what the value of this is - how does this wait()
> for you, rather than being a (trivial, stupidly cheap) arithmetic
> operation?
> 
>> 
>> diff -r c92381647a5b -r b55c76e1f4c1 mercurial/worker.py
>> --- a/mercurial/worker.py	Mon Aug 12 17:44:31 2013 -0500
>> +++ b/mercurial/worker.py	Mon Aug 12 17:46:01 2013 -0500
>> @@ -48,6 +48,9 @@
>> def worthwhile(ui, costperop, nops):
>>     '''try to determine whether the benefit of multiple processes can
>>     outweigh the cost of starting them'''
>> +    # this trivial calculation does not benefit plan 9
>> +    if sys.platform == 'plan9':
>> +        return 0
>>     linear = costperop * nops
>>     workers = _numworkers(ui)
>>     benefit = linear - (_startupcost * workers + linear / workers)
>> _______________________________________________
>> Mercurial-devel mailing list
>> Mercurial-devel@selenic.com
>> http://selenic.com/mailman/listinfo/mercurial-devel
Augie Fackler - Aug. 15, 2013, 3:06 p.m.
On Aug 13, 2013, at 10:20 PM, Jeff Sickel <jas@corpus-callosum.com> wrote:

> A small number, less than 130 forked children, may be
> beneficial on Plan 9 (see following test script).  But
> under certain cases like cloning a large repository or
> updating something that produces a significant number of
> children, we end up with an OSError exception that will
> cause hg to rollback the transaction and delete any
> changes written to the work area.  So until we have
> a solution that lets us fork() thousands of children
> like a clone of cpython would do in the worker calculation,
> we just punt and don't fork any.

What I'm hearing as I read this is we need a limit of how many workers to fork, and on plan9 that should be ~127.
erik quanstrom - Aug. 15, 2013, 5:01 p.m.
On Thu Aug 15 11:06:45 EDT 2013, raf@durin42.com wrote:
> 
> On Aug 13, 2013, at 10:20 PM, Jeff Sickel <jas@corpus-callosum.com> wrote:
> 
> > A small number, less than 130 forked children, may be
> > beneficial on Plan 9 (see following test script).  But
> > under certain cases like cloning a large repository or
> > updating something that produces a significant number of
> > children, we end up with an OSError exception that will
> > cause hg to rollback the transaction and delete any
> > changes written to the work area.  So until we have
> > a solution that lets us fork() thousands of children
> > like a clone of cpython would do in the worker calculation,
> > we just punt and don't fork any.
> 
> What I'm hearing as I read this is we need a limit of how many workers to fork, and on plan9 that should be ~127.

yes.

plan 9 will store a maximum of 128 wait messages, so if
a scheme relies on wait messages and might have a burst of
>128 exiting children with no oppertunity to clean them up,
then the scheme can be confused by the lack of wait messages.

- erik
Matt Mackall - Aug. 15, 2013, 9:59 p.m.
On Tue, 2013-08-13 at 21:20 -0500, Jeff Sickel wrote:
> A small number, less than 130 forked children, may be
> beneficial on Plan 9 (see following test script).

These are concurrent children, yes? The worker code should start up
children proportional to CPUs. How many CPUs do you have?
erik quanstrom - Aug. 15, 2013, 11:18 p.m.
On Thu Aug 15 18:06:04 EDT 2013, mpm@selenic.com wrote:
> On Tue, 2013-08-13 at 21:20 -0500, Jeff Sickel wrote:
> > A small number, less than 130 forked children, may be
> > beneficial on Plan 9 (see following test script).
> 
> These are concurrent children, yes? The worker code should start up
> children proportional to CPUs. How many CPUs do you have?

countcpus() will return 1.  os.name is 'posix'.

- erik

Patch

diff -r c92381647a5b -r b55c76e1f4c1 mercurial/worker.py
--- a/mercurial/worker.py	Mon Aug 12 17:44:31 2013 -0500
+++ b/mercurial/worker.py	Mon Aug 12 17:46:01 2013 -0500
@@ -48,6 +48,9 @@ 
 def worthwhile(ui, costperop, nops):
     '''try to determine whether the benefit of multiple processes can
     outweigh the cost of starting them'''
+    # this trivial calculation does not benefit plan 9
+    if sys.platform == 'plan9':
+        return 0
     linear = costperop * nops
     workers = _numworkers(ui)
     benefit = linear - (_startupcost * workers + linear / workers)