Patchwork [11,of,11] merge: don't fiddle with name lookups or i18n in hot loops

login
register
mail settings
Submitter Idan Kamara
Date Feb. 12, 2013, 9:26 a.m.
Message ID <CAMz0A7=2fMP4y7aWK2ePBn8nNcKKUf2OSXp7gjoE3YuSABfOUA@mail.gmail.com>
Download mbox | patch
Permalink /patch/965/
State Superseded
Commit 8048c519dc6a5f12d10cfdff155c6d8aac295b45
Headers show

Comments

Idan Kamara - Feb. 12, 2013, 9:26 a.m.
On Tue, Feb 12, 2013 at 3:02 AM, Bryan O'Sullivan <bos@serpentine.com>
wrote:
>
> On Sun, Feb 10, 2013 at 3:25 AM, Idan Kamara <idankk86@gmail.com> wrote:
>>
>> No, he didn't. And had he given me the chance to reply to his last
>> response, I would have shown it:
>
>
> Yep, that's a bug. But it's just a bug - I fixed another one in the new
> parallel code just a little while ago.

So we should exit on all exceptions, but we can't print a traceback
since that's not desirable when not running from the command line
(and it wasn't readable anyway since they got interleaved in one
another).

Maybe we should serialize the exceptions to the master and have
him decide what to do (e.g. reraise)?

>
>>
>> And the only reason the sys.exit I had a problem with is
>> 'working' is thanks to this:
>>
>> http://selenic.com/repo/hg/file/0027a5cec9d0/mercurial/dispatch.py#l201
>
>
> Correct, and that's deliberate. Perhaps you could tell me what you'd
> rather see, because I obviously think the use of sys.exit in this context
is
> reasonable based on what I currently know.

I think raising util.Abort instead should do it.

I wrote this test to check these things:

import unittest, silenttestrunner, os

from mercurial import worker, ui, util

class testworker(unittest.TestCase):
    def testchilderror(self):
        def workerfunc(x):
            raise util.Abort

        g = worker._platformworker(ui.ui(), workerfunc, (), [1])
        self.assertRaises(util.Abort, g.next)

if __name__ == '__main__':
    silenttestrunner.main(__name__)

It will currently fail on a StopIteration because the child doesn't exit,
making it call g.next on a finished generator. So doing this fixes it:
Isaac Jurado - Feb. 12, 2013, 10:12 a.m.
On Tue, Feb 12, 2013 at 10:26 AM, Idan Kamara <idankk86@gmail.com> wrote:
> On Tue, Feb 12, 2013 at 3:02 AM, Bryan O'Sullivan <bos@serpentine.com>
> wrote:
>>
>> On Sun, Feb 10, 2013 at 3:25 AM, Idan Kamara <idankk86@gmail.com> wrote:
>>>
>>> No, he didn't. And had he given me the chance to reply to his last
>>> response, I would have shown it:
>>
>>
>> Yep, that's a bug. But it's just a bug - I fixed another one in the new
>> parallel code just a little while ago.
>
> So we should exit on all exceptions, but we can't print a traceback
> since that's not desirable when not running from the command line
> (and it wasn't readable anyway since they got interleaved in one
> another).
>
> Maybe we should serialize the exceptions to the master and have
> him decide what to do (e.g. reraise)?

That could be a problem if the serialized exception needs more than
512 bytes in size.  The elegance of Bryan's design relies on the fact
that the operating system ensures atomicity for small writes to the
shared pipe.
Idan Kamara - Feb. 12, 2013, 10:35 a.m.
On Tue, Feb 12, 2013 at 12:12 PM, Isaac Jurado <diptongo@gmail.com> wrote:
>
> On Tue, Feb 12, 2013 at 10:26 AM, Idan Kamara <idankk86@gmail.com> wrote:
> > On Tue, Feb 12, 2013 at 3:02 AM, Bryan O'Sullivan <bos@serpentine.com>
> > wrote:
> >>
> >> On Sun, Feb 10, 2013 at 3:25 AM, Idan Kamara <idankk86@gmail.com>
> >> wrote:
> >>>
> >>> No, he didn't. And had he given me the chance to reply to his last
> >>> response, I would have shown it:
> >>
> >>
> >> Yep, that's a bug. But it's just a bug - I fixed another one in the new
> >> parallel code just a little while ago.
> >
> > So we should exit on all exceptions, but we can't print a traceback
> > since that's not desirable when not running from the command line
> > (and it wasn't readable anyway since they got interleaved in one
> > another).
> >
> > Maybe we should serialize the exceptions to the master and have
> > him decide what to do (e.g. reraise)?
>
> That could be a problem if the serialized exception needs more than
> 512 bytes in size.  The elegance of Bryan's design relies on the fact
> that the operating system ensures atomicity for small writes to the
> shared pipe.

Ok, and since we'd like to include the traceback, we'll most likely need
more than 512 bytes. So unless there's a trick to be made here, we'll
need a pipe for every worker.

It might be worth looking at multiprocessing.Pool, which does all this
and has a chance of working sensibly on Windows too (there's a
backport of it to 2.4-5).
Isaac Jurado - Feb. 12, 2013, 11:01 a.m.
On Tue, Feb 12, 2013 at 11:35 AM, Idan Kamara <idankk86@gmail.com> wrote:
> On Tue, Feb 12, 2013 at 12:12 PM, Isaac Jurado <diptongo@gmail.com> wrote:
>>
>> On Tue, Feb 12, 2013 at 10:26 AM, Idan Kamara <idankk86@gmail.com> wrote:
>> >
>> > Maybe we should serialize the exceptions to the master and have
>> > him decide what to do (e.g. reraise)?
>>
>> That could be a problem if the serialized exception needs more than
>> 512 bytes in size.  The elegance of Bryan's design relies on the fact
>> that the operating system ensures atomicity for small writes to the
>> shared pipe.
>
> Ok, and since we'd like to include the traceback, we'll most likely need
> more than 512 bytes. So unless there's a trick to be made here, we'll
> need a pipe for every worker.
>
> It might be worth looking at multiprocessing.Pool, which does all this
> and has a chance of working sensibly on Windows too (there's a
> backport of it to 2.4-5).

As far as I know, the multiprocessing.Pool uses the same shared pipe
design with two locks (read and write) so there is only one process
operating on the pipe at a time.   On win32 there is no write lock,
though.

I've never used the command server, so I need to ask: what does it do
when it raises an exception?
Idan Kamara - Feb. 12, 2013, 11:15 a.m.
On Tue, Feb 12, 2013 at 1:01 PM, Isaac Jurado <diptongo@gmail.com> wrote:
>
> On Tue, Feb 12, 2013 at 11:35 AM, Idan Kamara <idankk86@gmail.com> wrote:
> > On Tue, Feb 12, 2013 at 12:12 PM, Isaac Jurado <diptongo@gmail.com>
> > wrote:
> >>
> >> On Tue, Feb 12, 2013 at 10:26 AM, Idan Kamara <idankk86@gmail.com>
> >> wrote:
> >> >
> >> > Maybe we should serialize the exceptions to the master and have
> >> > him decide what to do (e.g. reraise)?
> >>
> >> That could be a problem if the serialized exception needs more than
> >> 512 bytes in size.  The elegance of Bryan's design relies on the fact
> >> that the operating system ensures atomicity for small writes to the
> >> shared pipe.
> >
> > Ok, and since we'd like to include the traceback, we'll most likely need
> > more than 512 bytes. So unless there's a trick to be made here, we'll
> > need a pipe for every worker.
> >
> > It might be worth looking at multiprocessing.Pool, which does all this
> > and has a chance of working sensibly on Windows too (there's a
> > backport of it to 2.4-5).
>
> As far as I know, the multiprocessing.Pool uses the same shared pipe
> design with two locks (read and write) so there is only one process
> operating on the pipe at a time.   On win32 there is no write lock,
> though.

I tried to check this earlier (by looking at /proc/<child>/fd) and it seemed
like each process in the pool had its own pipe, but either way, it doesn't
matter.

Assuming multiprocessing.Pool "just works" with no drastic performance
loss, it could be beneficial to go that route for obvious reasons (also,
a simple test shows it can transparently raise exceptions that came
from the pool in the master).

>
> I've never used the command server, so I need to ask: what does it do
> when it raises an exception?

It's not specific to the command server. Uncaught exceptions are
handled in dispatch, the module that is used to invoke top level
commands. Then, they're transformed to an exit code which is returned
to the caller: 'hg' simply calls sys.exit with it while other callers use it
differently (command server sends it to the client and goes back to
listening).
Isaac Jurado - Feb. 12, 2013, 3:03 p.m.
On Tue, Feb 12, 2013 at 12:15 PM, Idan Kamara <idankk86@gmail.com> wrote:
> On Tue, Feb 12, 2013 at 1:01 PM, Isaac Jurado <diptongo@gmail.com> wrote:
>> As far as I know, the multiprocessing.Pool uses the same shared pipe
>> design with two locks (read and write) so there is only one process
>> operating on the pipe at a time.   On win32 there is no write lock,
>> though.
>
> I tried to check this earlier (by looking at /proc/<child>/fd) and it seemed
> like each process in the pool had its own pipe, but either way, it doesn't
> matter.

I didn't think about checking that (silly me), I just overviewed the code.

> Assuming multiprocessing.Pool "just works" with no drastic performance
> loss, it could be beneficial to go that route for obvious reasons (also,
> a simple test shows it can transparently raise exceptions that came
> from the pool in the master).

To be honest, the multiprocessing module does some pretty weird
things, like squences of peek + sleep or threading in the parent
process to combine data from different pipes into one queue.  I've
always had  the sensation of not really knowing what's going on behind
the scenes.  And for process management code, that really creeps me
out.  But it's just my humble opinion.

> It's not specific to the command server. Uncaught exceptions are
> handled in dispatch, the module that is used to invoke top level
> commands. Then, they're transformed to an exit code which is returned
> to the caller: 'hg' simply calls sys.exit with it while other callers use it
> differently (command server sends it to the client and goes back to
> listening).

Right, so the communication is not really dropped.  The command server
uses pipes, is that right?
Idan Kamara - Feb. 12, 2013, 3:48 p.m.
On Tue, Feb 12, 2013 at 5:03 PM, Isaac Jurado <diptongo@gmail.com> wrote:
>
> On Tue, Feb 12, 2013 at 12:15 PM, Idan Kamara <idankk86@gmail.com> wrote:
> > On Tue, Feb 12, 2013 at 1:01 PM, Isaac Jurado <diptongo@gmail.com>
> > wrote:
> >> As far as I know, the multiprocessing.Pool uses the same shared pipe
> >> design with two locks (read and write) so there is only one process
> >> operating on the pipe at a time.   On win32 there is no write lock,
> >> though.
> >
> > I tried to check this earlier (by looking at /proc/<child>/fd) and it
> > seemed
> > like each process in the pool had its own pipe, but either way, it
> > doesn't
> > matter.
>
> I didn't think about checking that (silly me), I just overviewed the code.
>
> > Assuming multiprocessing.Pool "just works" with no drastic performance
> > loss, it could be beneficial to go that route for obvious reasons (also,
> > a simple test shows it can transparently raise exceptions that came
> > from the pool in the master).
>
> To be honest, the multiprocessing module does some pretty weird
> things, like squences of peek + sleep or threading in the parent
> process to combine data from different pipes into one queue.  I've
> always had  the sensation of not really knowing what's going on behind
> the scenes.  And for process management code, that really creeps me
> out.  But it's just my humble opinion.

My experience with it is fairly superficial so you could be right. But
it was just a suggestion that going with something that (should be)
fairly well tested and cross-platform could help us in the long run.

Perhaps someone who used it more extensively can weigh in.

>
> > It's not specific to the command server. Uncaught exceptions are
> > handled in dispatch, the module that is used to invoke top level
> > commands. Then, they're transformed to an exit code which is returned
> > to the caller: 'hg' simply calls sys.exit with it while other callers
> > use it
> > differently (command server sends it to the client and goes back to
> > listening).
>
> Right, so the communication is not really dropped.  The command server
> uses pipes, is that right?

Yes, although there's at least one implementation out there (that I
know of) that uses sockets.
Bryan O'Sullivan - Feb. 12, 2013, 4:27 p.m.
On Tue, Feb 12, 2013 at 7:48 AM, Idan Kamara <idankk86@gmail.com> wrote:

> My experience with it is fairly superficial so you could be right. But
> it was just a suggestion that going with something that (should be)
> fairly well tested and cross-platform could help us in the long run.
>

multiprocessing is a non-starter for the simple reason that we still need
to support Python < 2.6. It's also just kind of horrendous in every way,
which I admit sways me.
Idan Kamara - Feb. 12, 2013, 4:29 p.m.
On Tue, Feb 12, 2013 at 6:27 PM, Bryan O'Sullivan <bos@serpentine.com> wrote:
>
> On Tue, Feb 12, 2013 at 7:48 AM, Idan Kamara <idankk86@gmail.com> wrote:
>>
>> My experience with it is fairly superficial so you could be right. But
>> it was just a suggestion that going with something that (should be)
>> fairly well tested and cross-platform could help us in the long run.
>
>
> multiprocessing is a non-starter for the simple reason that we still need
> to support Python < 2.6. It's also just kind of horrendous in every way,
> which I admit sways me.

http://pypi.python.org/pypi/multiprocessing/

Patch

diff --git a/mercurial/worker.py b/mercurial/worker.py
--- a/mercurial/worker.py
+++ b/mercurial/worker.py
@@ -83,7 +83,7 @@ 
                 for i, item in func(*(staticargs + (pargs,))):
                     os.write(wfd, '%d %s\n' % (i, item))
                 os._exit(0)
-            except KeyboardInterrupt:
+            except:
                 os._exit(255)
     os.close(wfd)
     fp = os.fdopen(rfd, 'rb', 0)
@@ -96,7 +96,7 @@ 
         for i in xrange(workers):
             problems |= os.wait()[1]
         if problems:
-            sys.exit(1)
+            raise util.Abort(_('child error'))
     try:
         for line in fp:
             l = line.split(' ', 1)