Submitter | Bryan O'Sullivan |
---|---|
Date | Jan. 22, 2016, 7 p.m. |
Message ID | <6bb64e83ac188ee99528.1453489217@bryano-mbp.local> |
Download | mbox | patch |
Permalink | /patch/12868/ |
State | Superseded |
Commit | a6833e464b070c2f51edff8d6e55be79f5d9db22 |
Delegated to: | Laurent Charignon |
Headers | show |
Comments
On Fri, Jan 22, 2016 at 2:00 PM, Bryan O'Sullivan <bos@serpentine.com> wrote: > # HG changeset patch > # User Bryan O'Sullivan <bryano@fb.com> > # Date 1453489213 28800 > # Fri Jan 22 11:00:13 2016 -0800 > # Branch stable > # Node ID 6bb64e83ac188ee99528ad0237028f00047ab505 > # Parent 0b752757a0c4f30ac755389014e2af9d2e0b4a3f > run-tests: "fix" race condition in race condition fix > > Laurent's commit 3203dfe341f9 still suffers from a race: by the > time the "job" function tries to assign to channels[channel], that > list has been truncated to empty. The result is that every job > thread raises an IndexError. > > Earlier, I tried an approach of correctly locking channels, but > that caused run-tests to hang on KeyboardInterrupt sometimes. > > This approach is strictly hackier, but seems to actually work > reliably. I really wish this worked :( (py)[timeless@gcc2-power8 crew]$ hg import ../crew0/0 applying ../crew0/0 (py)[timeless@gcc2-power8 crew]$ cd tests/ (py)[timeless@gcc2-power8 tests]$ ./run-tests.py -l -j120 ..ss^C Skipped test-convert-svn-source.t: missing feature: subversion python bindings Skipped test-convert-svn-move.t: missing feature: subversion python bindings # Ran 122 tests, 2 skipped, 0 warned, 0 failed. INTERRUPTED: test-convert-cvs.t (after 2 seconds) INTERRUPTED: test-contrib-check-code.t (after 2 seconds) INTERRUPTED: test-pull-pull-corruption.t (after 2 seconds) INTERRUPTED: test-mv-cp-st-diff.t (after 2 seconds) INTERRUPTED: test-check-code.t (after 2 seconds) INTERRUPTED: test-check-py3-compat.t (after 2 seconds) INTERRUPTED: test-convert-svn-encoding.t (after 2 seconds) INTERRUPTED: test-check-commit.t (after 2 seconds) EINTERRUPTED: test-histedit-fold.t (after 1 seconds) EINTERRUPTED: test-lfconvert.t (after 1 seconds) EINTERRUPTED: test-hgweb-symrev.t (after 2 seconds) INTERRUPTED: test-hgweb-empty.t (after 1 seconds) INTERRUPTED: test-bundle2-exchange.t (after 2 seconds) INTERRUPTED: test-encoding-textwrap.t (after 2 seconds) INTERRUPTED: test-rebase-check-restore.t (after 2 seconds) INTERRUPTED: test-gendoc.t (after 3 seconds) INTERRUPTED: test-obsolete-checkheads.t (after 3 seconds) INTERRUPTED: test-notify.t (after 2 seconds) INTERRUPTED: test-check-execute.t (after 3 seconds) INTERRUPTED: test-contrib-perf.t (after 4 seconds) INTERRUPTED: test-bundle2-format.t (after 4 seconds) INTERRUPTED: test-acl.t (after 5 seconds) INTERRUPTED: test-hgweb-commands.t (after 5 seconds) INTERRUPTED: test-patchbomb.t (after 7 seconds) INTERRUPTED: test-merge-changedelete.t (after 7 seconds) INTERRUPTED: test-merge-force.t (after 10 seconds) INTERRUPTED: test-rename-merge2.t (after 10 seconds) ^C^C^C^C^C^C^C^Z [4]+ Stopped ./run-tests.py -l -j120 (py)[timeless@gcc2-power8 tests]$ hg log -r . changeset: 30698:6bb64e83ac18 branch: stable tag: tip parent: 30636:0b752757a0c4 user: Bryan O'Sullivan <bryano@fb.com> date: Fri Jan 22 11:00:13 2016 -0800 summary: run-tests: "fix" race condition in race condition fix (py)[timeless@gcc2-power8 tests]$ hg log -r . -v changeset: 30698:6bb64e83ac18 branch: stable tag: tip parent: 30636:0b752757a0c4 user: Bryan O'Sullivan <bryano@fb.com> date: Fri Jan 22 11:00:13 2016 -0800 files: tests/run-tests.py description: run-tests: "fix" race condition in race condition fix Laurent's commit 3203dfe341f9 still suffers from a race: by the time the "job" function tries to assign to channels[channel], that list has been truncated to empty. The result is that every job thread raises an IndexError. Earlier, I tried an approach of correctly locking channels, but that caused run-tests to hang on KeyboardInterrupt sometimes. This approach is strictly hackier, but seems to actually work reliably. (py)[timeless@gcc2-power8 tests]$
On 01/22/2016 11:30 AM, timeless wrote: > On Fri, Jan 22, 2016 at 2:00 PM, Bryan O'Sullivan <bos@serpentine.com> wrote: >> # HG changeset patch >> # User Bryan O'Sullivan <bryano@fb.com> >> # Date 1453489213 28800 >> # Fri Jan 22 11:00:13 2016 -0800 >> # Branch stable >> # Node ID 6bb64e83ac188ee99528ad0237028f00047ab505 >> # Parent 0b752757a0c4f30ac755389014e2af9d2e0b4a3f >> run-tests: "fix" race condition in race condition fix >> >> Laurent's commit 3203dfe341f9 still suffers from a race: by the >> time the "job" function tries to assign to channels[channel], that >> list has been truncated to empty. The result is that every job >> thread raises an IndexError. >> >> Earlier, I tried an approach of correctly locking channels, but >> that caused run-tests to hang on KeyboardInterrupt sometimes. >> >> This approach is strictly hackier, but seems to actually work >> reliably. > > I really wish this worked :( How much of your issue is related to "bryan patch failure to fix the race introduced by Laurent" and how much is it of "run-tests.py interruption have been borked since forever"? I've personnally never have been confortable with ctrl^C run-tests.py in years.
On Fri, Jan 22, 2016 at 11:30 AM, timeless <timeless@gmail.com> wrote: > I really wish this worked :( > What kind of system are you using? This patch works reliably on my Mac and Linux boxes.
This is: $ uname -a Linux gcc2-power8.osuosl.org 3.17.4-301.fc21.ppc64le #1 SMP Mon Dec 1 07:51:01 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux $ lsb_release -a LSB Version: :core-4.1-noarch:core-4.1-ppc64le Distributor ID: Fedora Description: Fedora release 21 (Twenty One) Release: 21 Codename: TwentyOne $ virtualenv --version 1.11.6 $ python --version Python 2.7.8 I'm ssh'ing from a mac, but I doubt that's relevant. $ cat /proc/cpuinfo |perl -pne 's/(processor\s*:).*/$1 X/'|sort|uniq -c|sort -n 1 firmware : OPAL v3 1 machine : PowerNV 8247-22L 1 model : 8247-22L 1 platform : PowerNV 1 timebase : 512000000 160 160 clock : 3425.000000MHz 160 cpu : POWER8E (raw), altivec supported 160 processor : X 160 revision : 2.1 (pvr 004b 0201) (X is just a processor number, so there are 160 lines w/ numbers 0..159, not very interesting) On Fri, Jan 22, 2016 at 4:09 PM, Bryan O'Sullivan <bos@serpentine.com> wrote: > > On Fri, Jan 22, 2016 at 11:30 AM, timeless <timeless@gmail.com> wrote: >> >> I really wish this worked :( > > > What kind of system are you using? This patch works reliably on my Mac and > Linux boxes.
On 01/22/2016 11:00 AM, Bryan O'Sullivan wrote: > # HG changeset patch > # User Bryan O'Sullivan <bryano@fb.com> > # Date 1453489213 28800 > # Fri Jan 22 11:00:13 2016 -0800 > # Branch stable > # Node ID 6bb64e83ac188ee99528ad0237028f00047ab505 > # Parent 0b752757a0c4f30ac755389014e2af9d2e0b4a3f > run-tests: "fix" race condition in race condition fix After discussion with Laurent, I'm taking this as this seems strictly better than the current situation. Pushed to the clowncopter.
On Fri, Jan 22, 2016 at 4:03 PM, Pierre-Yves David < pierre-yves.david@ens-lyon.org> wrote: > After discussion with Laurent, I'm taking this as this seems strictly > better than the current situation. > Thanks, I agree with your reasoning :-)
Patch
diff --git a/tests/run-tests.py b/tests/run-tests.py --- a/tests/run-tests.py +++ b/tests/run-tests.py @@ -1526,13 +1526,16 @@ class TestSuite(unittest.TestSuite): channels[channel] = "=" + test.name[5:].split(".")[0] try: test(result) - channels[channel] = '' done.put(None) except KeyboardInterrupt: - channels[channel] = '' + pass except: # re-raises done.put(('!', test, 'run-test raised an error, see traceback')) raise + try: + channels[channel] = '' + except IndexError: + pass def stat(): count = 0