Patchwork D8928: worker: don't expose readinto() on _blockingreader since pickle is picky

login
register
mail settings
Submitter phabricator
Date Aug. 15, 2020, 5:40 a.m.
Message ID <differential-rev-PHID-DREV-dapchx2pc4jwbi2eg3uk-req@mercurial-scm.org>
Download mbox | patch
Permalink /patch/47028/
State Superseded
Headers show

Comments

phabricator - Aug. 15, 2020, 5:40 a.m.
martinvonz created this revision.
Herald added a reviewer: hg-reviewers.
Herald added a subscriber: mercurial-patches.

REVISION SUMMARY
  The `pickle` module expects the input to be buffered and a whole
  object to be available when `pickle.load()` is called, which is not
  necessarily true when we send data from workers back to the parent
  process (i.e., it seems like a bad assumption for the `pickle` module
  to make). We added a workaround for that in
  https://phab.mercurial-scm.org/D8076, which made `read()` continue
  until all the requested bytes have been read.
  
  As we found out at work after a lot of investigation (I've spent the
  last two days on this), the native version of `pickle.load()` has
  started calling `readinto()` on the input since Python 3.8. That
  started being called in
  https://github.com/python/cpython/commit/91f4380cedbae32b49adbea2518014a5624c6523
  (and only by the C version of `pickle.load()`)). Before that, it was
  only `read()` and `readline()` that were called. The problem with that
  was that `readinto()` on our `_blockingreader` was simply delegating
  to the underlying, *unbuffered* object. The symptom we saw was that
  `hg fix` started failing sometimes on Python 3.8 on Mac. I failed very
  relyable in some cases. I still haven't figured out under what
  circumstances it fail and I've been unable to reproduce it in test
  cases (I've tried writing larger amounts of data, using different
  numbers of workers, and making the formatters sleep). I have, however,
  been able to reproduce it a 3-4 times on Linux, but then it stopped
  reproducing on the following few hundred attempts.
  
  To fix the problem, we can simply remove the implementation of
  `readinto()`, since the unpickler will then fall back to calling
  `read()`. (The fallback was added a bit later, in
  https://github.com/python/cpython/commit/b19f7ecfa3adc6ba1544225317b9473649815b38.)

REPOSITORY
  rHG Mercurial

BRANCH
  default

REVISION DETAIL
  https://phab.mercurial-scm.org/D8928

AFFECTED FILES
  mercurial/worker.py

CHANGE DETAILS




To: martinvonz, #hg-reviewers
Cc: mercurial-patches, mercurial-devel

Patch

diff --git a/mercurial/worker.py b/mercurial/worker.py
--- a/mercurial/worker.py
+++ b/mercurial/worker.py
@@ -71,8 +71,8 @@ 
         def __init__(self, wrapped):
             self._wrapped = wrapped
 
-        def __getattr__(self, attr):
-            return getattr(self._wrapped, attr)
+        def readline(self):
+            return self._wrapped.readline()
 
         # issue multiple reads until size is fulfilled
         def read(self, size=-1):