Submitter | Matt Harbison |
---|---|
Date | June 17, 2017, 2:02 a.m. |
Message ID | <op.y1yfyexa9lwrgf@envy> |
Download | mbox | patch |
Permalink | /patch/21439/ |
State | Not Applicable |
Headers | show |
Comments
> On Jun 16, 2017, at 22:02, Matt Harbison <mharbison72@gmail.com> wrote: > > On Fri, 16 Jun 2017 09:59:30 -0400, Augie Fackler <raf@durin42.com> wrote: > >> On Fri, Jun 16, 2017 at 12:18:18AM -0400, Matt Harbison wrote: >>> So apparently, this is a symptom of not having %SystemRoot% in the >>> environment when calling CreateProcess(). >>> >>> https://bugs.python.org/issue13524 >>> https://jpassing.com/2009/12/28/the-hidden-danger-of-forgetting-to-specify-systemroot-in-a-custom-environment-block/ >>> >>> I see that setup.py special cases this variable. I did a search for 'env >>> =', and it looks like hooks and pager start with empty environments, so they >>> must not inherit this. IDR if any recent changes were made that start with >>> an empty environment. >>> >>> The thing I can't get my mind around is the hit and miss nature of the >>> error, if this is really the problem. >> >> It sounds like it should be harmless to just always forward >> %SystemRoot% - should we just do that? > > Seems reasonable, but run-tests._getenv() already does an os.environ.copy(), so it should be there? > > It does seem like a good idea to do it for hooks and other things executed, where the environment is built from scratch. The question is where? There's util.popen[2-4](), plus some direct calls to subprocess.Popen(), and an os.system(). I considered util.shellenviron(), but there are far fewer calls to this than places where processes are spawned. My gut (without having gone through the calls) is that we should probably build up a subproc module in hg and standardize the various ways we call things - right now it's kind of messy, and I strongly suspect we only need 2 or 3 functions (naive call, call with stdin/stdout/stderr wired up, call with no stdin but stdout/stderr captured). Absent that cleanup, yeah, just do the necessary cleanup on each callsite - the os.system() call should (I think) forward everything though, so that one at least should be fine... > I just hit it here: > > --- c:/Users/Matt/projects/hg/tests/test-log.t > +++ c:/Users/Matt/projects/hg/tests/test-log.t.err > @@ -1476,72 +1476,64 @@ > $ hg resolve -m foo > (no more unresolved files) > $ hg ci -m "First merge, related" > + abort: Invalid Signature > + [255] > > I was wondering if it's right on the edge of some limit, and paths or whatever being set by the test runner are (sometimes) pushing it out of the environment block. But the docs say there's no limit to the block size: > > https://msdn.microsoft.com/en-us/library/windows/desktop/ms682653(v=vs.85).aspx > >>> _______________________________________________ >>> Mercurial-devel mailing list >>> Mercurial-devel@mercurial-scm.org >>> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
On Sun, 18 Jun 2017 22:19:29 -0400, Augie Fackler wrote: > > > On Jun 16, 2017, at 22:02, Matt Harbison <mharbison72@gmail.com> wrote: > > > > On Fri, 16 Jun 2017 09:59:30 -0400, Augie Fackler <raf@durin42.com> wrote: > > > >> On Fri, Jun 16, 2017 at 12:18:18AM -0400, Matt Harbison wrote: > >>> So apparently, this is a symptom of not having %SystemRoot% in the > >>> environment when calling CreateProcess(). > >>> > >>> https://bugs.python.org/issue13524 > >>> https://jpassing.com/2009/12/28/the-hidden-danger-of-forgetting-to-specify-systemroot-in-a-custom-environment-block/ > >>> > >>> I see that setup.py special cases this variable. I did a search for 'env > >>> =', and it looks like hooks and pager start with empty environments, so they > >>> must not inherit this. IDR if any recent changes were made that start with > >>> an empty environment. > >>> > >>> The thing I can't get my mind around is the hit and miss nature of the > >>> error, if this is really the problem. > >> > >> It sounds like it should be harmless to just always forward > >> %SystemRoot% - should we just do that? > > > > Seems reasonable, but run-tests._getenv() already does an os.environ.copy(), so it should be there? > > > > It does seem like a good idea to do it for hooks and other things executed, where the environment is built from scratch. The question is where? There's util.popen[2-4](), plus some direct calls to subprocess.Popen(), and an os.system(). I considered util.shellenviron(), but there are far fewer calls to this than places where processes are spawned. (+CC foozy since he has Windows) Is the problem only seen in tests? I don't think environment variables are cleared in hg side.
At Tue, 20 Jun 2017 00:30:28 +0900, Yuya Nishihara wrote: > > On Sun, 18 Jun 2017 22:19:29 -0400, Augie Fackler wrote: > > > > > On Jun 16, 2017, at 22:02, Matt Harbison <mharbison72@gmail.com> wrote: > > > > > > On Fri, 16 Jun 2017 09:59:30 -0400, Augie Fackler <raf@durin42.com> wrote: > > > > > >> On Fri, Jun 16, 2017 at 12:18:18AM -0400, Matt Harbison wrote: > > >>> So apparently, this is a symptom of not having %SystemRoot% in the > > >>> environment when calling CreateProcess(). > > >>> > > >>> https://bugs.python.org/issue13524 > > >>> https://jpassing.com/2009/12/28/the-hidden-danger-of-forgetting-to-specify-systemroot-in-a-custom-environment-block/ > > >>> > > >>> I see that setup.py special cases this variable. I did a search for 'env > > >>> =', and it looks like hooks and pager start with empty environments, so they > > >>> must not inherit this. IDR if any recent changes were made that start with > > >>> an empty environment. > > >>> > > >>> The thing I can't get my mind around is the hit and miss nature of the > > >>> error, if this is really the problem. > > >> > > >> It sounds like it should be harmless to just always forward > > >> %SystemRoot% - should we just do that? > > > > > > Seems reasonable, but run-tests._getenv() already does an os.environ.copy(), so it should be there? > > > > > > It does seem like a good idea to do it for hooks and other things executed, where the environment is built from scratch. The question is where? There's util.popen[2-4](), plus some direct calls to subprocess.Popen(), and an os.system(). I considered util.shellenviron(), but there are far fewer calls to this than places where processes are spawned. > > (+CC foozy since he has Windows) > > Is the problem only seen in tests? I don't think environment variables are > cleared in hg side. > I can't reproduce this issue on my Windows VM, yet :-< I'm using 32bit Python 2.7.9 on 64bit Windows7, but is 64bit or more recent Python (like as your environment) required to reproduce this ? BTW, on hg side, there is no code path, which causes replacement or removal of SystemRoot environment variable, AFAIK. - util.popen This involves os.popen(), and always inherits current environment variables. - util.popen2/3/4 This involves subprocess.Popen() with 'env=env', but (1) default value of 'env' is None (= inherit current ones), and (2) all existing callers in sources below invoke without explicit env argument. - hgext/convert/*.py - mercurial/sshpeer.py - util.system This involves subprocess.Popen() with result of shellenviron(environ), which updates existing variables or adds new variables. And there is no replacement of SystemRoot on caller sides. Execution of external hook, editor, and so on uses this API via ui.system(). - os.popen (for other than Mercurial APIs above) This always inherits current environment variables. (not used directly in mercurial/** and hgext/**) - os.system (for other than Mercurial APIs above) This always inherits current environment variables. - hgext/convert/gnuarch.py - hgext/factotum.py - mercurial/statprof.py - mercurial/util.py (for tempfilter) - subprocess.Popen (for other than Mercurial APIs above) - hgext/convert/common.py - hgext/fsmonitor/pywatchman/__init__.py - mercurial/util.py (for pipefilter) no explicit env - hgext/logtoprocess.py - mercurial/subrepo.py using copy of encoding.environ - mercurial/ui.py (for pager) using util.shellenviron() with rcutil.defaultpagerenv() On run-tests.py side, subprocess.Popen() is sometimes used with explicit env argument, but the result of Test._getenv() or None is used in such case. This doesn't cause replacement or removal of SystemRoot (at least, with current implementation). Any of intermediate layers (MinGW or Python ?) causes issue ?
On Mon, 19 Jun 2017 11:30:28 -0400, Yuya Nishihara <yuya@tcha.org> wrote: > On Sun, 18 Jun 2017 22:19:29 -0400, Augie Fackler wrote: >> >> > On Jun 16, 2017, at 22:02, Matt Harbison <mharbison72@gmail.com> >> wrote: >> > >> > On Fri, 16 Jun 2017 09:59:30 -0400, Augie Fackler <raf@durin42.com> >> wrote: >> > >> >> On Fri, Jun 16, 2017 at 12:18:18AM -0400, Matt Harbison wrote: >> >>> So apparently, this is a symptom of not having %SystemRoot% in the >> >>> environment when calling CreateProcess(). >> >>> >> >>> https://bugs.python.org/issue13524 >> >>> >> https://jpassing.com/2009/12/28/the-hidden-danger-of-forgetting-to-specify-systemroot-in-a-custom-environment-block/ >> >>> >> >>> I see that setup.py special cases this variable. I did a search >> for 'env >> >>> =', and it looks like hooks and pager start with empty >> environments, so they >> >>> must not inherit this. IDR if any recent changes were made that >> start with >> >>> an empty environment. >> >>> >> >>> The thing I can't get my mind around is the hit and miss nature of >> the >> >>> error, if this is really the problem. >> >> >> >> It sounds like it should be harmless to just always forward >> >> %SystemRoot% - should we just do that? >> > >> > Seems reasonable, but run-tests._getenv() already does an >> os.environ.copy(), so it should be there? >> > >> > It does seem like a good idea to do it for hooks and other things >> executed, where the environment is built from scratch. The question is >> where? There's util.popen[2-4](), plus some direct calls to >> subprocess.Popen(), and an os.system(). I considered >> util.shellenviron(), but there are far fewer calls to this than places >> where processes are spawned. > > (+CC foozy since he has Windows) > > Is the problem only seen in tests? I don't think environment variables > are > cleared in hg side. I've only ever seen it while running the tests on this one Win7 x64 system. But since it only started in the last few weeks and is rare, I'd say not enough data at this point. I didn't think hg cleared anything, and the get-with-headers.py example in this thread pretty much skips hg code (it does import util.py). I've got a second Windows machine setup (but it's much slower, and really can't deal with much more than --jobs=2), and I added code to run-tests.py and dispatch.py to complain if SystemRoot is not set. I haven't seen it pop up in the last few days (~6-8 full runs).
On Mon, 19 Jun 2017 14:47:22 -0400, FUJIWARA Katsunori <foozy@lares.dti.ne.jp> wrote: > At Tue, 20 Jun 2017 00:30:28 +0900, > Yuya Nishihara wrote: >> >> On Sun, 18 Jun 2017 22:19:29 -0400, Augie Fackler wrote: >> > >> > > On Jun 16, 2017, at 22:02, Matt Harbison <mharbison72@gmail.com> >> wrote: >> > > >> > > On Fri, 16 Jun 2017 09:59:30 -0400, Augie Fackler <raf@durin42.com> >> wrote: >> > > >> > >> On Fri, Jun 16, 2017 at 12:18:18AM -0400, Matt Harbison wrote: >> > >>> So apparently, this is a symptom of not having %SystemRoot% in the >> > >>> environment when calling CreateProcess(). >> > >>> >> > >>> https://bugs.python.org/issue13524 >> > >>> >> https://jpassing.com/2009/12/28/the-hidden-danger-of-forgetting-to-specify-systemroot-in-a-custom-environment-block/ >> (+CC foozy since he has Windows) >> >> Is the problem only seen in tests? I don't think environment variables >> are >> cleared in hg side. >> > > I can't reproduce this issue on my Windows VM, yet :-< > > I'm using 32bit Python 2.7.9 on 64bit Windows7, but is 64bit or more > recent Python (like as your environment) required to reproduce this ? Not sure. I've only tested with 2.7.13 x64. I run with '-t9000 -j9' to burn through faster, which (I think) causes all kinds of occasional one off failures (port in use, etc). > BTW, on hg side, there is no code path, which causes replacement or > removal of SystemRoot environment variable, AFAIK. > > - util.popen > > This involves os.popen(), and always inherits current environment > variables. > > - util.popen2/3/4 > > This involves subprocess.Popen() with 'env=env', but (1) default > value of 'env' is None (= inherit current ones), and (2) all > existing callers in sources below invoke without explicit env > argument. > > - hgext/convert/*.py > - mercurial/sshpeer.py > > - util.system > > This involves subprocess.Popen() with result of > shellenviron(environ), which updates existing variables or adds > new variables. And there is no replacement of SystemRoot on caller > sides. > > Execution of external hook, editor, and so on uses this API via > ui.system(). > > - os.popen (for other than Mercurial APIs above) > > This always inherits current environment variables. > > (not used directly in mercurial/** and hgext/**) > > - os.system (for other than Mercurial APIs above) > > This always inherits current environment variables. > > - hgext/convert/gnuarch.py > - hgext/factotum.py > - mercurial/statprof.py > - mercurial/util.py (for tempfilter) > > - subprocess.Popen (for other than Mercurial APIs above) > > - hgext/convert/common.py > - hgext/fsmonitor/pywatchman/__init__.py > - mercurial/util.py (for pipefilter) > no explicit env > > - hgext/logtoprocess.py > - mercurial/subrepo.py > using copy of encoding.environ > > - mercurial/ui.py (for pager) > using util.shellenviron() with rcutil.defaultpagerenv() Thanks for this. I hadn't looked too closely at all of these call sites, and was a little suspicious about some of them. But I see how they pick up the existing environment now. > On run-tests.py side, subprocess.Popen() is sometimes used with > explicit env argument, but the result of Test._getenv() or None is > used in such case. This doesn't cause replacement or removal of > SystemRoot (at least, with current implementation). > > Any of intermediate layers (MinGW or Python ?) causes issue ? Not sure. This is a regular Windows build (doesn't link against MSYS), and nothing has changed software-wise in a long time.
On Mon, 19 Jun 2017 11:30:28 -0400, Yuya Nishihara <yuya@tcha.org> wrote: > On Sun, 18 Jun 2017 22:19:29 -0400, Augie Fackler wrote: >> >> > On Jun 16, 2017, at 22:02, Matt Harbison <mharbison72@gmail.com> >> wrote: >> > >> > On Fri, 16 Jun 2017 09:59:30 -0400, Augie Fackler <raf@durin42.com> >> wrote: >> > >> >> On Fri, Jun 16, 2017 at 12:18:18AM -0400, Matt Harbison wrote: >> >>> So apparently, this is a symptom of not having %SystemRoot% in the >> >>> environment when calling CreateProcess(). >> >>> >> >>> https://bugs.python.org/issue13524 >> >>> >> https://jpassing.com/2009/12/28/the-hidden-danger-of-forgetting-to-specify-systemroot-in-a-custom-environment-block/ >> >>> >> >>> I see that setup.py special cases this variable. I did a search >> for 'env >> >>> =', and it looks like hooks and pager start with empty >> environments, so they >> >>> must not inherit this. IDR if any recent changes were made that >> start with >> >>> an empty environment. >> >>> >> >>> The thing I can't get my mind around is the hit and miss nature of >> the >> >>> error, if this is really the problem. >> >> >> >> It sounds like it should be harmless to just always forward >> >> %SystemRoot% - should we just do that? >> > >> > Seems reasonable, but run-tests._getenv() already does an >> os.environ.copy(), so it should be there? >> > >> > It does seem like a good idea to do it for hooks and other things >> executed, where the environment is built from scratch. The question is >> where? There's util.popen[2-4](), plus some direct calls to >> subprocess.Popen(), and an os.system(). I considered >> util.shellenviron(), but there are far fewer calls to this than places >> where processes are spawned. > > (+CC foozy since he has Windows) > > Is the problem only seen in tests? I don't think environment variables > are > cleared in hg side. I hit this problem again this weekend, after it was quiet for the past couple of weeks. It looks like it might be an issue with nearly running low on memory. When it happened this time, Windows popped up a dialog box saying memory was running low, offering to kill some programs. I had TaskManager open, and saw the Performance > System > Commit (MB) line was running around 5400/6076. I closed thg, which exited thg.exe and hg.exe (listed at ~300 MB each in the process list), and the issue stopped. I was able to recreate it again after a day of quiet by opening up a bunch of tabs in FireFox, and pushing the memory usage around that threshold. I tend to run the tests with -j9, and I've seen the first number bounce around between 4900 and 5700+ during these failures. So I'm not sure what the exact problem threshold is, as tests start and exit. Interestingly, the free memory number in Resource Monitor at the same time indicates only 150MB-20MB free. One of the "optimizations" of the SSD install software was to cap the page file, which is probably why I hadn't seen this until recently. Kostia had mentioned to me that he was seeing errors saying “application failed to start 0xc0000142”, which I also saw (along with dialog box failures of various msys executables, like env.exe and grep.exe). So maybe this is useful to others wanting to run Windows tests. It seems unlikely that it would be seen in the wild (the page file usually isn't capped), and I doubt there's anything we can do about it anyway.
Patch
--- c:/Users/Matt/projects/hg/tests/test-log.t +++ c:/Users/Matt/projects/hg/tests/test-log.t.err @@ -1476,72 +1476,64 @@ $ hg resolve -m foo (no more unresolved files) $ hg ci -m "First merge, related" + abort: Invalid Signature + [255] I was wondering if it's right on the edge of some limit, and paths or whatever being set by the test runner are (sometimes) pushing it out of the environment block. But the docs say there's no limit to the block size: https://msdn.microsoft.com/en-us/library/windows/desktop/ms682653(v=vs.85).aspx >> _______________________________________________ >> Mercurial-devel mailing list >> Mercurial-devel@mercurial-scm.org >> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel _______________________________________________ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel