Patchwork stable ordering of test output

login
register
mail settings
Submitter Augie Fackler
Date April 13, 2017, 8:17 p.m.
Message ID <CAPLqtW+yWYW4PuSMi+sA9kbGM+-t=+vVMg6Us74k0Yh6-ohUFw@mail.gmail.com>
Download mbox | patch
Permalink /patch/20183/
State Changes Requested
Delegated to: Yuya Nishihara
Headers show

Comments

Augie Fackler - April 13, 2017, 8:17 p.m.
On Thu, Apr 13, 2017 at 3:55 PM, Augie Fackler <raf@durin42.com> wrote:
> On Wed, Mar 8, 2017 at 10:44 AM, Yuya Nishihara <yuya@tcha.org> wrote:
>> On Tue, 7 Mar 2017 17:56:58 +0100, Pierre-Yves David wrote:
>>> On the other hand, this is probably not so bundle2 specific. We have
>>> some "select" logic to read stdout and stderr as soon as possible. This
>>> is the main suspect as it is possible that this logic behave different
>>> under linux and other unix (not too much effort have been put into it).
>>
>> posix.poll() waits every type of operation no matter if fd is e.g. writable
>> or not. IIRC, this doesn't always work on FreeBSD since the underlying resource
>> of read/write ends might be shared in the kernel.
>>
>> But I don't think this is the source of the unstable output.
>
> I've had a little time today between things to try and debug this.
> What I've found so far:
>
> 1) when the test passes, the remote: output is printed by the
> _forwardoutput method in sshpeer, presumably since stderr makes it to
> the client before the close of stdout.
> 2) When the test fails (as on BSD, and I guess Solaris), the client
> notices that stdout closed before stderr. It then aborts the
> transaction and sshpeer.cleanup() notices some data chilling on stderr
> and ensures it gets read and printed.
>
> I'm not really sure why BSD systems would be quicker at communicating
> the closed FD than other systems. I'm poking at dummyssh now to see if
> maybe it's weirdness from there...

Here's a patch that seems to work. I'm not happy about it, but it
makes the behavior consistent, and it looks mostly harmless.

# HG changeset patch
# User Augie Fackler <augie@google.com>
# Date 1492114180 14400
#      Thu Apr 13 16:09:40 2017 -0400
# Node ID ec81fd7580f3e31aa92e8834ffbcf2a8e80e72e3
# Parent  35afb54dbb4df2975dbbf0e1525b98611f18ba85
sshpeer: try harder to snag stderr when stdout closes unexpectedly

Resolves test failures on FreeBSD, but I'm not happy about the fix.
Danek Duvall - April 13, 2017, 9:52 p.m.
Augie Fackler wrote:

> On Thu, Apr 13, 2017 at 3:55 PM, Augie Fackler <raf@durin42.com> wrote:
> > On Wed, Mar 8, 2017 at 10:44 AM, Yuya Nishihara <yuya@tcha.org> wrote:
> >> On Tue, 7 Mar 2017 17:56:58 +0100, Pierre-Yves David wrote:
> >>> On the other hand, this is probably not so bundle2 specific. We have
> >>> some "select" logic to read stdout and stderr as soon as possible. This
> >>> is the main suspect as it is possible that this logic behave different
> >>> under linux and other unix (not too much effort have been put into it).
> >>
> >> posix.poll() waits every type of operation no matter if fd is e.g. writable
> >> or not. IIRC, this doesn't always work on FreeBSD since the underlying resource
> >> of read/write ends might be shared in the kernel.
> >>
> >> But I don't think this is the source of the unstable output.
> >
> > I've had a little time today between things to try and debug this.
> > What I've found so far:
> >
> > 1) when the test passes, the remote: output is printed by the
> > _forwardoutput method in sshpeer, presumably since stderr makes it to
> > the client before the close of stdout.
> > 2) When the test fails (as on BSD, and I guess Solaris), the client
> > notices that stdout closed before stderr. It then aborts the
> > transaction and sshpeer.cleanup() notices some data chilling on stderr
> > and ensures it gets read and printed.
> >
> > I'm not really sure why BSD systems would be quicker at communicating
> > the closed FD than other systems. I'm poking at dummyssh now to see if
> > maybe it's weirdness from there...
> 
> Here's a patch that seems to work. I'm not happy about it, but it
> makes the behavior consistent, and it looks mostly harmless.

Confirmed that it fixes the problem on Solaris, too.

Thanks!

Danek
Matt Harbison - April 14, 2017, 1:36 a.m.
On Thu, 13 Apr 2017 16:17:34 -0400, Augie Fackler <raf@durin42.com> wrote:

> On Thu, Apr 13, 2017 at 3:55 PM, Augie Fackler <raf@durin42.com> wrote:
>> On Wed, Mar 8, 2017 at 10:44 AM, Yuya Nishihara <yuya@tcha.org> wrote:
>>> On Tue, 7 Mar 2017 17:56:58 +0100, Pierre-Yves David wrote:
>>>> On the other hand, this is probably not so bundle2 specific. We have
>>>> some "select" logic to read stdout and stderr as soon as possible.  
>>>> This
>>>> is the main suspect as it is possible that this logic behave different
>>>> under linux and other unix (not too much effort have been put into  
>>>> it).
>>>
>>> posix.poll() waits every type of operation no matter if fd is e.g.  
>>> writable
>>> or not. IIRC, this doesn't always work on FreeBSD since the underlying  
>>> resource
>>> of read/write ends might be shared in the kernel.
>>>
>>> But I don't think this is the source of the unstable output.
>>
>> I've had a little time today between things to try and debug this.
>> What I've found so far:
>>
>> 1) when the test passes, the remote: output is printed by the
>> _forwardoutput method in sshpeer, presumably since stderr makes it to
>> the client before the close of stdout.
>> 2) When the test fails (as on BSD, and I guess Solaris), the client
>> notices that stdout closed before stderr. It then aborts the
>> transaction and sshpeer.cleanup() notices some data chilling on stderr
>> and ensures it gets read and printed.
>>
>> I'm not really sure why BSD systems would be quicker at communicating
>> the closed FD than other systems. I'm poking at dummyssh now to see if
>> maybe it's weirdness from there...
>
> Here's a patch that seems to work. I'm not happy about it, but it
> makes the behavior consistent, and it looks mostly harmless.

This fixes it for Windows too.  Thanks!

> # HG changeset patch
> # User Augie Fackler <augie@google.com>
> # Date 1492114180 14400
> #      Thu Apr 13 16:09:40 2017 -0400
> # Node ID ec81fd7580f3e31aa92e8834ffbcf2a8e80e72e3
> # Parent  35afb54dbb4df2975dbbf0e1525b98611f18ba85
> sshpeer: try harder to snag stderr when stdout closes unexpectedly
>
> Resolves test failures on FreeBSD, but I'm not happy about the fix.
>
> diff --git a/mercurial/sshpeer.py b/mercurial/sshpeer.py
> --- a/mercurial/sshpeer.py
> +++ b/mercurial/sshpeer.py
> @@ -110,9 +110,17 @@ class doublepipe(object):
>              if mainready:
>                  meth = getattr(self._main, methname)
>                  if data is None:
> -                    return meth()
> +                    r = meth()
>                  else:
> -                    return meth(data)
> +                    r = meth(data)
> +                if not r and data != 0:
> +                    # We've observed a condition that indicates the
> +                    # stdout closed unexpectedly. Check stderr one
> +                    # more time and snag anything that's there before
> +                    # letting anyone know the main part of the pipe
> +                    # closed prematurely.
> +                    _forwardoutput(self._ui, self._side)
> +                return r
>
>      def close(self):
>          return self._main.close()
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Yuya Nishihara - April 15, 2017, 9:48 a.m.
On Thu, 13 Apr 2017 16:17:34 -0400, Augie Fackler wrote:
> # HG changeset patch
> # User Augie Fackler <augie@google.com>
> # Date 1492114180 14400
> #      Thu Apr 13 16:09:40 2017 -0400
> # Node ID ec81fd7580f3e31aa92e8834ffbcf2a8e80e72e3
> # Parent  35afb54dbb4df2975dbbf0e1525b98611f18ba85
> sshpeer: try harder to snag stderr when stdout closes unexpectedly
> 
> Resolves test failures on FreeBSD, but I'm not happy about the fix.
> 
> diff --git a/mercurial/sshpeer.py b/mercurial/sshpeer.py
> --- a/mercurial/sshpeer.py
> +++ b/mercurial/sshpeer.py
> @@ -110,9 +110,17 @@ class doublepipe(object):
>              if mainready:
>                  meth = getattr(self._main, methname)
>                  if data is None:
> -                    return meth()
> +                    r = meth()
>                  else:
> -                    return meth(data)
> +                    r = meth(data)
> +                if not r and data != 0:

I'm not sure what this condition is intended for. It's always true for
write() because r is None and data is a str.

> +                    # We've observed a condition that indicates the
> +                    # stdout closed unexpectedly. Check stderr one
> +                    # more time and snag anything that's there before
> +                    # letting anyone know the main part of the pipe
> +                    # closed prematurely.
> +                    _forwardoutput(self._ui, self._side)
> +                return r
Augie Fackler - April 15, 2017, 5:04 p.m.
> On Apr 15, 2017, at 5:48 AM, Yuya Nishihara <yuya@tcha.org> wrote:
> 
> On Thu, 13 Apr 2017 16:17:34 -0400, Augie Fackler wrote:
>> # HG changeset patch
>> # User Augie Fackler <augie@google.com>
>> # Date 1492114180 14400
>> #      Thu Apr 13 16:09:40 2017 -0400
>> # Node ID ec81fd7580f3e31aa92e8834ffbcf2a8e80e72e3
>> # Parent  35afb54dbb4df2975dbbf0e1525b98611f18ba85
>> sshpeer: try harder to snag stderr when stdout closes unexpectedly
>> 
>> Resolves test failures on FreeBSD, but I'm not happy about the fix.
>> 
>> diff --git a/mercurial/sshpeer.py b/mercurial/sshpeer.py
>> --- a/mercurial/sshpeer.py
>> +++ b/mercurial/sshpeer.py
>> @@ -110,9 +110,17 @@ class doublepipe(object):
>>             if mainready:
>>                 meth = getattr(self._main, methname)
>>                 if data is None:
>> -                    return meth()
>> +                    r = meth()
>>                 else:
>> -                    return meth(data)
>> +                    r = meth(data)
>> +                if not r and data != 0:
> 
> I'm not sure what this condition is intended for. It's always true for
> write() because r is None and data is a str.

This forwarder is also used for read(), where data is the number of bytes to be read. At least, I think that’s right, now I’m doubting myself.

> 
>> +                    # We've observed a condition that indicates the
>> +                    # stdout closed unexpectedly. Check stderr one
>> +                    # more time and snag anything that's there before
>> +                    # letting anyone know the main part of the pipe
>> +                    # closed prematurely.
>> +                    _forwardoutput(self._ui, self._side)
>> +                return r
Augie Fackler - April 19, 2017, 7:22 p.m.
On Sat, Apr 15, 2017 at 1:04 PM, Augie Fackler <raf@durin42.com> wrote:
>
>> On Apr 15, 2017, at 5:48 AM, Yuya Nishihara <yuya@tcha.org> wrote:
>>
>> On Thu, 13 Apr 2017 16:17:34 -0400, Augie Fackler wrote:
>>> # HG changeset patch
>>> # User Augie Fackler <augie@google.com>
>>> # Date 1492114180 14400
>>> #      Thu Apr 13 16:09:40 2017 -0400
>>> # Node ID ec81fd7580f3e31aa92e8834ffbcf2a8e80e72e3
>>> # Parent  35afb54dbb4df2975dbbf0e1525b98611f18ba85
>>> sshpeer: try harder to snag stderr when stdout closes unexpectedly
>>>
>>> Resolves test failures on FreeBSD, but I'm not happy about the fix.
>>>
>>> diff --git a/mercurial/sshpeer.py b/mercurial/sshpeer.py
>>> --- a/mercurial/sshpeer.py
>>> +++ b/mercurial/sshpeer.py
>>> @@ -110,9 +110,17 @@ class doublepipe(object):
>>>             if mainready:
>>>                 meth = getattr(self._main, methname)
>>>                 if data is None:
>>> -                    return meth()
>>> +                    r = meth()
>>>                 else:
>>> -                    return meth(data)
>>> +                    r = meth(data)
>>> +                if not r and data != 0:
>>
>> I'm not sure what this condition is intended for. It's always true for
>> write() because r is None and data is a str.
>
> This forwarder is also used for read(), where data is the number of bytes to be read. At least, I think that’s right, now I’m doubting myself.

Should I go ahead and mail this patch? Perhaps with some extra
comments? Or do people object to this fix?

>
>>
>>> +                    # We've observed a condition that indicates the
>>> +                    # stdout closed unexpectedly. Check stderr one
>>> +                    # more time and snag anything that's there before
>>> +                    # letting anyone know the main part of the pipe
>>> +                    # closed prematurely.
>>> +                    _forwardoutput(self._ui, self._side)
>>> +                return r
>
Jun Wu - April 19, 2017, 10:48 p.m.
I'm +1 on this patch. It does not look harmful.

Excerpts from Augie Fackler's message of 2017-04-19 15:22:42 -0400:
> >> I'm not sure what this condition is intended for. It's always true for
> >> write() because r is None and data is a str.
> >
> > This forwarder is also used for read(), where data is the number of
> > bytes to be read. At least, I think that’s right, now I’m doubting
> > myself.
> 
> Should I go ahead and mail this patch? Perhaps with some extra
> comments? Or do people object to this fix?
Yuya Nishihara - April 20, 2017, 11:27 a.m.
On Wed, 19 Apr 2017 15:22:42 -0400, Augie Fackler wrote:
> On Sat, Apr 15, 2017 at 1:04 PM, Augie Fackler <raf@durin42.com> wrote:
> >
> >> On Apr 15, 2017, at 5:48 AM, Yuya Nishihara <yuya@tcha.org> wrote:
> >>
> >> On Thu, 13 Apr 2017 16:17:34 -0400, Augie Fackler wrote:
> >>> # HG changeset patch
> >>> # User Augie Fackler <augie@google.com>
> >>> # Date 1492114180 14400
> >>> #      Thu Apr 13 16:09:40 2017 -0400
> >>> # Node ID ec81fd7580f3e31aa92e8834ffbcf2a8e80e72e3
> >>> # Parent  35afb54dbb4df2975dbbf0e1525b98611f18ba85
> >>> sshpeer: try harder to snag stderr when stdout closes unexpectedly
> >>>
> >>> Resolves test failures on FreeBSD, but I'm not happy about the fix.
> >>>
> >>> diff --git a/mercurial/sshpeer.py b/mercurial/sshpeer.py
> >>> --- a/mercurial/sshpeer.py
> >>> +++ b/mercurial/sshpeer.py
> >>> @@ -110,9 +110,17 @@ class doublepipe(object):
> >>>             if mainready:
> >>>                 meth = getattr(self._main, methname)
> >>>                 if data is None:
> >>> -                    return meth()
> >>> +                    r = meth()
> >>>                 else:
> >>> -                    return meth(data)
> >>> +                    r = meth(data)
> >>> +                if not r and data != 0:
> >>
> >> I'm not sure what this condition is intended for. It's always true for
> >> write() because r is None and data is a str.
> >
> > This forwarder is also used for read(), where data is the number of bytes to be read. At least, I think that’s right, now I’m doubting myself.
> 
> Should I go ahead and mail this patch? Perhaps with some extra
> comments? Or do people object to this fix?

Seems fine, but can you update the patch to not always call _forwardoutput()
by write() ? Perhaps this hack can be moved to read() and readline().

Patch

diff --git a/mercurial/sshpeer.py b/mercurial/sshpeer.py
--- a/mercurial/sshpeer.py
+++ b/mercurial/sshpeer.py
@@ -110,9 +110,17 @@  class doublepipe(object):
             if mainready:
                 meth = getattr(self._main, methname)
                 if data is None:
-                    return meth()
+                    r = meth()
                 else:
-                    return meth(data)
+                    r = meth(data)
+                if not r and data != 0:
+                    # We've observed a condition that indicates the
+                    # stdout closed unexpectedly. Check stderr one
+                    # more time and snag anything that's there before
+                    # letting anyone know the main part of the pipe
+                    # closed prematurely.
+                    _forwardoutput(self._ui, self._side)
+                return r

     def close(self):
         return self._main.close()