Patchwork [stable,v2] py3: stop using block-buffered stdout to TTY on Python 3

login
register
mail settings
Submitter Manuel Jacob
Date June 6, 2020, 10:55 p.m.
Message ID <d1754bf05591d2aee0f5.1591484120@tmp>
Download mbox | patch
Permalink /patch/46482/
State New
Headers show

Comments

Manuel Jacob - June 6, 2020, 10:55 p.m.
# HG changeset patch
# User Manuel Jacob <me@manueljacob.de>
# Date 1591333053 -7200
#      Fri Jun 05 06:57:33 2020 +0200
# Branch stable
# Node ID d1754bf05591d2aee0f5f9dec185faf912a21190
# Parent  1f114c797961733ea03d0de88390f9555a7edef4
# EXP-Topic fix_block-buffered_stdout
py3: stop using block-buffered stdout to TTY on Python 3

The previous code (added in 227ba1afcb65) incorrectly assumed that
sys.stdout.buffer is line-buffered. However the interpreter initializes it
with a block-buffered stream or an unbuffered stream (when the -u option or
the PYTHONUNBUFFERED environment variable is set), never with a line-buffered
stream.

To see the difference with the hg command, you need to do something slow, e.g.
`hg pull https://www.mercurial-scm.org/repo/hg`. On Python 3 on Linux, without
this patch, all output is printed right before the command ends. With this
patch, output is printed almost immediately.

I did some quick and dirty benchmarking, printing on my terminal.


Unfavorable case:
* for 100 times:
  * for 1000 times:
    * call `ui.write(b'x')`
  * call `ui.write(b'\n')`
* ui.flush()

Results:
CPython 2.7 is unaffected (line-buffered): wall 0.231795 comb 0.230000 user 0.230000 sys 0.000000
CPython 3.8 before patch (block-buffered): wall 0.190882 comb 0.190000 user 0.190000 sys 0.000000
CPython 3.8 after patch (unbuffered):      wall 0.654466 comb 0.650000 user 0.480000 sys 0.170000
CPython 3.8 hand-rolled line-buffered [*]: wall 0.218919 comb 0.220000 user 0.220000 sys 0.000000


Favorable case:
* for 100000 times:
  * call `ui.write(b'Testing write performance\n')`

Results:
CPython 2.7 is unaffected (line-buffered): wall 0.717060 comb 0.710000 user 0.510000 sys 0.200000
CPython 3.8 before patch (block-buffered): wall 0.255527 comb 0.260000 user 0.180000 sys 0.080000
CPython 3.8 after patch (unbuffered):      wall 0.616500 comb 0.620000 user 0.450000 sys 0.170000
CPython 3.8 hand-rolled line-buffered [*]: wall 0.811805 comb 0.810000 user 0.570000 sys 0.240000


[*] For this, I implemented a simplified version of what is suggested in the
TODO: stdout was set to block-buffered and ui.write() flushed the stream if
the output contains a newline character.
Pierre-Yves David - June 9, 2020, 11:21 a.m.
Sigh, this seems like a good enough fix for now. We can rewrite the UI 
layer in rust later ;-)

On 6/7/20 12:55 AM, Manuel Jacob wrote:
> # HG changeset patch
> # User Manuel Jacob <me@manueljacob.de>
> # Date 1591333053 -7200
> #      Fri Jun 05 06:57:33 2020 +0200
> # Branch stable
> # Node ID d1754bf05591d2aee0f5f9dec185faf912a21190
> # Parent  1f114c797961733ea03d0de88390f9555a7edef4
> # EXP-Topic fix_block-buffered_stdout
> py3: stop using block-buffered stdout to TTY on Python 3
> 
> The previous code (added in 227ba1afcb65) incorrectly assumed that
> sys.stdout.buffer is line-buffered. However the interpreter initializes it
> with a block-buffered stream or an unbuffered stream (when the -u option or
> the PYTHONUNBUFFERED environment variable is set), never with a line-buffered
> stream.
> 
> To see the difference with the hg command, you need to do something slow, e.g.
> `hg pull https://www.mercurial-scm.org/repo/hg`. On Python 3 on Linux, without
> this patch, all output is printed right before the command ends. With this
> patch, output is printed almost immediately.
> 
> I did some quick and dirty benchmarking, printing on my terminal.
> 
> 
> Unfavorable case:
> * for 100 times:
>    * for 1000 times:
>      * call `ui.write(b'x')`
>    * call `ui.write(b'\n')`
> * ui.flush()
> 
> Results:
> CPython 2.7 is unaffected (line-buffered): wall 0.231795 comb 0.230000 user 0.230000 sys 0.000000
> CPython 3.8 before patch (block-buffered): wall 0.190882 comb 0.190000 user 0.190000 sys 0.000000
> CPython 3.8 after patch (unbuffered):      wall 0.654466 comb 0.650000 user 0.480000 sys 0.170000
> CPython 3.8 hand-rolled line-buffered [*]: wall 0.218919 comb 0.220000 user 0.220000 sys 0.000000
> 
> 
> Favorable case:
> * for 100000 times:
>    * call `ui.write(b'Testing write performance\n')`
> 
> Results:
> CPython 2.7 is unaffected (line-buffered): wall 0.717060 comb 0.710000 user 0.510000 sys 0.200000
> CPython 3.8 before patch (block-buffered): wall 0.255527 comb 0.260000 user 0.180000 sys 0.080000
> CPython 3.8 after patch (unbuffered):      wall 0.616500 comb 0.620000 user 0.450000 sys 0.170000
> CPython 3.8 hand-rolled line-buffered [*]: wall 0.811805 comb 0.810000 user 0.570000 sys 0.240000
> 
> 
> [*] For this, I implemented a simplified version of what is suggested in the
> TODO: stdout was set to block-buffered and ui.write() flushed the stream if
> the output contains a newline character.
> 
> diff -r 1f114c797961 -r d1754bf05591 mercurial/utils/procutil.py
> --- a/mercurial/utils/procutil.py	Thu Apr 30 00:33:00 2020 -0400
> +++ b/mercurial/utils/procutil.py	Fri Jun 05 06:57:33 2020 +0200
> @@ -55,12 +55,20 @@
>   # on Windows).
>   # Python 3 rolls its own standard I/O streams.
>   if isatty(stdout):
> +    # TODO: Consider mitigating the performance impact of using unbuffered
> +    # writes on Python 3 and Windows, e.g. by using fully buffered streams and
> +    # ensuring on a higher level that lines are flushed.
>       if pycompat.iswindows:
>           # Windows doesn't support line buffering
>           stdout = os.fdopen(stdout.fileno(), 'wb', 0)
> -    elif not pycompat.ispy3:
> -        # on Python 3, stdout (sys.stdout.buffer) is already line buffered and
> -        # buffering=1 is not handled in binary mode
> +    elif pycompat.ispy3:
> +        # Python 3 doesn't support line buffering on binary streams. Use the
> +        # underlying unbuffered raw stream. If the ``raw`` attribute doesn't
> +        # exist, stdout is already unbuffered or replaced by a custom stream.
> +        # TODO: After Mercurial 5.4, use this code path on Python 3 on Windows.
> +        # We shouldn't risk changing working behavior in the stable branch.
> +        stdout = getattr(stdout, 'raw', stdout)
> +    else:
>           stdout = os.fdopen(stdout.fileno(), 'wb', 1)
>   
>   if pycompat.iswindows:
> 
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
Augie Fackler - June 9, 2020, 9:24 p.m.
Agreed, this LGTM. Yuya? I know you've been good at spotting issues in this area that I miss.

> On Jun 9, 2020, at 07:21, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote:
> 
> Sigh, this seems like a good enough fix for now. We can rewrite the UI layer in rust later ;-)
> 
> On 6/7/20 12:55 AM, Manuel Jacob wrote:
>> # HG changeset patch
>> # User Manuel Jacob <me@manueljacob.de>
>> # Date 1591333053 -7200
>> #      Fri Jun 05 06:57:33 2020 +0200
>> # Branch stable
>> # Node ID d1754bf05591d2aee0f5f9dec185faf912a21190
>> # Parent  1f114c797961733ea03d0de88390f9555a7edef4
>> # EXP-Topic fix_block-buffered_stdout
>> py3: stop using block-buffered stdout to TTY on Python 3
>> The previous code (added in 227ba1afcb65) incorrectly assumed that
>> sys.stdout.buffer is line-buffered. However the interpreter initializes it
>> with a block-buffered stream or an unbuffered stream (when the -u option or
>> the PYTHONUNBUFFERED environment variable is set), never with a line-buffered
>> stream.
>> To see the difference with the hg command, you need to do something slow, e.g.
>> `hg pull https://www.mercurial-scm.org/repo/hg`. On Python 3 on Linux, without
>> this patch, all output is printed right before the command ends. With this
>> patch, output is printed almost immediately.
>> I did some quick and dirty benchmarking, printing on my terminal.
>> Unfavorable case:
>> * for 100 times:
>>   * for 1000 times:
>>     * call `ui.write(b'x')`
>>   * call `ui.write(b'\n')`
>> * ui.flush()
>> Results:
>> CPython 2.7 is unaffected (line-buffered): wall 0.231795 comb 0.230000 user 0.230000 sys 0.000000
>> CPython 3.8 before patch (block-buffered): wall 0.190882 comb 0.190000 user 0.190000 sys 0.000000
>> CPython 3.8 after patch (unbuffered):      wall 0.654466 comb 0.650000 user 0.480000 sys 0.170000
>> CPython 3.8 hand-rolled line-buffered [*]: wall 0.218919 comb 0.220000 user 0.220000 sys 0.000000
>> Favorable case:
>> * for 100000 times:
>>   * call `ui.write(b'Testing write performance\n')`
>> Results:
>> CPython 2.7 is unaffected (line-buffered): wall 0.717060 comb 0.710000 user 0.510000 sys 0.200000
>> CPython 3.8 before patch (block-buffered): wall 0.255527 comb 0.260000 user 0.180000 sys 0.080000
>> CPython 3.8 after patch (unbuffered):      wall 0.616500 comb 0.620000 user 0.450000 sys 0.170000
>> CPython 3.8 hand-rolled line-buffered [*]: wall 0.811805 comb 0.810000 user 0.570000 sys 0.240000
>> [*] For this, I implemented a simplified version of what is suggested in the
>> TODO: stdout was set to block-buffered and ui.write() flushed the stream if
>> the output contains a newline character.
>> diff -r 1f114c797961 -r d1754bf05591 mercurial/utils/procutil.py
>> --- a/mercurial/utils/procutil.py	Thu Apr 30 00:33:00 2020 -0400
>> +++ b/mercurial/utils/procutil.py	Fri Jun 05 06:57:33 2020 +0200
>> @@ -55,12 +55,20 @@
>>  # on Windows).
>>  # Python 3 rolls its own standard I/O streams.
>>  if isatty(stdout):
>> +    # TODO: Consider mitigating the performance impact of using unbuffered
>> +    # writes on Python 3 and Windows, e.g. by using fully buffered streams and
>> +    # ensuring on a higher level that lines are flushed.
>>      if pycompat.iswindows:
>>          # Windows doesn't support line buffering
>>          stdout = os.fdopen(stdout.fileno(), 'wb', 0)
>> -    elif not pycompat.ispy3:
>> -        # on Python 3, stdout (sys.stdout.buffer) is already line buffered and
>> -        # buffering=1 is not handled in binary mode
>> +    elif pycompat.ispy3:
>> +        # Python 3 doesn't support line buffering on binary streams. Use the
>> +        # underlying unbuffered raw stream. If the ``raw`` attribute doesn't
>> +        # exist, stdout is already unbuffered or replaced by a custom stream.
>> +        # TODO: After Mercurial 5.4, use this code path on Python 3 on Windows.
>> +        # We shouldn't risk changing working behavior in the stable branch.
>> +        stdout = getattr(stdout, 'raw', stdout)
>> +    else:
>>          stdout = os.fdopen(stdout.fileno(), 'wb', 1)
>>    if pycompat.iswindows:
>> _______________________________________________
>> Mercurial-devel mailing list
>> Mercurial-devel@mercurial-scm.org
>> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
> 
> -- 
> Pierre-Yves David
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Manuel Jacob - June 10, 2020, 9:32 a.m.
Yuya and me agreed that it is better to not use unbuffered I/O. See the 
discussion on v1 and subsequent improved approaches.

On 2020-06-09 23:24, Augie Fackler wrote:
> Agreed, this LGTM. Yuya? I know you've been good at spotting issues in
> this area that I miss.
> 
>> On Jun 9, 2020, at 07:21, Pierre-Yves David 
>> <pierre-yves.david@ens-lyon.org> wrote:
>> 
>> Sigh, this seems like a good enough fix for now. We can rewrite the UI 
>> layer in rust later ;-)
>> 
>> On 6/7/20 12:55 AM, Manuel Jacob wrote:
>>> # HG changeset patch
>>> # User Manuel Jacob <me@manueljacob.de>
>>> # Date 1591333053 -7200
>>> #      Fri Jun 05 06:57:33 2020 +0200
>>> # Branch stable
>>> # Node ID d1754bf05591d2aee0f5f9dec185faf912a21190
>>> # Parent  1f114c797961733ea03d0de88390f9555a7edef4
>>> # EXP-Topic fix_block-buffered_stdout
>>> py3: stop using block-buffered stdout to TTY on Python 3
>>> The previous code (added in 227ba1afcb65) incorrectly assumed that
>>> sys.stdout.buffer is line-buffered. However the interpreter 
>>> initializes it
>>> with a block-buffered stream or an unbuffered stream (when the -u 
>>> option or
>>> the PYTHONUNBUFFERED environment variable is set), never with a 
>>> line-buffered
>>> stream.
>>> To see the difference with the hg command, you need to do something 
>>> slow, e.g.
>>> `hg pull https://www.mercurial-scm.org/repo/hg`. On Python 3 on 
>>> Linux, without
>>> this patch, all output is printed right before the command ends. With 
>>> this
>>> patch, output is printed almost immediately.
>>> I did some quick and dirty benchmarking, printing on my terminal.
>>> Unfavorable case:
>>> * for 100 times:
>>>   * for 1000 times:
>>>     * call `ui.write(b'x')`
>>>   * call `ui.write(b'\n')`
>>> * ui.flush()
>>> Results:
>>> CPython 2.7 is unaffected (line-buffered): wall 0.231795 comb 
>>> 0.230000 user 0.230000 sys 0.000000
>>> CPython 3.8 before patch (block-buffered): wall 0.190882 comb 
>>> 0.190000 user 0.190000 sys 0.000000
>>> CPython 3.8 after patch (unbuffered):      wall 0.654466 comb 
>>> 0.650000 user 0.480000 sys 0.170000
>>> CPython 3.8 hand-rolled line-buffered [*]: wall 0.218919 comb 
>>> 0.220000 user 0.220000 sys 0.000000
>>> Favorable case:
>>> * for 100000 times:
>>>   * call `ui.write(b'Testing write performance\n')`
>>> Results:
>>> CPython 2.7 is unaffected (line-buffered): wall 0.717060 comb 
>>> 0.710000 user 0.510000 sys 0.200000
>>> CPython 3.8 before patch (block-buffered): wall 0.255527 comb 
>>> 0.260000 user 0.180000 sys 0.080000
>>> CPython 3.8 after patch (unbuffered):      wall 0.616500 comb 
>>> 0.620000 user 0.450000 sys 0.170000
>>> CPython 3.8 hand-rolled line-buffered [*]: wall 0.811805 comb 
>>> 0.810000 user 0.570000 sys 0.240000
>>> [*] For this, I implemented a simplified version of what is suggested 
>>> in the
>>> TODO: stdout was set to block-buffered and ui.write() flushed the 
>>> stream if
>>> the output contains a newline character.
>>> diff -r 1f114c797961 -r d1754bf05591 mercurial/utils/procutil.py
>>> --- a/mercurial/utils/procutil.py	Thu Apr 30 00:33:00 2020 -0400
>>> +++ b/mercurial/utils/procutil.py	Fri Jun 05 06:57:33 2020 +0200
>>> @@ -55,12 +55,20 @@
>>>  # on Windows).
>>>  # Python 3 rolls its own standard I/O streams.
>>>  if isatty(stdout):
>>> +    # TODO: Consider mitigating the performance impact of using 
>>> unbuffered
>>> +    # writes on Python 3 and Windows, e.g. by using fully buffered 
>>> streams and
>>> +    # ensuring on a higher level that lines are flushed.
>>>      if pycompat.iswindows:
>>>          # Windows doesn't support line buffering
>>>          stdout = os.fdopen(stdout.fileno(), 'wb', 0)
>>> -    elif not pycompat.ispy3:
>>> -        # on Python 3, stdout (sys.stdout.buffer) is already line 
>>> buffered and
>>> -        # buffering=1 is not handled in binary mode
>>> +    elif pycompat.ispy3:
>>> +        # Python 3 doesn't support line buffering on binary streams. 
>>> Use the
>>> +        # underlying unbuffered raw stream. If the ``raw`` attribute 
>>> doesn't
>>> +        # exist, stdout is already unbuffered or replaced by a 
>>> custom stream.
>>> +        # TODO: After Mercurial 5.4, use this code path on Python 3 
>>> on Windows.
>>> +        # We shouldn't risk changing working behavior in the stable 
>>> branch.
>>> +        stdout = getattr(stdout, 'raw', stdout)
>>> +    else:
>>>          stdout = os.fdopen(stdout.fileno(), 'wb', 1)
>>>    if pycompat.iswindows:
>>> _______________________________________________
>>> Mercurial-devel mailing list
>>> Mercurial-devel@mercurial-scm.org
>>> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>> 
>> --
>> Pierre-Yves David
>> _______________________________________________
>> Mercurial-devel mailing list
>> Mercurial-devel@mercurial-scm.org
>> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Yuya Nishihara - June 10, 2020, 11:58 a.m.
On Wed, 10 Jun 2020 11:32:29 +0200, Manuel Jacob wrote:
> Yuya and me agreed that it is better to not use unbuffered I/O. See the
> discussion on v1 and subsequent improved approaches.

Yes. IMO, unbuffered I/O is the last option given how casually we do use
ui.write(). I have little concern about the speed of terminal output, which
is inherently slow or unreadable, but it shouldn't be unnecessarily slow if
a pager is attached.

Patch

diff -r 1f114c797961 -r d1754bf05591 mercurial/utils/procutil.py
--- a/mercurial/utils/procutil.py	Thu Apr 30 00:33:00 2020 -0400
+++ b/mercurial/utils/procutil.py	Fri Jun 05 06:57:33 2020 +0200
@@ -55,12 +55,20 @@ 
 # on Windows).
 # Python 3 rolls its own standard I/O streams.
 if isatty(stdout):
+    # TODO: Consider mitigating the performance impact of using unbuffered
+    # writes on Python 3 and Windows, e.g. by using fully buffered streams and
+    # ensuring on a higher level that lines are flushed.
     if pycompat.iswindows:
         # Windows doesn't support line buffering
         stdout = os.fdopen(stdout.fileno(), 'wb', 0)
-    elif not pycompat.ispy3:
-        # on Python 3, stdout (sys.stdout.buffer) is already line buffered and
-        # buffering=1 is not handled in binary mode
+    elif pycompat.ispy3:
+        # Python 3 doesn't support line buffering on binary streams. Use the
+        # underlying unbuffered raw stream. If the ``raw`` attribute doesn't
+        # exist, stdout is already unbuffered or replaced by a custom stream.
+        # TODO: After Mercurial 5.4, use this code path on Python 3 on Windows.
+        # We shouldn't risk changing working behavior in the stable branch.
+        stdout = getattr(stdout, 'raw', stdout)
+    else:
         stdout = os.fdopen(stdout.fileno(), 'wb', 1)
 
 if pycompat.iswindows: