Patchwork [stable] py3: always flush ui streams on Python 3

login
register
mail settings
Submitter Manuel Jacob
Date June 2, 2020, 6:07 p.m.
Message ID <ebbc45544673c33ea3be.1591121253@tmp>
Download mbox | patch
Permalink /patch/46458/
State Superseded
Headers show

Comments

Manuel Jacob - June 2, 2020, 6:07 p.m.
# HG changeset patch
# User Manuel Jacob <me@manueljacob.de>
# Date 1591120869 -7200
#      Tue Jun 02 20:01:09 2020 +0200
# Branch stable
# Node ID ebbc45544673c33ea3beb887ed4d5230b015102a
# Parent  91e509a12dbc4cd6cf2dcb9dae3ed383932132ac
py3: always flush ui streams on Python 3

On Python 2, we rely on that stdout and stderr are line buffered. Python 3’s
io module doesn’t offer line buffered binary streams.
Augie Fackler - June 2, 2020, 10:04 p.m.
> On Jun 2, 2020, at 14:07, Manuel Jacob <me@manueljacob.de> wrote:
> 
> # HG changeset patch
> # User Manuel Jacob <me@manueljacob.de>
> # Date 1591120869 -7200
> #      Tue Jun 02 20:01:09 2020 +0200
> # Branch stable
> # Node ID ebbc45544673c33ea3beb887ed4d5230b015102a
> # Parent  91e509a12dbc4cd6cf2dcb9dae3ed383932132ac
> py3: always flush ui streams on Python 3
> 
> On Python 2, we rely on that stdout and stderr are line buffered. Python 3’s
> io module doesn’t offer line buffered binary streams.

We use the underlying non-buffer thing though for std{err, out} though, or so I thought. Are you observing  behavior that this corrects?

> 
> diff --git a/mercurial/ui.py b/mercurial/ui.py
> --- a/mercurial/ui.py
> +++ b/mercurial/ui.py
> @@ -1198,9 +1198,14 @@ class ui(object):
>                     label = opts.get('label', b'')
>                     msg = self.label(msg, label)
>                 dest.write(msg)
> +            # On Python 2, stdout and stderr are usually line buffered, but
>             # stderr may be buffered under win32 when redirected to files,
>             # including stdout.
> -            if dest is self._ferr and not getattr(self._ferr, 'closed', False):
> +            # On Python 3, we use the underlying binary buffer, which does not
> +            # support line buffering.
> +            if (pycompat.ispy3 or dest is self._ferr) and not getattr(
> +                self._ferr, 'closed', False
> +            ):
>                 dest.flush()
>         except IOError as err:
>             if dest is self._ferr and err.errno in (
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Manuel Jacob - June 3, 2020, 3:46 a.m.
On 2020-06-03 00:04, Augie Fackler wrote:
>> On Jun 2, 2020, at 14:07, Manuel Jacob <me@manueljacob.de> wrote:
>> 
>> # HG changeset patch
>> # User Manuel Jacob <me@manueljacob.de>
>> # Date 1591120869 -7200
>> #      Tue Jun 02 20:01:09 2020 +0200
>> # Branch stable
>> # Node ID ebbc45544673c33ea3beb887ed4d5230b015102a
>> # Parent  91e509a12dbc4cd6cf2dcb9dae3ed383932132ac
>> py3: always flush ui streams on Python 3
>> 
>> On Python 2, we rely on that stdout and stderr are line buffered. 
>> Python 3’s
>> io module doesn’t offer line buffered binary streams.
> 
> We use the underlying non-buffer thing though for std{err, out}
> though, or so I thought. Are you observing  behavior that this
> corrects?

pycompat says:

     stdout = sys.stdout.buffer
     stderr = sys.stderr.buffer

They are <_io.BufferedWriter name='<stdout>'> and <_io.BufferedWriter 
name='<stderr>'>, respectively.

I observed the error in TortoiseHg, which is a long-running process.

To trigger the problem with the hg command, you would need to do 
something slow that prints on `self._fout`, e.g. `hg pull 
https://www.mercurial-scm.org/repo/hg --debug`.

>> diff --git a/mercurial/ui.py b/mercurial/ui.py
>> --- a/mercurial/ui.py
>> +++ b/mercurial/ui.py
>> @@ -1198,9 +1198,14 @@ class ui(object):
>>                     label = opts.get('label', b'')
>>                     msg = self.label(msg, label)
>>                 dest.write(msg)
>> +            # On Python 2, stdout and stderr are usually line 
>> buffered, but
>>             # stderr may be buffered under win32 when redirected to 
>> files,
>>             # including stdout.
>> -            if dest is self._ferr and not getattr(self._ferr, 
>> 'closed', False):
>> +            # On Python 3, we use the underlying binary buffer, which 
>> does not
>> +            # support line buffering.
>> +            if (pycompat.ispy3 or dest is self._ferr) and not 
>> getattr(
>> +                self._ferr, 'closed', False
>> +            ):
>>                 dest.flush()
>>         except IOError as err:
>>             if dest is self._ferr and err.errno in (
>> _______________________________________________
>> Mercurial-devel mailing list
>> Mercurial-devel@mercurial-scm.org
>> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Manuel Jacob - June 3, 2020, 4:14 a.m.
On 2020-06-03 05:46, Manuel Jacob wrote:
> On 2020-06-03 00:04, Augie Fackler wrote:
>>> On Jun 2, 2020, at 14:07, Manuel Jacob <me@manueljacob.de> wrote:
>>> 
>>> # HG changeset patch
>>> # User Manuel Jacob <me@manueljacob.de>
>>> # Date 1591120869 -7200
>>> #      Tue Jun 02 20:01:09 2020 +0200
>>> # Branch stable
>>> # Node ID ebbc45544673c33ea3beb887ed4d5230b015102a
>>> # Parent  91e509a12dbc4cd6cf2dcb9dae3ed383932132ac
>>> py3: always flush ui streams on Python 3
>>> 
>>> On Python 2, we rely on that stdout and stderr are line buffered. 
>>> Python 3’s
>>> io module doesn’t offer line buffered binary streams.
>> 
>> We use the underlying non-buffer thing though for std{err, out}
>> though, or so I thought. Are you observing  behavior that this
>> corrects?
> 
> pycompat says:
> 
>     stdout = sys.stdout.buffer
>     stderr = sys.stderr.buffer
> 
> They are <_io.BufferedWriter name='<stdout>'> and <_io.BufferedWriter
> name='<stderr>'>, respectively.
> 
> I observed the error in TortoiseHg, which is a long-running process.
> 
> To trigger the problem with the hg command, you would need to do
> something slow that prints on `self._fout`, e.g. `hg pull
> https://www.mercurial-scm.org/repo/hg --debug`.

Actually, you don’t need to specify --debug.

Just run `hg pull https://www.mercurial-scm.org/repo/hg` and observe 
that without the patch, "pulling from 
https://www.mercurial-scm.org/repo/hg" will not be immediately printed 
on your screen.

I’ve sent an updated patch with this example in the description and with 
a correction.
Yuya Nishihara - June 3, 2020, 11:13 a.m.
On Wed, 03 Jun 2020 05:46:05 +0200, Manuel Jacob wrote:
> On 2020-06-03 00:04, Augie Fackler wrote:
> >> On Jun 2, 2020, at 14:07, Manuel Jacob <me@manueljacob.de> wrote:
> >> 
> >> # HG changeset patch
> >> # User Manuel Jacob <me@manueljacob.de>
> >> # Date 1591120869 -7200
> >> #      Tue Jun 02 20:01:09 2020 +0200
> >> # Branch stable
> >> # Node ID ebbc45544673c33ea3beb887ed4d5230b015102a
> >> # Parent  91e509a12dbc4cd6cf2dcb9dae3ed383932132ac
> >> py3: always flush ui streams on Python 3
> >> 
> >> On Python 2, we rely on that stdout and stderr are line buffered. 
> >> Python 3’s
> >> io module doesn’t offer line buffered binary streams.
> > 
> > We use the underlying non-buffer thing though for std{err, out}
> > though, or so I thought. Are you observing  behavior that this
> > corrects?
> 
> pycompat says:
> 
>      stdout = sys.stdout.buffer
>      stderr = sys.stderr.buffer
> 
> They are <_io.BufferedWriter name='<stdout>'> and <_io.BufferedWriter 
> name='<stderr>'>, respectively.
> 
> I observed the error in TortoiseHg, which is a long-running process.
> 
> To trigger the problem with the hg command, you would need to do 
> something slow that prints on `self._fout`, e.g. `hg pull 
> https://www.mercurial-scm.org/repo/hg --debug`.

Denis, do you have any idea? My assumption was sys.stdout.buffer was magically
line-buffered, but it appears not at least with Python 3.8.3 on Linux.

https://www.mercurial-scm.org/repo/hg/rev/227ba1afcb65

> >> --- a/mercurial/ui.py
> >> +++ b/mercurial/ui.py
> >> @@ -1198,9 +1198,14 @@ class ui(object):
> >>                     label = opts.get('label', b'')
> >>                     msg = self.label(msg, label)
> >>                 dest.write(msg)
> >> +            # On Python 2, stdout and stderr are usually line 
> >> buffered, but
> >>             # stderr may be buffered under win32 when redirected to 
> >> files,
> >>             # including stdout.
> >> -            if dest is self._ferr and not getattr(self._ferr, 
> >> 'closed', False):
> >> +            # On Python 3, we use the underlying binary buffer, which 
> >> does not
> >> +            # support line buffering.
> >> +            if (pycompat.ispy3 or dest is self._ferr) and not 
> >> getattr(
> >> +                self._ferr, 'closed', False
> >> +            ):
> >>                 dest.flush()

Flushing stdout per write() sounds bad. If that's the only way to work
around Py3 io module, we'll probably need to check if the stdout is
a tty or not.
Augie Fackler - June 3, 2020, 8:40 p.m.
> On Jun 3, 2020, at 07:13, Yuya Nishihara <yuya@tcha.org> wrote:
> 
> On Wed, 03 Jun 2020 05:46:05 +0200, Manuel Jacob wrote:
>> On 2020-06-03 00:04, Augie Fackler wrote:
>>>> On Jun 2, 2020, at 14:07, Manuel Jacob <me@manueljacob.de> wrote:
>>>> 
>>>> # HG changeset patch
>>>> # User Manuel Jacob <me@manueljacob.de>
>>>> # Date 1591120869 -7200
>>>> #      Tue Jun 02 20:01:09 2020 +0200
>>>> # Branch stable
>>>> # Node ID ebbc45544673c33ea3beb887ed4d5230b015102a
>>>> # Parent  91e509a12dbc4cd6cf2dcb9dae3ed383932132ac
>>>> py3: always flush ui streams on Python 3
>>>> 
>>>> On Python 2, we rely on that stdout and stderr are line buffered. 
>>>> Python 3’s
>>>> io module doesn’t offer line buffered binary streams.
>>> 
>>> We use the underlying non-buffer thing though for std{err, out}
>>> though, or so I thought. Are you observing  behavior that this
>>> corrects?
>> 
>> pycompat says:
>> 
>>     stdout = sys.stdout.buffer
>>     stderr = sys.stderr.buffer
>> 
>> They are <_io.BufferedWriter name='<stdout>'> and <_io.BufferedWriter 
>> name='<stderr>'>, respectively.
>> 
>> I observed the error in TortoiseHg, which is a long-running process.
>> 
>> To trigger the problem with the hg command, you would need to do 
>> something slow that prints on `self._fout`, e.g. `hg pull 
>> https://www.mercurial-scm.org/repo/hg --debug`.
> 
> Denis, do you have any idea? My assumption was sys.stdout.buffer was magically
> line-buffered, but it appears not at least with Python 3.8.3 on Linux.
> 
> https://www.mercurial-scm.org/repo/hg/rev/227ba1afcb65
> 
>>>> --- a/mercurial/ui.py
>>>> +++ b/mercurial/ui.py
>>>> @@ -1198,9 +1198,14 @@ class ui(object):
>>>>                    label = opts.get('label', b'')
>>>>                    msg = self.label(msg, label)
>>>>                dest.write(msg)
>>>> +            # On Python 2, stdout and stderr are usually line 
>>>> buffered, but
>>>>            # stderr may be buffered under win32 when redirected to 
>>>> files,
>>>>            # including stdout.
>>>> -            if dest is self._ferr and not getattr(self._ferr, 
>>>> 'closed', False):
>>>> +            # On Python 3, we use the underlying binary buffer, which 
>>>> does not
>>>> +            # support line buffering.
>>>> +            if (pycompat.ispy3 or dest is self._ferr) and not 
>>>> getattr(
>>>> +                self._ferr, 'closed', False
>>>> +            ):
>>>>                dest.flush()
> 
> Flushing stdout per write() sounds bad. If that's the only way to work
> around Py3 io module, we'll probably need to check if the stdout is
> a tty or not.

Agreed. Maybe we could sniff for a \n in the output stream, but I dunno if that's likely to be a win.
Manuel Jacob - June 4, 2020, 5:54 a.m.
On 2020-06-03 22:40, Augie Fackler wrote:
>> On Jun 3, 2020, at 07:13, Yuya Nishihara <yuya@tcha.org> wrote:
>> 
>> On Wed, 03 Jun 2020 05:46:05 +0200, Manuel Jacob wrote:
>>> On 2020-06-03 00:04, Augie Fackler wrote:
>>>>> On Jun 2, 2020, at 14:07, Manuel Jacob <me@manueljacob.de> wrote:
>>>>> 
>>>>> # HG changeset patch
>>>>> # User Manuel Jacob <me@manueljacob.de>
>>>>> # Date 1591120869 -7200
>>>>> #      Tue Jun 02 20:01:09 2020 +0200
>>>>> # Branch stable
>>>>> # Node ID ebbc45544673c33ea3beb887ed4d5230b015102a
>>>>> # Parent  91e509a12dbc4cd6cf2dcb9dae3ed383932132ac
>>>>> py3: always flush ui streams on Python 3
>>>>> 
>>>>> On Python 2, we rely on that stdout and stderr are line buffered.
>>>>> Python 3’s
>>>>> io module doesn’t offer line buffered binary streams.
>>>> 
>>>> We use the underlying non-buffer thing though for std{err, out}
>>>> though, or so I thought. Are you observing  behavior that this
>>>> corrects?
>>> 
>>> pycompat says:
>>> 
>>>     stdout = sys.stdout.buffer
>>>     stderr = sys.stderr.buffer
>>> 
>>> They are <_io.BufferedWriter name='<stdout>'> and <_io.BufferedWriter
>>> name='<stderr>'>, respectively.
>>> 
>>> I observed the error in TortoiseHg, which is a long-running process.
>>> 
>>> To trigger the problem with the hg command, you would need to do
>>> something slow that prints on `self._fout`, e.g. `hg pull
>>> https://www.mercurial-scm.org/repo/hg --debug`.
>> 
>> Denis, do you have any idea? My assumption was sys.stdout.buffer was 
>> magically
>> line-buffered, but it appears not at least with Python 3.8.3 on Linux.
>> 
>> https://www.mercurial-scm.org/repo/hg/rev/227ba1afcb65
>> 
>>>>> --- a/mercurial/ui.py
>>>>> +++ b/mercurial/ui.py
>>>>> @@ -1198,9 +1198,14 @@ class ui(object):
>>>>>                    label = opts.get('label', b'')
>>>>>                    msg = self.label(msg, label)
>>>>>                dest.write(msg)
>>>>> +            # On Python 2, stdout and stderr are usually line
>>>>> buffered, but
>>>>>            # stderr may be buffered under win32 when redirected to
>>>>> files,
>>>>>            # including stdout.
>>>>> -            if dest is self._ferr and not getattr(self._ferr,
>>>>> 'closed', False):
>>>>> +            # On Python 3, we use the underlying binary buffer, 
>>>>> which
>>>>> does not
>>>>> +            # support line buffering.
>>>>> +            if (pycompat.ispy3 or dest is self._ferr) and not
>>>>> getattr(
>>>>> +                self._ferr, 'closed', False
>>>>> +            ):
>>>>>                dest.flush()
>> 
>> Flushing stdout per write() sounds bad. If that's the only way to work
>> around Py3 io module, we'll probably need to check if the stdout is
>> a tty or not.
> 
> Agreed. Maybe we could sniff for a \n in the output stream, but I
> dunno if that's likely to be a win.

For status output (in the general sense, not necessarily in the 
ui.status() sense), flushing the stream would IMHO be fine: the messages 
should be shown as fast as possible, often end with a '\n' (so the 
stream would be flushed anyway in case of line buffering) and often use 
string interpolation (therefore _writenobuf() gets called once per 
message), and the amount of output is limited (so printing the lines 
won’t dominate the total time spent).

For "data" output (like diffs), flushing the stream can indeed be 
wasteful: showing single lines immediately is often not as important, 
ui.write() is often called with small data (not complete lines), and the 
amount of output can be large. Maybe we can introduce special methods 
for this kind of data that uses fully buffered writes? Code that uses 
these methods would need to flush manually.

Just sniffing for '\n', as Augie suggested, is pretty much what the io 
module does when line buffering is enabled 
(https://github.com/python/cpython/blob/v3.8.3/Modules/_io/textio.c#L1597).

Of course, we should do that only if the stream is interactive. I’m 
concerned that calling isatty() on every write is too expensive, but I 
need to benchmark this. BTW, there’s ui._isatty() that checks the 
ui.nontty config. This config is not checked when setting sys.stdout to 
line-buffered on Python 2 (by the interpreter), so am I right that we 
should not care about this config?

BTW, I’m not sure I understand the comment about stderr on Windows. I 
couldn't find any further information on this topic. Is it the case 
that, when stderr is redirected to stdout on windows, Python fails to 
recognize that stderr is actually interactive and doesn't set line 
buffering?

Is there some command that calls ui.write() many times with small data 
and doesn’t do much else, to be used as a benchmark?
Denis Laxalde - June 4, 2020, 10:03 a.m.
Yuya Nishihara a écrit :
> On Wed, 03 Jun 2020 05:46:05 +0200, Manuel Jacob wrote:
> > On 2020-06-03 00:04, Augie Fackler wrote:
> > >> On Jun 2, 2020, at 14:07, Manuel Jacob <me@manueljacob.de> wrote:
> > >> 
> > >> # HG changeset patch
> > >> # User Manuel Jacob <me@manueljacob.de>
> > >> # Date 1591120869 -7200
> > >> #      Tue Jun 02 20:01:09 2020 +0200
> > >> # Branch stable
> > >> # Node ID ebbc45544673c33ea3beb887ed4d5230b015102a
> > >> # Parent  91e509a12dbc4cd6cf2dcb9dae3ed383932132ac
> > >> py3: always flush ui streams on Python 3
> > >> 
> > >> On Python 2, we rely on that stdout and stderr are line buffered. 
> > >> Python 3’s
> > >> io module doesn’t offer line buffered binary streams.
> > > 
> > > We use the underlying non-buffer thing though for std{err, out}
> > > though, or so I thought. Are you observing  behavior that this
> > > corrects?
> > 
> > pycompat says:
> > 
> >      stdout = sys.stdout.buffer
> >      stderr = sys.stderr.buffer
> > 
> > They are <_io.BufferedWriter name='<stdout>'> and <_io.BufferedWriter 
> > name='<stderr>'>, respectively.
> > 
> > I observed the error in TortoiseHg, which is a long-running process.
> > 
> > To trigger the problem with the hg command, you would need to do 
> > something slow that prints on `self._fout`, e.g. `hg pull 
> > https://www.mercurial-scm.org/repo/hg --debug`.
> 
> Denis, do you have any idea? My assumption was sys.stdout.buffer was magically
> line-buffered, but it appears not at least with Python 3.8.3 on Linux.

sys.stdout is documented to be line-buffered in interactive mode.
But looking at the implementation of TextIOWrapper, line buffering is
apparently not handled at the underlying buffer level (which we use
IIRC):

  https://github.com/python/cpython/blob/v3.8.3/Lib/_pyio.py#L2210

This would mean that sys.stdout.buffer is not line-buffered.

So perhaps flushing using a similar logic as in TextIOWrapper.write(),
but at the buffer level, is the way go.


> https://www.mercurial-scm.org/repo/hg/rev/227ba1afcb65
> 
> > >> --- a/mercurial/ui.py
> > >> +++ b/mercurial/ui.py
> > >> @@ -1198,9 +1198,14 @@ class ui(object):
> > >>                     label = opts.get('label', b'')
> > >>                     msg = self.label(msg, label)
> > >>                 dest.write(msg)
> > >> +            # On Python 2, stdout and stderr are usually line 
> > >> buffered, but
> > >>             # stderr may be buffered under win32 when redirected to 
> > >> files,
> > >>             # including stdout.
> > >> -            if dest is self._ferr and not getattr(self._ferr, 
> > >> 'closed', False):
> > >> +            # On Python 3, we use the underlying binary buffer, which 
> > >> does not
> > >> +            # support line buffering.
> > >> +            if (pycompat.ispy3 or dest is self._ferr) and not 
> > >> getattr(
> > >> +                self._ferr, 'closed', False
> > >> +            ):
> > >>                 dest.flush()
> 
> Flushing stdout per write() sounds bad. If that's the only way to work
> around Py3 io module, we'll probably need to check if the stdout is
> a tty or not.
Yuya Nishihara - June 4, 2020, 12:21 p.m.
On Thu, 04 Jun 2020 07:54:14 +0200, Manuel Jacob wrote:
> On 2020-06-03 22:40, Augie Fackler wrote:
> >> On Jun 3, 2020, at 07:13, Yuya Nishihara <yuya@tcha.org> wrote:
> >> Flushing stdout per write() sounds bad. If that's the only way to work
> >> around Py3 io module, we'll probably need to check if the stdout is
> >> a tty or not.
> > 
> > Agreed. Maybe we could sniff for a \n in the output stream, but I
> > dunno if that's likely to be a win.
> 
> For status output (in the general sense, not necessarily in the 
> ui.status() sense), flushing the stream would IMHO be fine: the messages 
> should be shown as fast as possible, often end with a '\n'

Yes, that should be fine.

> For "data" output (like diffs), flushing the stream can indeed be 
> wasteful: showing single lines immediately is often not as important, 
> ui.write() is often called with small data (not complete lines), and the 
> amount of output can be large. Maybe we can introduce special methods 
> for this kind of data that uses fully buffered writes? Code that uses 
> these methods would need to flush manually.

Actually there's a method for "data" output:

 - ui.write(), which is a subset of ui._write()
 - ui.status(), etc. -> ui._write() -> ui._writenobuf()

So it's probably okay for ui._writenobuf() to flush() per call. For "data"
output, we'll also need to call flush() occasionally so e.g.
"hg log -r 'grep(foobar)'" will emit data incrementally. It would be nice
if this flush() can be automated, but that's up to the overhead.

> Of course, we should do that only if the stream is interactive. I’m 
> concerned that calling isatty() on every write is too expensive, but I 
> need to benchmark this. BTW, there’s ui._isatty() that checks the 
> ui.nontty config. This config is not checked when setting sys.stdout to 
> line-buffered on Python 2 (by the interpreter), so am I right that we 
> should not care about this config?

That wouldn't matter, but I think ui functions should respect ui.nontty
option consistently.

> BTW, I’m not sure I understand the comment about stderr on Windows. I 
> couldn't find any further information on this topic. Is it the case 
> that, when stderr is redirected to stdout on windows, Python fails to 
> recognize that stderr is actually interactive and doesn't set line 
> buffering?

Maybe there would be weird behavior in C stdio.

> Is there some command that calls ui.write() many times with small data 
> and doesn’t do much else, to be used as a benchmark?

hg files?
Manuel Jacob - June 5, 2020, 10:38 a.m.
On 2020-06-04 14:21, Yuya Nishihara wrote:
> On Thu, 04 Jun 2020 07:54:14 +0200, Manuel Jacob wrote:
>> On 2020-06-03 22:40, Augie Fackler wrote:
>> >> On Jun 3, 2020, at 07:13, Yuya Nishihara <yuya@tcha.org> wrote:
>> >> Flushing stdout per write() sounds bad. If that's the only way to work
>> >> around Py3 io module, we'll probably need to check if the stdout is
>> >> a tty or not.
>> >
>> > Agreed. Maybe we could sniff for a \n in the output stream, but I
>> > dunno if that's likely to be a win.
>> 
>> For status output (in the general sense, not necessarily in the
>> ui.status() sense), flushing the stream would IMHO be fine: the 
>> messages
>> should be shown as fast as possible, often end with a '\n'
> 
> Yes, that should be fine.
> 
>> For "data" output (like diffs), flushing the stream can indeed be
>> wasteful: showing single lines immediately is often not as important,
>> ui.write() is often called with small data (not complete lines), and 
>> the
>> amount of output can be large. Maybe we can introduce special methods
>> for this kind of data that uses fully buffered writes? Code that uses
>> these methods would need to flush manually.
> 
> Actually there's a method for "data" output:
> 
>  - ui.write(), which is a subset of ui._write()
>  - ui.status(), etc. -> ui._write() -> ui._writenobuf()
> 
> So it's probably okay for ui._writenobuf() to flush() per call. For 
> "data"
> output, we'll also need to call flush() occasionally so e.g.
> "hg log -r 'grep(foobar)'" will emit data incrementally. It would be 
> nice
> if this flush() can be automated, but that's up to the overhead.

I missed one thing before: By default, ui uses the streams from 
procutil, which are supposed to be line-buffered or unbuffered. That 
logic was buggy and gets fixed in the new patch series I just sent. So 
the higher layers don’t need to flush. As suggested in a comment in the 
last patch, we could make the streams from procutil buffered and push 
the responsibility to flush to higher layers.

>> Of course, we should do that only if the stream is interactive. I’m
>> concerned that calling isatty() on every write is too expensive, but I
>> need to benchmark this. BTW, there’s ui._isatty() that checks the
>> ui.nontty config. This config is not checked when setting sys.stdout 
>> to
>> line-buffered on Python 2 (by the interpreter), so am I right that we
>> should not care about this config?
> 
> That wouldn't matter, but I think ui functions should respect ui.nontty
> option consistently.
> 
>> BTW, I’m not sure I understand the comment about stderr on Windows. I
>> couldn't find any further information on this topic. Is it the case
>> that, when stderr is redirected to stdout on windows, Python fails to
>> recognize that stderr is actually interactive and doesn't set line
>> buffering?
> 
> Maybe there would be weird behavior in C stdio.

Procutil claims "Windows doesn't support line buffering. I didn’t find 
an good independent source clearly indicating that this is the case, but 
https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/setvbuf?view=vs-2019#remarks 
suggests that it could be the case.

Luckily, on Python 3 we’re not dependent on such implementation details 
anymore.

I think the logic to flush stderr can be replaced by code in procutil 
that ensures that procutil.stderr is always unbuffered. I’ll send a 
patch once the current series is through.

>> Is there some command that calls ui.write() many times with small data
>> and doesn’t do much else, to be used as a benchmark?
> 
> hg files?

Patch

diff --git a/mercurial/ui.py b/mercurial/ui.py
--- a/mercurial/ui.py
+++ b/mercurial/ui.py
@@ -1198,9 +1198,14 @@  class ui(object):
                     label = opts.get('label', b'')
                     msg = self.label(msg, label)
                 dest.write(msg)
+            # On Python 2, stdout and stderr are usually line buffered, but
             # stderr may be buffered under win32 when redirected to files,
             # including stdout.
-            if dest is self._ferr and not getattr(self._ferr, 'closed', False):
+            # On Python 3, we use the underlying binary buffer, which does not
+            # support line buffering.
+            if (pycompat.ispy3 or dest is self._ferr) and not getattr(
+                self._ferr, 'closed', False
+            ):
                 dest.flush()
         except IOError as err:
             if dest is self._ferr and err.errno in (