From patchwork Sun Mar 29 20:31:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: D8339: dispatch: force \n for newlines on sys.std* streams (BC) From: phabricator X-Patchwork-Id: 45939 Message-Id: To: Phabricator Cc: mercurial-devel@mercurial-scm.org Date: Sun, 29 Mar 2020 20:31:42 +0000 indygreg created this revision. Herald added a subscriber: mercurial-devel. Herald added a reviewer: hg-reviewers. REVISION SUMMARY The sys.std* streams behave differently on Python 3. On Python 3, these streams are an io.TextIOWrapper that wraps a binary buffer stored on a .buffer attribute. These TextIOWrapper instances normalize \n to os.linesep by default. On Windows, this means that \n is normalized to \r\n. So functions like print() which have an implicit end='\n' will actually emit \r\n for line endings. While most parts of Mercurial go through the ui.write() layer to print output, some code - notably in extensions and hooks - can use print(). If this code was using print() or otherwise writing to sys.std* on Windows, Mercurial would emit \r\n. In reality, pretty much everything on Windows reacts to \n just fine. Mercurial itself doesn't emit \r\n when going through the ui layer. Changing the sys.std* streams to not normalize line endings sounds like a scary change. But I think it is safe. It also makes Mercurial on Python 3 behave similarly to Python 2, which did not perform \r\n normalization in print() by default. .. bc:: sys.{stdout, stderr, stdin} now use \n line endings on Python 3 REPOSITORY rHG Mercurial BRANCH default REVISION DETAIL https://phab.mercurial-scm.org/D8339 AFFECTED FILES mercurial/dispatch.py CHANGE DETAILS To: indygreg, #hg-reviewers Cc: mercurial-devel diff --git a/mercurial/dispatch.py b/mercurial/dispatch.py --- a/mercurial/dispatch.py +++ b/mercurial/dispatch.py @@ -10,6 +10,7 @@ import difflib import errno import getopt +import io import os import pdb import re @@ -144,7 +145,50 @@ if pycompat.ispy3: def initstdio(): - pass + # stdio streams on Python 3 are io.TextIOWrapper instances proxying another + # buffer. These streams will normalize \n to \r\n by default. Mercurial's + # preferred mechanism for writing output (ui.write()) uses io.BufferedWriter + # instances, which write to the underlying stdio file descriptor in binary + # mode. ui.write() uses \n for line endings and no line ending normalization + # is attempted through this interface. This "just works," even if the system + # preferred line ending is not \n. + # + # But some parts of Mercurial (e.g. hooks) can still send data to sys.stdout + # and sys.stderr. They will inherit the line ending normalization settings, + # potentially causing e.g. \r\n to be emitted. Since emitting \n should + # "just work," here we change the sys.* streams to disable line ending + # normalization, ensuring compatibility with our ui type. + + # write_through is new in Python 3.7. + kwargs = { + "newline": "\n", + "line_buffering": sys.stdout.line_buffering, + } + if util.safehasattr(sys.stdout, "write_through"): + kwargs["write_through"] = sys.stdout.write_through + sys.stdout = io.TextIOWrapper( + sys.stdout.buffer, sys.stdout.encoding, sys.stdout.errors, **kwargs + ) + + kwargs = { + "newline": "\n", + "line_buffering": sys.stderr.line_buffering, + } + if util.safehasattr(sys.stderr, "write_through"): + kwargs["write_through"] = sys.stderr.write_through + sys.stderr = io.TextIOWrapper( + sys.stderr.buffer, sys.stderr.encoding, sys.stderr.errors, **kwargs + ) + + # No write_through on read-only stream. + sys.stdin = io.TextIOWrapper( + sys.stdin.buffer, + sys.stdin.encoding, + sys.stdin.errors, + # None is universal newlines mode. + newline=None, + line_buffering=sys.stdin.line_buffering, + ) def _silencestdio(): for fp in (sys.stdout, sys.stderr):