Patchwork [1,of,2,V2] pycompat: stop setting LC_CTYPE unconditionally

login
register
mail settings
Submitter Manuel Jacob
Date June 28, 2020, 8:05 a.m.
Message ID <b28c05fb9ce3bd1bcc09.1593331520@tmp>
Download mbox | patch
Permalink /patch/46583/
State Accepted
Headers show

Comments

Manuel Jacob - June 28, 2020, 8:05 a.m.
# HG changeset patch
# User Manuel Jacob <me@manueljacob.de>
# Date 1593284746 -7200
#      Sat Jun 27 21:05:46 2020 +0200
# Node ID b28c05fb9ce3bd1bcc09681d3d069572151af383
# Parent  47a07bbf400a24ef48bcaf0e7a4c15365c3a000b
# EXP-Topic with_lc_type
pycompat: stop setting LC_CTYPE unconditionally

Changeset a25343d16ebe aimed to align how LC_CTYPE is initialized across Python
versions. However, as Yuya Nishihara pointed out, it changes the behavior of
some str methods on Python 2.

Curses requires that LC_CTYPE is initialized correctly. Therefore LC_CTYPE is
set while curses is used and reset afterwards. It shouldn’t be a problem that
some str methods behave differently on Python 2 while curses is used. At least
it’s not a regression compared to what was done before d2227d4c9e6b.

This again breaks non-ASCII filenames passed to the Subversion bindings on
Python 2. Since it didn’t work before a25343d16ebe, it’s however not really a
regression. A separate patch will be sent.
Yuya Nishihara - June 28, 2020, 9:06 a.m.
On Sun, 28 Jun 2020 10:05:20 +0200, Manuel Jacob wrote:
> # HG changeset patch
> # User Manuel Jacob <me@manueljacob.de>
> # Date 1593284746 -7200
> #      Sat Jun 27 21:05:46 2020 +0200
> # Node ID b28c05fb9ce3bd1bcc09681d3d069572151af383
> # Parent  47a07bbf400a24ef48bcaf0e7a4c15365c3a000b
> # EXP-Topic with_lc_type
> pycompat: stop setting LC_CTYPE unconditionally

Queued, thanks.

Patch

diff --git a/hgext/histedit.py b/hgext/histedit.py
--- a/hgext/histedit.py
+++ b/hgext/histedit.py
@@ -1710,7 +1710,8 @@ 
         ctxs = []
         for i, r in enumerate(revs):
             ctxs.append(histeditrule(ui, repo[r], i))
-        rc = curses.wrapper(functools.partial(_chisteditmain, repo, ctxs))
+        with util.with_lc_ctype():
+            rc = curses.wrapper(functools.partial(_chisteditmain, repo, ctxs))
         curses.echo()
         curses.endwin()
         if rc is False:
diff --git a/mercurial/crecord.py b/mercurial/crecord.py
--- a/mercurial/crecord.py
+++ b/mercurial/crecord.py
@@ -569,7 +569,8 @@ 
     if util.safehasattr(signal, b'SIGTSTP'):
         origsigtstp = signal.getsignal(signal.SIGTSTP)
     try:
-        curses.wrapper(chunkselector.main)
+        with util.with_lc_ctype():
+            curses.wrapper(chunkselector.main)
         if chunkselector.initexc is not None:
             raise chunkselector.initexc
         # ncurses does not restore signal handler for SIGTSTP
diff --git a/mercurial/pycompat.py b/mercurial/pycompat.py
--- a/mercurial/pycompat.py
+++ b/mercurial/pycompat.py
@@ -13,7 +13,6 @@ 
 import getopt
 import inspect
 import json
-import locale
 import os
 import shlex
 import sys
@@ -94,26 +93,6 @@ 
     return _rapply(f, xs)
 
 
-# Passing the '' locale means that the locale should be set according to the
-# user settings (environment variables).
-# Python sometimes avoids setting the global locale settings. When interfacing
-# with C code (e.g. the curses module or the Subversion bindings), the global
-# locale settings must be initialized correctly. Python 2 does not initialize
-# the global locale settings on interpreter startup. Python 3 sometimes
-# initializes LC_CTYPE, but not consistently at least on Windows. Therefore we
-# explicitly initialize it to get consistent behavior if it's not already
-# initialized. Since CPython commit 177d921c8c03d30daa32994362023f777624b10d,
-# LC_CTYPE is always initialized. If we require Python 3.8+, we should re-check
-# if we can remove this code.
-if locale.setlocale(locale.LC_CTYPE, None) == 'C':
-    try:
-        locale.setlocale(locale.LC_CTYPE, '')
-    except locale.Error:
-        # The likely case is that the locale from the environment variables is
-        # unknown.
-        pass
-
-
 if ispy3:
     import builtins
     import codecs
diff --git a/mercurial/util.py b/mercurial/util.py
--- a/mercurial/util.py
+++ b/mercurial/util.py
@@ -22,6 +22,7 @@ 
 import gc
 import hashlib
 import itertools
+import locale
 import mmap
 import os
 import platform as pyplatform
@@ -3596,3 +3597,32 @@ 
         if not (byte & 0x80):
             return result
         shift += 7
+
+
+# Passing the '' locale means that the locale should be set according to the
+# user settings (environment variables).
+# Python sometimes avoids setting the global locale settings. When interfacing
+# with C code (e.g. the curses module or the Subversion bindings), the global
+# locale settings must be initialized correctly. Python 2 does not initialize
+# the global locale settings on interpreter startup. Python 3 sometimes
+# initializes LC_CTYPE, but not consistently at least on Windows. Therefore we
+# explicitly initialize it to get consistent behavior if it's not already
+# initialized. Since CPython commit 177d921c8c03d30daa32994362023f777624b10d,
+# LC_CTYPE is always initialized. If we require Python 3.8+, we should re-check
+# if we can remove this code.
+@contextlib.contextmanager
+def with_lc_ctype():
+    oldloc = locale.setlocale(locale.LC_CTYPE, None)
+    if oldloc == 'C':
+        try:
+            try:
+                locale.setlocale(locale.LC_CTYPE, '')
+            except locale.Error:
+                # The likely case is that the locale from the environment
+                # variables is unknown.
+                pass
+            yield
+        finally:
+            locale.setlocale(locale.LC_CTYPE, oldloc)
+    else:
+        yield