Patchwork [stable,v2] curses: do not initialize LC_ALL to user settings (issue6358)

login
register
mail settings
Submitter Manuel Jacob
Date June 26, 2020, 8:35 a.m.
Message ID <36dcc87c0c15d8757ef0.1593160524@tmp>
Download mbox | patch
Permalink /patch/46578/
State Accepted
Headers show

Comments

Manuel Jacob - June 26, 2020, 8:35 a.m.
# HG changeset patch
# User Manuel Jacob <me@manueljacob.de>
# Date 1593157054 -7200
#      Fri Jun 26 09:37:34 2020 +0200
# Branch stable
# Node ID 36dcc87c0c15d8757ef0322c820861b1d10350c1
# Parent  9bb2fc4ac22b0b986d56fd13be4e6b524af7b648
# EXP-Topic init_locale
curses: do not initialize LC_ALL to user settings (issue6358)

701341f57ceb moved the setlocale() call to right before curses was used. This
didn’t fully solve the problem it was supposed to solve (locale-dependent
functions, like date formatting/parsing), but only postponed it.

Initializing LC_CTYPE seems to be sufficient for curses to work correctly.
Luckily this is already done at interpreter startup on modern Python versions
and, since recently, by Mercurial in the pycompat module in all other cases.
Yuya Nishihara - June 26, 2020, 11:47 a.m.
On Fri, 26 Jun 2020 10:35:24 +0200, Manuel Jacob wrote:
> # HG changeset patch
> # User Manuel Jacob <me@manueljacob.de>
> # Date 1593157054 -7200
> #      Fri Jun 26 09:37:34 2020 +0200
> # Branch stable
> # Node ID 36dcc87c0c15d8757ef0322c820861b1d10350c1
> # Parent  9bb2fc4ac22b0b986d56fd13be4e6b524af7b648
> # EXP-Topic init_locale
> curses: do not initialize LC_ALL to user settings (issue6358)

Queued these to default, thanks. The issue to be fixed is relatively small,
but the setlocale() change might have unwanted side effect.
Manuel Jacob - June 26, 2020, 12:42 p.m.
On 2020-06-26 13:47, Yuya Nishihara wrote:
> On Fri, 26 Jun 2020 10:35:24 +0200, Manuel Jacob wrote:
>> # HG changeset patch
>> # User Manuel Jacob <me@manueljacob.de>
>> # Date 1593157054 -7200
>> #      Fri Jun 26 09:37:34 2020 +0200
>> # Branch stable
>> # Node ID 36dcc87c0c15d8757ef0322c820861b1d10350c1
>> # Parent  9bb2fc4ac22b0b986d56fd13be4e6b524af7b648
>> # EXP-Topic init_locale
>> curses: do not initialize LC_ALL to user settings (issue6358)
> 
> Queued these to default, thanks. The issue to be fixed is relatively 
> small,
> but the setlocale() change might have unwanted side effect.

As mentioned in the first patch, non-ASCII filenames in 
hgext.convert.subversion are fixed by it. If I make further non-ASCII 
fixes to hgext.convert.subversion, should do they go to stable (for 
testing, I can ensure that LC_CTYPE is initialized otherwise on Python 
2) or default? For me as a user, it doesn’t make a difference, as I 
don’t use Subversion (especially not with non-ASCII filenames).
Yuya Nishihara - June 26, 2020, 1 p.m.
On Fri, 26 Jun 2020 14:42:58 +0200, Manuel Jacob wrote:
> On 2020-06-26 13:47, Yuya Nishihara wrote:
> > On Fri, 26 Jun 2020 10:35:24 +0200, Manuel Jacob wrote:
> >> # HG changeset patch
> >> # User Manuel Jacob <me@manueljacob.de>
> >> # Date 1593157054 -7200
> >> #      Fri Jun 26 09:37:34 2020 +0200
> >> # Branch stable
> >> # Node ID 36dcc87c0c15d8757ef0322c820861b1d10350c1
> >> # Parent  9bb2fc4ac22b0b986d56fd13be4e6b524af7b648
> >> # EXP-Topic init_locale
> >> curses: do not initialize LC_ALL to user settings (issue6358)
> > 
> > Queued these to default, thanks. The issue to be fixed is relatively 
> > small,
> > but the setlocale() change might have unwanted side effect.
> 
> As mentioned in the first patch, non-ASCII filenames in 
> hgext.convert.subversion are fixed by it. If I make further non-ASCII 
> fixes to hgext.convert.subversion, should do they go to stable (for 
> testing, I can ensure that LC_CTYPE is initialized otherwise on Python 
> 2) or default? For me as a user, it doesn’t make a difference, as I 
> don’t use Subversion (especially not with non-ASCII filenames).

If the scope of the change is narrowed down to the conversion extension,
I think it may go stable. As a user, I never use non-ASCII filenames, too.
Yuya Nishihara - June 27, 2020, 3:27 a.m.
On Fri, 26 Jun 2020 10:35:24 +0200, Manuel Jacob wrote:
> # HG changeset patch
> # User Manuel Jacob <me@manueljacob.de>
> # Date 1593157054 -7200
> #      Fri Jun 26 09:37:34 2020 +0200
> # Branch stable
> # Node ID 36dcc87c0c15d8757ef0322c820861b1d10350c1
> # Parent  9bb2fc4ac22b0b986d56fd13be4e6b524af7b648
> # EXP-Topic init_locale
> curses: do not initialize LC_ALL to user settings (issue6358)
> 
> 701341f57ceb moved the setlocale() call to right before curses was used. This
> didn’t fully solve the problem it was supposed to solve (locale-dependent
> functions, like date formatting/parsing), but only postponed it.
> 
> Initializing LC_CTYPE seems to be sufficient for curses to work correctly.
> Luckily this is already done at interpreter startup on modern Python versions
> and, since recently, by Mercurial in the pycompat module in all other cases.

Sorry to be a bit late, but I noticed a problem on Python 2.

LC_CTYPE changes the character classification rules, which unfortunately
changes the behavior of bytes.is*() functions on Python 2.

https://man7.org/linux/man-pages/man7/locale.7.html
https://docs.python.org/2.7/library/stdtypes.html?highlight=isalpha#str.isalnum

  % python -c 'import locale; locale.setlocale(locale.LC_CTYPE, "de_DE"); print(b"\xe4".isalpha())'
  True
  % python3 -c 'import locale; locale.setlocale(locale.LC_CTYPE, "de_DE"); print(b"\xe4".isalpha())'
  False

So, on Python 2, this patch would make template syntax locale-dependent for
example.
Manuel Jacob - June 27, 2020, 4:42 a.m.
On 2020-06-27 05:27, Yuya Nishihara wrote:
> On Fri, 26 Jun 2020 10:35:24 +0200, Manuel Jacob wrote:
>> # HG changeset patch
>> # User Manuel Jacob <me@manueljacob.de>
>> # Date 1593157054 -7200
>> #      Fri Jun 26 09:37:34 2020 +0200
>> # Branch stable
>> # Node ID 36dcc87c0c15d8757ef0322c820861b1d10350c1
>> # Parent  9bb2fc4ac22b0b986d56fd13be4e6b524af7b648
>> # EXP-Topic init_locale
>> curses: do not initialize LC_ALL to user settings (issue6358)
>> 
>> 701341f57ceb moved the setlocale() call to right before curses was 
>> used. This
>> didn’t fully solve the problem it was supposed to solve 
>> (locale-dependent
>> functions, like date formatting/parsing), but only postponed it.
>> 
>> Initializing LC_CTYPE seems to be sufficient for curses to work 
>> correctly.
>> Luckily this is already done at interpreter startup on modern Python 
>> versions
>> and, since recently, by Mercurial in the pycompat module in all other 
>> cases.
> 
> Sorry to be a bit late, but I noticed a problem on Python 2.
> 
> LC_CTYPE changes the character classification rules, which 
> unfortunately
> changes the behavior of bytes.is*() functions on Python 2.
> 
> https://man7.org/linux/man-pages/man7/locale.7.html
> https://docs.python.org/2.7/library/stdtypes.html?highlight=isalpha#str.isalnum
> 
>   % python -c 'import locale; locale.setlocale(locale.LC_CTYPE,
> "de_DE"); print(b"\xe4".isalpha())'
>   True
>   % python3 -c 'import locale; locale.setlocale(locale.LC_CTYPE,
> "de_DE"); print(b"\xe4".isalpha())'
>   False
> 
> So, on Python 2, this patch would make template syntax locale-dependent 
> for
> example.

I spent quite some time trying to find out why Python 2 doesn’t 
initialize LC_CTYPE. Now I know!

For curses, we can probably work around the problem by resetting 
LC_CTYPE after calling initscr() or after we close the curses UI.

For Subversion it seems like LC_CTYPE must be set all the time instead 
of only during initialization, but I have to double-check.

I can prepare more patches during this weekend. If that’s too long, feel 
free to backout the existing changes.

Patch

diff --git a/hgext/histedit.py b/hgext/histedit.py
--- a/hgext/histedit.py
+++ b/hgext/histedit.py
@@ -201,7 +201,6 @@ 
     termios = None
 
 import functools
-import locale
 import os
 import struct
 
@@ -1710,10 +1709,6 @@ 
         ctxs = []
         for i, r in enumerate(revs):
             ctxs.append(histeditrule(ui, repo[r], i))
-        # Curses requires setting the locale or it will default to the C
-        # locale. This sets the locale to the user's default system
-        # locale.
-        locale.setlocale(locale.LC_ALL, '')
         rc = curses.wrapper(functools.partial(_chisteditmain, repo, ctxs))
         curses.echo()
         curses.endwin()
diff --git a/mercurial/crecord.py b/mercurial/crecord.py
--- a/mercurial/crecord.py
+++ b/mercurial/crecord.py
@@ -10,7 +10,6 @@ 
 
 from __future__ import absolute_import
 
-import locale
 import os
 import re
 import signal
@@ -574,9 +573,6 @@ 
     """
     ui.write(_(b'starting interactive selection\n'))
     chunkselector = curseschunkselector(headerlist, ui, operation)
-    # This is required for ncurses to display non-ASCII characters in
-    # default user locale encoding correctly.  --immerrr
-    locale.setlocale(locale.LC_ALL, '')
     origsigtstp = sentinel = object()
     if util.safehasattr(signal, b'SIGTSTP'):
         origsigtstp = signal.getsignal(signal.SIGTSTP)