Patchwork [py3] ui: construct _keepalnum list in a python3-friendly way

login
register
mail settings
Submitter Augie Fackler
Date Feb. 16, 2017, 4:35 p.m.
Message ID <791b4e846a7b9a078344.1487262941@arthedain.pit.corp.google.com>
Download mbox | patch
Permalink /patch/18547/
State Accepted
Headers show

Comments

Augie Fackler - Feb. 16, 2017, 4:35 p.m.
# HG changeset patch
# User Augie Fackler <augie@google.com>
# Date 1487262890 18000
#      Thu Feb 16 11:34:50 2017 -0500
# Node ID 791b4e846a7b9a0783440b9504585438777fe2d2
# Parent  1ee685defe80117cf6aafea1ede6c33c478abceb
ui: construct _keepalnum list in a python3-friendly way

It'll be more expensive, but it preserves the behavior.
Yuya Nishihara - Feb. 18, 2017, 8:10 a.m.
On Thu, 16 Feb 2017 11:35:41 -0500, Augie Fackler wrote:
> # HG changeset patch
> # User Augie Fackler <augie@google.com>
> # Date 1487262890 18000
> #      Thu Feb 16 11:34:50 2017 -0500
> # Node ID 791b4e846a7b9a0783440b9504585438777fe2d2
> # Parent  1ee685defe80117cf6aafea1ede6c33c478abceb
> ui: construct _keepalnum list in a python3-friendly way

Looks good. Queued, thanks.
Martijn Pieters - Feb. 18, 2017, 10:58 p.m.
On 16 Feb 2017, at 16:35, Augie Fackler <raf@durin42.com <mailto:raf@durin42.com>> wrote:
> +if pycompat.ispy3:
> +    _unicodes = [bytes([c]).decode('latin1') for c in range(256)]
> +    _notalnum = [s.encode('latin1') for s in _unicodes if not s.isalnum()]

...
> +_keepalnum = ''.join(_notalnum)

This could be more cheaply calculated as

    _keepalnum = bytes(c for c in range(256) if not chr(c).isalnum())

This takes a third of the time.

Martijn
Yuya Nishihara - Feb. 19, 2017, 2:29 p.m.
On Sat, 18 Feb 2017 22:58:10 +0000, Martijn Pieters wrote:
> On 16 Feb 2017, at 16:35, Augie Fackler <raf@durin42.com <mailto:raf@durin42.com>> wrote:
> > +if pycompat.ispy3:
> > +    _unicodes = [bytes([c]).decode('latin1') for c in range(256)]
> > +    _notalnum = [s.encode('latin1') for s in _unicodes if not s.isalnum()]
> 
> ...
> > +_keepalnum = ''.join(_notalnum)
> 
> This could be more cheaply calculated as
> 
>     _keepalnum = bytes(c for c in range(256) if not chr(c).isalnum())
> 
> This takes a third of the time.

Good catch, but I found both of them are incorrect since str.isalnum() is
unicode aware on Python3. We'll need to use bytes.isalnum() or string.*
constants.

Patch

diff --git a/mercurial/ui.py b/mercurial/ui.py
--- a/mercurial/ui.py
+++ b/mercurial/ui.py
@@ -36,7 +36,12 @@  from . import (
 urlreq = util.urlreq
 
 # for use with str.translate(None, _keepalnum), to keep just alphanumerics
-_keepalnum = ''.join(c for c in map(chr, range(256)) if not c.isalnum())
+if pycompat.ispy3:
+    _unicodes = [bytes([c]).decode('latin1') for c in range(256)]
+    _notalnum = [s.encode('latin1') for s in _unicodes if not s.isalnum()]
+else:
+    _notalnum = [c for c in map(chr, range(256)) if not c.isalnum()]
+_keepalnum = ''.join(_notalnum)
 
 samplehgrcs = {
     'user':