Submitter | Augie Fackler |
---|---|
Date | Feb. 16, 2017, 4:35 p.m. |
Message ID | <791b4e846a7b9a078344.1487262941@arthedain.pit.corp.google.com> |
Download | mbox | patch |
Permalink | /patch/18547/ |
State | Accepted |
Headers | show |
Comments
On Thu, 16 Feb 2017 11:35:41 -0500, Augie Fackler wrote: > # HG changeset patch > # User Augie Fackler <augie@google.com> > # Date 1487262890 18000 > # Thu Feb 16 11:34:50 2017 -0500 > # Node ID 791b4e846a7b9a0783440b9504585438777fe2d2 > # Parent 1ee685defe80117cf6aafea1ede6c33c478abceb > ui: construct _keepalnum list in a python3-friendly way Looks good. Queued, thanks.
On 16 Feb 2017, at 16:35, Augie Fackler <raf@durin42.com <mailto:raf@durin42.com>> wrote: > +if pycompat.ispy3: > + _unicodes = [bytes([c]).decode('latin1') for c in range(256)] > + _notalnum = [s.encode('latin1') for s in _unicodes if not s.isalnum()] ... > +_keepalnum = ''.join(_notalnum) This could be more cheaply calculated as _keepalnum = bytes(c for c in range(256) if not chr(c).isalnum()) This takes a third of the time. Martijn
On Sat, 18 Feb 2017 22:58:10 +0000, Martijn Pieters wrote: > On 16 Feb 2017, at 16:35, Augie Fackler <raf@durin42.com <mailto:raf@durin42.com>> wrote: > > +if pycompat.ispy3: > > + _unicodes = [bytes([c]).decode('latin1') for c in range(256)] > > + _notalnum = [s.encode('latin1') for s in _unicodes if not s.isalnum()] > > ... > > +_keepalnum = ''.join(_notalnum) > > This could be more cheaply calculated as > > _keepalnum = bytes(c for c in range(256) if not chr(c).isalnum()) > > This takes a third of the time. Good catch, but I found both of them are incorrect since str.isalnum() is unicode aware on Python3. We'll need to use bytes.isalnum() or string.* constants.
Patch
diff --git a/mercurial/ui.py b/mercurial/ui.py --- a/mercurial/ui.py +++ b/mercurial/ui.py @@ -36,7 +36,12 @@ from . import ( urlreq = util.urlreq # for use with str.translate(None, _keepalnum), to keep just alphanumerics -_keepalnum = ''.join(c for c in map(chr, range(256)) if not c.isalnum()) +if pycompat.ispy3: + _unicodes = [bytes([c]).decode('latin1') for c in range(256)] + _notalnum = [s.encode('latin1') for s in _unicodes if not s.isalnum()] +else: + _notalnum = [c for c in map(chr, range(256)) if not c.isalnum()] +_keepalnum = ''.join(_notalnum) samplehgrcs = { 'user':