Patchwork py3: make py3 compat.iterbytestr simpler and faster

login
register
mail settings
Submitter via Mercurial-devel
Date March 14, 2017, 9:55 p.m.
Message ID <375345ceabc9f14e780d.1489528539@martinvonz.mtv.corp.google.com>
Download mbox | patch
Permalink /patch/19342/
State Changes Requested
Headers show

Comments

via Mercurial-devel - March 14, 2017, 9:55 p.m.
# HG changeset patch
# User Martin von Zweigbergk <martinvonz@google.com>
# Date 1489528461 25200
#      Tue Mar 14 14:54:21 2017 -0700
# Node ID 375345ceabc9f14e780d8a8efd00ed32ffc6396c
# Parent  3d3109339b57341b333c1112beb41dd281fa944a
py3: make py3 compat.iterbytestr simpler and faster

I don't have a py3.5 installed, so I have not run tests on this, but
maybe it works?

$ python3 -m timeit -s 's=b"a"*100' 'for x in iter(s[i:i + 1] for i in range(len(s))): pass'
100000 loops, best of 3: 10.3 usec per loop

$ python3 -m timeit -s 's=b"a"*100' 'for x in map(chr, s): pass'
100000 loops, best of 3: 6.16 usec per loop

(I picked the best "before" time and worst "after" time out of a few runs.)
Gregory Szorc - March 14, 2017, 10:19 p.m.
On Tue, Mar 14, 2017 at 2:55 PM, Martin von Zweigbergk via Mercurial-devel <
mercurial-devel@mercurial-scm.org> wrote:

> # HG changeset patch
> # User Martin von Zweigbergk <martinvonz@google.com>
> # Date 1489528461 25200
> #      Tue Mar 14 14:54:21 2017 -0700
> # Node ID 375345ceabc9f14e780d8a8efd00ed32ffc6396c
> # Parent  3d3109339b57341b333c1112beb41dd281fa944a
> py3: make py3 compat.iterbytestr simpler and faster
>
> I don't have a py3.5 installed, so I have not run tests on this, but
> maybe it works?
>
> $ python3 -m timeit -s 's=b"a"*100' 'for x in iter(s[i:i + 1] for i in
> range(len(s))): pass'
> 100000 loops, best of 3: 10.3 usec per loop
>
> $ python3 -m timeit -s 's=b"a"*100' 'for x in map(chr, s): pass'
> 100000 loops, best of 3: 6.16 usec per loop
>
> (I picked the best "before" time and worst "after" time out of a few runs.)
>
> diff -r 3d3109339b57 -r 375345ceabc9 mercurial/pycompat.py
> --- a/mercurial/pycompat.py     Mon Mar 13 11:19:24 2017 -0700
> +++ b/mercurial/pycompat.py     Tue Mar 14 14:54:21 2017 -0700
> @@ -78,7 +78,7 @@
>
>      def iterbytestr(s):
>          """Iterate bytes as if it were a str object of Python 2"""
> -        return iter(s[i:i + 1] for i in range(len(s)))
> +        return map(chr, s)
>
>
The change to map() makes sense to me and I can understand how it would be
faster. However, chr returns a Python 3 str instead of bytes, which is
wrong. Try the following:

return map(struct.Struct('>B').pack, s)


>      def sysstr(s):
>          """Return a keyword str to be passed to Python functions such as
>
via Mercurial-devel - March 14, 2017, 10:25 p.m.
On Tue, Mar 14, 2017 at 3:19 PM, Gregory Szorc <gregory.szorc@gmail.com> wrote:
> On Tue, Mar 14, 2017 at 2:55 PM, Martin von Zweigbergk via Mercurial-devel
> <mercurial-devel@mercurial-scm.org> wrote:
>>
>> # HG changeset patch
>> # User Martin von Zweigbergk <martinvonz@google.com>
>> # Date 1489528461 25200
>> #      Tue Mar 14 14:54:21 2017 -0700
>> # Node ID 375345ceabc9f14e780d8a8efd00ed32ffc6396c
>> # Parent  3d3109339b57341b333c1112beb41dd281fa944a
>> py3: make py3 compat.iterbytestr simpler and faster
>>
>> I don't have a py3.5 installed, so I have not run tests on this, but
>> maybe it works?
>>
>> $ python3 -m timeit -s 's=b"a"*100' 'for x in iter(s[i:i + 1] for i in
>> range(len(s))): pass'
>> 100000 loops, best of 3: 10.3 usec per loop
>>
>> $ python3 -m timeit -s 's=b"a"*100' 'for x in map(chr, s): pass'
>> 100000 loops, best of 3: 6.16 usec per loop
>>
>> (I picked the best "before" time and worst "after" time out of a few
>> runs.)
>>
>> diff -r 3d3109339b57 -r 375345ceabc9 mercurial/pycompat.py
>> --- a/mercurial/pycompat.py     Mon Mar 13 11:19:24 2017 -0700
>> +++ b/mercurial/pycompat.py     Tue Mar 14 14:54:21 2017 -0700
>> @@ -78,7 +78,7 @@
>>
>>      def iterbytestr(s):
>>          """Iterate bytes as if it were a str object of Python 2"""
>> -        return iter(s[i:i + 1] for i in range(len(s)))
>> +        return map(chr, s)
>>
>
> The change to map() makes sense to me and I can understand how it would be
> faster. However, chr returns a Python 3 str instead of bytes, which is
> wrong. Try the following:
>
> return map(struct.Struct('>B').pack, s)

Ah. That's still noticeably faster: ~7us. I'll send a v2. Thanks.

>
>>
>>      def sysstr(s):
>>          """Return a keyword str to be passed to Python functions such as
>
>

Patch

diff -r 3d3109339b57 -r 375345ceabc9 mercurial/pycompat.py
--- a/mercurial/pycompat.py	Mon Mar 13 11:19:24 2017 -0700
+++ b/mercurial/pycompat.py	Tue Mar 14 14:54:21 2017 -0700
@@ -78,7 +78,7 @@ 
 
     def iterbytestr(s):
         """Iterate bytes as if it were a str object of Python 2"""
-        return iter(s[i:i + 1] for i in range(len(s)))
+        return map(chr, s)
 
     def sysstr(s):
         """Return a keyword str to be passed to Python functions such as