Patchwork performance: disable workaround for an old bug of Python

login
register
mail settings
Submitter Maciej Fijalkowski
Date Aug. 10, 2016, 2:45 p.m.
Message ID <300f14ea21432face8d7.1470840358@brick.arcode.com>
Download mbox | patch
Permalink /patch/16237/
State Accepted
Headers show

Comments

Maciej Fijalkowski - Aug. 10, 2016, 2:45 p.m.
# HG changeset patch
# User Maciej Fijalkowski <fijall@gmail.com>
# Date 1469708281 -7200
#      Thu Jul 28 14:18:01 2016 +0200
# Node ID 300f14ea21432face8d7e6cdcf92ba9d2f1f92dc
# Parent  2f4cd91aaa9a2ad1e4de097a1ec7a59127d51e20
performance: disable workaround for an old bug of Python

Since disabling the gc does things worse for pypy and the bug was
fixed in 2.7, let's only enable it in <2.7
Jun Wu - Aug. 10, 2016, 2:54 p.m.
I guess some mercurial code is depending on this for perf reasons -
disabling it would make things slower for CPython.

Maybe we can just disable it for Pypy.

Excerpts from Maciej Fijalkowski's message of 2016-08-10 16:45:58 +0200:
> # HG changeset patch
> # User Maciej Fijalkowski <fijall@gmail.com>
> # Date 1469708281 -7200
> #      Thu Jul 28 14:18:01 2016 +0200
> # Node ID 300f14ea21432face8d7e6cdcf92ba9d2f1f92dc
> # Parent  2f4cd91aaa9a2ad1e4de097a1ec7a59127d51e20
> performance: disable workaround for an old bug of Python
> 
> Since disabling the gc does things worse for pypy and the bug was
> fixed in 2.7, let's only enable it in <2.7
> 
> diff --git a/mercurial/util.py b/mercurial/util.py
> --- a/mercurial/util.py
> +++ b/mercurial/util.py
> @@ -852,6 +852,8 @@
>  
>      This garbage collector issue have been fixed in 2.7.
>      """
> +    if sys.version >= (2, 7):
> +        return func
>      def wrapper(*args, **kwargs):
>          gcenabled = gc.isenabled()
>          gc.disable()
Matt Mackall - Aug. 10, 2016, 10:36 p.m.
On Wed, 2016-08-10 at 15:54 +0100, Jun Wu wrote:
> I guess some mercurial code is depending on this for perf reasons -
> disabling it would make things slower for CPython.

To quote the comment from this function:

    Python's garbage collector triggers a GC each time a certain number of      
    container objects (the number being defined by gc.get_threshold()) are      
    allocated even when marked not to be tracked by the collector. Tracking has
    no effect on when GCs are triggered, only on what objects the GC looks      
    into. As a workaround, disable GC while building complex (huge)             
    containers.                                                                 
                                                                                
    This garbage collector issue have been fixed in 2.7.    

It's either fixed in 2.7 or it's not. The primary users of this code are
dirstate and obsmarkers, so it should be pretty easy to test. This changeset has
a benchmark:

changeset:   25675:5817f71c2336
user:        Pierre-Yves David <pierre-yves.david@fb.com>
date:        Wed Nov 26 16:58:31 2014 -0800
files:       mercurial/obsolete.py
description:
obsstore: disable garbage collection during initialization (issue4456)

-- 
Mathematics is the supreme nostalgia of our time.
Maciej Fijalkowski - Aug. 12, 2016, 9:42 p.m.
Well, seems the comment is out of date. I know the issue - the GC got
triggered every X objects in 2.6, which caused O(n^2) performance when
allocating lots of objects. These days (since 2.7) it adapts the
threshold, which means it always amortizes to O(n).

Either way, I don't care if we disable it for pypy or for cpython too.

On Thu, Aug 11, 2016 at 12:36 AM, Matt Mackall <mpm@selenic.com> wrote:
> On Wed, 2016-08-10 at 15:54 +0100, Jun Wu wrote:
>> I guess some mercurial code is depending on this for perf reasons -
>> disabling it would make things slower for CPython.
>
> To quote the comment from this function:
>
>     Python's garbage collector triggers a GC each time a certain number of
>     container objects (the number being defined by gc.get_threshold()) are
>     allocated even when marked not to be tracked by the collector. Tracking has
>     no effect on when GCs are triggered, only on what objects the GC looks
>     into. As a workaround, disable GC while building complex (huge)
>     containers.
>
>     This garbage collector issue have been fixed in 2.7.
>
> It's either fixed in 2.7 or it's not. The primary users of this code are
> dirstate and obsmarkers, so it should be pretty easy to test. This changeset has
> a benchmark:
>
> changeset:   25675:5817f71c2336
> user:        Pierre-Yves David <pierre-yves.david@fb.com>
> date:        Wed Nov 26 16:58:31 2014 -0800
> files:       mercurial/obsolete.py
> description:
> obsstore: disable garbage collection during initialization (issue4456)
>
> --
> Mathematics is the supreme nostalgia of our time.
>
Yuya Nishihara - Aug. 14, 2016, 2:14 a.m.
On Fri, 12 Aug 2016 23:42:38 +0200, Maciej Fijalkowski wrote:
> Well, seems the comment is out of date. I know the issue - the GC got
> triggered every X objects in 2.6, which caused O(n^2) performance when
> allocating lots of objects. These days (since 2.7) it adapts the
> threshold, which means it always amortizes to O(n).

> On Thu, Aug 11, 2016 at 12:36 AM, Matt Mackall <mpm@selenic.com> wrote:
> > It's either fixed in 2.7 or it's not. The primary users of this code are
> > dirstate and obsmarkers, so it should be pretty easy to test. This changeset has
> > a benchmark:
> >
> > changeset:   25675:5817f71c2336
> > user:        Pierre-Yves David <pierre-yves.david@fb.com>
> > date:        Wed Nov 26 16:58:31 2014 -0800
> > files:       mercurial/obsolete.py
> > description:
> > obsstore: disable garbage collection during initialization (issue4456)

I can't say there's measurable win on CPython 2.7, and we have native parser
now. So I'll take this patch, thanks.

  $ hg debugobsolete | wc -l
  106437
  $ hg up 5817f71c2336  # use pure python parser
  $ python -m timeit -r10 \
  -s 'from mercurial import obsolete, scmutil; svfs = scmutil.vfs(".hg/store")' \
  'obsolete.obsstore(svfs)'

  (Python 2.6.9, GC disabled)
  10 loops, best of 10: 714 msec per loop
  (Python 2.6.9, GC enabled)
  10 loops, best of 10: 746 msec per loop

  (Python 2.7.12+, GC disabled)
  10 loops, best of 10: 699 msec per loop
  (Python 2.7.12+, GC enabled)
  10 loops, best of 10: 703 msec per loop

The result of timeit wasn't stable.
Maciej Fijalkowski - Aug. 20, 2016, 9:04 p.m.
On Sun, Aug 14, 2016 at 4:14 AM, Yuya Nishihara <yuya@tcha.org> wrote:
> On Fri, 12 Aug 2016 23:42:38 +0200, Maciej Fijalkowski wrote:
>> Well, seems the comment is out of date. I know the issue - the GC got
>> triggered every X objects in 2.6, which caused O(n^2) performance when
>> allocating lots of objects. These days (since 2.7) it adapts the
>> threshold, which means it always amortizes to O(n).
>
>> On Thu, Aug 11, 2016 at 12:36 AM, Matt Mackall <mpm@selenic.com> wrote:
>> > It's either fixed in 2.7 or it's not. The primary users of this code are
>> > dirstate and obsmarkers, so it should be pretty easy to test. This changeset has
>> > a benchmark:
>> >
>> > changeset:   25675:5817f71c2336
>> > user:        Pierre-Yves David <pierre-yves.david@fb.com>
>> > date:        Wed Nov 26 16:58:31 2014 -0800
>> > files:       mercurial/obsolete.py
>> > description:
>> > obsstore: disable garbage collection during initialization (issue4456)
>
> I can't say there's measurable win on CPython 2.7, and we have native parser
> now. So I'll take this patch, thanks.
>
>   $ hg debugobsolete | wc -l
>   106437
>   $ hg up 5817f71c2336  # use pure python parser
>   $ python -m timeit -r10 \
>   -s 'from mercurial import obsolete, scmutil; svfs = scmutil.vfs(".hg/store")' \
>   'obsolete.obsstore(svfs)'
>
>   (Python 2.6.9, GC disabled)
>   10 loops, best of 10: 714 msec per loop
>   (Python 2.6.9, GC enabled)
>   10 loops, best of 10: 746 msec per loop
>
>   (Python 2.7.12+, GC disabled)
>   10 loops, best of 10: 699 msec per loop
>   (Python 2.7.12+, GC enabled)
>   10 loops, best of 10: 703 msec per loop
>
> The result of timeit wasn't stable.

Well.... you're using TIMEIT to do GC benchmarks - it's a terrible
idea, ever, but for GC especially. You're taking a minimum of 10 runs
- of course it'll be random, you should take average instead and do
more runs.

That said, this is what I would expect, but I still strongly disagree
with the methodology.

Cheers,
fijal
Yuya Nishihara - Aug. 21, 2016, 3:19 p.m.
On Sat, 20 Aug 2016 23:04:06 +0200, Maciej Fijalkowski wrote:
> On Sun, Aug 14, 2016 at 4:14 AM, Yuya Nishihara <yuya@tcha.org> wrote:
> > On Fri, 12 Aug 2016 23:42:38 +0200, Maciej Fijalkowski wrote:
> >> Well, seems the comment is out of date. I know the issue - the GC got
> >> triggered every X objects in 2.6, which caused O(n^2) performance when
> >> allocating lots of objects. These days (since 2.7) it adapts the
> >> threshold, which means it always amortizes to O(n).
> >
> >> On Thu, Aug 11, 2016 at 12:36 AM, Matt Mackall <mpm@selenic.com> wrote:
> >> > It's either fixed in 2.7 or it's not. The primary users of this code are
> >> > dirstate and obsmarkers, so it should be pretty easy to test. This changeset has
> >> > a benchmark:
> >> >
> >> > changeset:   25675:5817f71c2336
> >> > user:        Pierre-Yves David <pierre-yves.david@fb.com>
> >> > date:        Wed Nov 26 16:58:31 2014 -0800
> >> > files:       mercurial/obsolete.py
> >> > description:
> >> > obsstore: disable garbage collection during initialization (issue4456)
> >
> > I can't say there's measurable win on CPython 2.7, and we have native parser
> > now. So I'll take this patch, thanks.
> >
> >   $ hg debugobsolete | wc -l
> >   106437
> >   $ hg up 5817f71c2336  # use pure python parser
> >   $ python -m timeit -r10 \
> >   -s 'from mercurial import obsolete, scmutil; svfs = scmutil.vfs(".hg/store")' \
> >   'obsolete.obsstore(svfs)'
> >
> >   (Python 2.6.9, GC disabled)
> >   10 loops, best of 10: 714 msec per loop
> >   (Python 2.6.9, GC enabled)
> >   10 loops, best of 10: 746 msec per loop
> >
> >   (Python 2.7.12+, GC disabled)
> >   10 loops, best of 10: 699 msec per loop
> >   (Python 2.7.12+, GC enabled)
> >   10 loops, best of 10: 703 msec per loop
> >
> > The result of timeit wasn't stable.
> 
> Well.... you're using TIMEIT to do GC benchmarks - it's a terrible
> idea, ever, but for GC especially. You're taking a minimum of 10 runs
> - of course it'll be random, you should take average instead and do
> more runs.
> 
> That said, this is what I would expect, but I still strongly disagree
> with the methodology.

Good point. At least, I should try to calculate average of many runs.

I thought 100k markers would be large enough to see significant difference
even by timeit, on Python 2.6.9 as described in 5817f71c2336, but it seemed
not.
Maciej Fijalkowski - Aug. 21, 2016, 4:49 p.m.
On Sun, Aug 21, 2016 at 5:19 PM, Yuya Nishihara <yuya@tcha.org> wrote:
> On Sat, 20 Aug 2016 23:04:06 +0200, Maciej Fijalkowski wrote:
>> On Sun, Aug 14, 2016 at 4:14 AM, Yuya Nishihara <yuya@tcha.org> wrote:
>> > On Fri, 12 Aug 2016 23:42:38 +0200, Maciej Fijalkowski wrote:
>> >> Well, seems the comment is out of date. I know the issue - the GC got
>> >> triggered every X objects in 2.6, which caused O(n^2) performance when
>> >> allocating lots of objects. These days (since 2.7) it adapts the
>> >> threshold, which means it always amortizes to O(n).
>> >
>> >> On Thu, Aug 11, 2016 at 12:36 AM, Matt Mackall <mpm@selenic.com> wrote:
>> >> > It's either fixed in 2.7 or it's not. The primary users of this code are
>> >> > dirstate and obsmarkers, so it should be pretty easy to test. This changeset has
>> >> > a benchmark:
>> >> >
>> >> > changeset:   25675:5817f71c2336
>> >> > user:        Pierre-Yves David <pierre-yves.david@fb.com>
>> >> > date:        Wed Nov 26 16:58:31 2014 -0800
>> >> > files:       mercurial/obsolete.py
>> >> > description:
>> >> > obsstore: disable garbage collection during initialization (issue4456)
>> >
>> > I can't say there's measurable win on CPython 2.7, and we have native parser
>> > now. So I'll take this patch, thanks.
>> >
>> >   $ hg debugobsolete | wc -l
>> >   106437
>> >   $ hg up 5817f71c2336  # use pure python parser
>> >   $ python -m timeit -r10 \
>> >   -s 'from mercurial import obsolete, scmutil; svfs = scmutil.vfs(".hg/store")' \
>> >   'obsolete.obsstore(svfs)'
>> >
>> >   (Python 2.6.9, GC disabled)
>> >   10 loops, best of 10: 714 msec per loop
>> >   (Python 2.6.9, GC enabled)
>> >   10 loops, best of 10: 746 msec per loop
>> >
>> >   (Python 2.7.12+, GC disabled)
>> >   10 loops, best of 10: 699 msec per loop
>> >   (Python 2.7.12+, GC enabled)
>> >   10 loops, best of 10: 703 msec per loop
>> >
>> > The result of timeit wasn't stable.
>>
>> Well.... you're using TIMEIT to do GC benchmarks - it's a terrible
>> idea, ever, but for GC especially. You're taking a minimum of 10 runs
>> - of course it'll be random, you should take average instead and do
>> more runs.
>>
>> That said, this is what I would expect, but I still strongly disagree
>> with the methodology.
>
> Good point. At least, I should try to calculate average of many runs.
>
> I thought 100k markers would be large enough to see significant difference
> even by timeit, on Python 2.6.9 as described in 5817f71c2336, but it seemed
> not.

never use timeit
Augie Fackler - Aug. 22, 2016, 2:13 p.m.
On Sun, Aug 21, 2016 at 06:49:45PM +0200, Maciej Fijalkowski wrote:
> On Sun, Aug 21, 2016 at 5:19 PM, Yuya Nishihara <yuya@tcha.org> wrote:
> > On Sat, 20 Aug 2016 23:04:06 +0200, Maciej Fijalkowski wrote:
> >> Well.... you're using TIMEIT to do GC benchmarks - it's a terrible
> >> idea, ever, but for GC especially. You're taking a minimum of 10 runs
> >> - of course it'll be random, you should take average instead and do
> >> more runs.
> >>
> >> That said, this is what I would expect, but I still strongly disagree
> >> with the methodology.
> >
> > Good point. At least, I should try to calculate average of many runs.
> >
> > I thought 100k markers would be large enough to see significant difference
> > even by timeit, on Python 2.6.9 as described in 5817f71c2336, but it seemed
> > not.
>
> never use timeit

You've got a lot more experience with Python performance than I - for
little microbenchmarks is timeit still not good? If so, what would you
recommend for (semi-disposable) microbenchmarks?

Thanks!

> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Bryan O'Sullivan - Aug. 23, 2016, 4:13 p.m.
On Mon, Aug 22, 2016 at 7:13 AM, Augie Fackler <raf@durin42.com> wrote:

> You've got a lot more experience with Python performance than I - for
> little microbenchmarks is timeit still not good? If so, what would you
> recommend for (semi-disposable) microbenchmarks?
>

I don't believe there are any trustworthy Python benchmarking libraries.

The state of the art is advancing fairly quickly in this area in general.
Here's a good paper that was put up on arxiv just last week:
http://arxiv.org/pdf/1608.04295.pdf

Patch

diff --git a/mercurial/util.py b/mercurial/util.py
--- a/mercurial/util.py
+++ b/mercurial/util.py
@@ -852,6 +852,8 @@ 
 
     This garbage collector issue have been fixed in 2.7.
     """
+    if sys.version >= (2, 7):
+        return func
     def wrapper(*args, **kwargs):
         gcenabled = gc.isenabled()
         gc.disable()