Patchwork [rfc] rfc: call gc at exit of mercurial

login
register
mail settings
Submitter Maciej Fijalkowski
Date April 1, 2016, 6:58 a.m.
Message ID <66b3737b62635691b5a2.1459493929@brick.arcode.com>
Download mbox | patch
Permalink /patch/14218/
State Changes Requested
Delegated to: Pierre-Yves David
Headers show

Comments

Maciej Fijalkowski - April 1, 2016, 6:58 a.m.
# HG changeset patch
# User Maciej Fijalkowski <fijall@gmail.com>
# Date 1459493914 -7200
#      Fri Apr 01 08:58:34 2016 +0200
# Node ID 66b3737b62635691b5a205dafc80e640880b77ca
# Parent  9ca7854b963295b8b56d276b903623ac8277f18d
rfc: call gc at exit of mercurial

This is not a proper patch, just a visualization of a hack I did.
Locks have a __del__, which test-devel-warnings rely on being
called at the exit of a program. Locks also have __exit__ which is
a more proper way to handle locks, but that test does not respect it.
I would like to know what's the proper way of handling this - should
I fix test, should I just call the GC (which will make the command
line tool slower) or what's the proposed solution
Gregory Szorc - April 1, 2016, 7:09 a.m.
> On Mar 31, 2016, at 23:58, Maciej Fijalkowski <fijall@gmail.com> wrote:
> 
> # HG changeset patch
> # User Maciej Fijalkowski <fijall@gmail.com>
> # Date 1459493914 -7200
> #      Fri Apr 01 08:58:34 2016 +0200
> # Node ID 66b3737b62635691b5a205dafc80e640880b77ca
> # Parent  9ca7854b963295b8b56d276b903623ac8277f18d
> rfc: call gc at exit of mercurial
> 
> This is not a proper patch, just a visualization of a hack I did.
> Locks have a __del__, which test-devel-warnings rely on being
> called at the exit of a program. Locks also have __exit__ which is
> a more proper way to handle locks, but that test does not respect it.
> I would like to know what's the proper way of handling this - should
> I fix test, should I just call the GC (which will make the command
> line tool slower) or what's the proposed solution

__del__ should be purged and banned from the code base because it undermines GC and can lead to memory leaks.

For holding on to temporary resources, context managers (preferred) or try..finally should be used.

> 
> diff -r 9ca7854b9632 -r 66b3737b6263 hg
> --- a/hg    Thu Mar 31 18:38:08 2016 +0200
> +++ b/hg    Fri Apr 01 08:58:34 2016 +0200
> @@ -40,4 +40,8 @@
> for fp in (sys.stdin, sys.stdout, sys.stderr):
>     mercurial.util.setbinary(fp)
> 
> -mercurial.dispatch.run()
> +try:
> +    mercurial.dispatch.run()
> +finally:
> +    import gc
> +    gc.collect()
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Maciej Fijalkowski - April 1, 2016, 7:10 a.m.
On Fri, Apr 1, 2016 at 9:09 AM, Gregory Szorc <gregory.szorc@gmail.com> wrote:
>
>
>> On Mar 31, 2016, at 23:58, Maciej Fijalkowski <fijall@gmail.com> wrote:
>>
>> # HG changeset patch
>> # User Maciej Fijalkowski <fijall@gmail.com>
>> # Date 1459493914 -7200
>> #      Fri Apr 01 08:58:34 2016 +0200
>> # Node ID 66b3737b62635691b5a205dafc80e640880b77ca
>> # Parent  9ca7854b963295b8b56d276b903623ac8277f18d
>> rfc: call gc at exit of mercurial
>>
>> This is not a proper patch, just a visualization of a hack I did.
>> Locks have a __del__, which test-devel-warnings rely on being
>> called at the exit of a program. Locks also have __exit__ which is
>> a more proper way to handle locks, but that test does not respect it.
>> I would like to know what's the proper way of handling this - should
>> I fix test, should I just call the GC (which will make the command
>> line tool slower) or what's the proposed solution
>
> __del__ should be purged and banned from the code base because it undermines GC and can lead to memory leaks.
>
> For holding on to temporary resources, context managers (preferred) or try..finally should be used.

So what do we do with lock.__del__ and how tests rely on it?
Pierre-Yves David - April 1, 2016, 9:13 a.m.
On 04/01/2016 12:10 AM, Maciej Fijalkowski wrote:
> On Fri, Apr 1, 2016 at 9:09 AM, Gregory Szorc <gregory.szorc@gmail.com> wrote:
>>
>>
>>> On Mar 31, 2016, at 23:58, Maciej Fijalkowski <fijall@gmail.com> wrote:
>>>
>>> # HG changeset patch
>>> # User Maciej Fijalkowski <fijall@gmail.com>
>>> # Date 1459493914 -7200
>>> #      Fri Apr 01 08:58:34 2016 +0200
>>> # Node ID 66b3737b62635691b5a205dafc80e640880b77ca
>>> # Parent  9ca7854b963295b8b56d276b903623ac8277f18d
>>> rfc: call gc at exit of mercurial
>>>
>>> This is not a proper patch, just a visualization of a hack I did.
>>> Locks have a __del__, which test-devel-warnings rely on being
>>> called at the exit of a program. Locks also have __exit__ which is
>>> a more proper way to handle locks, but that test does not respect it.
>>> I would like to know what's the proper way of handling this - should
>>> I fix test, should I just call the GC (which will make the command
>>> line tool slower) or what's the proposed solution
>>
>> __del__ should be purged and banned from the code base because it undermines GC and can lead to memory leaks.
>>
>> For holding on to temporary resources, context managers (preferred) or try..finally should be used.
>
> So what do we do with lock.__del__ and how tests rely on it?

Can we get details about this test failure and why it relies on it?
Maciej Fijalkowski - April 1, 2016, 9:43 a.m.
I think it does the locks in the wrong order which means some of them
abort, exiting the function "buggylocking". That means, because we're
not in a with x as lock, we're not cleaning up after the lock until
the __del__ is called. It's possible to fix the "buggylocking" test,
but it's called buggylocking for a reason :-)

On Fri, Apr 1, 2016 at 11:13 AM, Pierre-Yves David
<pierre-yves.david@ens-lyon.org> wrote:
>
>
> On 04/01/2016 12:10 AM, Maciej Fijalkowski wrote:
>>
>> On Fri, Apr 1, 2016 at 9:09 AM, Gregory Szorc <gregory.szorc@gmail.com>
>> wrote:
>>>
>>>
>>>
>>>> On Mar 31, 2016, at 23:58, Maciej Fijalkowski <fijall@gmail.com> wrote:
>>>>
>>>> # HG changeset patch
>>>> # User Maciej Fijalkowski <fijall@gmail.com>
>>>> # Date 1459493914 -7200
>>>> #      Fri Apr 01 08:58:34 2016 +0200
>>>> # Node ID 66b3737b62635691b5a205dafc80e640880b77ca
>>>> # Parent  9ca7854b963295b8b56d276b903623ac8277f18d
>>>> rfc: call gc at exit of mercurial
>>>>
>>>> This is not a proper patch, just a visualization of a hack I did.
>>>> Locks have a __del__, which test-devel-warnings rely on being
>>>> called at the exit of a program. Locks also have __exit__ which is
>>>> a more proper way to handle locks, but that test does not respect it.
>>>> I would like to know what's the proper way of handling this - should
>>>> I fix test, should I just call the GC (which will make the command
>>>> line tool slower) or what's the proposed solution
>>>
>>>
>>> __del__ should be purged and banned from the code base because it
>>> undermines GC and can lead to memory leaks.
>>>
>>> For holding on to temporary resources, context managers (preferred) or
>>> try..finally should be used.
>>
>>
>> So what do we do with lock.__del__ and how tests rely on it?
>
>
> Can we get details about this test failure and why it relies on it?
>
> --
> Pierre-Yves David
Jun Wu - April 1, 2016, 10:17 a.m.
On 04/01/2016 10:13 AM, Pierre-Yves David wrote:
> why it relies on it?

I guess it's because of the order of `__del__`s being called is undefined.

FYI chg has a different issue about `__del__`. See commit 5346e9b910fc.
Maciej Fijalkowski - April 3, 2016, 7:26 a.m.
On Fri, Apr 1, 2016 at 12:17 PM, Jun Wu <quark@fb.com> wrote:
> On 04/01/2016 10:13 AM, Pierre-Yves David wrote:
>>
>> why it relies on it?
>
>
> I guess it's because of the order of `__del__`s being called is undefined.
>
> FYI chg has a different issue about `__del__`. See commit 5346e9b910fc.

It has nothing to do with the order btw, only if it's called at the
program exit (it's not guaranteed to)

I would like some resolution before going forward on that
Gregory Szorc - April 3, 2016, 5:51 p.m.
On Fri, Apr 1, 2016 at 2:13 AM, Pierre-Yves David <
pierre-yves.david@ens-lyon.org> wrote:

>
>
> On 04/01/2016 12:10 AM, Maciej Fijalkowski wrote:
>
>> On Fri, Apr 1, 2016 at 9:09 AM, Gregory Szorc <gregory.szorc@gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Mar 31, 2016, at 23:58, Maciej Fijalkowski <fijall@gmail.com> wrote:
>>>>
>>>> # HG changeset patch
>>>> # User Maciej Fijalkowski <fijall@gmail.com>
>>>> # Date 1459493914 -7200
>>>> #      Fri Apr 01 08:58:34 2016 +0200
>>>> # Node ID 66b3737b62635691b5a205dafc80e640880b77ca
>>>> # Parent  9ca7854b963295b8b56d276b903623ac8277f18d
>>>> rfc: call gc at exit of mercurial
>>>>
>>>> This is not a proper patch, just a visualization of a hack I did.
>>>> Locks have a __del__, which test-devel-warnings rely on being
>>>> called at the exit of a program. Locks also have __exit__ which is
>>>> a more proper way to handle locks, but that test does not respect it.
>>>> I would like to know what's the proper way of handling this - should
>>>> I fix test, should I just call the GC (which will make the command
>>>> line tool slower) or what's the proposed solution
>>>>
>>>
>>> __del__ should be purged and banned from the code base because it
>>> undermines GC and can lead to memory leaks.
>>>
>>> For holding on to temporary resources, context managers (preferred) or
>>> try..finally should be used.
>>>
>>
>> So what do we do with lock.__del__ and how tests rely on it?
>>
>
> Can we get details about this test failure and why it relies on it?
>

The existence of lock.__del__ is to catch bugs where people forget to call
lock.release. Unfortunately, the presence of __del__ does bad things for
garbage collection. So, we want to have a mechanism to detect failure to
call lock.release but it can't use __del__. What can we do?

We could establish a variable holding a set of lock instances. When
lock.release is called, the instance is removed from the set. After command
dispatch or during process exit, we could see if any locks weren't released
and issue warnings. The set of lock instances would need to be a global
variable because it can't be a member of e.g. repo instances because then
you'd need __del__ or a similar "leak checking" mechanism for repos. But
once you make it a global variable you have problems with e.g. multiple
threads. So you need to key locks to a thread/dispatch ID or only warn on
process exit. Yuck.

I'm tempted to say that the lock accounting should be implemented as an
extension that is always loaded when running tests. This gets us our
warning without making the run-time behavior too complicated.

Furthermore, I'd consider moving the logic from lock.release() into
lock._release() and have lock.release() issue a developer warning unless
called with a flag saying to suppress it. Most callers should be using
locks as context managers now. Context managers prevent bugs and we should
be prodding all consumers to use locks as context managers and explicitly
drawing attention to call sites that don't (by requiring a flag to suppress
the warning).
Augie Fackler - April 5, 2016, 5:23 p.m.
On Fri, Apr 01, 2016 at 12:09:19AM -0700, Gregory Szorc wrote:
>
>
> > On Mar 31, 2016, at 23:58, Maciej Fijalkowski <fijall@gmail.com> wrote:
> >
> > # HG changeset patch
> > # User Maciej Fijalkowski <fijall@gmail.com>
> > # Date 1459493914 -7200
> > #      Fri Apr 01 08:58:34 2016 +0200
> > # Node ID 66b3737b62635691b5a205dafc80e640880b77ca
> > # Parent  9ca7854b963295b8b56d276b903623ac8277f18d
> > rfc: call gc at exit of mercurial
> >
> > This is not a proper patch, just a visualization of a hack I did.
> > Locks have a __del__, which test-devel-warnings rely on being
> > called at the exit of a program. Locks also have __exit__ which is
> > a more proper way to handle locks, but that test does not respect it.
> > I would like to know what's the proper way of handling this - should
> > I fix test, should I just call the GC (which will make the command
> > line tool slower) or what's the proposed solution
>
> __del__ should be purged and banned from the code base because it undermines GC and can lead to memory leaks.
>
> For holding on to temporary resources, context managers (preferred) or try..finally should be used.

Having read most of the rest of this thread, I think this is the
direction I'd like us to go. Does anyone know how hard/impossible it
would be to move all our locks to only be context managers?

>
> >
> > diff -r 9ca7854b9632 -r 66b3737b6263 hg
> > --- a/hg    Thu Mar 31 18:38:08 2016 +0200
> > +++ b/hg    Fri Apr 01 08:58:34 2016 +0200
> > @@ -40,4 +40,8 @@
> > for fp in (sys.stdin, sys.stdout, sys.stderr):
> >     mercurial.util.setbinary(fp)
> >
> > -mercurial.dispatch.run()
> > +try:
> > +    mercurial.dispatch.run()
> > +finally:
> > +    import gc
> > +    gc.collect()
> > _______________________________________________
> > Mercurial-devel mailing list
> > Mercurial-devel@mercurial-scm.org
> > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Antoine Pitrou - April 26, 2016, 4:09 p.m.
Gregory Szorc <gregory.szorc <at> gmail.com> writes:
> 
> The existence of lock.__del__ is to catch bugs where people forget to call
lock.release. Unfortunately, the presence of __del__ does bad things for
garbage collection. So, we want to have a mechanism to detect failure to
call lock.release but it can't use __del__. What can we do?

weakref.finalize() is very handy:
https://docs.python.org/3/library/weakref.html#finalizer-objects

It's Python 3-only but backporting it to Python 2 should be almost trivial
(we did it for Numba).

Regards

Antoine.

Patch

diff -r 9ca7854b9632 -r 66b3737b6263 hg
--- a/hg	Thu Mar 31 18:38:08 2016 +0200
+++ b/hg	Fri Apr 01 08:58:34 2016 +0200
@@ -40,4 +40,8 @@ 
 for fp in (sys.stdin, sys.stdout, sys.stderr):
     mercurial.util.setbinary(fp)
 
-mercurial.dispatch.run()
+try:
+    mercurial.dispatch.run()
+finally:
+    import gc
+    gc.collect()