Patchwork [02,of,11] commitctx: stop using weakref proxy for transaction

login
register
mail settings
Submitter Pierre-Yves David
Date July 24, 2020, 2:38 p.m.
Message ID <76a585b26acdaf884e1c.1595601507@nodosa.octobus.net>
Download mbox | patch
Permalink /patch/46869/
State Accepted
Headers show

Comments

Pierre-Yves David - July 24, 2020, 2:38 p.m.
# HG changeset patch
# User Pierre-Yves David <pierre-yves.david@octobus.net>
# Date 1595587952 -7200
#      Fri Jul 24 12:52:32 2020 +0200
# Node ID 76a585b26acdaf884e1c40252e351b1d45cbbcf1
# Parent  2727e91ffa6e9063bd9c29671b5008cfef22dd97
# EXP-Topic commitctx-cleanup-2
# Available At https://foss.heptapod.net/octobus/mercurial-devel/
#              hg pull https://foss.heptapod.net/octobus/mercurial-devel/ -r 76a585b26acd
commitctx: stop using weakref proxy for transaction

This weakref proxy was introduced in 2007 by 30d4d8985dd8.

If I understand it correctly, the logic at that time was relying on the
transaction destructor, triggered at garbage collection time to rollback failed
transaction. passing the object to sub function directly mean it would live in
the function scope and be trapped in the traceback on exception, leading to the
transaction rollback too late in some case.

Modern transaction usage use explicit opening and closing of transaction and no
longer rely on some internal reference counting details. So this weakref proxy
is no longer necessary. Absolutely no test are affected when we drop it.
Gregory Szorc - July 26, 2020, 10:18 p.m.
On Fri, Jul 24, 2020 at 8:49 AM Pierre-Yves David <
pierre-yves.david@ens-lyon.org> wrote:

> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@octobus.net>
> # Date 1595587952 -7200
> #      Fri Jul 24 12:52:32 2020 +0200
> # Node ID 76a585b26acdaf884e1c40252e351b1d45cbbcf1
> # Parent  2727e91ffa6e9063bd9c29671b5008cfef22dd97
> # EXP-Topic commitctx-cleanup-2
> # Available At https://foss.heptapod.net/octobus/mercurial-devel/
> #              hg pull https://foss.heptapod.net/octobus/mercurial-devel/
> -r 76a585b26acd
> commitctx: stop using weakref proxy for transaction
>

I have concerns about this patch.

I believe the reason we continue to use a weak ref here is due to cycles
leading to leaked objects due to failure to garbage collect.

I believe there are still places where we leak transaction objects because
of cycles. Try doing a `hg convert` between 2 Mercurial repos involving
thousands of changesets. I suspect this change may make the memory leak
worse since the garbage collector isn't able to break cycles as easily.


>
> This weakref proxy was introduced in 2007 by 30d4d8985dd8.
>
> If I understand it correctly, the logic at that time was relying on the
> transaction destructor, triggered at garbage collection time to rollback
> failed
> transaction. passing the object to sub function directly mean it would
> live in
> the function scope and be trapped in the traceback on exception, leading
> to the
> transaction rollback too late in some case.
>
> Modern transaction usage use explicit opening and closing of transaction
> and no
> longer rely on some internal reference counting details. So this weakref
> proxy
> is no longer necessary. Absolutely no test are affected when we drop it.
>
> diff --git a/mercurial/commit.py b/mercurial/commit.py
> --- a/mercurial/commit.py
> +++ b/mercurial/commit.py
> @@ -6,7 +6,6 @@
>  from __future__ import absolute_import
>
>  import errno
> -import weakref
>
>  from .i18n import _
>  from .node import (
> @@ -62,8 +61,6 @@ def commitctx(repo, ctx, error=False, or
>          p2copies = ctx.p2copies()
>      filesadded, filesremoved = None, None
>      with repo.lock(), repo.transaction(b"commit") as tr:
> -        trp = weakref.proxy(tr)
> -
>          if ctx.manifestnode():
>              # reuse an existing manifest revision
>              repo.ui.debug(b'reusing known manifest\n')
> @@ -102,7 +99,7 @@ def commitctx(repo, ctx, error=False, or
>                      else:
>                          added.append(f)
>                          m[f], is_touched = _filecommit(
> -                            repo, fctx, m1, m2, linkrev, trp,
> writefilecopymeta,
> +                            repo, fctx, m1, m2, linkrev, tr,
> writefilecopymeta,
>                          )
>                          if is_touched:
>                              touched.append(f)
> @@ -156,7 +153,7 @@ def commitctx(repo, ctx, error=False, or
>                  # case where the merge has files outside of the
> narrowspec,
>                  # so this is safe.
>                  mn = mctx.write(
> -                    trp,
> +                    tr,
>                      linkrev,
>                      p1.manifestnode(),
>                      p2.manifestnode(),
> @@ -191,7 +188,7 @@ def commitctx(repo, ctx, error=False, or
>              mn,
>              files,
>              ctx.description(),
> -            trp,
> +            tr,
>              p1.node(),
>              p2.node(),
>              user,
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
Manuel Jacob - July 26, 2020, 10:47 p.m.
On 2020-07-27 00:18, Gregory Szorc wrote:
> On Fri, Jul 24, 2020 at 8:49 AM Pierre-Yves David <
> pierre-yves.david@ens-lyon.org> wrote:
> 
>> # HG changeset patch
>> # User Pierre-Yves David <pierre-yves.david@octobus.net>
>> # Date 1595587952 -7200
>> #      Fri Jul 24 12:52:32 2020 +0200
>> # Node ID 76a585b26acdaf884e1c40252e351b1d45cbbcf1
>> # Parent  2727e91ffa6e9063bd9c29671b5008cfef22dd97
>> # EXP-Topic commitctx-cleanup-2
>> # Available At https://foss.heptapod.net/octobus/mercurial-devel/
>> #              hg pull 
>> https://foss.heptapod.net/octobus/mercurial-devel/
>> -r 76a585b26acd
>> commitctx: stop using weakref proxy for transaction
>> 
> 
> I have concerns about this patch.
> 
> I believe the reason we continue to use a weak ref here is due to 
> cycles
> leading to leaked objects due to failure to garbage collect.
> 
> I believe there are still places where we leak transaction objects 
> because
> of cycles. Try doing a `hg convert` between 2 Mercurial repos involving
> thousands of changesets. I suspect this change may make the memory leak
> worse since the garbage collector isn't able to break cycles as easily.

What exactly do you mean by "memory leak"? That objects are never freed 
(before the command terminates) or freed too late, increasing temporary 
memory usage more than necessary?

> 
>> 
>> This weakref proxy was introduced in 2007 by 30d4d8985dd8.
>> 
>> If I understand it correctly, the logic at that time was relying on 
>> the
>> transaction destructor, triggered at garbage collection time to 
>> rollback
>> failed
>> transaction. passing the object to sub function directly mean it would
>> live in
>> the function scope and be trapped in the traceback on exception, 
>> leading
>> to the
>> transaction rollback too late in some case.
>> 
>> Modern transaction usage use explicit opening and closing of 
>> transaction
>> and no
>> longer rely on some internal reference counting details. So this 
>> weakref
>> proxy
>> is no longer necessary. Absolutely no test are affected when we drop 
>> it.
>> 
>> diff --git a/mercurial/commit.py b/mercurial/commit.py
>> --- a/mercurial/commit.py
>> +++ b/mercurial/commit.py
>> @@ -6,7 +6,6 @@
>>  from __future__ import absolute_import
>> 
>>  import errno
>> -import weakref
>> 
>>  from .i18n import _
>>  from .node import (
>> @@ -62,8 +61,6 @@ def commitctx(repo, ctx, error=False, or
>>          p2copies = ctx.p2copies()
>>      filesadded, filesremoved = None, None
>>      with repo.lock(), repo.transaction(b"commit") as tr:
>> -        trp = weakref.proxy(tr)
>> -
>>          if ctx.manifestnode():
>>              # reuse an existing manifest revision
>>              repo.ui.debug(b'reusing known manifest\n')
>> @@ -102,7 +99,7 @@ def commitctx(repo, ctx, error=False, or
>>                      else:
>>                          added.append(f)
>>                          m[f], is_touched = _filecommit(
>> -                            repo, fctx, m1, m2, linkrev, trp,
>> writefilecopymeta,
>> +                            repo, fctx, m1, m2, linkrev, tr,
>> writefilecopymeta,
>>                          )
>>                          if is_touched:
>>                              touched.append(f)
>> @@ -156,7 +153,7 @@ def commitctx(repo, ctx, error=False, or
>>                  # case where the merge has files outside of the
>> narrowspec,
>>                  # so this is safe.
>>                  mn = mctx.write(
>> -                    trp,
>> +                    tr,
>>                      linkrev,
>>                      p1.manifestnode(),
>>                      p2.manifestnode(),
>> @@ -191,7 +188,7 @@ def commitctx(repo, ctx, error=False, or
>>              mn,
>>              files,
>>              ctx.description(),
>> -            trp,
>> +            tr,
>>              p1.node(),
>>              p2.node(),
>>              user,
>> 
>> _______________________________________________
>> Mercurial-devel mailing list
>> Mercurial-devel@mercurial-scm.org
>> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>> 
> 
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Yuya Nishihara - July 27, 2020, 11:17 a.m.
On Sun, 26 Jul 2020 16:18:43 -0600, Gregory Szorc wrote:
> On Fri, Jul 24, 2020 at 8:49 AM Pierre-Yves David <
> pierre-yves.david@ens-lyon.org> wrote:
> 
> > # HG changeset patch
> > # User Pierre-Yves David <pierre-yves.david@octobus.net>
> > # Date 1595587952 -7200
> > #      Fri Jul 24 12:52:32 2020 +0200
> > # Node ID 76a585b26acdaf884e1c40252e351b1d45cbbcf1
> > # Parent  2727e91ffa6e9063bd9c29671b5008cfef22dd97
> > # EXP-Topic commitctx-cleanup-2
> > # Available At https://foss.heptapod.net/octobus/mercurial-devel/
> > #              hg pull https://foss.heptapod.net/octobus/mercurial-devel/
> > -r 76a585b26acd
> > commitctx: stop using weakref proxy for transaction
> >
> 
> I have concerns about this patch.
> 
> I believe the reason we continue to use a weak ref here is due to cycles
> leading to leaked objects due to failure to garbage collect.

AFAIK, trp is passed around as function arguments which wouldn't be captured
by lambda, so I don't see any cycles in this specific weakref usage.
Augie Fackler - July 27, 2020, 8:34 p.m.
On Mon, Jul 27, 2020 at 12:47:04AM +0200, Manuel Jacob wrote:
> On 2020-07-27 00:18, Gregory Szorc wrote:
> > On Fri, Jul 24, 2020 at 8:49 AM Pierre-Yves David <
> > pierre-yves.david@ens-lyon.org> wrote:
> >
> > > # HG changeset patch
> > > # User Pierre-Yves David <pierre-yves.david@octobus.net>
> > > # Date 1595587952 -7200
> > > #      Fri Jul 24 12:52:32 2020 +0200
> > > # Node ID 76a585b26acdaf884e1c40252e351b1d45cbbcf1
> > > # Parent  2727e91ffa6e9063bd9c29671b5008cfef22dd97
> > > # EXP-Topic commitctx-cleanup-2
> > > # Available At https://foss.heptapod.net/octobus/mercurial-devel/
> > > #              hg pull
> > > https://foss.heptapod.net/octobus/mercurial-devel/
> > > -r 76a585b26acd
> > > commitctx: stop using weakref proxy for transaction
> > >
> >
> > I have concerns about this patch.
> >
> > I believe the reason we continue to use a weak ref here is due to cycles
> > leading to leaked objects due to failure to garbage collect.
> >
> > I believe there are still places where we leak transaction objects
> > because
> > of cycles. Try doing a `hg convert` between 2 Mercurial repos involving
> > thousands of changesets. I suspect this change may make the memory leak
> > worse since the garbage collector isn't able to break cycles as easily.
>
> What exactly do you mean by "memory leak"? That objects are never freed
> (before the command terminates) or freed too late, increasing temporary
> memory usage more than necessary?
>

My recollection from the last time we had this discussion was that we
had actual /leaks/ where in the cycles were rendered uncollectable
until process exit (because some of the types in play have __del__
methods).
Pierre-Yves David - July 28, 2020, 7:24 p.m.
On 7/27/20 12:18 AM, Gregory Szorc wrote:
> On Fri, Jul 24, 2020 at 8:49 AM Pierre-Yves David 
> <pierre-yves.david@ens-lyon.org <mailto:pierre-yves.david@ens-lyon.org>> 
> wrote:
> 
>     # HG changeset patch
>     # User Pierre-Yves David <pierre-yves.david@octobus.net
>     <mailto:pierre-yves.david@octobus.net>>
>     # Date 1595587952 -7200
>     #      Fri Jul 24 12:52:32 2020 +0200
>     # Node ID 76a585b26acdaf884e1c40252e351b1d45cbbcf1
>     # Parent  2727e91ffa6e9063bd9c29671b5008cfef22dd97
>     # EXP-Topic commitctx-cleanup-2
>     # Available At https://foss.heptapod.net/octobus/mercurial-devel/
>     #              hg pull
>     https://foss.heptapod.net/octobus/mercurial-devel/ -r 76a585b26acd
>     commitctx: stop using weakref proxy for transaction
> 
> 
> I have concerns about this patch.
> 
> I believe the reason we continue to use a weak ref here is due to cycles 
> leading to leaked objects due to failure to garbage collect.

We need weakref usage at higher level to avoid cycle. However we are 
quite low level here and I could not found any justification for the 
weakref other than the one listed in the commit I pointed.

Do you have some concrete piece of code under this one that concerns you 
especially?

> I believe there are still places where we leak transaction objects 
> because of cycles. Try doing a `hg convert` between 2 Mercurial repos 
> involving thousands of changesets. I suspect this change may make the 
> memory leak worse since the garbage collector isn't able to break cycles 
> as easily.

I agree we still have memory leak in some place, however I don't think 
they would originate from this code.

Patch

diff --git a/mercurial/commit.py b/mercurial/commit.py
--- a/mercurial/commit.py
+++ b/mercurial/commit.py
@@ -6,7 +6,6 @@ 
 from __future__ import absolute_import
 
 import errno
-import weakref
 
 from .i18n import _
 from .node import (
@@ -62,8 +61,6 @@  def commitctx(repo, ctx, error=False, or
         p2copies = ctx.p2copies()
     filesadded, filesremoved = None, None
     with repo.lock(), repo.transaction(b"commit") as tr:
-        trp = weakref.proxy(tr)
-
         if ctx.manifestnode():
             # reuse an existing manifest revision
             repo.ui.debug(b'reusing known manifest\n')
@@ -102,7 +99,7 @@  def commitctx(repo, ctx, error=False, or
                     else:
                         added.append(f)
                         m[f], is_touched = _filecommit(
-                            repo, fctx, m1, m2, linkrev, trp, writefilecopymeta,
+                            repo, fctx, m1, m2, linkrev, tr, writefilecopymeta,
                         )
                         if is_touched:
                             touched.append(f)
@@ -156,7 +153,7 @@  def commitctx(repo, ctx, error=False, or
                 # case where the merge has files outside of the narrowspec,
                 # so this is safe.
                 mn = mctx.write(
-                    trp,
+                    tr,
                     linkrev,
                     p1.manifestnode(),
                     p2.manifestnode(),
@@ -191,7 +188,7 @@  def commitctx(repo, ctx, error=False, or
             mn,
             files,
             ctx.description(),
-            trp,
+            tr,
             p1.node(),
             p2.node(),
             user,