Patchwork [1,of,8] cache: make the cache updated callback easily accessible to extension

login
register
mail settings
Submitter Pierre-Yves David
Date May 19, 2017, 2:28 p.m.
Message ID <1fe18fcc122b4d8e1426.1495204080@nodosa.octopoid.net>
Download mbox | patch
Permalink /patch/20709/
State Accepted
Headers show

Comments

Pierre-Yves David - May 19, 2017, 2:28 p.m.
# HG changeset patch
# User Pierre-Yves David <pierre-yves.david@octobus.net>
# Date 1495192163 -7200
#      Fri May 19 13:09:23 2017 +0200
# Node ID 1fe18fcc122b4d8e14264a3670d2d5f99f11bd75
# Parent  8a87bfc5bebbbe0ac996ac8e047a029eb931af45
# EXP-Topic obscache
# Available At https://www.mercurial-scm.org/repo/users/marmoute/mercurial/
#              hg pull https://www.mercurial-scm.org/repo/users/marmoute/mercurial/ -r 1fe18fcc122b
cache: make the cache updated callback easily accessible to extension

This will help extension to benefit from this new logic. As a side effect this
clarify the 'transaction' method a little bit.
Yuya Nishihara - May 20, 2017, 4:49 a.m.
On Fri, 19 May 2017 16:28:00 +0200, Pierre-Yves David wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@octobus.net>
> # Date 1495192163 -7200
> #      Fri May 19 13:09:23 2017 +0200
> # Node ID 1fe18fcc122b4d8e14264a3670d2d5f99f11bd75
> # Parent  8a87bfc5bebbbe0ac996ac8e047a029eb931af45
> # EXP-Topic obscache
> # Available At https://www.mercurial-scm.org/repo/users/marmoute/mercurial/
> #              hg pull https://www.mercurial-scm.org/repo/users/marmoute/mercurial/ -r 1fe18fcc122b
> cache: make the cache updated callback easily accessible to extension
> 
> This will help extension to benefit from this new logic. As a side effect this
> clarify the 'transaction' method a little bit.
> 
> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> --- a/mercurial/localrepo.py
> +++ b/mercurial/localrepo.py
> @@ -1091,10 +1091,7 @@ class localrepository(object):
>                                 **pycompat.strkwargs(hookargs))
>              reporef()._afterlock(hook)
>          tr.addfinalize('txnclose-hook', txnclosehook)
> -        def warmscache(tr2):
> -            repo = reporef()
> -            repo.updatecaches(tr2)
> -        tr.addpostclose('warms-cache', warmscache)
> +        tr.addpostclose('warms-cache', self._buildcacheupdater(tr))
>          def txnaborthook(tr2):
>              """To be run if transaction is aborted
>              """
> @@ -1229,6 +1226,20 @@ class localrepository(object):
>          self.destroyed()
>          return 0
>  
> +    def _buildcacheupdater(self, newtransaction):
> +        """called during transaction to build the callback updating cache
> +
> +        Lives on the repository to help extension who might want to augment
> +        this logic. For this purpose, the created transaction is passed to the
> +        method.
> +        """
> +        # we must avoid cyclic reference between repo and transaction.
> +        reporef = weakref.ref(self)
> +        def updater(tr):
> +            repo = reporef()
> +            repo.updatecaches(tr)
> +        return updater

Out of curiosity, why do we need this whole weakref business? IIUC, repo holds
no strong reference to the transaction object.
Pierre-Yves David - May 20, 2017, 2:21 p.m.
On 05/20/2017 06:49 AM, Yuya Nishihara wrote:
> On Fri, 19 May 2017 16:28:00 +0200, Pierre-Yves David wrote:
>> # HG changeset patch
>> # User Pierre-Yves David <pierre-yves.david@octobus.net>
>> # Date 1495192163 -7200
>> #      Fri May 19 13:09:23 2017 +0200
>> # Node ID 1fe18fcc122b4d8e14264a3670d2d5f99f11bd75
>> # Parent  8a87bfc5bebbbe0ac996ac8e047a029eb931af45
>> # EXP-Topic obscache
>> # Available At https://www.mercurial-scm.org/repo/users/marmoute/mercurial/
>> #              hg pull https://www.mercurial-scm.org/repo/users/marmoute/mercurial/ -r 1fe18fcc122b
>> cache: make the cache updated callback easily accessible to extension
>>
>> This will help extension to benefit from this new logic. As a side effect this
>> clarify the 'transaction' method a little bit.
>>
>> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
>> --- a/mercurial/localrepo.py
>> +++ b/mercurial/localrepo.py
>> @@ -1091,10 +1091,7 @@ class localrepository(object):
>>                                 **pycompat.strkwargs(hookargs))
>>              reporef()._afterlock(hook)
>>          tr.addfinalize('txnclose-hook', txnclosehook)
>> -        def warmscache(tr2):
>> -            repo = reporef()
>> -            repo.updatecaches(tr2)
>> -        tr.addpostclose('warms-cache', warmscache)
>> +        tr.addpostclose('warms-cache', self._buildcacheupdater(tr))
>>          def txnaborthook(tr2):
>>              """To be run if transaction is aborted
>>              """
>> @@ -1229,6 +1226,20 @@ class localrepository(object):
>>          self.destroyed()
>>          return 0
>>
>> +    def _buildcacheupdater(self, newtransaction):
>> +        """called during transaction to build the callback updating cache
>> +
>> +        Lives on the repository to help extension who might want to augment
>> +        this logic. For this purpose, the created transaction is passed to the
>> +        method.
>> +        """
>> +        # we must avoid cyclic reference between repo and transaction.
>> +        reporef = weakref.ref(self)
>> +        def updater(tr):
>> +            repo = reporef()
>> +            repo.updatecaches(tr)
>> +        return updater
>
> Out of curiosity, why do we need this whole weakref business? IIUC, repo holds
> no strong reference to the transaction object.

I've preserved the usage of the reporef from the 'transaction' method.

Tracking the introduction of the 'reporef' there trace back to 
db8679812f84 and usage of repository in "afterlock" call.

So I'm not sure it is useful in this context but it probably do not hurt.
Gregory Szorc - May 20, 2017, 5:38 p.m.
On Sat, May 20, 2017 at 7:21 AM, Pierre-Yves David <
pierre-yves.david@ens-lyon.org> wrote:

>
>
> On 05/20/2017 06:49 AM, Yuya Nishihara wrote:
>
>> On Fri, 19 May 2017 16:28:00 +0200, Pierre-Yves David wrote:
>>
>>> # HG changeset patch
>>> # User Pierre-Yves David <pierre-yves.david@octobus.net>
>>> # Date 1495192163 -7200
>>> #      Fri May 19 13:09:23 2017 +0200
>>> # Node ID 1fe18fcc122b4d8e14264a3670d2d5f99f11bd75
>>> # Parent  8a87bfc5bebbbe0ac996ac8e047a029eb931af45
>>> # EXP-Topic obscache
>>> # Available At https://www.mercurial-scm.org/
>>> repo/users/marmoute/mercurial/
>>> #              hg pull https://www.mercurial-scm.org/
>>> repo/users/marmoute/mercurial/ -r 1fe18fcc122b
>>> cache: make the cache updated callback easily accessible to extension
>>>
>>> This will help extension to benefit from this new logic. As a side
>>> effect this
>>> clarify the 'transaction' method a little bit.
>>>
>>> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
>>> --- a/mercurial/localrepo.py
>>> +++ b/mercurial/localrepo.py
>>> @@ -1091,10 +1091,7 @@ class localrepository(object):
>>>                                 **pycompat.strkwargs(hookargs))
>>>              reporef()._afterlock(hook)
>>>          tr.addfinalize('txnclose-hook', txnclosehook)
>>> -        def warmscache(tr2):
>>> -            repo = reporef()
>>> -            repo.updatecaches(tr2)
>>> -        tr.addpostclose('warms-cache', warmscache)
>>> +        tr.addpostclose('warms-cache', self._buildcacheupdater(tr))
>>>          def txnaborthook(tr2):
>>>              """To be run if transaction is aborted
>>>              """
>>> @@ -1229,6 +1226,20 @@ class localrepository(object):
>>>          self.destroyed()
>>>          return 0
>>>
>>> +    def _buildcacheupdater(self, newtransaction):
>>> +        """called during transaction to build the callback updating
>>> cache
>>> +
>>> +        Lives on the repository to help extension who might want to
>>> augment
>>> +        this logic. For this purpose, the created transaction is passed
>>> to the
>>> +        method.
>>> +        """
>>> +        # we must avoid cyclic reference between repo and transaction.
>>> +        reporef = weakref.ref(self)
>>> +        def updater(tr):
>>> +            repo = reporef()
>>> +            repo.updatecaches(tr)
>>> +        return updater
>>>
>>
>> Out of curiosity, why do we need this whole weakref business? IIUC, repo
>> holds
>> no strong reference to the transaction object.
>>
>
> I've preserved the usage of the reporef from the 'transaction' method.
>
> Tracking the introduction of the 'reporef' there trace back to
> db8679812f84 and usage of repository in "afterlock" call.
>
> So I'm not sure it is useful in this context but it probably do not hurt.


I believe I inserted some of these. There are circular references involving
repo instances. The ref arc is inserted through transactions, specifically
these transaction callbacks, which either store a ref to the repo
explicitly (as in this patch) or inherit one via the outer scope.

If a single repo undergoes multiple transactions, we effectively leak
transactions, repoviews, and associated objects because of the circular
refs. I believe we still have leaks today because I have a handful of old
patches sitting around attempting to break some cycles by inserting more
weakrefs and proxies. If you want to reproduce, run `hg convert` on a large
repo.
Yuya Nishihara - May 21, 2017, 12:14 p.m.
On Sat, 20 May 2017 10:38:59 -0700, Gregory Szorc wrote:
> On Sat, May 20, 2017 at 7:21 AM, Pierre-Yves David <
> pierre-yves.david@ens-lyon.org> wrote:
> > On 05/20/2017 06:49 AM, Yuya Nishihara wrote:
> >> On Fri, 19 May 2017 16:28:00 +0200, Pierre-Yves David wrote:
> >>
> >>> # HG changeset patch
> >>> # User Pierre-Yves David <pierre-yves.david@octobus.net>
> >>> # Date 1495192163 -7200
> >>> #      Fri May 19 13:09:23 2017 +0200
> >>> # Node ID 1fe18fcc122b4d8e14264a3670d2d5f99f11bd75
> >>> # Parent  8a87bfc5bebbbe0ac996ac8e047a029eb931af45
> >>> # EXP-Topic obscache
> >>> # Available At https://www.mercurial-scm.org/
> >>> repo/users/marmoute/mercurial/
> >>> #              hg pull https://www.mercurial-scm.org/
> >>> repo/users/marmoute/mercurial/ -r 1fe18fcc122b
> >>> cache: make the cache updated callback easily accessible to extension
> >>>
> >>> This will help extension to benefit from this new logic. As a side
> >>> effect this
> >>> clarify the 'transaction' method a little bit.
> >>>
> >>> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> >>> --- a/mercurial/localrepo.py
> >>> +++ b/mercurial/localrepo.py
> >>> @@ -1091,10 +1091,7 @@ class localrepository(object):
> >>>                                 **pycompat.strkwargs(hookargs))
> >>>              reporef()._afterlock(hook)
> >>>          tr.addfinalize('txnclose-hook', txnclosehook)
> >>> -        def warmscache(tr2):
> >>> -            repo = reporef()
> >>> -            repo.updatecaches(tr2)
> >>> -        tr.addpostclose('warms-cache', warmscache)
> >>> +        tr.addpostclose('warms-cache', self._buildcacheupdater(tr))
> >>>          def txnaborthook(tr2):
> >>>              """To be run if transaction is aborted
> >>>              """
> >>> @@ -1229,6 +1226,20 @@ class localrepository(object):
> >>>          self.destroyed()
> >>>          return 0
> >>>
> >>> +    def _buildcacheupdater(self, newtransaction):
> >>> +        """called during transaction to build the callback updating
> >>> cache
> >>> +
> >>> +        Lives on the repository to help extension who might want to
> >>> augment
> >>> +        this logic. For this purpose, the created transaction is passed
> >>> to the
> >>> +        method.
> >>> +        """
> >>> +        # we must avoid cyclic reference between repo and transaction.
> >>> +        reporef = weakref.ref(self)
> >>> +        def updater(tr):
> >>> +            repo = reporef()
> >>> +            repo.updatecaches(tr)
> >>> +        return updater
> >>>
> >>
> >> Out of curiosity, why do we need this whole weakref business? IIUC, repo
> >> holds
> >> no strong reference to the transaction object.
> >>
> >
> > I've preserved the usage of the reporef from the 'transaction' method.
> >
> > Tracking the introduction of the 'reporef' there trace back to
> > db8679812f84 and usage of repository in "afterlock" call.
> >
> > So I'm not sure it is useful in this context but it probably do not hurt.
> 
> I believe I inserted some of these. There are circular references involving
> repo instances. The ref arc is inserted through transactions, specifically
> these transaction callbacks, which either store a ref to the repo
> explicitly (as in this patch) or inherit one via the outer scope.
> 
> If a single repo undergoes multiple transactions, we effectively leak
> transactions, repoviews, and associated objects because of the circular
> refs. I believe we still have leaks today because I have a handful of old
> patches sitting around attempting to break some cycles by inserting more
> weakrefs and proxies. If you want to reproduce, run `hg convert` on a large
> repo.

Thanks, but it's still unclear to me. Please correct me if I'm wrong:

 - the reference tr->repo should be allowed
 - but we have bad repo->..->tr refs somewhere, which could make cycles
 - so these tr->repo weakrefs can avoid some leaks

Patch

diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -1091,10 +1091,7 @@  class localrepository(object):
                                **pycompat.strkwargs(hookargs))
             reporef()._afterlock(hook)
         tr.addfinalize('txnclose-hook', txnclosehook)
-        def warmscache(tr2):
-            repo = reporef()
-            repo.updatecaches(tr2)
-        tr.addpostclose('warms-cache', warmscache)
+        tr.addpostclose('warms-cache', self._buildcacheupdater(tr))
         def txnaborthook(tr2):
             """To be run if transaction is aborted
             """
@@ -1229,6 +1226,20 @@  class localrepository(object):
         self.destroyed()
         return 0
 
+    def _buildcacheupdater(self, newtransaction):
+        """called during transaction to build the callback updating cache
+
+        Lives on the repository to help extension who might want to augment
+        this logic. For this purpose, the created transaction is passed to the
+        method.
+        """
+        # we must avoid cyclic reference between repo and transaction.
+        reporef = weakref.ref(self)
+        def updater(tr):
+            repo = reporef()
+            repo.updatecaches(tr)
+        return updater
+
     @unfilteredmethod
     def updatecaches(self, tr=None):
         """warm appropriate caches