Patchwork [5,of,5] changelog: rely on transaction for finalization

login
register
mail settings
Submitter Pierre-Yves David
Date Nov. 2, 2014, 2:20 p.m.
Message ID <3270e2ef0b7d74003e49.1414938043@marginatus.alto.octopoid.net>
Download mbox | patch
Permalink /patch/6527/
State Accepted
Headers show

Comments

Pierre-Yves David - Nov. 2, 2014, 2:20 p.m.
# HG changeset patch
# User Pierre-Yves David <pierre-yves.david@fb.com>
# Date 1413619781 25200
#      Sat Oct 18 01:09:41 2014 -0700
# Node ID 3270e2ef0b7d74003e49fcdec588763c258a1038
# Parent  d434fffe10dac3cb137b3695426a69a4cef9ef35
changelog: rely on transaction for finalization

Instead of calling `cl.finalize()` by hand (possibly at a bogus time) we
register it to the transaction during `delayupdate` and rely on `tr.close()` to
call it at the right time.
Gregory Szorc - Nov. 3, 2014, 5:11 p.m.
On 11/2/14 6:20 AM, Pierre-Yves David wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@fb.com>
> # Date 1413619781 25200
> #      Sat Oct 18 01:09:41 2014 -0700
> # Node ID 3270e2ef0b7d74003e49fcdec588763c258a1038
> # Parent  d434fffe10dac3cb137b3695426a69a4cef9ef35
> changelog: rely on transaction for finalization
>
> Instead of calling `cl.finalize()` by hand (possibly at a bogus time) we
> register it to the transaction during `delayupdate` and rely on `tr.close()` to
> call it at the right time.
>
> diff --git a/mercurial/changelog.py b/mercurial/changelog.py
> --- a/mercurial/changelog.py
> +++ b/mercurial/changelog.py
> @@ -3,10 +3,11 @@
>   # Copyright 2005-2007 Matt Mackall <mpm@selenic.com>
>   #
>   # This software may be used and distributed according to the terms of the
>   # GNU General Public License version 2 or any later version.
>
> +import weakref
>   from node import bin, hex, nullid
>   from i18n import _
>   import util, error, revlog, encoding
>
>   _defaultextra = {'branch': 'default'}
> @@ -237,12 +238,14 @@ class changelog(revlog.revlog):
>                   self._delaybuf = []
>                   self.opener = _delayopener(self._realopener, self.indexfile,
>                                              self._delaybuf)
>           self._delayed = True
>           tr.addpending('cl-%i' % id(self), self._writepending)
> +        trp = weakref.proxy(tr)
> +        tr.addfinalize('cl-%i' % id(self), lambda: self._finalize(trp))

I tend to get suspicious any time I see weakref. Looking into the 
future, I could see other consumers of these callbacks also looking to 
use a transaction reference. Requiring the caller to do the weakref 
magic to avoid reference cycles seems fragile and prone to leaks.

Should we nip this in the bud by passing the transaction instance to the 
callback?

As an extension author, I may also want a repo instance in the callback. 
But that's not currently captured as part of transaction instances, so 
that's scope bloat. (The use case I care about is creating a SQL 
transaction that can be committed or rolled back with the lifecycle of 
the hg transaction. This patch series makes it much easier.)
Pierre-Yves David - Nov. 3, 2014, 7:15 p.m.
On 11/03/2014 05:11 PM, Gregory Szorc wrote:
> On 11/2/14 6:20 AM, Pierre-Yves David wrote:
>> # HG changeset patch
>> # User Pierre-Yves David <pierre-yves.david@fb.com>
>> # Date 1413619781 25200
>> #      Sat Oct 18 01:09:41 2014 -0700
>> # Node ID 3270e2ef0b7d74003e49fcdec588763c258a1038
>> # Parent  d434fffe10dac3cb137b3695426a69a4cef9ef35
>> changelog: rely on transaction for finalization
>>
>> Instead of calling `cl.finalize()` by hand (possibly at a bogus time) we
>> register it to the transaction during `delayupdate` and rely on
>> `tr.close()` to
>> call it at the right time.
>>
>> diff --git a/mercurial/changelog.py b/mercurial/changelog.py
>> --- a/mercurial/changelog.py
>> +++ b/mercurial/changelog.py
>> @@ -3,10 +3,11 @@
>>   # Copyright 2005-2007 Matt Mackall <mpm@selenic.com>
>>   #
>>   # This software may be used and distributed according to the terms
>> of the
>>   # GNU General Public License version 2 or any later version.
>>
>> +import weakref
>>   from node import bin, hex, nullid
>>   from i18n import _
>>   import util, error, revlog, encoding
>>
>>   _defaultextra = {'branch': 'default'}
>> @@ -237,12 +238,14 @@ class changelog(revlog.revlog):
>>                   self._delaybuf = []
>>                   self.opener = _delayopener(self._realopener,
>> self.indexfile,
>>                                              self._delaybuf)
>>           self._delayed = True
>>           tr.addpending('cl-%i' % id(self), self._writepending)
>> +        trp = weakref.proxy(tr)
>> +        tr.addfinalize('cl-%i' % id(self), lambda: self._finalize(trp))
>
> I tend to get suspicious any time I see weakref. Looking into the
> future, I could see other consumers of these callbacks also looking to
> use a transaction reference. Requiring the caller to do the weakref
> magic to avoid reference cycles seems fragile and prone to leaks.

This code have been using a weakref for ever. It just become visible 
here because I'm mving code around.

> Should we nip this in the bud by passing the transaction instance to the
> callback?

It could make sense yes, it would help adding hooks data inside those 
callback.

> As an extension author, I may also want a repo instance in the callback.
> But that's not currently captured as part of transaction instances, so
> that's scope bloat. (The use case I care about is creating a SQL
> transaction that can be committed or rolled back with the lifecycle of
> the hg transaction. This patch series makes it much easier.)

Cheers,
Matt Mackall - Nov. 6, 2014, 12:28 a.m.
On Sun, 2014-11-02 at 14:20 +0000, Pierre-Yves David wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@fb.com>
> # Date 1413619781 25200
> #      Sat Oct 18 01:09:41 2014 -0700
> # Node ID 3270e2ef0b7d74003e49fcdec588763c258a1038
> # Parent  d434fffe10dac3cb137b3695426a69a4cef9ef35
> changelog: rely on transaction for finalization

I've queued the first four patches here, thanks. But I can't work out by
inspection why the weakref is essential, so it needs inline
justification. We'd really prefer to avoid the weakref if at all
possible, of course.

> Instead of calling `cl.finalize()` by hand (possibly at a bogus time) we
> register it to the transaction during `delayupdate` and rely on `tr.close()` to
> call it at the right time.
> 
> diff --git a/mercurial/changegroup.py b/mercurial/changegroup.py
> --- a/mercurial/changegroup.py
> +++ b/mercurial/changegroup.py
> @@ -720,12 +720,10 @@ def addchangegroup(repo, source, srctype
>              # publishing only alter behavior during push
>              #
>              # strip should not touch boundary at all
>              phases.retractboundary(repo, tr, targetphase, added)
>  
> -        # make changelog see real files again
> -        cl.finalize(trp)
>  
>          tr.close()
>  
>          if changesets > 0:
>              if srctype != 'strip':
> diff --git a/mercurial/changelog.py b/mercurial/changelog.py
> --- a/mercurial/changelog.py
> +++ b/mercurial/changelog.py
> @@ -3,10 +3,11 @@
>  # Copyright 2005-2007 Matt Mackall <mpm@selenic.com>
>  #
>  # This software may be used and distributed according to the terms of the
>  # GNU General Public License version 2 or any later version.
>  
> +import weakref
>  from node import bin, hex, nullid
>  from i18n import _
>  import util, error, revlog, encoding
>  
>  _defaultextra = {'branch': 'default'}
> @@ -237,12 +238,14 @@ class changelog(revlog.revlog):
>                  self._delaybuf = []
>                  self.opener = _delayopener(self._realopener, self.indexfile,
>                                             self._delaybuf)
>          self._delayed = True
>          tr.addpending('cl-%i' % id(self), self._writepending)
> +        trp = weakref.proxy(tr)
> +        tr.addfinalize('cl-%i' % id(self), lambda: self._finalize(trp))
>  
> -    def finalize(self, tr):
> +    def _finalize(self, tr):
>          "finalize index updates"
>          self._delayed = False
>          self.opener = self._realopener
>          # move redirected index data back into place
>          if self._divert:
> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> --- a/mercurial/localrepo.py
> +++ b/mercurial/localrepo.py
> @@ -1443,11 +1443,10 @@ class localrepository(object):
>                                     user, ctx.date(), ctx.extra().copy())
>              p = lambda: tr.writepending() and self.root or ""
>              xp1, xp2 = p1.hex(), p2 and p2.hex() or ''
>              self.hook('pretxncommit', throw=True, node=hex(n), parent1=xp1,
>                        parent2=xp2, pending=p)
> -            self.changelog.finalize(trp)
>              # set the new commit is proper phase
>              targetphase = subrepo.newcommitphase(self.ui, ctx)
>              if targetphase:
>                  # retract boundary do not alter parent changeset.
>                  # if a parent have higher the resulting phase will
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
Pierre-Yves David - Nov. 6, 2014, 12:41 a.m.
On 11/06/2014 12:28 AM, Matt Mackall wrote:
> On Sun, 2014-11-02 at 14:20 +0000, Pierre-Yves David wrote:
>> # HG changeset patch
>> # User Pierre-Yves David <pierre-yves.david@fb.com>
>> # Date 1413619781 25200
>> #      Sat Oct 18 01:09:41 2014 -0700
>> # Node ID 3270e2ef0b7d74003e49fcdec588763c258a1038
>> # Parent  d434fffe10dac3cb137b3695426a69a4cef9ef35
>> changelog: rely on transaction for finalization
>
> I've queued the first four patches here, thanks. But I can't work out by
> inspection why the weakref is essential, so it needs inline
> justification. We'd really prefer to avoid the weakref if at all
> possible, of course.

TL;DR: the weakref is nothing new, just code movement.

The weakref is nothing new. The code to finalize a changelog is using a 
weakref. As we call this finalization from elsewhere, the elsewhere 
needs to create the weakref.

After more thinking, I subscribe to the idea of adding a transaction 
argument to the callback. That would remove the need for a weakref.

I'm planning to add such argument a bit later when some other clean up 
will be in.
Matt Mackall - Nov. 6, 2014, 12:53 a.m.
On Thu, 2014-11-06 at 00:41 +0000, Pierre-Yves David wrote:
> 
> On 11/06/2014 12:28 AM, Matt Mackall wrote:
> > On Sun, 2014-11-02 at 14:20 +0000, Pierre-Yves David wrote:
> >> # HG changeset patch
> >> # User Pierre-Yves David <pierre-yves.david@fb.com>
> >> # Date 1413619781 25200
> >> #      Sat Oct 18 01:09:41 2014 -0700
> >> # Node ID 3270e2ef0b7d74003e49fcdec588763c258a1038
> >> # Parent  d434fffe10dac3cb137b3695426a69a4cef9ef35
> >> changelog: rely on transaction for finalization
> >
> > I've queued the first four patches here, thanks. But I can't work out by
> > inspection why the weakref is essential, so it needs inline
> > justification. We'd really prefer to avoid the weakref if at all
> > possible, of course.
> 
> TL;DR: the weakref is nothing new, just code movement.

Right, spotted over in localrepo.py where it's similarly mysterious. And
I see you already explained this to Greg. Ok, queued.

Patch

diff --git a/mercurial/changegroup.py b/mercurial/changegroup.py
--- a/mercurial/changegroup.py
+++ b/mercurial/changegroup.py
@@ -720,12 +720,10 @@  def addchangegroup(repo, source, srctype
             # publishing only alter behavior during push
             #
             # strip should not touch boundary at all
             phases.retractboundary(repo, tr, targetphase, added)
 
-        # make changelog see real files again
-        cl.finalize(trp)
 
         tr.close()
 
         if changesets > 0:
             if srctype != 'strip':
diff --git a/mercurial/changelog.py b/mercurial/changelog.py
--- a/mercurial/changelog.py
+++ b/mercurial/changelog.py
@@ -3,10 +3,11 @@ 
 # Copyright 2005-2007 Matt Mackall <mpm@selenic.com>
 #
 # This software may be used and distributed according to the terms of the
 # GNU General Public License version 2 or any later version.
 
+import weakref
 from node import bin, hex, nullid
 from i18n import _
 import util, error, revlog, encoding
 
 _defaultextra = {'branch': 'default'}
@@ -237,12 +238,14 @@  class changelog(revlog.revlog):
                 self._delaybuf = []
                 self.opener = _delayopener(self._realopener, self.indexfile,
                                            self._delaybuf)
         self._delayed = True
         tr.addpending('cl-%i' % id(self), self._writepending)
+        trp = weakref.proxy(tr)
+        tr.addfinalize('cl-%i' % id(self), lambda: self._finalize(trp))
 
-    def finalize(self, tr):
+    def _finalize(self, tr):
         "finalize index updates"
         self._delayed = False
         self.opener = self._realopener
         # move redirected index data back into place
         if self._divert:
diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -1443,11 +1443,10 @@  class localrepository(object):
                                    user, ctx.date(), ctx.extra().copy())
             p = lambda: tr.writepending() and self.root or ""
             xp1, xp2 = p1.hex(), p2 and p2.hex() or ''
             self.hook('pretxncommit', throw=True, node=hex(n), parent1=xp1,
                       parent2=xp2, pending=p)
-            self.changelog.finalize(trp)
             # set the new commit is proper phase
             targetphase = subrepo.newcommitphase(self.ui, ctx)
             if targetphase:
                 # retract boundary do not alter parent changeset.
                 # if a parent have higher the resulting phase will