Patchwork [3,of,5] branchmap: acquires lock before writting the rev branch cache

login
register
mail settings
Submitter Pierre-Yves David
Date Aug. 5, 2016, 11:02 p.m.
Message ID <84885aeec2e442d18dcb.1470438175@nodosa.octopoid.net>
Download mbox | patch
Permalink /patch/16142/
State Changes Requested
Headers show

Comments

Pierre-Yves David - Aug. 5, 2016, 11:02 p.m.
# HG changeset patch
# User Pierre-Yves David <pierre-yves.david@ens-lyon.org>
# Date 1470401836 -7200
#      Fri Aug 05 14:57:16 2016 +0200
# Node ID 84885aeec2e442d18dcb70fcce2d65dd9fafbf91
# Parent  fa3f05a4219c4ea6b9bc9682f258e4418b36c265
# EXP-Topic vfsward
branchmap: acquires lock before writting the rev branch cache

We now attempt to acquire a lock and write the branch cache within that lock.
This would prevent cache corruption when multiple processes try to write the cache
at the same time.
Yuya Nishihara - Aug. 7, 2016, 11:21 a.m.
On Sat, 06 Aug 2016 01:02:55 +0200, Pierre-Yves David wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@ens-lyon.org>
> # Date 1470401836 -7200
> #      Fri Aug 05 14:57:16 2016 +0200
> # Node ID 84885aeec2e442d18dcb70fcce2d65dd9fafbf91
> # Parent  fa3f05a4219c4ea6b9bc9682f258e4418b36c265
> # EXP-Topic vfsward
> branchmap: acquires lock before writting the rev branch cache
> 
> We now attempt to acquire a lock and write the branch cache within that lock.
> This would prevent cache corruption when multiple processes try to write the cache
> at the same time.

(+CC Durham, Mads)

I wonder if revbranchcache is designed to not take a lock. 9347c15d8136 says
corruption can be recovered.

> --- a/mercurial/branchmap.py	Fri Aug 05 14:54:46 2016 +0200
> +++ b/mercurial/branchmap.py	Fri Aug 05 14:57:16 2016 +0200
> @@ -470,8 +470,10 @@ class revbranchcache(object):
>      def write(self, tr=None):
>          """Save branch cache if it is dirty."""
>          repo = self._repo
> -        if True:
> +        wlock = None
> +        try:
>              if self._rbcnamescount < len(self._names):
> +                wlock = repo.wlock(wait=False)
>                  try:
>                      if self._rbcnamescount != 0:
>                          f = repo.vfs.open(_rbcnames, 'ab')
> @@ -501,6 +503,7 @@ class revbranchcache(object):
>  
>              start = self._rbcrevslen * _rbcrecsize
>              if start != len(self._rbcrevs):
> +                wlock = repo.wlock(wait=False)
>                  revs = min(len(repo.changelog),
>                             len(self._rbcrevs) // _rbcrecsize)
>                  try:
> @@ -521,3 +524,8 @@ class revbranchcache(object):
>                                    inst)
>                      return
>                  self._rbcrevslen = revs
> +        except error.LockHeld as inst:
> +            repo.ui.debug("couldn't write revision branch cache: %s\n" % inst)
> +        finally:
> +            if wlock is not None:
> +                wlock.release()

wlock.held wouldn't be decremented appropriately if you'd take two wlocks.
Pierre-Yves David - Aug. 7, 2016, 1:57 p.m.
On 08/07/2016 01:21 PM, Yuya Nishihara wrote:
> On Sat, 06 Aug 2016 01:02:55 +0200, Pierre-Yves David wrote:
>> # HG changeset patch
>> # User Pierre-Yves David <pierre-yves.david@ens-lyon.org>
>> # Date 1470401836 -7200
>> #      Fri Aug 05 14:57:16 2016 +0200
>> # Node ID 84885aeec2e442d18dcb70fcce2d65dd9fafbf91
>> # Parent  fa3f05a4219c4ea6b9bc9682f258e4418b36c265
>> # EXP-Topic vfsward
>> branchmap: acquires lock before writting the rev branch cache
>>
>> We now attempt to acquire a lock and write the branch cache within that lock.
>> This would prevent cache corruption when multiple processes try to write the cache
>> at the same time.
>
> (+CC Durham, Mads)
>
> I wonder if revbranchcache is designed to not take a lock. 9347c15d8136 says
> corruption can be recovered.

The lock taking is best effort, so I would say it is best to try to grab 
that lock. Even if corruption were recoverable it is better to avoid them.

>> --- a/mercurial/branchmap.py	Fri Aug 05 14:54:46 2016 +0200
>> +++ b/mercurial/branchmap.py	Fri Aug 05 14:57:16 2016 +0200
>> @@ -470,8 +470,10 @@ class revbranchcache(object):
>>      def write(self, tr=None):
>>          """Save branch cache if it is dirty."""
>>          repo = self._repo
>> -        if True:
>> +        wlock = None
>> +        try:
>>              if self._rbcnamescount < len(self._names):
>> +                wlock = repo.wlock(wait=False)
>>                  try:
>>                      if self._rbcnamescount != 0:
>>                          f = repo.vfs.open(_rbcnames, 'ab')
>> @@ -501,6 +503,7 @@ class revbranchcache(object):
>>
>>              start = self._rbcrevslen * _rbcrecsize
>>              if start != len(self._rbcrevs):
>> +                wlock = repo.wlock(wait=False)
>>                  revs = min(len(repo.changelog),
>>                             len(self._rbcrevs) // _rbcrecsize)
>>                  try:
>> @@ -521,3 +524,8 @@ class revbranchcache(object):
>>                                    inst)
>>                      return
>>                  self._rbcrevslen = revs
>> +        except error.LockHeld as inst:
>> +            repo.ui.debug("couldn't write revision branch cache: %s\n" % inst)
>> +        finally:
>> +            if wlock is not None:
>> +                wlock.release()
>
> wlock.held wouldn't be decremented appropriately if you'd take two wlocks.

Good catch, I'll send a V3.

Cheers,
Mads Kiilerich - Aug. 8, 2016, 2:50 p.m.
On Sun, Aug 7, 2016 at 1:21 PM, Yuya Nishihara <yuya@tcha.org> wrote:

> On Sat, 06 Aug 2016 01:02:55 +0200, Pierre-Yves David wrote:
> > # HG changeset patch
> > # User Pierre-Yves David <pierre-yves.david@ens-lyon.org>
> > # Date 1470401836 -7200
> > #      Fri Aug 05 14:57:16 2016 +0200
> > # Node ID 84885aeec2e442d18dcb70fcce2d65dd9fafbf91
> > # Parent  fa3f05a4219c4ea6b9bc9682f258e4418b36c265
> > # EXP-Topic vfsward
> > branchmap: acquires lock before writting the rev branch cache
> >
> > We now attempt to acquire a lock and write the branch cache within that
> lock.
> > This would prevent cache corruption when multiple processes try to write
> the cache
> > at the same time.
>
> (+CC Durham, Mads)
>
> I wonder if revbranchcache is designed to not take a lock. 9347c15d8136
> says
> corruption can be recovered.
>

Yes, I think it was given as an initial design constraint for acceptance
upstream that the cache had to be updated on the fly by read only
operations without locking. Hindsight perspective might have changed mine
and others memory and opinion ;-)

In my opinion, the revision branch cache should be treated exactly like the
branch head cache and updated the same way. It might be expensive to update
initially, but not doing it will end up being more expensive.

Proper transaction support would however also require that the design
constraint of doing in-place (non-atomic non-append) updates got lifted.

The existing algorithm is not perfect lockfree, but I doubt it was possible
to come up with a test case where simultaneous writes would give a
corruption that not would be detected and recovered automatically.

This change looks like a fine small improvement that will reduce the risk.
But still, like most other write locks in Mercurial, it has the problem
that data is read and checked before taking the lock and then "leaks" into
the locked zone.

I wonder but haven't investigated: How will this handle the case where the
repo can't be locked because it is on a read only filesystem? Will it get
LockHeld (which would be a misleading exception name) and skip the writing?

/Mads
Pierre-Yves David - Aug. 8, 2016, 2:53 p.m.
On 08/08/2016 04:50 PM, Mads Kiilerich wrote:
> On Sun, Aug 7, 2016 at 1:21 PM, Yuya Nishihara <yuya@tcha.org
> <mailto:yuya@tcha.org>> wrote:
>
>     On Sat, 06 Aug 2016 01:02:55 +0200, Pierre-Yves David wrote:
>     > # HG changeset patch
>     > # User Pierre-Yves David <pierre-yves.david@ens-lyon.org
>     <mailto:pierre-yves.david@ens-lyon.org>>
>     > # Date 1470401836 -7200
>     > #      Fri Aug 05 14:57:16 2016 +0200
>     > # Node ID 84885aeec2e442d18dcb70fcce2d65dd9fafbf91
>     > # Parent  fa3f05a4219c4ea6b9bc9682f258e4418b36c265
>     > # EXP-Topic vfsward
>     > branchmap: acquires lock before writting the rev branch cache
>     >
>     > We now attempt to acquire a lock and write the branch cache within
>     that lock.
>     > This would prevent cache corruption when multiple processes try to
>     write the cache
>     > at the same time.
>
>     (+CC Durham, Mads)
>
>     I wonder if revbranchcache is designed to not take a lock.
>     9347c15d8136 says
>     corruption can be recovered.
>
>
> Yes, I think it was given as an initial design constraint for acceptance
> upstream that the cache had to be updated on the fly by read only
> operations without locking. Hindsight perspective might have changed
> mine and others memory and opinion ;-)
>
> In my opinion, the revision branch cache should be treated exactly like
> the branch head cache and updated the same way. It might be expensive to
> update initially, but not doing it will end up being more expensive.
>
> Proper transaction support would however also require that the design
> constraint of doing in-place (non-atomic non-append) updates got lifted.
>
> The existing algorithm is not perfect lockfree, but I doubt it was
> possible to come up with a test case where simultaneous writes would
> give a corruption that not would be detected and recovered automatically.
>
> This change looks like a fine small improvement that will reduce the
> risk. But still, like most other write locks in Mercurial, it has the
> problem that data is read and checked before taking the lock and then
> "leaks" into the locked zone.

Yeah, it is not perfect.

> I wonder but haven't investigated: How will this handle the case where
> the repo can't be locked because it is on a read only filesystem? Will
> it get LockHeld (which would be a misleading exception name) and skip
> the writing?

It will raise LockUnavailable, yuya fixed it in flight

Cheers,

Patch

diff -r fa3f05a4219c -r 84885aeec2e4 mercurial/branchmap.py
--- a/mercurial/branchmap.py	Fri Aug 05 14:54:46 2016 +0200
+++ b/mercurial/branchmap.py	Fri Aug 05 14:57:16 2016 +0200
@@ -470,8 +470,10 @@  class revbranchcache(object):
     def write(self, tr=None):
         """Save branch cache if it is dirty."""
         repo = self._repo
-        if True:
+        wlock = None
+        try:
             if self._rbcnamescount < len(self._names):
+                wlock = repo.wlock(wait=False)
                 try:
                     if self._rbcnamescount != 0:
                         f = repo.vfs.open(_rbcnames, 'ab')
@@ -501,6 +503,7 @@  class revbranchcache(object):
 
             start = self._rbcrevslen * _rbcrecsize
             if start != len(self._rbcrevs):
+                wlock = repo.wlock(wait=False)
                 revs = min(len(repo.changelog),
                            len(self._rbcrevs) // _rbcrecsize)
                 try:
@@ -521,3 +524,8 @@  class revbranchcache(object):
                                   inst)
                     return
                 self._rbcrevslen = revs
+        except error.LockHeld as inst:
+            repo.ui.debug("couldn't write revision branch cache: %s\n" % inst)
+        finally:
+            if wlock is not None:
+                wlock.release()