Patchwork [3,of,3,RFC] localrepository: add caches to transaction (issue4255)

login
register
mail settings
Submitter Gregory Szorc
Date May 28, 2014, 6 a.m.
Message ID <12c2a7ae1d443e6ea89e.1401256847@vm-ubuntu-main.gateway.sonic.net>
Download mbox | patch
Permalink /patch/4882/
State Changes Requested
Headers show

Comments

Gregory Szorc - May 28, 2014, 6 a.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1401253147 25200
#      Tue May 27 21:59:07 2014 -0700
# Node ID 12c2a7ae1d443e6ea89ea6f8f1032bf2178af15d
# Parent  02939296b8052833d4c8bfa6e2e2f1371fe5c517
localrepository: add caches to transaction (issue4255)

localrepository.transaction() now creates a chained transaction bound
to the .hg opener in addition to the .hg/store opener. This new
transaction creates backups of most files under .hg/cache.

With this change, transaction rollbacks will restore caches, thus
preventing cache invalidation and recomputation. This will result
in considerable performance wins on repositories where cache
generation is slow.

On a repository with 12,421 heads and manifests with near 100,000
entries, branch caches can take ~17 minutes to generate
(259s for base, 68s for immutable, 723s for served). On a server
that had hooks causing branch cache updates, transaction rollbacks
resulted in branch cache invalidation and an expensive computation.
This patch makes the problem go away.

THIS PATCH CURRENTLY CAUSES A REGRESSION IN test-fncache.t.
THIS PATCH SHOULD ALSO HAVE EXPLICIT TEST COVERAGE THAT CACHES ARE
RESTORED.
Durham Goode - May 28, 2014, 5:40 p.m.
On 5/27/14, 11:00 PM, "Gregory Szorc" <gregory.szorc@gmail.com> wrote:

># HG changeset patch

># User Gregory Szorc <gregory.szorc@gmail.com>

># Date 1401253147 25200

>#      Tue May 27 21:59:07 2014 -0700

># Node ID 12c2a7ae1d443e6ea89ea6f8f1032bf2178af15d

># Parent  02939296b8052833d4c8bfa6e2e2f1371fe5c517

>localrepository: add caches to transaction (issue4255)

>

>localrepository.transaction() now creates a chained transaction bound

>to the .hg opener in addition to the .hg/store opener. This new

>transaction creates backups of most files under .hg/cache.

>

>With this change, transaction rollbacks will restore caches, thus

>preventing cache invalidation and recomputation. This will result

>in considerable performance wins on repositories where cache

>generation is slow.

>

>On a repository with 12,421 heads and manifests with near 100,000

>entries, branch caches can take ~17 minutes to generate

>(259s for base, 68s for immutable, 723s for served). On a server

>that had hooks causing branch cache updates, transaction rollbacks

>resulted in branch cache invalidation and an expensive computation.

>This patch makes the problem go away.

>

>THIS PATCH CURRENTLY CAUSES A REGRESSION IN test-fncache.t.

>THIS PATCH SHOULD ALSO HAVE EXPLICIT TEST COVERAGE THAT CACHES ARE

>RESTORED.

>

>diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py

>--- a/mercurial/localrepo.py

>+++ b/mercurial/localrepo.py

>@@ -848,8 +848,16 @@ class localrepository(object):

> 

>     def wwritedata(self, filename, data):

>         return self._filter(self._decodefilterpats, filename, data)

> 

>+    def _cachetransactionblacklist(self):

>+        """Set of cache files that are not safe to include in

>transactions.

>+

>+        Files in .hg/cache/ are included in transactions by default. To

>+        prevent that from happening, add an entry to this set.

>+        """

>+        return set(['storehash'])

>+

>     def transaction(self, desc, report=None):

>         tr = self._transref and self._transref() or None

>         if tr and tr.running():

>             return tr.nest()

>@@ -870,8 +878,27 @@ class localrepository(object):

>                                      "journal",

>                                      aftertrans(renames),

>                                      self.store.createmode,

>                                      onclose)

>+

>+        # Add caches to a chained transaction.

>+        if self.ui.configbool("format", "usestore", True):

>+            try:

>+                def noreport(msg):

>+                    pass

>+                tr2 = transaction.transaction(noreport, self.opener,

>+                                              "journal",

>+                 

>createmode=self.store.createmode)

>+                blacklist = self._cachetransactionblacklist()

>+                for filename, mode in self.vfs.readdir("cache"):

>+                    if filename not in blacklist:

>+                        tr2.addbackup("cache/%s" % filename,

>hardlink=False)

>+                tr.chaintransaction(tr2)

>+

>+            except OSError, e:

>+                if e.errno != errno.ENOENT:

>+                    raise

>+

>         self._transref = weakref.ref(tr)

>         return tr

> 

>     def _journalfiles(self):



I’m not sure I like the chained transaction concept.  How is the chain
represented on disk (i.e. if the user ctrl+c’s, how does rollback handle
it)? What happens if the _abort for one transaction fails or the process
is killed before all the chained transactions are aborted?  What if the
same file is added to two transactions (perhaps by a poorly written
extension)? These caches can be written outside the lock right (like
during read only operations)? So what happens if someone modifies a cache
outside the lock while another process is in the middle of transacting it?

I like how elegantly this addresses your problem, but I think the
transaction model is already fragile and difficult enough to reason about,
and I think adding multiple simultaneous transactions will make it even
harder to reason about transaction behavior in the future.  If we want the
caches to be transacted, I think we need to refactor the existing
transaction to support them, make sure they are never written outside of a
transaction, and make sure they are never written during read only
operations.

I have a local commit where I’ve started doing this for the phase cache,
but that’s one of the easier ones since it is mostly modified inside locks
already.
Gregory Szorc - May 28, 2014, 6:37 p.m.
On 5/28/14, 10:40 AM, Durham Goode wrote:
> On 5/27/14, 11:00 PM, "Gregory Szorc" <gregory.szorc@gmail.com> wrote:
>
>> # HG changeset patch
>> # User Gregory Szorc <gregory.szorc@gmail.com>
>> # Date 1401253147 25200
>> #      Tue May 27 21:59:07 2014 -0700
>> # Node ID 12c2a7ae1d443e6ea89ea6f8f1032bf2178af15d
>> # Parent  02939296b8052833d4c8bfa6e2e2f1371fe5c517
>> localrepository: add caches to transaction (issue4255)
>>
>> localrepository.transaction() now creates a chained transaction bound
>> to the .hg opener in addition to the .hg/store opener. This new
>> transaction creates backups of most files under .hg/cache.
>>
>> With this change, transaction rollbacks will restore caches, thus
>> preventing cache invalidation and recomputation. This will result
>> in considerable performance wins on repositories where cache
>> generation is slow.
>>
>> On a repository with 12,421 heads and manifests with near 100,000
>> entries, branch caches can take ~17 minutes to generate
>> (259s for base, 68s for immutable, 723s for served). On a server
>> that had hooks causing branch cache updates, transaction rollbacks
>> resulted in branch cache invalidation and an expensive computation.
>> This patch makes the problem go away.
>>
>> THIS PATCH CURRENTLY CAUSES A REGRESSION IN test-fncache.t.
>> THIS PATCH SHOULD ALSO HAVE EXPLICIT TEST COVERAGE THAT CACHES ARE
>> RESTORED.
>>
>> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
>> --- a/mercurial/localrepo.py
>> +++ b/mercurial/localrepo.py
>> @@ -848,8 +848,16 @@ class localrepository(object):
>>
>>      def wwritedata(self, filename, data):
>>          return self._filter(self._decodefilterpats, filename, data)
>>
>> +    def _cachetransactionblacklist(self):
>> +        """Set of cache files that are not safe to include in
>> transactions.
>> +
>> +        Files in .hg/cache/ are included in transactions by default. To
>> +        prevent that from happening, add an entry to this set.
>> +        """
>> +        return set(['storehash'])
>> +
>>      def transaction(self, desc, report=None):
>>          tr = self._transref and self._transref() or None
>>          if tr and tr.running():
>>              return tr.nest()
>> @@ -870,8 +878,27 @@ class localrepository(object):
>>                                       "journal",
>>                                       aftertrans(renames),
>>                                       self.store.createmode,
>>                                       onclose)
>> +
>> +        # Add caches to a chained transaction.
>> +        if self.ui.configbool("format", "usestore", True):
>> +            try:
>> +                def noreport(msg):
>> +                    pass
>> +                tr2 = transaction.transaction(noreport, self.opener,
>> +                                              "journal",
>> +
>> createmode=self.store.createmode)
>> +                blacklist = self._cachetransactionblacklist()
>> +                for filename, mode in self.vfs.readdir("cache"):
>> +                    if filename not in blacklist:
>> +                        tr2.addbackup("cache/%s" % filename,
>> hardlink=False)
>> +                tr.chaintransaction(tr2)
>> +
>> +            except OSError, e:
>> +                if e.errno != errno.ENOENT:
>> +                    raise
>> +
>>          self._transref = weakref.ref(tr)
>>          return tr
>>
>>      def _journalfiles(self):
>
>
> I’m not sure I like the chained transaction concept.  How is the chain
> represented on disk (i.e. if the user ctrl+c’s, how does rollback handle
> it)? What happens if the _abort for one transaction fails or the process
> is killed before all the chained transactions are aborted?  What if the
> same file is added to two transactions (perhaps by a poorly written
> extension)? These caches can be written outside the lock right (like
> during read only operations)? So what happens if someone modifies a cache
> outside the lock while another process is in the middle of transacting it?
>
> I like how elegantly this addresses your problem, but I think the
> transaction model is already fragile and difficult enough to reason about,
> and I think adding multiple simultaneous transactions will make it even
> harder to reason about transaction behavior in the future.  If we want the
> caches to be transacted, I think we need to refactor the existing
> transaction to support them, make sure they are never written outside of a
> transaction, and make sure they are never written during read only
> operations.
>
> I have a local commit where I’ve started doing this for the phase cache,
> but that’s one of the easier ones since it is mostly modified inside locks
> already.

Thank you for the feedback and for confirming some of the concerns I had.

Since it sounds like you are already going down this road, I will hold 
off and build on top of whatever you land.

FWIW, the issue of cache population outside of locks is bothersome. If a 
cache is invalidated, you may have multiple processes (e.g. hgweb) 
rebuilding the cache simultaneously. This can easily bring a server to 
its knees. Perhaps I should work on a patch series to address that?
Durham Goode - May 28, 2014, 6:57 p.m.
On 5/28/14, 11:37 AM, "Gregory Szorc" <gregory.szorc@gmail.com> wrote:

>On 5/28/14, 10:40 AM, Durham Goode wrote:

>> On 5/27/14, 11:00 PM, "Gregory Szorc" <gregory.szorc@gmail.com> wrote:

>>

>>> # HG changeset patch

>>> # User Gregory Szorc <gregory.szorc@gmail.com>

>>> # Date 1401253147 25200

>>> #      Tue May 27 21:59:07 2014 -0700

>>> # Node ID 12c2a7ae1d443e6ea89ea6f8f1032bf2178af15d

>>> # Parent  02939296b8052833d4c8bfa6e2e2f1371fe5c517

>>> localrepository: add caches to transaction (issue4255)

>>>

>>> localrepository.transaction() now creates a chained transaction bound

>>> to the .hg opener in addition to the .hg/store opener. This new

>>> transaction creates backups of most files under .hg/cache.

>>>

>>> With this change, transaction rollbacks will restore caches, thus

>>> preventing cache invalidation and recomputation. This will result

>>> in considerable performance wins on repositories where cache

>>> generation is slow.

>>>

>>> On a repository with 12,421 heads and manifests with near 100,000

>>> entries, branch caches can take ~17 minutes to generate

>>> (259s for base, 68s for immutable, 723s for served). On a server

>>> that had hooks causing branch cache updates, transaction rollbacks

>>> resulted in branch cache invalidation and an expensive computation.

>>> This patch makes the problem go away.

>>>

>>> THIS PATCH CURRENTLY CAUSES A REGRESSION IN test-fncache.t.

>>> THIS PATCH SHOULD ALSO HAVE EXPLICIT TEST COVERAGE THAT CACHES ARE

>>> RESTORED.

>>>

>>> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py

>>> --- a/mercurial/localrepo.py

>>> +++ b/mercurial/localrepo.py

>>> @@ -848,8 +848,16 @@ class localrepository(object):

>>>

>>>      def wwritedata(self, filename, data):

>>>          return self._filter(self._decodefilterpats, filename, data)

>>>

>>> +    def _cachetransactionblacklist(self):

>>> +        """Set of cache files that are not safe to include in

>>> transactions.

>>> +

>>> +        Files in .hg/cache/ are included in transactions by default.

>>>To

>>> +        prevent that from happening, add an entry to this set.

>>> +        """

>>> +        return set(['storehash'])

>>> +

>>>      def transaction(self, desc, report=None):

>>>          tr = self._transref and self._transref() or None

>>>          if tr and tr.running():

>>>              return tr.nest()

>>> @@ -870,8 +878,27 @@ class localrepository(object):

>>>                                       "journal",

>>>                                       aftertrans(renames),

>>>                                       self.store.createmode,

>>>                                       onclose)

>>> +

>>> +        # Add caches to a chained transaction.

>>> +        if self.ui.configbool("format", "usestore", True):

>>> +            try:

>>> +                def noreport(msg):

>>> +                    pass

>>> +                tr2 = transaction.transaction(noreport, self.opener,

>>> +                                              "journal",

>>> +

>>> createmode=self.store.createmode)

>>> +                blacklist = self._cachetransactionblacklist()

>>> +                for filename, mode in self.vfs.readdir("cache"):

>>> +                    if filename not in blacklist:

>>> +                        tr2.addbackup("cache/%s" % filename,

>>> hardlink=False)

>>> +                tr.chaintransaction(tr2)

>>> +

>>> +            except OSError, e:

>>> +                if e.errno != errno.ENOENT:

>>> +                    raise

>>> +

>>>          self._transref = weakref.ref(tr)

>>>          return tr

>>>

>>>      def _journalfiles(self):

>>

>>

>> I’m not sure I like the chained transaction concept.  How is the chain

>> represented on disk (i.e. if the user ctrl+c’s, how does rollback handle

>> it)? What happens if the _abort for one transaction fails or the process

>> is killed before all the chained transactions are aborted?  What if the

>> same file is added to two transactions (perhaps by a poorly written

>> extension)? These caches can be written outside the lock right (like

>> during read only operations)? So what happens if someone modifies a

>>cache

>> outside the lock while another process is in the middle of transacting

>>it?

>>

>> I like how elegantly this addresses your problem, but I think the

>> transaction model is already fragile and difficult enough to reason

>>about,

>> and I think adding multiple simultaneous transactions will make it even

>> harder to reason about transaction behavior in the future.  If we want

>>the

>> caches to be transacted, I think we need to refactor the existing

>> transaction to support them, make sure they are never written outside

>>of a

>> transaction, and make sure they are never written during read only

>> operations.

>>

>> I have a local commit where I’ve started doing this for the phase cache,

>> but that’s one of the easier ones since it is mostly modified inside

>>locks

>> already.

>

>Thank you for the feedback and for confirming some of the concerns I had.

>

>Since it sounds like you are already going down this road, I will hold

>off and build on top of whatever you land.


My current patch (http://pastebin.com/qkRPDs30 - the tests still fail)
doesn’t add any new transactions features. So nothing should block you if
you wanted to start on other bits. It just adds the phase cache to the
transaction, makes it impossible to write the cache without a transaction,
and now I’m slowly finding all the places that was happening (via the
tests) and adding transactions.

>

>FWIW, the issue of cache population outside of locks is bothersome. If a

>cache is invalidated, you may have multiple processes (e.g. hgweb)

>rebuilding the cache simultaneously. This can easily bring a server to

>its knees. Perhaps I should work on a patch series to address that?


I’m not familiar with how all the cache stuff works unfortunately. Once we
get them all into transactions, I think they’ll cease to be caches
anymore, so they’ll never have to be rebuilt. I don’t have any ideas for
intermediate fixes though. We might get to them later this year, but
probably not in the next couple months.

Patch

diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -848,8 +848,16 @@  class localrepository(object):
 
     def wwritedata(self, filename, data):
         return self._filter(self._decodefilterpats, filename, data)
 
+    def _cachetransactionblacklist(self):
+        """Set of cache files that are not safe to include in transactions.
+
+        Files in .hg/cache/ are included in transactions by default. To
+        prevent that from happening, add an entry to this set.
+        """
+        return set(['storehash'])
+
     def transaction(self, desc, report=None):
         tr = self._transref and self._transref() or None
         if tr and tr.running():
             return tr.nest()
@@ -870,8 +878,27 @@  class localrepository(object):
                                      "journal",
                                      aftertrans(renames),
                                      self.store.createmode,
                                      onclose)
+
+        # Add caches to a chained transaction.
+        if self.ui.configbool("format", "usestore", True):
+            try:
+                def noreport(msg):
+                    pass
+                tr2 = transaction.transaction(noreport, self.opener,
+                                              "journal",
+                                              createmode=self.store.createmode)
+                blacklist = self._cachetransactionblacklist()
+                for filename, mode in self.vfs.readdir("cache"):
+                    if filename not in blacklist:
+                        tr2.addbackup("cache/%s" % filename, hardlink=False)
+                tr.chaintransaction(tr2)
+
+            except OSError, e:
+                if e.errno != errno.ENOENT:
+                    raise
+
         self._transref = weakref.ref(tr)
         return tr
 
     def _journalfiles(self):