Patchwork [stable] revisionbranchcache: fall back to slow path if starting readonly (issue4531)

login
register
mail settings
Submitter Mads Kiilerich
Date Feb. 6, 2015, 1:53 a.m.
Message ID <83796637ce12f8d0e913.1423187611@ssl.google-analytics.com>
Download mbox | patch
Permalink /patch/7698/
State Accepted
Commit 5b4ed033390bf6e2879c8f5c28c84e1ee3b87231
Headers show

Comments

Mads Kiilerich - Feb. 6, 2015, 1:53 a.m.
# HG changeset patch
# User Mads Kiilerich <madski@unity3d.com>
# Date 1423187530 -3600
#      Fri Feb 06 02:52:10 2015 +0100
# Branch stable
# Node ID 83796637ce12f8d0e913f6db062eae5b835dbea9
# Parent  942a5a34b2d0611ab284380fbe45b9bb1897af98
revisionbranchcache: fall back to slow path if starting readonly (issue4531)

Transitioning to Mercurial versions with revision branch cache could be slow as
long as all operations were readonly (revset queries) and the cache would be
populated but not written back.

Instead, fall back to using the consistently slow path when readonly and the
cache doesn't exist yet. That avoids the overhead of populating the cache
without writing it back.

If not readonly, it will still populate all missing entries initially. That
avoids repeated writing of the cache file with small updates, and it also makes
sure a fully populated cache available for the readonly operations.
Augie Fackler - Feb. 10, 2015, 7:04 p.m.
On Fri, Feb 06, 2015 at 02:53:31AM +0100, Mads Kiilerich wrote:
> # HG changeset patch
> # User Mads Kiilerich <madski@unity3d.com>
> # Date 1423187530 -3600
> #      Fri Feb 06 02:52:10 2015 +0100
> # Branch stable
> # Node ID 83796637ce12f8d0e913f6db062eae5b835dbea9
> # Parent  942a5a34b2d0611ab284380fbe45b9bb1897af98
> revisionbranchcache: fall back to slow path if starting readonly (issue4531)
>
> Transitioning to Mercurial versions with revision branch cache could be slow as
> long as all operations were readonly (revset queries) and the cache would be
> populated but not written back.
>
> Instead, fall back to using the consistently slow path when readonly and the
> cache doesn't exist yet. That avoids the overhead of populating the cache
> without writing it back.
>
> If not readonly, it will still populate all missing entries initially. That
> avoids repeated writing of the cache file with small updates, and it also makes
> sure a fully populated cache available for the readonly operations.
>
> diff --git a/mercurial/branchmap.py b/mercurial/branchmap.py
> --- a/mercurial/branchmap.py
> +++ b/mercurial/branchmap.py
> @@ -330,7 +330,7 @@ class revbranchcache(object):
>      and will grow with it but be 1/8th of its size.
>      """
>
> -    def __init__(self, repo):
> +    def __init__(self, repo, readonly=True):
>          assert repo.filtername is None
>          self._names = [] # branch names in local encoding with static index
>          self._rbcrevs = array('c') # structs of type _rbcrecfmt
> @@ -342,6 +342,10 @@ class revbranchcache(object):
>          except (IOError, OSError), inst:
>              repo.ui.debug("couldn't read revision branch cache names: %s\n" %
>                            inst)
> +            if readonly:
> +                # don't try to use cache - fall back to the slow path
> +                self.branchinfo = self._branchinfo

Do we have any way we can use the cache if it's available (already
loaded), but skip loading/fixing it up if we're on the readonly path
and it's not already loaded?


> +
>          if self._names:
>              try:
>                  data = repo.vfs.read(_rbcrevs)
> diff --git a/mercurial/revset.py b/mercurial/revset.py
> --- a/mercurial/revset.py
> +++ b/mercurial/revset.py
> @@ -527,7 +527,7 @@ def branch(repo, subset, x):
>      import branchmap
>      urepo = repo.unfiltered()
>      ucl = urepo.changelog
> -    getbi = branchmap.revbranchcache(urepo).branchinfo
> +    getbi = branchmap.revbranchcache(urepo, readonly=True).branchinfo
>
>      try:
>          b = getstring(x, '')
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
Mads Kiilerich - Feb. 21, 2015, 7:50 p.m.
On 02/10/2015 08:04 PM, Augie Fackler wrote:
> On Fri, Feb 06, 2015 at 02:53:31AM +0100, Mads Kiilerich wrote:
>> # HG changeset patch
>> # User Mads Kiilerich <madski@unity3d.com>
>> # Date 1423187530 -3600
>> #      Fri Feb 06 02:52:10 2015 +0100
>> # Branch stable
>> # Node ID 83796637ce12f8d0e913f6db062eae5b835dbea9
>> # Parent  942a5a34b2d0611ab284380fbe45b9bb1897af98
>> revisionbranchcache: fall back to slow path if starting readonly (issue4531)
>>
>> Transitioning to Mercurial versions with revision branch cache could be slow as
>> long as all operations were readonly (revset queries) and the cache would be
>> populated but not written back.
>>
>> Instead, fall back to using the consistently slow path when readonly and the
>> cache doesn't exist yet. That avoids the overhead of populating the cache
>> without writing it back.
>>
>> If not readonly, it will still populate all missing entries initially. That
>> avoids repeated writing of the cache file with small updates, and it also makes
>> sure a fully populated cache available for the readonly operations.
>>
>> diff --git a/mercurial/branchmap.py b/mercurial/branchmap.py
>> --- a/mercurial/branchmap.py
>> +++ b/mercurial/branchmap.py
>> @@ -330,7 +330,7 @@ class revbranchcache(object):
>>       and will grow with it but be 1/8th of its size.
>>       """
>>
>> -    def __init__(self, repo):
>> +    def __init__(self, repo, readonly=True):
>>           assert repo.filtername is None
>>           self._names = [] # branch names in local encoding with static index
>>           self._rbcrevs = array('c') # structs of type _rbcrecfmt
>> @@ -342,6 +342,10 @@ class revbranchcache(object):
>>           except (IOError, OSError), inst:
>>               repo.ui.debug("couldn't read revision branch cache names: %s\n" %
>>                             inst)
>> +            if readonly:
>> +                # don't try to use cache - fall back to the slow path
>> +                self.branchinfo = self._branchinfo
> Do we have any way we can use the cache if it's available (already
> loaded), but skip loading/fixing it up if we're on the readonly path
> and it's not already loaded?

The revision branch cache is currently only shortlived. It is not stored 
anywhere and will thus never already have been loaded. So no, there is 
currently no win there.

The same question could be asked for 'fully populated' instead of 
'loaded'. That is however also not feasible; all the revision entries 
are invalidated separately and there is no grand total that can be checked.

So: the best we can do is to make sure we don't spend time populating 
the cache when using it readonly.

/Mads
Matt Mackall - March 2, 2015, 6:52 a.m.
On Fri, 2015-02-06 at 02:53 +0100, Mads Kiilerich wrote:
> # HG changeset patch
> # User Mads Kiilerich <madski@unity3d.com>
> # Date 1423187530 -3600
> #      Fri Feb 06 02:52:10 2015 +0100
> # Branch stable
> # Node ID 83796637ce12f8d0e913f6db062eae5b835dbea9
> # Parent  942a5a34b2d0611ab284380fbe45b9bb1897af98
> revisionbranchcache: fall back to slow path if starting readonly (issue4531)

Queued for stable, thanks.

Patch

diff --git a/mercurial/branchmap.py b/mercurial/branchmap.py
--- a/mercurial/branchmap.py
+++ b/mercurial/branchmap.py
@@ -330,7 +330,7 @@  class revbranchcache(object):
     and will grow with it but be 1/8th of its size.
     """
 
-    def __init__(self, repo):
+    def __init__(self, repo, readonly=True):
         assert repo.filtername is None
         self._names = [] # branch names in local encoding with static index
         self._rbcrevs = array('c') # structs of type _rbcrecfmt
@@ -342,6 +342,10 @@  class revbranchcache(object):
         except (IOError, OSError), inst:
             repo.ui.debug("couldn't read revision branch cache names: %s\n" %
                           inst)
+            if readonly:
+                # don't try to use cache - fall back to the slow path
+                self.branchinfo = self._branchinfo
+
         if self._names:
             try:
                 data = repo.vfs.read(_rbcrevs)
diff --git a/mercurial/revset.py b/mercurial/revset.py
--- a/mercurial/revset.py
+++ b/mercurial/revset.py
@@ -527,7 +527,7 @@  def branch(repo, subset, x):
     import branchmap
     urepo = repo.unfiltered()
     ucl = urepo.changelog
-    getbi = branchmap.revbranchcache(urepo).branchinfo
+    getbi = branchmap.revbranchcache(urepo, readonly=True).branchinfo
 
     try:
         b = getstring(x, '')