Patchwork [4,of,4,stable] rbc: fix superfluous rebuilding from scratch - don't abuse self._rbcnamescount

login
register
mail settings
Submitter Mads Kiilerich
Date July 18, 2016, 8:27 p.m.
Message ID <84b90b0dba99eb322c93.1468873624@localhost.localdomain>
Download mbox | patch
Permalink /patch/15953/
State Accepted
Headers show

Comments

Mads Kiilerich - July 18, 2016, 8:27 p.m.
# HG changeset patch
# User Mads Kiilerich <madski@unity3d.com>
# Date 1468873509 -7200
#      Mon Jul 18 22:25:09 2016 +0200
# Branch stable
# Node ID 84b90b0dba99eb322c93355e89d1e0ffdd0fa809
# Parent  80b80af907493879cae865de643de827bda9fdd1
rbc: fix superfluous rebuilding from scratch - don't abuse self._rbcnamescount

The code used self._rbcnamescount as if it was the length of self._names ...
but actually it is just the number of good entries on disk. This caused the
cache to be populated inefficiently. In some cases very inefficiently.

Instead of checking the length before lookup, just try a lookup in self._names
- that is also in most cases faster.

Comments and debug messages are tweaked to help understanding the issue
and the fix.
Matt Mackall - July 19, 2016, 4:20 a.m.
On Mon, 2016-07-18 at 22:27 +0200, Mads Kiilerich wrote:
> # HG changeset patch
> # User Mads Kiilerich <madski@unity3d.com>
> # Date 1468873509 -7200
> #      Mon Jul 18 22:25:09 2016 +0200
> # Branch stable
> # Node ID 84b90b0dba99eb322c93355e89d1e0ffdd0fa809
> # Parent  80b80af907493879cae865de643de827bda9fdd1
> rbc: fix superfluous rebuilding from scratch - don't abuse self._rbcnamescount

These are queued for default, thanks.

-- 
Mathematics is the supreme nostalgia of our time.
Matt Mackall - July 19, 2016, 4:30 a.m.
On Mon, 2016-07-18 at 23:20 -0500, Matt Mackall wrote:
> On Mon, 2016-07-18 at 22:27 +0200, Mads Kiilerich wrote:
> > 
> > # HG changeset patch
> > # User Mads Kiilerich <madski@unity3d.com>
> > # Date 1468873509 -7200
> > #      Mon Jul 18 22:25:09 2016 +0200
> > # Branch stable
> > # Node ID 84b90b0dba99eb322c93355e89d1e0ffdd0fa809
> > # Parent  80b80af907493879cae865de643de827bda9fdd1
> > rbc: fix superfluous rebuilding from scratch - don't abuse
> > self._rbcnamescount
> These are queued for default, thanks.

Got test failures for the last two, dropping for now.
Mads Kiilerich - July 19, 2016, 10:23 p.m.
On 07/19/2016 06:30 AM, Matt Mackall wrote:
> On Mon, 2016-07-18 at 23:20 -0500, Matt Mackall wrote:
>> On Mon, 2016-07-18 at 22:27 +0200, Mads Kiilerich wrote:
>>> # HG changeset patch
>>> # User Mads Kiilerich <madski@unity3d.com>
>>> # Date 1468873509 -7200
>>> #      Mon Jul 18 22:25:09 2016 +0200
>>> # Branch stable
>>> # Node ID 84b90b0dba99eb322c93355e89d1e0ffdd0fa809
>>> # Parent  80b80af907493879cae865de643de827bda9fdd1
>>> rbc: fix superfluous rebuilding from scratch - don't abuse
>>> self._rbcnamescount
>> These are queued for default, thanks.
> Got test failures for the last two, dropping for now.


I'm sorry about that. A "minor" test change done last minute after 
thorough testing was not completed. I forgot to accept the actual 
output. Fixed in v2.

/Mads

Patch

diff --git a/mercurial/branchmap.py b/mercurial/branchmap.py
--- a/mercurial/branchmap.py
+++ b/mercurial/branchmap.py
@@ -358,7 +358,7 @@  class revbranchcache(object):
         self._repo = repo
         self._names = [] # branch names in local encoding with static index
         self._rbcrevs = array('c') # structs of type _rbcrecfmt
-        self._rbcsnameslen = 0
+        self._rbcsnameslen = 0 # length of names read at _rbcsnameslen
         try:
             bndata = repo.vfs.read(_rbcnames)
             self._rbcsnameslen = len(bndata) # for verification before writing
@@ -380,7 +380,8 @@  class revbranchcache(object):
                                len(repo.changelog))
         if self._rbcrevslen == 0:
             self._names = []
-        self._rbcnamescount = len(self._names) # number of good names on disk
+        self._rbcnamescount = len(self._names) # number of names read at
+                                               # _rbcsnameslen
         self._namesreverse = dict((b, r) for r, b in enumerate(self._names))
 
     def _clear(self):
@@ -416,13 +417,17 @@  class revbranchcache(object):
         if cachenode == '\0\0\0\0':
             pass
         elif cachenode == reponode:
-            if branchidx < self._rbcnamescount:
+            try:
                 return self._names[branchidx], close
-            # referenced branch doesn't exist - rebuild is expensive but needed
-            self._repo.ui.debug("rebuilding corrupted revision branch cache\n")
-            self._clear()
+            except IndexError:
+                # recover from invalid reference to unknown branch
+                self._repo.ui.debug("referenced branch names not found"
+                    " - rebuilding revision branch cache from scratch\n")
+                self._clear()
         else:
             # rev/node map has changed, invalidate the cache from here up
+            self._repo.ui.debug("history modification detected - truncating "
+                "revision branch cache to revision %s\n" % rev)
             truncate = rbcrevidx + _rbcrecsize
             del self._rbcrevs[truncate:]
             self._rbcrevslen = min(self._rbcrevslen, truncate)
diff --git a/tests/test-branches.t b/tests/test-branches.t
--- a/tests/test-branches.t
+++ b/tests/test-branches.t
@@ -632,7 +632,7 @@  situation where the cache is out of sync
 cache is rebuilt when corruption is detected
   $ echo > .hg/cache/rbc-names-v1
   $ hg log -r '5:&branch(.)' -T '{rev} ' --debug
-  rebuilding corrupted revision branch cache
+  referenced branch names not found - rebuilding revision branch cache from scratch
   8 9 10 11 12 13 truncating cache/rbc-revs-v1 to 40
   $ f --size --hexdump .hg/cache/rbc-*
   .hg/cache/rbc-names-v1: size=79
@@ -692,7 +692,6 @@  Test for multiple incorrect branch cache
   rebuilding corrupted revision branch cache
   rebuilding corrupted revision branch cache
   truncating cache/rbc-revs-v1 to 8
-BUG: the cache was declared corrupt multiple times and not fully rebuilt:
   $ f --size --hexdump .hg/cache/rbc-*
   .hg/cache/rbc-names-v1: size=14
   0000: 64 65 66 61 75 6c 74 00 62 72 61 6e 63 68       |default.branch|
diff --git a/tests/test-rebase-conflicts.t b/tests/test-rebase-conflicts.t
--- a/tests/test-rebase-conflicts.t
+++ b/tests/test-rebase-conflicts.t
@@ -302,6 +302,7 @@  Check that the right ancestors is used w
   bundle2-input-part: total payload size 1713
   bundle2-input-bundle: 0 parts total
   invalid branchheads cache (served): tip differs
+  history modification detected - truncating revision branch cache to revision 9
   rebase completed
   updating the branch cache
   truncating cache/rbc-revs-v1 to 72