Patchwork [3,of,4,tags-cache-split] tags: extract .hgtags filenodes cache to a standalone file

login
register
mail settings
Submitter Gregory Szorc
Date March 30, 2015, 6:14 a.m.
Message ID <9b396004b4338aa3e011.1427696045@vm-ubuntu-main.gateway.sonic.net>
Download mbox | patch
Permalink /patch/8368/
State Superseded
Commit 07200e3332a169cbbcc8a9fdc1134196c1e1ce22
Headers show

Comments

Gregory Szorc - March 30, 2015, 6:14 a.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1426392860 25200
#      Sat Mar 14 21:14:20 2015 -0700
# Node ID 9b396004b4338aa3e011cbf4f13800a3c55b1ab6
# Parent  c19ef55be215f68a0955c49473c2e07b721998ff
tags: extract .hgtags filenodes cache to a standalone file

Resolution of .hgtags filenodes values has historically been
a performance pain point for large repositories, where reading
individual manifests can take over 100ms. Multiplied by hundreds
or even thousands of heads and resolving .hgtags filenodes becomes
a performance issue.

Earlier work to split the tags cache into per-filter files
helped alleviate many of the pain points. However, it didn't
address the overall problem of redundant .hgtags filenode resolution.
In fact, it introduced a new one: redundant .hgtags filenode
resolution for each per-filter cache file.

This patch extracts the .hgtags filenode mapping out of tags
cache files and into a standalone and shared file. After this patch,
the .hgtags filenode for any particular changeset should only have to
be computed once during the lifetime of the repository, no matter
how many filters you have. And, if you add a new filter (e.g. when
obsolescence is enabled in core), clients won't have to spend time
recomputing .hgtags filenodes: they'll be able to seed the set from
an existing cache.

A limitation of the existing implementation is that the new cache
file will grow without bound. It is possible for every changeset
in a repository to have an entry in the cache. This will be addressed
in a subsequent commit.
Matt Mackall - March 31, 2015, 1:19 p.m.
On Sun, 2015-03-29 at 23:14 -0700, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1426392860 25200
> #      Sat Mar 14 21:14:20 2015 -0700
> # Node ID 9b396004b4338aa3e011cbf4f13800a3c55b1ab6
> # Parent  c19ef55be215f68a0955c49473c2e07b721998ff
> tags: extract .hgtags filenodes cache to a standalone file
> 
> Resolution of .hgtags filenodes values has historically been
> a performance pain point for large repositories, where reading
> individual manifests can take over 100ms. Multiplied by hundreds
> or even thousands of heads and resolving .hgtags filenodes becomes
> a performance issue.
> 
> Earlier work to split the tags cache into per-filter files
> helped alleviate many of the pain points. However, it didn't
> address the overall problem of redundant .hgtags filenode resolution.
> In fact, it introduced a new one: redundant .hgtags filenode
> resolution for each per-filter cache file.

Again, please take a look at the new revbranchcache and either try to
parallel it or explain why you're doing something different. It's
already doing the same thing for rev->branch names that you're doing
for .hgtags nodes, compactly and quickly.
Pierre-Yves David - March 31, 2015, 9:31 p.m.
On 03/31/2015 06:19 AM, Matt Mackall wrote:
> On Sun, 2015-03-29 at 23:14 -0700, Gregory Szorc wrote:
>> # HG changeset patch
>> # User Gregory Szorc <gregory.szorc@gmail.com>
>> # Date 1426392860 25200
>> #      Sat Mar 14 21:14:20 2015 -0700
>> # Node ID 9b396004b4338aa3e011cbf4f13800a3c55b1ab6
>> # Parent  c19ef55be215f68a0955c49473c2e07b721998ff
>> tags: extract .hgtags filenodes cache to a standalone file
>>
>> Resolution of .hgtags filenodes values has historically been
>> a performance pain point for large repositories, where reading
>> individual manifests can take over 100ms. Multiplied by hundreds
>> or even thousands of heads and resolving .hgtags filenodes becomes
>> a performance issue.
>>
>> Earlier work to split the tags cache into per-filter files
>> helped alleviate many of the pain points. However, it didn't
>> address the overall problem of redundant .hgtags filenode resolution.
>> In fact, it introduced a new one: redundant .hgtags filenode
>> resolution for each per-filter cache file.
>
> Again, please take a look at the new revbranchcache and either try to
> parallel it or explain why you're doing something different. It's
> already doing the same thing for rev->branch names that you're doing
> for .hgtags nodes, compactly and quickly.

I've spend some time in VC with greg, to talk about how reb branch cache 
collaborate and how such approach could also be used by tags cache.

Collaboration is now clearer in Greg minds while the way the tag cache 
is clearer in mind. There will be a new version of this series.
Gregory Szorc - April 12, 2015, 10:19 p.m.
On Tue, Mar 31, 2015 at 5:31 PM, Pierre-Yves David <
pierre-yves.david@ens-lyon.org> wrote:

>
>
> On 03/31/2015 06:19 AM, Matt Mackall wrote:
>
>> On Sun, 2015-03-29 at 23:14 -0700, Gregory Szorc wrote:
>>
>>> # HG changeset patch
>>> # User Gregory Szorc <gregory.szorc@gmail.com>
>>> # Date 1426392860 25200
>>> #      Sat Mar 14 21:14:20 2015 -0700
>>> # Node ID 9b396004b4338aa3e011cbf4f13800a3c55b1ab6
>>> # Parent  c19ef55be215f68a0955c49473c2e07b721998ff
>>> tags: extract .hgtags filenodes cache to a standalone file
>>>
>>> Resolution of .hgtags filenodes values has historically been
>>> a performance pain point for large repositories, where reading
>>> individual manifests can take over 100ms. Multiplied by hundreds
>>> or even thousands of heads and resolving .hgtags filenodes becomes
>>> a performance issue.
>>>
>>> Earlier work to split the tags cache into per-filter files
>>> helped alleviate many of the pain points. However, it didn't
>>> address the overall problem of redundant .hgtags filenode resolution.
>>> In fact, it introduced a new one: redundant .hgtags filenode
>>> resolution for each per-filter cache file.
>>>
>>
>> Again, please take a look at the new revbranchcache and either try to
>> parallel it or explain why you're doing something different. It's
>> already doing the same thing for rev->branch names that you're doing
>> for .hgtags nodes, compactly and quickly.
>>
>
> I've spend some time in VC with greg, to talk about how reb branch cache
> collaborate and how such approach could also be used by tags cache.
>
> Collaboration is now clearer in Greg minds while the way the tag cache is
> clearer in mind. There will be a new version of this series.
>

We talked about this again today at PyCon. Current plan:

1) Establish a shared file for the .hgtags -> filenode mapping that works
like rev branch cache (binary sparse-populated array)
2) Establish per-filter tags cache files
3) Change format of tags cache files to a) remove .hgtags fnodes b)
properly incorporate hiddens revs
4) (distant future) refactor the per-filter tags caches to interact better
to allow reuse, avoid extra computation, etc (similar to how branch head
caches work)

I'm going to try to do 1-3 during the sprint.

Patch

diff --git a/mercurial/tags.py b/mercurial/tags.py
--- a/mercurial/tags.py
+++ b/mercurial/tags.py
@@ -9,9 +9,9 @@ 
 # Currently this module only deals with reading and caching tags.
 # Eventually, it could take care of updating (adding/removing/moving)
 # tags too.
 
-from node import nullid, bin, hex, short
+from node import nullid, nullrev, bin, hex, short
 from i18n import _
 import util
 import encoding
 import error
@@ -29,20 +29,28 @@  import time
 #   <headrev> <headnode> [<hgtagsnode>]
 #
 # *OR* a single line of the form:
 #
-#   "external" <tiprev> <tipnode>
+#   "external" <tiprev> <tipnode> <filterhash>
 #
 # The first form is the historical method of storing the .hgtags filenode
-# mapping inline. The second form (which is reserved for future use) uses
-# a separate file for this data.
+# mapping inline. The second form utilizes a separate file for storing the
+# .hgtags filenode mapping.
 #
 # <headrev> is an integer revision and <headnode> is a 40 character hex
 # node for that changeset. These redundantly identify a repository
 # head from the time the cache was written.
 #
-# <tagnode> is the filenode of .hgtags on that head. Heads with no .hgtags
-# file will have no <hgtagsnode> (just 2 values per line).
+# <hgtagsnode> is the filenode of .hgtags on that head. Heads with no
+# .hgtags file will have no <hgtagsnode> (just 2 values per line).
+#
+# For external filenode caches, the first line contains the tip revision,
+# node, and hash of the filter. These are used for cache validation.
+#
+# The purpose of an external filenode cache is so caches for different
+# filters can share data. Without this shared cache, each filter/cache
+# would have to perform its own .hgtags filenode resolution. This can
+# be quite expensive and lead to significant performance issues.
 #
 # The filenode cache is ordered from tip to oldest (which is part of why
 # <headrev> is there: a quick check of the tip from when the cache was
 # written against the current tip is all that is needed to check whether
@@ -89,9 +97,10 @@  def findglobaltags(ui, repo, alltags, ta
     # tags when we pass it to _writetagcache().
     assert len(alltags) == len(tagtypes) == 0, \
            "findglobaltags() should be called first"
 
-    (heads, tagfnode, cachetags, shouldwrite) = _readtagcache(ui, repo)
+    heads, tagfnode, cachetags, shouldwrite, cacheinfo = \
+            _readtagcache(ui, repo)
     if cachetags is not None:
         assert not shouldwrite
         # XXX is this really 100% correct?  are there oddball special
         # cases where a global tag should outrank a local tag but won't,
@@ -118,9 +127,9 @@  def findglobaltags(ui, repo, alltags, ta
             _updatetags(filetags, 'global', alltags, tagtypes)
 
     # and update the cache (if necessary)
     if shouldwrite:
-        _writetagcache(ui, repo, heads, tagfnode, alltags)
+        _writetagcache(ui, repo, heads, tagfnode, alltags, cacheinfo)
 
 def readlocaltags(ui, repo, alltags, tagtypes):
     '''Read local tags in repo. Update alltags and tagtypes.'''
     try:
@@ -257,9 +266,9 @@  def _filename(repo):
 
 def _readtagcache(ui, repo):
     '''Read the tag cache.
 
-    Returns a tuple (heads, fnodes, cachetags, shouldwrite).
+    Returns a tuple (heads, fnodes, cachetags, shouldwrite, cacheinfo).
 
     If the cache is completely up-to-date, "cachetags" is a dict of the
     form returned by _readtags() and "heads" and "fnodes" are None and
     "shouldwrite" is False.
@@ -270,66 +279,105 @@  def _readtagcache(ui, repo):
     True.
 
     If the cache is not up to date, the caller is responsible for reading tag
     info from each returned head. (See findglobaltags().)
+
+    The "cacheinfo" returned should be treated as a black box and passed
+    to _writetagcache() for inclusion in the written cache file.
     '''
+    import repoview # avoid cycle
     try:
         cachefile = repo.vfs(_filename(repo), 'r')
         # force reading the file for static-http
         cachelines = iter(cachefile)
     except IOError:
         cachefile = None
 
-    cacherevs = []  # list of headrev
-    cacheheads = [] # list of headnode
+    cacheheads = set() # set of head nodes
     cachefnode = {} # map headnode to filenode
+    # Cache validation values.
+    lastrev = nullrev
+    lastnode = nullid
+    lastfilterhash = None
     if cachefile:
         try:
+            external = False
             for line in cachelines:
                 if line == "\n":
                     break
 
-                # Future version of cache encountered. Do nothing yet.
+                # The reference to external .hgtags filenodes also defines
+                # the cache validation data.
                 if line.startswith('external '):
+                    external = True
+                    cachekey = line.split()[1:]
+                    lastrev = int(cachekey[0])
+                    lastnode = bin(cachekey[1])
+                    # Value isn't written if repo is empty.
+                    try:
+                        lastfilterhash = bin(cachekey[2])
+                    except IndexError:
+                        pass
                     continue
 
+                if external:
+                    raise ValueError('should not encounter head nodes with '
+                                     'external hgtags filenode cache')
+
                 line = line.split()
-                cacherevs.append(int(line[0]))
+                if lastrev == nullrev:
+                    lastrev = int(line[0])
                 headnode = bin(line[1])
-                cacheheads.append(headnode)
+                if lastnode is nullid:
+                    lastnode = headnode
                 if len(line) == 3:
                     fnode = bin(line[2])
                     cachefnode[headnode] = fnode
         except Exception:
             # corruption of the tags cache, just recompute it
             ui.warn(_('.hg/%s is corrupt, rebuilding it\n') % _filename(repo))
-            cacheheads = []
-            cacherevs = []
             cachefnode = {}
 
     tipnode = repo.changelog.tip()
     tiprev = len(repo.changelog) - 1
 
-    # Case 1 (common): tip is the same, so nothing has changed.
-    # (Unchanged tip trivially means no changesets have been added.
-    # But, thanks to localrepository.destroyed(), it also means none
-    # have been destroyed by strip or rollback.)
-    if cacheheads and cacheheads[0] == tipnode and cacherevs[0] == tiprev:
+    # We compare the current state of the repo against what was recorded
+    # in the cache. If they are equivalent, the cache is up to date and
+    # no additional processing is required. This is similar to logic in
+    # the branch caches.
+    fresh = False
+    filteredhash = repoview.filteredhash(repo, tiprev)
+    try:
+        fresh = (cachefile
+                 and (tiprev == lastrev)
+                 and (lastnode == tipnode)
+                 and (lastfilterhash == filteredhash))
+    except IndexError:
+        pass
+
+    cacheinfo = (tiprev, tipnode, filteredhash, {})
+
+    # Case 1 (common): repository and filter unaltered since last time.
+    # Go ahead and use existing tags values.
+    if fresh:
         tags, warned = _readtags(ui, repo, cachelines, cachefile.name)
         cachefile.close()
         cachefile = None
         # If tag reading had issues, fall through and repair the cache.
         if not warned:
-            return (None, None, tags, False)
+            return None, None, tags, False, cacheinfo
         ui.warn(_('.hg/%s is corrupt; rebuilding it\n') % _filename(repo))
+
+    # Cache isn't fresh or is corrupt. Either way, we don't care about the
+    # cache file content any more.
     if cachefile:
-        cachefile.close()               # ignore rest of file
+        cachefile.close()
 
     repoheads = repo.heads()
     # Case 2 (uncommon): empty repo; get out quickly and don't bother
     # writing an empty cache.
     if repoheads == [nullid]:
-        return ([], {}, {}, False)
+        return [], {}, {}, False, cacheinfo
 
     # Case 3 (uncommon): cache file missing or empty.
 
     # Case 4 (uncommon): tip rev decreased.  This should only happen
@@ -345,69 +393,65 @@  def _readtagcache(ui, repo):
     # exposed".
     if not len(repo.file('.hgtags')):
         # No tags have ever been committed, so we can avoid a
         # potentially expensive search.
-        return (repoheads, cachefnode, None, True)
+        return repoheads, cachefnode, None, True, cacheinfo
 
     starttime = time.time()
 
     newheads = [head
                 for head in repoheads
-                if head not in set(cacheheads)]
+                if head not in cacheheads]
+
+    existingfnodes = _readhgtagsfnodescache(ui, repo)
+    # This gets carried through to _updatehgtagsfnodescache() so we only have
+    # to read the file once.
+    cacheinfo[3].update(existingfnodes)
+    manifestlookupcount = 0
 
     # Now we have to lookup the .hgtags filenode for every new head.
     # This is the most expensive part of finding tags, so performance
     # depends primarily on the size of newheads.  Worst case: no cache
     # file, so newheads == repoheads.
     for head in reversed(newheads):
+        # Look in the supplemental hgtags fnodes cache first.
+        fnode = existingfnodes.get(head)
+        if fnode:
+            cachefnode[head] = fnode
+            continue
+
         cctx = repo[head]
         try:
             fnode = cctx.filenode('.hgtags')
             cachefnode[head] = fnode
+            manifestlookupcount += 1
         except error.LookupError:
             # no .hgtags file on this head
             pass
 
     duration = time.time() - starttime
     ui.log('tagscache',
            'resolved %d tags cache entries from %d manifests in %0.4f '
            'seconds\n',
-           len(cachefnode), len(newheads), duration)
+           len(cachefnode), manifestlookupcount, duration)
 
     # Caller has to iterate over all heads, but can use the filenodes in
     # cachefnode to get to each .hgtags revision quickly.
-    return (repoheads, cachefnode, None, True)
+    return repoheads, cachefnode, None, True, cacheinfo
 
-def _writetagcache(ui, repo, heads, tagfnode, cachetags):
+def _writetagcache(ui, repo, heads, tagfnode, cachetags, cacheinfo):
     try:
         cachefile = repo.vfs(_filename(repo), 'w', atomictemp=True)
     except (OSError, IOError):
         return
 
     ui.log('tagscache', 'writing tags cache file with %d heads and %d tags\n',
             len(heads), len(cachetags))
 
-    realheads = repo.heads()            # for sanity checks below
-    for head in heads:
-        # temporary sanity checks; these can probably be removed
-        # once this code has been in crew for a few weeks
-        assert head in repo.changelog.nodemap, \
-               'trying to write non-existent node %s to tag cache' % short(head)
-        assert head in realheads, \
-               'trying to write non-head %s to tag cache' % short(head)
-        assert head != nullid, \
-               'trying to write nullid to tag cache'
-
-        # This can't fail because of the first assert above.  When/if we
-        # remove that assert, we might want to catch LookupError here
-        # and downgrade it to a warning.
-        rev = repo.changelog.rev(head)
-
-        fnode = tagfnode.get(head)
-        if fnode:
-            cachefile.write('%d %s %s\n' % (rev, hex(head), hex(fnode)))
-        else:
-            cachefile.write('%d %s\n' % (rev, hex(head)))
+    cachekey = [str(cacheinfo[0]), hex(cacheinfo[1])]
+    if cacheinfo[2] is not None:
+        cachekey.append(hex(cacheinfo[2]))
+    cachefile.write('external %s\n' % ' '.join(cachekey))
 
     # Tag names in the cache are in UTF-8 -- which is the whole reason
     # we keep them in UTF-8 throughout this module.  If we converted
     # them local encoding on input, we would lose info writing them to
@@ -421,4 +465,71 @@  def _writetagcache(ui, repo, heads, tagf
     try:
         cachefile.close()
     except (OSError, IOError):
         pass
+
+    _updatehgtagsfnodescache(ui, repo, cacheinfo[3], tagfnode)
+
+_fnodescachefile = 'cache/hgtagsfnodes1'
+
+def _readhgtagsfnodescache(ui, repo):
+    """Read the cache mapping changeset nodes to .hgtags filenodes.
+
+    The cache consists of pairs of 20-byte nodes.
+
+    No validation of the entries is performed other than a spot test
+    that the file doesn't contain extra data. Since nodes are derived
+    from content and are deterministic, mappings are constant. The
+    only thing that can change is that a changeset may be stripped.
+    Callers must perform their own checking to ensure "unknown"
+    entries don't leak out.
+    """
+    data = repo.vfs.tryread(_fnodescachefile)
+    nodes = {}
+    l = len(data)
+    offset = 0
+    while offset + 40 <= l:
+        node = data[offset:offset + 20]
+        fnode = data[offset + 20:offset + 40]
+        nodes[node] = fnode
+        offset += 40
+
+    ui.log('tagscache',
+           'read %d entries from hgtags filenodes cache\n',
+           len(nodes))
+
+    # If we have data left over, something wasn't written properly.
+    # We remove the invalid cache and throw away any data that was
+    # read since we can't trust it.
+    if offset != l:
+        ui.warn(_('.hg/%s is corrupt; it will be rebuilt\n') %
+                  _fnodescachefile)
+        repo.vfs.unlink(_fnodescachefile)
+        nodes = {}
+
+    return nodes
+
+def _updatehgtagsfnodescache(ui, repo, existing, fnodes):
+    """Update the cache file mapping changeset nodes to .hgtags fnodes.
+
+    For now, all entries are preserved for all of time. In the future, we
+    should consider pruning this cache so its growth isn't unbounded.
+    """
+    # Append new entries instead of re-writing existing content to reduce
+    # I/O.
+    missing = set(fnodes.keys()) - set(existing.keys())
+
+    entries = []
+    # sorted() is to make tests sane.
+    for node in sorted(missing):
+        fnode = fnodes[node]
+        entries.append('%s%s' % (node, fnode))
+
+    data = ''.join(entries)
+    try:
+        repo.vfs.append(_fnodescachefile, data)
+    except (IOError, OSError):
+        pass
+
+    ui.log('tagscache',
+           'appended %d entries to hgtags filenodes cache\n',
+           len(missing))
diff --git a/tests/test-blackbox.t b/tests/test-blackbox.t
--- a/tests/test-blackbox.t
+++ b/tests/test-blackbox.t
@@ -130,11 +130,13 @@  tags cache gets logged
   $ hg tag -m 'create test tag' test-tag
   $ hg tags
   tip                                3:5b5562c08298
   test-tag                           2:d02f48003e62
-  $ hg blackbox -l 3
+  $ hg blackbox -l 5
+  1970/01/01 00:00:00 bob> read 0 entries from hgtags filenodes cache
   1970/01/01 00:00:00 bob> resolved 1 tags cache entries from 1 manifests in ?.???? seconds (glob)
   1970/01/01 00:00:00 bob> writing tags cache file with 2 heads and 1 tags
+  1970/01/01 00:00:00 bob> appended 1 entries to hgtags filenodes cache
   1970/01/01 00:00:00 bob> tags exited 0 after ?.?? seconds (glob)
 
 extension and python hooks - use the eol extension for a pythonhook
 
diff --git a/tests/test-mq.t b/tests/test-mq.t
--- a/tests/test-mq.t
+++ b/tests/test-mq.t
@@ -316,9 +316,9 @@  Dump the tag cache to ensure that it has
 
 .hg/cache/tags1-visible (pre qpush):
 
   $ cat .hg/cache/tags1-visible
-  1 [\da-f]{40} (re)
+  external 1 [\da-f]{40} (re)
   
   $ hg qpush
   applying test.patch
   now at: test.patch
@@ -328,9 +328,9 @@  Dump the tag cache to ensure that it has
 
 .hg/cache/tags1-visible (post qpush):
 
   $ cat .hg/cache/tags1-visible
-  2 [\da-f]{40} (re)
+  external 2 [\da-f]{40} (re)
   
   $ checkundo qpush
   $ cd ..
 
diff --git a/tests/test-obsolete-tag-cache.t b/tests/test-obsolete-tag-cache.t
--- a/tests/test-obsolete-tag-cache.t
+++ b/tests/test-obsolete-tag-cache.t
@@ -36,10 +36,9 @@  Trigger tags cache population by doing s
   o  0:55482a6fb4b1 test1 initial
   
 
   $ cat .hg/cache/tags1-visible
-  4 042eb6bfcc4909bad84a1cbf6eb1ddf0ab587d41
-  3 c3cb30f2d2cd0aae008cc91a07876e3c5131fd22 b3bce87817fe7ac9dca2834366c1d7534c095cf1
+  external 4 042eb6bfcc4909bad84a1cbf6eb1ddf0ab587d41
   
   55482a6fb4b1881fa8f746fd52cf6f096bb21c89 test1
   d75775ffbc6bca1794d300f5571272879bd280da test2
 
@@ -81,17 +80,16 @@  Trigger population on unfiltered repo
 
 visible cache should only contain visible head (issue4550)
 
   $ cat .hg/cache/tags1-visible
-  7 eb610439e10e0c6b296f97b59624c2e24fc59e30 b3bce87817fe7ac9dca2834366c1d7534c095cf1
+  external 7 eb610439e10e0c6b296f97b59624c2e24fc59e30 2fce1eec33263d08a4d04293960fc73a555230e4
   
   55482a6fb4b1881fa8f746fd52cf6f096bb21c89 test1
   d75775ffbc6bca1794d300f5571272879bd280da test2
 
 unfiltered cache should contain all topological heads (issue4550)
 
   $ cat .hg/cache/tags1-unfiltered
-  7 eb610439e10e0c6b296f97b59624c2e24fc59e30 b3bce87817fe7ac9dca2834366c1d7534c095cf1
-  3 c3cb30f2d2cd0aae008cc91a07876e3c5131fd22 b3bce87817fe7ac9dca2834366c1d7534c095cf1
+  external 7 eb610439e10e0c6b296f97b59624c2e24fc59e30
   
   55482a6fb4b1881fa8f746fd52cf6f096bb21c89 test1
   d75775ffbc6bca1794d300f5571272879bd280da test2
diff --git a/tests/test-tags.t b/tests/test-tags.t
--- a/tests/test-tags.t
+++ b/tests/test-tags.t
@@ -2,8 +2,11 @@  Helper functions:
 
   $ cacheexists() {
   >   [ -f .hg/cache/tags1-visible ] && echo "tag cache exists" || echo "no tag cache"
   > }
+  $ fnodescacheexists() {
+  >   [ -f .hg/cache/hgtagsfnodes1 ] && echo "fnodes cache exists" || echo "no fnodes cache"
+  > }
 
   $ dumptags() {
   >     rev=$1
   >     echo "rev $rev: .hgtags:"
@@ -19,12 +22,16 @@  Setup:
   $ hg init t
   $ cd t
   $ cacheexists
   no tag cache
+  $ fnodescacheexists
+  no fnodes cache
   $ hg id
   000000000000 tip
   $ cacheexists
   no tag cache
+  $ fnodescacheexists
+  no fnodes cache
   $ echo a > a
   $ hg add a
   $ hg commit -m "test"
   $ hg co
@@ -32,8 +39,12 @@  Setup:
   $ hg identify
   acb14030fe0a tip
   $ cacheexists
   tag cache exists
+  $ fnodescacheexists
+  fnodes cache exists
+  $ f --size --hexdump .hg/cache/hgtagsfnodes1
+  .hg/cache/hgtagsfnodes1: size=0
 
 Try corrupting the cache
 
   $ printf 'a b' > .hg/cache/tags1-visible
@@ -66,18 +77,29 @@  Create a tag behind hg's back:
   first                              0:acb14030fe0a
   $ hg identify
   b9154636be93 tip
 
+  $ f --size --hexdump .hg/cache/hgtagsfnodes1
+  .hg/cache/hgtagsfnodes1: size=40
+  0000: b9 15 46 36 be 93 8d 3d 43 1e 75 a7 c9 06 50 4a |..F6...=C.u...PJ|
+  0010: 07 9b fe 07 26 b7 b4 a7 73 e0 9e e3 c5 2f 51 0e |....&...s..../Q.|
+  0020: 19 e0 5e 1f f9 66 d8 59                         |..^..f.Y|
+
 Repeat with cold tag cache:
 
-  $ rm -f .hg/cache/tags1-visible
+  $ rm -f .hg/cache/tags1-visible .hg/cache/hgtagsfnodes1
   $ hg identify
   b9154636be93 tip
+  $ f --size --hexdump .hg/cache/hgtagsfnodes1
+  .hg/cache/hgtagsfnodes1: size=40
+  0000: b9 15 46 36 be 93 8d 3d 43 1e 75 a7 c9 06 50 4a |..F6...=C.u...PJ|
+  0010: 07 9b fe 07 26 b7 b4 a7 73 e0 9e e3 c5 2f 51 0e |....&...s..../Q.|
+  0020: 19 e0 5e 1f f9 66 d8 59                         |..^..f.Y|
 
 And again, but now unable to write tag cache:
 
 #if unix-permissions
-  $ rm -f .hg/cache/tags1-visible
+  $ rm -f .hg/cache/tags1-visible .hg/cache/hgtagsfnodes1
   $ chmod 555 .hg
   $ hg identify
   b9154636be93 tip
   $ chmod 755 .hg
@@ -104,8 +126,15 @@  Create a branch:
   created new head
   $ hg id
   c8edf04160c7 tip
 
+No new .hgtags fnode should be written since we didn't access tags info
+  $ f --size --hexdump .hg/cache/hgtagsfnodes1
+  .hg/cache/hgtagsfnodes1: size=40
+  0000: b9 15 46 36 be 93 8d 3d 43 1e 75 a7 c9 06 50 4a |..F6...=C.u...PJ|
+  0010: 07 9b fe 07 26 b7 b4 a7 73 e0 9e e3 c5 2f 51 0e |....&...s..../Q.|
+  0020: 19 e0 5e 1f f9 66 d8 59                         |..^..f.Y|
+
 Merge the two heads:
 
   $ hg merge 1
   1 files updated, 0 files merged, 0 files removed, 0 files unresolved
@@ -119,8 +148,15 @@  Merge the two heads:
 Create a fake head, make sure tag not visible afterwards:
 
   $ cp .hgtags tags
   $ hg tag last
+  $ f --size --hexdump .hg/cache/hgtagsfnodes1
+  .hg/cache/hgtagsfnodes1: size=80
+  0000: b9 15 46 36 be 93 8d 3d 43 1e 75 a7 c9 06 50 4a |..F6...=C.u...PJ|
+  0010: 07 9b fe 07 26 b7 b4 a7 73 e0 9e e3 c5 2f 51 0e |....&...s..../Q.|
+  0020: 19 e0 5e 1f f9 66 d8 59 ac 5e 98 0c 4d c0 0f 2d |..^..f.Y.^..M..-|
+  0030: 1b 5c 60 72 2a 62 23 75 2b d4 a5 1d 26 b7 b4 a7 |.\`r*b#u+...&...|
+  0040: 73 e0 9e e3 c5 2f 51 0e 19 e0 5e 1f f9 66 d8 59 |s..../Q...^..f.Y|
   $ hg rm .hgtags
   $ hg commit -m "remove"
 
   $ mv tags .hgtags
@@ -129,8 +165,18 @@  Create a fake head, make sure tag not vi
   $ 
   $ hg tags
   tip                                6:35ff301afafe
   first                              0:acb14030fe0a
+  $ f --size --hexdump .hg/cache/hgtagsfnodes1
+  .hg/cache/hgtagsfnodes1: size=120
+  0000: b9 15 46 36 be 93 8d 3d 43 1e 75 a7 c9 06 50 4a |..F6...=C.u...PJ|
+  0010: 07 9b fe 07 26 b7 b4 a7 73 e0 9e e3 c5 2f 51 0e |....&...s..../Q.|
+  0020: 19 e0 5e 1f f9 66 d8 59 ac 5e 98 0c 4d c0 0f 2d |..^..f.Y.^..M..-|
+  0030: 1b 5c 60 72 2a 62 23 75 2b d4 a5 1d 26 b7 b4 a7 |.\`r*b#u+...&...|
+  0040: 73 e0 9e e3 c5 2f 51 0e 19 e0 5e 1f f9 66 d8 59 |s..../Q...^..f.Y|
+  0050: 35 ff 30 1a fa fe 30 af ed 40 e3 7f 41 10 1b 7a |5.0...0..@..A..z|
+  0060: 5c 7b 88 d8 26 b7 b4 a7 73 e0 9e e3 c5 2f 51 0e |\{..&...s..../Q.|
+  0070: 19 e0 5e 1f f9 66 d8 59                         |..^..f.Y|
 
 Add invalid tags:
 
   $ echo "spam" >> .hgtags
@@ -159,10 +205,9 @@  Report tag parse error on other head:
   tip                                8:c4be69a18c11
   first                              0:acb14030fe0a
 
   $ cat .hg/cache/tags1-visible
-  8 c4be69a18c11e8bc3a5fdbb576017c25f7d84663 9876b1193cfc564b1518d1f1b4459028ec75bf18
-  7 75d9f02dfe2874aa938ee8c18fa27c1328cfb023 7371bc5168f70e1b7c8dbf7c8bedf9d79f51dd82
+  external 8 c4be69a18c11e8bc3a5fdbb576017c25f7d84663
   
   acb14030fe0a21b60322c440ad2d20cf7685a376 first
 
   $ hg tip
@@ -178,10 +223,9 @@  Test a similar corruption issue in the t
 We should see a warning when loading the tags cache and a repaired version
 should be written automatically.
 
   $ cat > .hg/cache/tags1-visible << EOF
-  > 8 c4be69a18c11e8bc3a5fdbb576017c25f7d84663 9876b1193cfc564b1518d1f1b4459028ec75bf18
-  > 7 75d9f02dfe2874aa938ee8c18fa27c1328cfb023 7371bc5168f70e1b7c8dbf7c8bedf9d79f51dd82
+  > external 8 c4be69a18c11e8bc3a5fdbb576017c25f7d84663
   > 
   > zzabc123 first
   > EOF
 
@@ -199,10 +243,9 @@  should be written automatically.
   summary:     head
   
 
   $ cat .hg/cache/tags1-visible
-  8 c4be69a18c11e8bc3a5fdbb576017c25f7d84663 9876b1193cfc564b1518d1f1b4459028ec75bf18
-  7 75d9f02dfe2874aa938ee8c18fa27c1328cfb023 7371bc5168f70e1b7c8dbf7c8bedf9d79f51dd82
+  external 8 c4be69a18c11e8bc3a5fdbb576017c25f7d84663
   
   acb14030fe0a21b60322c440ad2d20cf7685a376 first
 
   $ hg tip
@@ -263,20 +306,20 @@  Detailed dump of tag info:
 
 Dump cache:
 
   $ cat .hg/cache/tags1-visible
-  4 0c192d7d5e6b78a714de54a2e9627952a877e25a 0c04f2a8af31de17fab7422878ee5a2dadbc943d
-  3 6fa450212aeb2a21ed616a54aea39a4a27894cd7 7d3b718c964ef37b89e550ebdafd5789e76ce1b0
-  2 7a94127795a33c10a370c93f731fd9fea0b79af6 0c04f2a8af31de17fab7422878ee5a2dadbc943d
+  external 4 0c192d7d5e6b78a714de54a2e9627952a877e25a
   
   bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
   bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
   78391a272241d70354aa14c874552cad6b51bb42 bar
 
-External .hgtags filenode cache marker is handled
+Old caches with inline .hgtags filenodes are recognized
 
   $ cat > .hg/cache/tags1-visible << EOF
-  > external 4 0c192d7d5e6b78a714de54a2e9627952a877e25a 2e21d3312350ce63785cda82526c951211e76bab
+  > 4 0c192d7d5e6b78a714de54a2e9627952a877e25a 0c04f2a8af31de17fab7422878ee5a2dadbc943d
+  > 3 6fa450212aeb2a21ed616a54aea39a4a27894cd7 7d3b718c964ef37b89e550ebdafd5789e76ce1b0
+  > 7a94127795a33c10a370c93f731fd9fea0b79af6 0c04f2a8af31de17fab7422878ee5a2dadbc943d
   > 
   > bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
   > bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
   > 78391a272241d70354aa14c874552cad6b51bb42 bar
@@ -285,19 +328,63 @@  External .hgtags filenode cache marker i
   $ hg tags
   tip                                4:0c192d7d5e6b
   bar                                1:78391a272241
 
-We should get an old style cache again
+The cache is not re-written until a write occurs
 
   $ cat .hg/cache/tags1-visible
   4 0c192d7d5e6b78a714de54a2e9627952a877e25a 0c04f2a8af31de17fab7422878ee5a2dadbc943d
   3 6fa450212aeb2a21ed616a54aea39a4a27894cd7 7d3b718c964ef37b89e550ebdafd5789e76ce1b0
-  2 7a94127795a33c10a370c93f731fd9fea0b79af6 0c04f2a8af31de17fab7422878ee5a2dadbc943d
+  7a94127795a33c10a370c93f731fd9fea0b79af6 0c04f2a8af31de17fab7422878ee5a2dadbc943d
   
   bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
   bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
   78391a272241d70354aa14c874552cad6b51bb42 bar
 
+  $ echo dummy > foo
+  $ hg commit -m throwaway
+  $ hg tags
+  tip                                5:a967e26c62ab
+  bar                                1:78391a272241
+
+  $ cat .hg/cache/tags1-visible
+  external 5 a967e26c62abddd3e1e1fb7fb35f6accc2b5e94d
+  
+  bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
+  bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
+  78391a272241d70354aa14c874552cad6b51bb42 bar
+
+Corrupt the .hgtags fnodes cache
+
+  $ echo 'extra' >> .hg/cache/hgtagsfnodes1
+  $ echo dummy2 > foo
+  $ hg commit -m throwaway2
+  $ hg tags
+  .hg/cache/hgtagsfnodes1 is corrupt; it will be rebuilt
+  tip                                6:039af0ff94d0
+  bar                                1:78391a272241
+
+#if unix-permissions no-root
+Errors writing to .hgtags fnodes cache are silently ignored
+
+  $ rm .hg/cache/hgtagsfnodes1 && touch .hg/cache/hgtagsfnodes1
+  $ chmod a-w .hg/cache/hgtagsfnodes1
+  $ rm .hg/cache/tags1-visible
+  $ hg tags
+  tip                                6:039af0ff94d0
+  bar                                1:78391a272241
+  $ cat .hg/cache/tags1-visible
+  external 6 039af0ff94d0d4827b5c22fc5248db229b17cf29
+  
+  bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
+  bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
+  78391a272241d70354aa14c874552cad6b51bb42 bar
+
+  $ chmod a+w .hg/cache/hgtagsfnodes1
+#endif
+
+  $ hg -q --config extensions.strip= strip -r 5: --no-backup
+
 Test tag removal:
 
   $ hg tag --remove bar     # rev 5
   $ hg tip -vp
@@ -399,9 +486,9 @@  Strip 2: destroy whole branch, no old he
   saved backup bundle to $TESTTMP/t3/.hg/strip-backup/*-backup.hg (glob)
   $ hg tags                  # partly stale
   tip                                4:735c3ca72986
   bar                                0:bbd179dfa0a7
-  $ rm -f .hg/cache/tags1-visible
+  $ rm -f .hg/cache/tags1-visible .hg/cache/hgtagsfnodes1
   $ hg tags                  # cold cache
   tip                                4:735c3ca72986
   bar                                0:bbd179dfa0a7