Patchwork D6711: branchheads: store wdir-dependent caches in wcache (issue6181)

login
register
mail settings
Submitter phabricator
Date Aug. 3, 2019, 1:27 a.m.
Message ID <differential-rev-PHID-DREV-globmffs4dlfltoytlck-req@mercurial-scm.org>
Download mbox | patch
Permalink /patch/41134/
State New
Headers show

Comments

phabricator - Aug. 3, 2019, 1:27 a.m.
spectral created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Previously, all branch2-* caches were stored in .hg/cache, which is shared
  across repos when using `hg share`. This can cause cache thrashing when the
  repos are accessed concurrently, since some of the caches depend on wdir state
  (which is NOT shared across the repos).
  
  There's already a cache directory for caches that depend on wdir state, wcache,
  so let's just put the caches there.
  
  This change does not clean up any existing caches in .hg/cache that are now
  moved to the new location, and it does not re-use the caches from that location
  when constructing them in the wcache directory. Upgrades to versions of
  Mercurial that have this commit will need to regenerate their branchheads once, and
  this may be an expensive operation.
  
  The format of these cache files is not changing - forwards/backwards
  compatibility remains; if a newer repo (with these in wcache) is accessed by an
  older hg, the older hg will either see the files that remain in .hg/cache, and
  update them there, or will not see the files at all and will fall back to a
  slower path (and then likely cache them in .hg/cache). If it's then accessed by
  a newer hg, the work done by the older hg is ignored, and the wcache copies are
  brought up to date.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6711

AFFECTED FILES
  mercurial/branchmap.py
  mercurial/utils/repoviewutil.py
  tests/test-branchmap-cache.t

CHANGE DETAILS




To: spectral, #hg-reviewers
Cc: mercurial-devel
phabricator - Aug. 3, 2019, 11:49 a.m.
marmoute added a comment.


  The change looks good to me, However we probably want to introduce a new filter level 'wdir-independent-visible' to ensure we a good branchcache cache in .hg/cache that most share can use and that will be kept up to date. This also means we need to make sure it is warmed after transaction.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6711/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6711

To: spectral, #hg-reviewers
Cc: marmoute, mercurial-devel
phabricator - Aug. 6, 2019, 1:11 a.m.
spectral added a comment.


  In D6711#98248 <https://phab.mercurial-scm.org/D6711#98248>, @marmoute wrote:
  
  > The change looks good to me, However we probably want to introduce a new filter level 'wdir-independent-visible' to ensure we a good branchcache cache in .hg/cache that most share can use and that will be kept up to date. This also means we need to make sure it is warmed after transaction.
  
  I agree such a thing would be useful, and pretty easy to add (just needs to not remove items considered "pinned" during computehidden, but I'm a little concerned about the proliferation of these things. I think ideally we'd end up with:
  
  visible-hidden > visible > wdir-independent-visible > wdir-independent-served > wdir-independent-served.hidden > immutable > base
  and
  served > wdir-independent-served > wdir-independent-served.hidden > immutable > base
  and
  served.hidden > wdir-independent-served.hidden > immutable > base
  
  I think?
  
  (Not sure if we need a wdir-independent-visible-hidden, I still don't really understand visible-hidden, I just know that updating it actually breaks a Lot of stuff, so I avoided doing so in the first commit in the stack. :))

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6711/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6711

To: spectral, #hg-reviewers
Cc: marmoute, mercurial-devel
phabricator - Aug. 8, 2019, 5:01 p.m.
marmoute added a comment.


  In D6711#98318 <https://phab.mercurial-scm.org/D6711#98318>, @spectral wrote:
  
  > In D6711#98248 <https://phab.mercurial-scm.org/D6711#98248>, @marmoute wrote:
  >
  >> The change looks good to me, However we probably want to introduce a new filter level 'wdir-independent-visible' to ensure we a good branchcache cache in .hg/cache that most share can use and that will be kept up to date. This also means we need to make sure it is warmed after transaction.
  >
  > I agree such a thing would be useful, and pretty easy to add (just needs to not remove items considered "pinned" during computehidden, but I'm a little concerned about the proliferation of these things. I think ideally we'd end up with:
  > visible-hidden > visible > wdir-independent-visible > wdir-independent-served > wdir-independent-served.hidden > immutable > base
  > and
  > served > wdir-independent-served > wdir-independent-served.hidden > immutable > base
  > and
  > served.hidden > wdir-independent-served.hidden > immutable > base
  > I think?
  
  Yeah, this makes a lot of them, but the alternative (having non-up-to-date cache in the share-source would a significant issue). In the current situation you fix one bug, reintroducing another. I wonder if we have a "simpler or saner" way to introduce these various disctinction. Let me think about it a bit.
  
  > (Not sure if we need a wdir-independent-visible-hidden, I still don't really understand visible-hidden, I just know that updating it actually breaks a Lot of stuff, so I avoided doing so in the first commit in the stack. :))
  
  `visible-hidden` still exclude all the `internal` phase commit that we never want to expose to anyone. (yeah the name is … confusing)

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6711/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6711

To: spectral, #hg-reviewers
Cc: marmoute, mercurial-devel
phabricator - Sept. 5, 2019, 7:06 p.m.
spectral added a comment.
spectral planned changes to this revision.


  I'm not sure what the current state of this is, but I don't want it lingering in the review queue while I'm not actively working on it.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6711/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6711

To: spectral, #hg-reviewers
Cc: marmoute, mercurial-devel

Patch

diff --git a/tests/test-branchmap-cache.t b/tests/test-branchmap-cache.t
new file mode 100644
--- /dev/null
+++ b/tests/test-branchmap-cache.t
@@ -0,0 +1,38 @@ 
+  $ cat >> $HGRCPATH << EOF
+  > [extensions]
+  > share=
+  > [experimental]
+  > evolution=createmarkers
+  > EOF
+  $ hg init -q orig
+  $ cd orig
+  $ echo hi > foo
+  $ hg ci -qAm initial_rev0
+  $ echo "hi again" >> foo
+  $ hg ci -qAm "hi again rev1"
+  $ hg commit -q --amend -m "obsoleted 1, this is rev2"
+  $ cd ..
+  $ hg share -q orig other
+  $ cd other
+  $ hg co -qr 1 --hidden
+  updated to hidden changeset a3571a6d8234
+  (hidden revision 'a3571a6d8234' was rewritten as: 0444ce380f11)
+  $ cd ..
+
+Forcefully update the caches in each repo once; we're not using --debug since we
+know that these are going to update some/all of the caches, we don't need to
+know which.
+  $ hg -R orig debugupdatecache
+  $ hg -R other debugupdatecache
+
+The branchheads caches should not change just because we're accessing from a
+different repo each time. If they were stale, we'd get another line in the
+output per filtername that's stale.
+  $ hg -R orig debugupdatecache --debug
+  updating the branch cache
+  $ hg -R other debugupdatecache --debug
+  updating the branch cache
+  $ hg -R orig debugupdatecache --debug
+  updating the branch cache
+  $ hg -R other debugupdatecache --debug
+  updating the branch cache
diff --git a/mercurial/utils/repoviewutil.py b/mercurial/utils/repoviewutil.py
--- a/mercurial/utils/repoviewutil.py
+++ b/mercurial/utils/repoviewutil.py
@@ -20,3 +20,8 @@ 
                'served.hidden': 'served',
                'served': 'immutable',
                'immutable': 'base'}
+
+# List of views that are potentially dependent upon wdir state (such as the
+# current parents of the working directory) and need to be kept in wcache
+# instead of cache.
+wdirdependent = frozenset(['served', 'visible', 'visible-hidden'])
diff --git a/mercurial/branchmap.py b/mercurial/branchmap.py
--- a/mercurial/branchmap.py
+++ b/mercurial/branchmap.py
@@ -221,7 +221,7 @@ 
     def fromfile(cls, repo):
         f = None
         try:
-            f = repo.cachevfs(cls._filename(repo))
+            f = cls._opencachefile(repo)
             lineiter = iter(f)
             cachekey = next(lineiter).rstrip('\n').split(" ", 2)
             last, lrev = cachekey[:2]
@@ -271,12 +271,15 @@ 
                 self._closednodes.add(node)
 
     @staticmethod
-    def _filename(repo):
+    def _opencachefile(repo, *args, **kwargs):
         """name of a branchcache file for a given repo or repoview"""
         filename = "branch2"
+        vfs = repo.cachevfs
         if repo.filtername:
             filename = '%s-%s' % (filename, repo.filtername)
-        return filename
+            if repo.filtername in repoviewutil.wdirdependent:
+                vfs = repo.wcachevfs
+        return vfs(filename, *args, **kwargs)
 
     def validfor(self, repo):
         """Is the cache content valid regarding a repo
@@ -335,7 +338,7 @@ 
 
     def write(self, repo):
         try:
-            f = repo.cachevfs(self._filename(repo), "w", atomictemp=True)
+            f = self._opencachefile(repo, "w", atomictemp=True)
             cachekey = [hex(self.tipnode), '%d' % self.tiprev]
             if self.filteredhash is not None:
                 cachekey.append(hex(self.filteredhash))