Patchwork D6710: branchmap: properly refresh/warm all branchmap caches

login
register
mail settings
Submitter phabricator
Date Aug. 3, 2019, 1:27 a.m.
Message ID <differential-rev-PHID-DREV-5jm22bum7o7lbwcbdxki-req@mercurial-scm.org>
Download mbox | patch
Permalink /patch/41135/
State Superseded
Headers show

Comments

phabricator - Aug. 3, 2019, 1:27 a.m.
spectral created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  The comment about 'served' refreshing all the others only applies when served is
  the head of the branchmap cache subsets. This was already no longer the case
  (and is presumably why served.hidden was added) but the comment was not
  adjusted. Currently, the "heads" of the branchmap cache subsets are the three
  specified here: visible, visible-hidden, served.hidden.
  
  Additionally, 'full' claims it will warm all of the caches that are known about,
  but this is not the case - it does not actually warm the branchmap caches for
  subsets that we haven't requested or for subsets that are still considered
  "valid".

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6710

AFFECTED FILES
  mercurial/localrepo.py

CHANGE DETAILS




To: spectral, #hg-reviewers
Cc: mercurial-devel
phabricator - Aug. 3, 2019, 1:30 a.m.
spectral added a comment.
spectral planned changes to this revision.


  Needs some test updates.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6710/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6710

To: spectral, #hg-reviewers
Cc: mercurial-devel
phabricator - Aug. 3, 2019, 11:45 a.m.
marmoute added a comment.


  Overall principle seems good. I made couple of inline comment.

INLINE COMMENTS

> localrepo.py:2200
>              self.ui.debug('updating the branch cache\n')
> -            self.filtered('served').branchmap()
> -            self.filtered('served.hidden').branchmap()
> +            for filt in ['visible', 'visible-hidden', 'served.hidden']:
> +                self.filtered(filt).branchmap()

Should we have this list explicitly stored in a list next to the filtermap ? That would seems more robust to future changes.

> localrepo.py:2224
> +                filtered = self.filtered(filt)
> +                filtered.branchmap().write(filtered)
> +

Why the explicite write here ? We don't seems to need it for the previous section. Is this because if the cache of the previous subset is valid, the write would be skipped ?
If so, consider clarifying it in your comment.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6710/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6710

To: spectral, #hg-reviewers
Cc: marmoute, mercurial-devel
phabricator - Aug. 6, 2019, 1:12 a.m.
spectral added inline comments.

INLINE COMMENTS

> marmoute wrote in localrepo.py:2224
> Why the explicite write here ? We don't seems to need it for the previous section. Is this because if the cache of the previous subset is valid, the write would be skipped ?
> If so, consider clarifying it in your comment.

Actually it's because the documentation for the function states that it will "warm the caches", "even the ones usually loaded more lazily".

If nothing in hg actually explicitly requests the subset, it won't be written:

  $ hg init; echo hi > foo; hg ci -qAm foo; ls .hg/cache
  branch2-served  evoext-obscache-00  rbc-names-v1  rbc-revs-v1

This would have, I'd thought, written out -served, -immutable, and -base, since -immutable and -base are subsets of -served, but that doesn't seem to happen.  Even if I run `hg debugupdatecache` (before this change) they don't get written:

  branch2-served  evoext-obscache-00  hgtagsfnodes1  rbc-names-v1  rbc-revs-v1  tags2  tags2-served

If the intent of `hg debugupdatecache` is to actually warm all levels of cache, it should probably warm -immutable and -base, so that they're kept up to date? Or is that undesirable for some reason (maybe it causes additional computation every time the cache for -served is updated if -immutable and -base exist, since they'd also possibly have to be updated? I'd think it'd be the opposite (-base is very cheap to calculate, and unlikely to go stale, can be used to make calculating immutable quicker, and that can be used to make calculating served quicker.. without them, then served has to start from scratch each time; this seems to be the *reason* for the subsettable :)), but I'm not super familiar with the caching code (and uses of it) to know if this is actually true in practice.

That said, I agree that these are two separate concerns, and the number of tests that need to be changed is pretty significant for this one, so I've split this change into two.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6710/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6710

To: spectral, #hg-reviewers
Cc: marmoute, mercurial-devel
Pierre-Yves David - Aug. 6, 2019, 10:01 a.m.
Forcing this write seems like a good idea. Having it in its own 
changeset seems like a good idea (and please add a comment about forcing 
the write).

On 8/6/19 3:12 AM, spectral (Kyle Lippincott) wrote:
> spectral added inline comments.
> 
> INLINE COMMENTS
> 
>> marmoute wrote in localrepo.py:2224
>> Why the explicite write here ? We don't seems to need it for the previous section. Is this because if the cache of the previous subset is valid, the write would be skipped ?
>> If so, consider clarifying it in your comment.
> 
> Actually it's because the documentation for the function states that it will "warm the caches", "even the ones usually loaded more lazily".
> 
> If nothing in hg actually explicitly requests the subset, it won't be written:
> 
>    $ hg init; echo hi > foo; hg ci -qAm foo; ls .hg/cache
>    branch2-served  evoext-obscache-00  rbc-names-v1  rbc-revs-v1
> 
> This would have, I'd thought, written out -served, -immutable, and -base, since -immutable and -base are subsets of -served, but that doesn't seem to happen.  Even if I run `hg debugupdatecache` (before this change) they don't get written:
> 
>    branch2-served  evoext-obscache-00  hgtagsfnodes1  rbc-names-v1  rbc-revs-v1  tags2  tags2-served
> 
> If the intent of `hg debugupdatecache` is to actually warm all levels of cache, it should probably warm -immutable and -base, so that they're kept up to date? Or is that undesirable for some reason (maybe it causes additional computation every time the cache for -served is updated if -immutable and -base exist, since they'd also possibly have to be updated? I'd think it'd be the opposite (-base is very cheap to calculate, and unlikely to go stale, can be used to make calculating immutable quicker, and that can be used to make calculating served quicker.. without them, then served has to start from scratch each time; this seems to be the *reason* for the subsettable :)), but I'm not super familiar with the caching code (and uses of it) to know if this is actually true in practice.
> 
> That said, I agree that these are two separate concerns, and the number of tests that need to be changed is pretty significant for this one, so I've split this change into two.
> 
> REPOSITORY
>    rHG Mercurial
> 
> CHANGES SINCE LAST ACTION
>    https://phab.mercurial-scm.org/D6710/new/
> 
> REVISION DETAIL
>    https://phab.mercurial-scm.org/D6710
> 
> To: spectral, #hg-reviewers
> Cc: marmoute, mercurial-devel
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
phabricator - Aug. 7, 2019, 1:07 a.m.
spectral added a comment.


  In D6710#98322 <https://phab.mercurial-scm.org/D6710#98322>, @marmoute wrote:
  
  > Forcing this write seems like a good idea. Having it in its own 
  > changeset seems like a good idea (and please add a comment about forcing 
  > the write).
  
  I've split the 'full' change from the one changing what subsets we invalidate already, this one will be used for the 'full' change since it had the most comments. Comment has been added. Please take another look at the whole stack.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6710/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6710

To: spectral, #hg-reviewers
Cc: marmoute, mercurial-devel
phabricator - Aug. 8, 2019, 4:50 p.m.
marmoute added a comment.
marmoute accepted this revision.


  We could warm them in increasing order to improve efficiency. However this is for the full cache warming so this looks good enough. (consider doing them in order in a follow up)
  
  Note: this change seems independant from the previous one, so one might be able to take it on its own

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6710/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6710

To: spectral, #hg-reviewers, marmoute
Cc: marmoute, mercurial-devel
phabricator - Aug. 8, 2019, 8:42 p.m.
pulkit added a comment.


  Thanks @marmoute for the review.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6710/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6710

To: spectral, #hg-reviewers, marmoute, pulkit
Cc: marmoute, mercurial-devel

Patch

diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -2192,10 +2192,13 @@ 
             return
 
         if tr is None or tr.changes['origrepolen'] < len(self):
-            # accessing the 'ser ved' branchmap should refresh all the others,
+            # There are three "heads" to the cache hierarchy: visible,
+            # visible-hidden, and served.hidden. Updating any of these three
+            # should cause all of the others (currently: served, immutable,
+            # base) that are stale to be updated.
             self.ui.debug('updating the branch cache\n')
-            self.filtered('served').branchmap()
-            self.filtered('served.hidden').branchmap()
+            for filt in ['visible', 'visible-hidden', 'served.hidden']:
+                self.filtered(filt).branchmap()
 
         if full:
             unfi = self.unfiltered()
@@ -2214,6 +2217,12 @@ 
             self.tags()
             self.filtered('served').tags()
 
+            # Warm the branchmap caches even for caches we haven't needed yet,
+            # including forcing a write to disk.
+            for filt in repoview.filtertable.keys():
+                filtered = self.filtered(filt)
+                filtered.branchmap().write(filtered)
+
     def invalidatecaches(self):
 
         if r'_tagscache' in vars(self):