Patchwork D9573: branchmap: update rev-branch-cache automatically [POC]

login
register
mail settings
Submitter phabricator
Date Dec. 13, 2020, 6:50 p.m.
Message ID <differential-rev-PHID-DREV-svyak4gagqt5ia57q4ti-req@mercurial-scm.org>
Download mbox | patch
Permalink /patch/47871/
State Superseded
Headers show

Comments

phabricator - Dec. 13, 2020, 6:50 p.m.
joerg.sonnenberger created this revision.
Herald added a reviewer: hg-reviewers.
Herald added a subscriber: mercurial-patches.

REVISION SUMMARY
  Introduce an optional callback in changelog.add and provide it in
  localrepo to update the revbranchcache for all new changes. Ignore the
  now redundant bundle part.
  
  Performance regression for "hg bundle" of the whole hg repository is
  2.7%, for the NetBSD src test bundle it is around 1%.
  
  XXX stop advertising the capability

REPOSITORY
  rHG Mercurial

BRANCH
  default

REVISION DETAIL
  https://phab.mercurial-scm.org/D9573

AFFECTED FILES
  mercurial/bundle2.py
  mercurial/changelog.py
  mercurial/localrepo.py
  mercurial/store.py
  tests/test-acl.t
  tests/test-inherit-mode.t
  tests/test-rebase-conflicts.t

CHANGE DETAILS




To: joerg.sonnenberger, #hg-reviewers
Cc: mercurial-patches, mercurial-devel
Joerg Sonnenberger - Dec. 13, 2020, 7:10 p.m.
On Sun, Dec 13, 2020 at 06:50:45PM +0000, joerg.sonnenberger (Joerg Sonnenberger) wrote:
> REVISION SUMMARY
>   Introduce an optional callback in changelog.add and provide it in
>   localrepo to update the revbranchcache for all new changes. Ignore the
>   now redundant bundle part.

The general goal here is to provide a stable base for property caches
like the topic map. I also think it can allow making the whole
branchmap update less expensive. The downside is that we need to
actually look at the changeset, which we tried to avoid in the past.

Joerg

Patch

diff --git a/tests/test-rebase-conflicts.t b/tests/test-rebase-conflicts.t
--- a/tests/test-rebase-conflicts.t
+++ b/tests/test-rebase-conflicts.t
@@ -318,10 +318,10 @@ 
   bundle2-input-part: total payload size 1686
   bundle2-input-part: "cache:rev-branch-cache" (advisory) supported
   bundle2-input-part: total payload size 74
-  truncating cache/rbc-revs-v1 to 56
   bundle2-input-part: "phase-heads" supported
   bundle2-input-part: total payload size 24
   bundle2-input-bundle: 3 parts total
+  truncating cache/rbc-revs-v1 to 72
   added 2 changesets with 2 changes to 1 files
   updating the branch cache
   invalid branch cache (served): tip differs
diff --git a/tests/test-inherit-mode.t b/tests/test-inherit-mode.t
--- a/tests/test-inherit-mode.t
+++ b/tests/test-inherit-mode.t
@@ -134,6 +134,8 @@ 
   00660 ../push/.hg/00changelog.i
   00770 ../push/.hg/cache/
   00660 ../push/.hg/cache/branch2-base
+  00660 ../push/.hg/cache/rbc-names-v1
+  00660 ../push/.hg/cache/rbc-revs-v1
   00660 ../push/.hg/dirstate
   00660 ../push/.hg/requires
   00770 ../push/.hg/store/
diff --git a/tests/test-acl.t b/tests/test-acl.t
--- a/tests/test-acl.t
+++ b/tests/test-acl.t
@@ -204,6 +204,7 @@ 
   bundle2-input-part: "phase-heads" supported
   bundle2-input-part: total payload size 24
   bundle2-input-bundle: 5 parts total
+  truncating cache/rbc-revs-v1 to 8
   updating the branch cache
   added 3 changesets with 3 changes to 3 files
   bundle2-output-bundle: "HG20", 1 parts total
@@ -283,6 +284,7 @@ 
   bundle2-input-part: "phase-heads" supported
   bundle2-input-part: total payload size 24
   bundle2-input-bundle: 5 parts total
+  truncating cache/rbc-revs-v1 to 8
   updating the branch cache
   added 3 changesets with 3 changes to 3 files
   bundle2-output-bundle: "HG20", 1 parts total
@@ -806,6 +808,7 @@ 
   acl: acl.deny.bookmarks not enabled
   acl: bookmark access granted: "ef1ea85a6374b77d6da9dcda9541f498f2d17df7" on bookmark "moving-bookmark"
   bundle2-input-bundle: 7 parts total
+  truncating cache/rbc-revs-v1 to 8
   updating the branch cache
   added 1 changesets with 1 changes to 1 files
   bundle2-output-bundle: "HG20", 1 parts total
@@ -981,6 +984,7 @@ 
   bundle2-input-part: "phase-heads" supported
   bundle2-input-part: total payload size 24
   bundle2-input-bundle: 5 parts total
+  truncating cache/rbc-revs-v1 to 8
   updating the branch cache
   added 3 changesets with 3 changes to 3 files
   bundle2-output-bundle: "HG20", 1 parts total
@@ -1317,6 +1321,7 @@ 
   bundle2-input-part: "phase-heads" supported
   bundle2-input-part: total payload size 24
   bundle2-input-bundle: 5 parts total
+  truncating cache/rbc-revs-v1 to 8
   updating the branch cache
   added 3 changesets with 3 changes to 3 files
   bundle2-output-bundle: "HG20", 1 parts total
@@ -1407,6 +1412,7 @@ 
   bundle2-input-part: "phase-heads" supported
   bundle2-input-part: total payload size 24
   bundle2-input-bundle: 5 parts total
+  truncating cache/rbc-revs-v1 to 8
   updating the branch cache
   added 3 changesets with 3 changes to 3 files
   bundle2-output-bundle: "HG20", 1 parts total
@@ -1576,6 +1582,7 @@ 
   bundle2-input-part: "phase-heads" supported
   bundle2-input-part: total payload size 24
   bundle2-input-bundle: 5 parts total
+  truncating cache/rbc-revs-v1 to 8
   updating the branch cache
   added 3 changesets with 3 changes to 3 files
   bundle2-output-bundle: "HG20", 1 parts total
diff --git a/mercurial/store.py b/mercurial/store.py
--- a/mercurial/store.py
+++ b/mercurial/store.py
@@ -429,8 +429,10 @@ 
         l.sort()
         return l
 
-    def changelog(self, trypending):
-        return changelog.changelog(self.vfs, trypending=trypending)
+    def changelog(self, trypending, addcallback=None):
+        return changelog.changelog(
+            self.vfs, trypending=trypending, addcallback=addcallback
+        )
 
     def manifestlog(self, repo, storenarrowmatch):
         rootstore = manifest.manifestrevlog(self.vfs)
diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -1499,6 +1499,20 @@ 
         if 'changelog' in vars(self) and self.currenttransaction() is None:
             del self.changelog
 
+    def _addchangeset(
+        self,
+        node,
+        rev,
+        extra=None,
+    ):
+        if extra and extra.get(b"branch"):
+            branch = extra[b"branch"]
+        else:
+            branch = b'default'
+        branch = encoding.tolocal(branch)
+        closed = extra and b'close' in extra
+        self.revbranchcache().setdata(branch, rev, node, closed)
+
     @property
     def _activebookmark(self):
         return self._bookmarks.active
@@ -1518,7 +1532,9 @@ 
     def changelog(self):
         # load dirstate before changelog to avoid race see issue6303
         self.dirstate.prefetch_parents()
-        return self.store.changelog(txnutil.mayhavepending(self.root))
+        return self.store.changelog(
+            txnutil.mayhavepending(self.root), addcallback=self._addchangeset
+        )
 
     @storecache(b'00manifest.i')
     def manifestlog(self):
diff --git a/mercurial/changelog.py b/mercurial/changelog.py
--- a/mercurial/changelog.py
+++ b/mercurial/changelog.py
@@ -202,6 +202,16 @@ 
     description = attr.ib(default=b'')
 
 
+def extractextra(text):
+    nl1 = text.index(b'\n')
+    nl2 = text.index(b'\n', nl1 + 1)
+    nl3 = text.index(b'\n', nl2 + 1)
+    dateextra = text[nl2 + 1 : nl3]
+    fields = dateextra.split(b' ', 2)
+    if len(fields) != 3:
+        return None
+    return decodeextra(fields[2])
+
 class changelogrevision(object):
     """Holds results of a parsed changelog revision.
 
@@ -374,7 +384,7 @@ 
 
 
 class changelog(revlog.revlog):
-    def __init__(self, opener, trypending=False):
+    def __init__(self, opener, trypending=False, addcallback=None):
         """Load a changelog revlog using an opener.
 
         If ``trypending`` is true, we attempt to load the index from a
@@ -418,6 +428,7 @@ 
         self._filteredrevs = frozenset()
         self._filteredrevs_hashcache = {}
         self._copiesstorage = opener.options.get(b'copies-storage')
+        self._addcallback = addcallback
 
     @property
     def filteredrevs(self):
@@ -588,13 +599,21 @@ 
             sidedata = metadata.encode_files_sidedata(files)
 
         if extra:
-            extra = encodeextra(extra)
-            parseddate = b"%s %s" % (parseddate, extra)
+            encodedextra = encodeextra(extra)
+            parseddate = b"%s %s" % (parseddate, encodedextra)
         l = [hex(manifest), user, parseddate] + sortedfiles + [b"", desc]
         text = b"\n".join(l)
-        return self.addrevision(
-            text, transaction, len(self), p1, p2, sidedata=sidedata, flags=flags
-        )
+        return self.addrevision(text, transaction, len(self), p1, p2, sidedata=sidedata, flags=flags)
+
+    def _addrevision(self, *args, **kwargs):
+        node = super(changelog, self)._addrevision(*args, **kwargs)
+        if self._addcallback:
+            self._addcallback(
+                node,
+                len(self) - 1,
+                extractextra(args[1] or self.rawdata(node))
+            )
+        return node
 
     def branchinfo(self, rev):
         """return the branch name and open/close state of a revision
diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py
--- a/mercurial/bundle2.py
+++ b/mercurial/bundle2.py
@@ -2476,35 +2476,10 @@ 
 
 @parthandler(b'cache:rev-branch-cache')
 def handlerbc(op, inpart):
-    """receive a rev-branch-cache payload and update the local cache
-
-    The payload is a series of data related to each branch
-
-    1) branch name length
-    2) number of open heads
-    3) number of closed heads
-    4) open heads nodes
-    5) closed heads nodes
-    """
-    total = 0
-    rawheader = inpart.read(rbcstruct.size)
-    cache = op.repo.revbranchcache()
-    cl = op.repo.unfiltered().changelog
-    while rawheader:
-        header = rbcstruct.unpack(rawheader)
-        total += header[1] + header[2]
-        utf8branch = inpart.read(header[0])
-        branch = encoding.tolocal(utf8branch)
-        for x in pycompat.xrange(header[1]):
-            node = inpart.read(20)
-            rev = cl.rev(node)
-            cache.setdata(branch, rev, node, False)
-        for x in pycompat.xrange(header[2]):
-            node = inpart.read(20)
-            rev = cl.rev(node)
-            cache.setdata(branch, rev, node, True)
-        rawheader = inpart.read(rbcstruct.size)
-    cache.write()
+    # Ignored for compatibility with older bundles.
+    # The cache is now updated incrementally when the
+    # changes are added.
+    pass
 
 
 @parthandler(b'pushvars')