Patchwork D9631: branchmap: avoid ancestor computations for absent non-continous branches

login
register
mail settings
Submitter phabricator
Date Dec. 18, 2020, 1:45 p.m.
Message ID <differential-rev-PHID-DREV-g4af5rhnu64gcknzjn3n-req@mercurial-scm.org>
Download mbox | patch
Permalink /patch/47935/
State Superseded
Headers show

Comments

phabricator - Dec. 18, 2020, 1:45 p.m.
joerg.sonnenberger created this revision.
Herald added a reviewer: hg-reviewers.
Herald added a subscriber: mercurial-patches.

REVISION SUMMARY
  The branchhead computation is one of the more heavy operations for
  bigger repositories as it has to scan all changesets and potentially
  involves the expensive computation of the ancestor sets. Redo the
  computation to handle the common cases directly and use tighter
  conditions for when the ancestor scan is necessary. Most importantly,
  avoid it completely if the non-continous branches are processed in one
  update as seen in the initial computation after a clone.
  
  For the Mercurial repository, it gives a small 2-3% performance boost.
  For the NetBSD test repository, it cuts the time in half.

REPOSITORY
  rHG Mercurial

BRANCH
  default

REVISION DETAIL
  https://phab.mercurial-scm.org/D9631

AFFECTED FILES
  mercurial/branchmap.py

CHANGE DETAILS




To: joerg.sonnenberger, #hg-reviewers
Cc: mercurial-patches, mercurial-devel

Patch

diff --git a/mercurial/branchmap.py b/mercurial/branchmap.py
--- a/mercurial/branchmap.py
+++ b/mercurial/branchmap.py
@@ -443,33 +443,65 @@ 
             if closesbranch:
                 self._closednodes.add(cl.node(r))
 
-        # fetch current topological heads to speed up filtering
-        topoheads = set(cl.headrevs())
-
         # new tip revision which we found after iterating items from new
         # branches
         ntiprev = self.tiprev
 
-        # if older branchheads are reachable from new ones, they aren't
-        # really branchheads. Note checking parents is insufficient:
-        # 1 (branch a) -> 2 (branch b) -> 3 (branch a)
+        # Delay fetching the topological heads until they are needed.
+        # A repository without non-continous branches can skip this part.
+        topoheads = None
+
+        # If a changeset is visible, its parents must be visible too, so
+        # use the faster unfiltered parent accessor.
+        parentrevs = repo.unfiltered().changelog.parentrevs
+
         for branch, newheadrevs in pycompat.iteritems(newbranches):
+            # The set of branchheads is the union of the existing branchheads
+            # with the heads of new revisions of that branch, but without
+            # existing branchheads that are ancestors of new revisions.
+            # The latter condition is necessary for non-continous branches,
+            # i.e. 1 (branch a) -> 2 (branch b) -> 3 (branch a).
+            #
+            # The newrev loop processes all new revisions in order and updates
+            # the branchheads for the simple case of continous branches.
+            # The sorting ensures that parents are processed first and the root
+            # of a potential non-continous branch is seen first.
+            # It followes that all revisions that are not a child of the branch
+            # are candidates for such branches and therefore kept on the
+            # uncertain set. The exception is a local branch root with no
+            # pre-existing branchheads. This is the initial start of a branch
+            # and safe.
+            #
+            # If the newrev loop left any uncertain candidates for potential
+            # non-continous branches around, further checks are necessary.
+            # If all the remaining pre-existing branchheads (i.e. those without
+            # a child in the new revision set) are still topological heads,
+            # they are automatically also branchheads. Otherwise a full
+            # ancestor check is necessary to filter out obsoleted branchheads.
+
             bheads = self._entries.setdefault(branch, [])
             bheadset = {cl.rev(node) for node in bheads}
-
-            # This have been tested True on all internal usage of this function.
-            # run it again in case of doubt
-            # assert not (set(bheadrevs) & set(newheadrevs))
-            bheadset.update(newheadrevs)
+            uncertain = set()
+            for newrev in sorted(newheadrevs):
+                parents = [p for p in parentrevs(newrev) if p != nullrev]
+                gotit = False
+                for p in parents:
+                    if p in bheadset:
+                        bheadset.remove(p)
+                        gotit = True
+                    elif getbranchinfo(p)[0] == branch:
+                        gotit = True
+                if not gotit and bheadset:
+                    uncertain.add(newrev)
+                bheadset.add(newrev)
 
-            # This prunes out two kinds of heads - heads that are superseded by
-            # a head in newheadrevs, and newheadrevs that are not heads because
-            # an existing head is their descendant.
-            uncertain = bheadset - topoheads
             if uncertain:
-                floorrev = min(uncertain)
-                ancestors = set(cl.ancestors(newheadrevs, floorrev))
-                bheadset -= ancestors
+                if topoheads is None:
+                    topoheads = set(cl.headrevs())
+                if bheadset - topoheads:
+                    floorrev = min(bheadset)
+                    ancestors = set(cl.ancestors(newheadrevs, floorrev))
+                    bheadset -= ancestors
             bheadrevs = sorted(bheadset)
             self[branch] = [cl.node(rev) for rev in bheadrevs]
             tiprev = bheadrevs[-1]