Patchwork D7127: sidedatacopies: only fetch information once for merge

login
register
mail settings
Submitter phabricator
Date Oct. 19, 2019, 5:38 p.m.
Message ID <6d3a78b63d852b7fc1b0cf9f701a7541@localhost.localdomain>
Download mbox | patch
Permalink /patch/42507/
State Not Applicable
Headers show

Comments

phabricator - Oct. 19, 2019, 5:38 p.m.
Closed by commit rHG90213d027154: sidedatacopies: only fetch information once for merge (authored by marmoute).
This revision was automatically updated to reflect the committed changes.
This revision was not accepted when it landed; it landed in state "Needs Review".

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7127?vs=17332&id=17362

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7127/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7127

AFFECTED FILES
  mercurial/copies.py

CHANGE DETAILS




To: marmoute, #hg-reviewers
Cc: martinvonz, mercurial-devel

Patch

diff --git a/mercurial/copies.py b/mercurial/copies.py
--- a/mercurial/copies.py
+++ b/mercurial/copies.py
@@ -193,13 +193,44 @@ 
         changelogrevision = cl.changelogrevision
         flags = cl.flags
 
+        # A small cache to avoid doing the work twice for merges
+        #
+        # In the vast majority of cases, if we ask information for a revision
+        # about 1 parent, we'll later ask it for the other. So it make sense to
+        # keep the information around when reaching the first parent of a merge
+        # and dropping it after it was provided for the second parents.
+        #
+        # It exists cases were only one parent of the merge will be walked. It
+        # happens when the "destination" the copy tracing is descendant from a
+        # new root, not common with the "source". In that case, we will only walk
+        # through merge parents that are descendant of changesets common
+        # between "source" and "destination".
+        #
+        # With the current case implementation if such changesets have a copy
+        # information, we'll keep them in memory until the end of
+        # _changesetforwardcopies. We don't expect the case to be frequent
+        # enough to matters.
+        #
+        # In addition, it would be possible to reach pathological case, were
+        # many first parent are met before any second parent is reached. In
+        # that case the cache could grow. If this even become an issue one can
+        # safely introduce a maximum cache size. This would trade extra CPU/IO
+        # time to save memory.
+        merge_caches = {}
+
         def revinfo(rev):
             p1, p2 = parents(rev)
             if flags(rev) & REVIDX_SIDEDATA:
+                e = merge_caches.pop(rev, None)
+                if e is not None:
+                    return e
                 c = changelogrevision(rev)
                 p1copies = c.p1copies
                 p2copies = c.p2copies
                 removed = c.filesremoved
+                if p1 != node.nullrev and p2 != node.nullrev:
+                    # XXX some case we over cache, IGNORE
+                    merge_caches[rev] = (p1, p2, p1copies, p2copies, removed)
             else:
                 p1copies = {}
                 p2copies = {}