Patchwork D8588: files: extract code for extra filtering of the `removed` entry into copies

login
register
mail settings
Submitter phabricator
Date May 27, 2020, 1:17 p.m.
Message ID <differential-rev-PHID-DREV-g5xrk45i3ea3q4ob3ggw-req@mercurial-scm.org>
Download mbox | patch
Permalink /patch/46381/
State Superseded
Headers show

Comments

phabricator - May 27, 2020, 1:17 p.m.
marmoute created this revision.
Herald added a reviewer: hg-reviewers.
Herald added a subscriber: mercurial-patches.

REVISION SUMMARY
  We want to reduce the set of `removed` files that to the set of files actually
  removed. That `removed` set is used as of the changeset centric algorithm,
  having smaller sets means less processing and faster computation.
  
  In this changeset we extract the code to be a function of it own. We will make
  use of it in the next changesets.

REPOSITORY
  rHG Mercurial

BRANCH
  default

REVISION DETAIL
  https://phab.mercurial-scm.org/D8588

AFFECTED FILES
  mercurial/localrepo.py
  mercurial/metadata.py

CHANGE DETAILS




To: marmoute, #hg-reviewers
Cc: mercurial-patches, mercurial-devel

Patch

diff --git a/mercurial/metadata.py b/mercurial/metadata.py
--- a/mercurial/metadata.py
+++ b/mercurial/metadata.py
@@ -11,6 +11,7 @@ 
 
 from . import (
     error,
+    node,
     pycompat,
     util,
 )
@@ -31,6 +32,61 @@ 
     return added
 
 
+def get_removal_filter(ctx, x=None):
+    """return a function to detect files "wrongly" detected as `removed`
+
+    When a file is removed relative to p1 in a merge, this
+    function determines whether the absence is due to a
+    deletion from a parent, or whether the merge commit
+    itself deletes the file. We decide this by doing a
+    simplified three way merge of the manifest entry for
+    the file. There are two ways we decide the merge
+    itself didn't delete a file:
+    - neither parent (nor the merge) contain the file
+    - exactly one parent contains the file, and that
+      parent has the same filelog entry as the merge
+      ancestor (or all of them if there two). In other
+      words, that parent left the file unchanged while the
+      other one deleted it.
+    One way to think about this is that deleting a file is
+    similar to emptying it, so the list of changed files
+    should be similar either way. The computation
+    described above is not done directly in _filecommit
+    when creating the list of changed files, however
+    it does something very similar by comparing filelog
+    nodes.
+    """
+
+    if x is not None:
+        p1, p2, m1, m2 = x
+    else:
+        p1 = ctx.p1()
+        p2 = ctx.p2()
+        m1 = p1.manifest()
+        m2 = p2.manifest()
+
+    @util.cachefunc
+    def mas():
+        p1n = p1.node()
+        p2n = p2.node()
+        cahs = ctx.repo().changelog.commonancestorsheads(p1n, p2n)
+        if not cahs:
+            cahs = [node.nullrev]
+        return [ctx.repo()[r].manifest() for r in cahs]
+
+    def deletionfromparent(f):
+        if f in m1:
+            return f not in m2 and all(
+                f in ma and ma.find(f) == m1.find(f) for ma in mas()
+            )
+        elif f in m2:
+            return all(f in ma and ma.find(f) == m2.find(f) for ma in mas())
+        else:
+            return True
+
+    return deletionfromparent
+
+
 def computechangesetfilesremoved(ctx):
     """return the list of files removed in a changeset
     """
diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -46,6 +46,7 @@ 
     match as matchmod,
     merge as mergemod,
     mergeutil,
+    metadata,
     namespaces,
     narrowspec,
     obsolete,
@@ -3143,51 +3144,8 @@ 
                 for f in drop:
                     del m[f]
                 if p2.rev() != nullrev:
-
-                    @util.cachefunc
-                    def mas():
-                        p1n = p1.node()
-                        p2n = p2.node()
-                        cahs = self.changelog.commonancestorsheads(p1n, p2n)
-                        if not cahs:
-                            cahs = [nullrev]
-                        return [self[r].manifest() for r in cahs]
-
-                    def deletionfromparent(f):
-                        # When a file is removed relative to p1 in a merge, this
-                        # function determines whether the absence is due to a
-                        # deletion from a parent, or whether the merge commit
-                        # itself deletes the file. We decide this by doing a
-                        # simplified three way merge of the manifest entry for
-                        # the file. There are two ways we decide the merge
-                        # itself didn't delete a file:
-                        # - neither parent (nor the merge) contain the file
-                        # - exactly one parent contains the file, and that
-                        #   parent has the same filelog entry as the merge
-                        #   ancestor (or all of them if there two). In other
-                        #   words, that parent left the file unchanged while the
-                        #   other one deleted it.
-                        # One way to think about this is that deleting a file is
-                        # similar to emptying it, so the list of changed files
-                        # should be similar either way. The computation
-                        # described above is not done directly in _filecommit
-                        # when creating the list of changed files, however
-                        # it does something very similar by comparing filelog
-                        # nodes.
-                        if f in m1:
-                            return f not in m2 and all(
-                                f in ma and ma.find(f) == m1.find(f)
-                                for ma in mas()
-                            )
-                        elif f in m2:
-                            return all(
-                                f in ma and ma.find(f) == m2.find(f)
-                                for ma in mas()
-                            )
-                        else:
-                            return True
-
-                    removed = [f for f in removed if not deletionfromparent(f)]
+                    rf = metadata.get_removal_filter(ctx, (p1, p2, m1, m2))
+                    removed = [f for f in removed if not rf(f)]
 
                 files = changed + removed
                 md = None