Patchwork [1,of,3] dirstate: add a method to efficiently filter by match

login
register
mail settings
Submitter Siddharth Agarwal
Date Aug. 2, 2014, 5:19 a.m.
Message ID <9c198be44e46807eda6e.1406956773@dev1738.prn1.facebook.com>
Download mbox | patch
Permalink /patch/5225/
State Accepted
Headers show

Comments

Siddharth Agarwal - Aug. 2, 2014, 5:19 a.m.
# HG changeset patch
# User Siddharth Agarwal <sid0@fb.com>
# Date 1406955916 25200
#      Fri Aug 01 22:05:16 2014 -0700
# Node ID 9c198be44e46807eda6e0ca749e05e881045de65
# Parent  43413d440fe6ea54930a51d76de2f91129fc726b
dirstate: add a method to efficiently filter by match

Current callers that require just this data call workingctx.walk, which calls
dirstate.walk, which stats all the files. Even worse, workingctx.walk looks for
unknown files, significantly slowing things down, even though callers might not
be interested in them at all.

Patch

diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py
--- a/mercurial/dirstate.py
+++ b/mercurial/dirstate.py
@@ -873,3 +873,21 @@ 
 
         return (lookup, modified, added, removed, deleted, unknown, ignored,
                 clean)
+
+    def matches(self, match):
+        '''
+        return files in the dirstate (in whatever state) filtered by match
+        '''
+        dmap = self._map
+        if match.always():
+            return dmap.keys()
+        files = match.files()
+        if match.matchfn == match.exact:
+            # fast path -- filter the other way around, since typically files is
+            # much smaller than dmap
+            return [f for f in files if f in dmap]
+        if not match.anypats() and util.all(fn in dmap for fn in files):
+            # fast path -- all the values are known to be files, so just return
+            # that
+            return list(files)
+        return [f for f in dmap if match(f)]