Patchwork match: make subinclude construction lazy

login
register
mail settings
Submitter Durham Goode
Date May 3, 2017, 5:34 p.m.
Message ID <69fee32d094c234b3011.1493832850@dev111.prn1.facebook.com>
Download mbox | patch
Permalink /patch/20406/
State Accepted
Headers show

Comments

Durham Goode - May 3, 2017, 5:34 p.m.
# HG changeset patch
# User Durham Goode <durham@fb.com>
# Date 1493832657 25200
#      Wed May 03 10:30:57 2017 -0700
# Node ID 69fee32d094c234b30114977b0dcce1b45401f5a
# Parent  6cacc271ee0a9385be483314dc73be176a13c891
match: make subinclude construction lazy

The matcher subinclude functionality allows us to have .hgignore files that
include subdirectory hgignore files. Today it parses the entire repo at once,
even if we only need to test a file in one subdirectory. This patch makes the
subinclude tree creation lazy, which speeds up matcher creation significantly in
large repos with very large trees of ignore patterns.
Yuya Nishihara - May 4, 2017, 3:51 a.m.
On Wed, 3 May 2017 10:34:10 -0700, Durham Goode wrote:
> # HG changeset patch
> # User Durham Goode <durham@fb.com>
> # Date 1493832657 25200
> #      Wed May 03 10:30:57 2017 -0700
> # Node ID 69fee32d094c234b30114977b0dcce1b45401f5a
> # Parent  6cacc271ee0a9385be483314dc73be176a13c891
> match: make subinclude construction lazy

Looks good. Queued, thanks.

> diff --git a/mercurial/match.py b/mercurial/match.py
> --- a/mercurial/match.py
> +++ b/mercurial/match.py
> @@ -64,12 +64,12 @@ def _expandsubinclude(kindpats, root):
>              path = pathutil.join(sourceroot, pat)
>  
>              newroot = pathutil.dirname(path)
> -            relmatcher = match(newroot, '', [], ['include:%s' % path])
> +            matcherargs = (newroot, '', [], ['include:%s' % path])
>  
>              prefix = pathutil.canonpath(root, root, newroot)
>              if prefix:
>                  prefix += '/'
> -            relmatchers.append((prefix, relmatcher))
> +            relmatchers.append((prefix, matcherargs))

I've updated the docstring as s/matchers/matcher args/.

Patch

diff --git a/mercurial/match.py b/mercurial/match.py
--- a/mercurial/match.py
+++ b/mercurial/match.py
@@ -64,12 +64,12 @@  def _expandsubinclude(kindpats, root):
             path = pathutil.join(sourceroot, pat)
 
             newroot = pathutil.dirname(path)
-            relmatcher = match(newroot, '', [], ['include:%s' % path])
+            matcherargs = (newroot, '', [], ['include:%s' % path])
 
             prefix = pathutil.canonpath(root, root, newroot)
             if prefix:
                 prefix += '/'
-            relmatchers.append((prefix, relmatcher))
+            relmatchers.append((prefix, matcherargs))
         else:
             other.append((kind, pat, source))
 
@@ -584,10 +584,17 @@  def _buildmatch(ctx, kindpats, globsuffi
 
     subincludes, kindpats = _expandsubinclude(kindpats, root)
     if subincludes:
+        submatchers = {}
         def matchsubinclude(f):
-            for prefix, mf in subincludes:
-                if f.startswith(prefix) and mf(f[len(prefix):]):
-                    return True
+            for prefix, matcherargs in subincludes:
+                if f.startswith(prefix):
+                    mf = submatchers.get(prefix)
+                    if mf is None:
+                        mf = match(*matcherargs)
+                        submatchers[prefix] = mf
+
+                    if mf(f[len(prefix):]):
+                        return True
             return False
         matchfuncs.append(matchsubinclude)