Patchwork [1,of,2,v2] match: adding support for matching files inside a directory

login
register
mail settings
Submitter via Mercurial-devel
Date Feb. 14, 2017, 1:05 a.m.
Message ID <94264a6e6672c917d425.1487034323@rdamazio.mtv.corp.google.com>
Download mbox | patch
Permalink /patch/18458/
State Accepted
Headers show

Comments

via Mercurial-devel - Feb. 14, 2017, 1:05 a.m.
# HG changeset patch
# User Rodrigo Damazio Bovendorp <rdamazio@google.com>
# Date 1487029169 28800
#      Mon Feb 13 15:39:29 2017 -0800
# Node ID 94264a6e6672c917d42518f7ae9322445868d067
# Parent  72f25e17af9d6a206ea374c30f229ae9513f3f23
match: adding support for matching files inside a directory

This adds a new "rootfilesin" matcher type which matches files inside a
directory, but not any subdirectories (so it matches non-recursively).
This has the "root" prefix per foozy's plan for other matchers (rootglob,
rootpath, cwdre, etc.).
via Mercurial-devel - Feb. 16, 2017, 10:07 p.m.
Foozy, how does this version of the series look to you?

Yuya, since it is in Google's interest to get this in, I'm reluctant
to queue it myself. Would you be able to do that (if it looks good to
you, of course)? Thanks.


On Mon, Feb 13, 2017 at 5:05 PM, Rodrigo Damazio Bovendorp
<rdamazio@google.com> wrote:
> # HG changeset patch
> # User Rodrigo Damazio Bovendorp <rdamazio@google.com>
> # Date 1487029169 28800
> #      Mon Feb 13 15:39:29 2017 -0800
> # Node ID 94264a6e6672c917d42518f7ae9322445868d067
> # Parent  72f25e17af9d6a206ea374c30f229ae9513f3f23
> match: adding support for matching files inside a directory
>
> This adds a new "rootfilesin" matcher type which matches files inside a
> directory, but not any subdirectories (so it matches non-recursively).
> This has the "root" prefix per foozy's plan for other matchers (rootglob,
> rootpath, cwdre, etc.).
>
> diff -r 72f25e17af9d -r 94264a6e6672 mercurial/help/patterns.txt
> --- a/mercurial/help/patterns.txt       Mon Feb 13 02:31:56 2017 -0800
> +++ b/mercurial/help/patterns.txt       Mon Feb 13 15:39:29 2017 -0800
> @@ -13,7 +13,10 @@
>
>  To use a plain path name without any pattern matching, start it with
>  ``path:``. These path names must completely match starting at the
> -current repository root.
> +current repository root, and when the path points to a directory, it is matched
> +recursively. To match all files in a directory non-recursively (not including
> +any files in subdirectories), ``rootfilesin:`` can be used, specifying an
> +absolute path (relative to the repository root).
>
>  To use an extended glob, start a name with ``glob:``. Globs are rooted
>  at the current directory; a glob such as ``*.c`` will only match files
> @@ -39,12 +42,15 @@
>  All patterns, except for ``glob:`` specified in command line (not for
>  ``-I`` or ``-X`` options), can match also against directories: files
>  under matched directories are treated as matched.
> +For ``-I`` and ``-X`` options, ``glob:`` will match directories recursively.
>
>  Plain examples::
>
> -  path:foo/bar   a name bar in a directory named foo in the root
> -                 of the repository
> -  path:path:name a file or directory named "path:name"
> +  path:foo/bar        a name bar in a directory named foo in the root
> +                      of the repository
> +  path:path:name      a file or directory named "path:name"
> +  rootfilesin:foo/bar the files in a directory called foo/bar, but not any files
> +                      in its subdirectories and not a file bar in directory foo
>
>  Glob examples::
>
> @@ -52,6 +58,8 @@
>    *.c            any name ending in ".c" in the current directory
>    **.c           any name ending in ".c" in any subdirectory of the
>                   current directory including itself.
> +  foo/*          any file in directory foo plus all its subdirectories,
> +                 recursively
>    foo/*.c        any name ending in ".c" in the directory foo
>    foo/**.c       any name ending in ".c" in any subdirectory of foo
>                   including itself.
> diff -r 72f25e17af9d -r 94264a6e6672 mercurial/match.py
> --- a/mercurial/match.py        Mon Feb 13 02:31:56 2017 -0800
> +++ b/mercurial/match.py        Mon Feb 13 15:39:29 2017 -0800
> @@ -104,7 +104,10 @@
>          a pattern is one of:
>          'glob:<glob>' - a glob relative to cwd
>          're:<regexp>' - a regular expression
> -        'path:<path>' - a path relative to repository root
> +        'path:<path>' - a path relative to repository root, which is matched
> +                        recursively
> +        'rootfilesin:<path>' - a path relative to repository root, which is
> +                        matched non-recursively (will not match subdirectories)
>          'relglob:<glob>' - an unrooted glob (*.c matches C files in all dirs)
>          'relpath:<path>' - a path relative to cwd
>          'relre:<regexp>' - a regexp that needn't match the start of a name
> @@ -153,7 +156,7 @@
>          elif patterns:
>              kindpats = self._normalize(patterns, default, root, cwd, auditor)
>              if not _kindpatsalwaysmatch(kindpats):
> -                self._files = _roots(kindpats)
> +                self._files = _explicitfiles(kindpats)
>                  self._anypats = self._anypats or _anypats(kindpats)
>                  self.patternspat, pm = _buildmatch(ctx, kindpats, '$',
>                                                     listsubrepos, root)
> @@ -286,7 +289,7 @@
>          for kind, pat in [_patsplit(p, default) for p in patterns]:
>              if kind in ('glob', 'relpath'):
>                  pat = pathutil.canonpath(root, cwd, pat, auditor)
> -            elif kind in ('relglob', 'path'):
> +            elif kind in ('relglob', 'path', 'rootfilesin'):
>                  pat = util.normpath(pat)
>              elif kind in ('listfile', 'listfile0'):
>                  try:
> @@ -447,7 +450,8 @@
>      if ':' in pattern:
>          kind, pat = pattern.split(':', 1)
>          if kind in ('re', 'glob', 'path', 'relglob', 'relpath', 'relre',
> -                    'listfile', 'listfile0', 'set', 'include', 'subinclude'):
> +                    'listfile', 'listfile0', 'set', 'include', 'subinclude',
> +                    'rootfilesin'):
>              return kind, pat
>      return default, pattern
>
> @@ -540,6 +544,14 @@
>          if pat == '.':
>              return ''
>          return '^' + util.re.escape(pat) + '(?:/|$)'
> +    if kind == 'rootfilesin':
> +        if pat == '.':
> +            escaped = ''
> +        else:
> +            # Pattern is a directory name.
> +            escaped = util.re.escape(pat) + '/'
> +        # Anything after the pattern must be a non-directory.
> +        return '^' + escaped + '[^/]+$'
>      if kind == 'relglob':
>          return '(?:|.*/)' + _globre(pat) + globsuffix
>      if kind == 'relpath':
> @@ -614,6 +626,8 @@
>
>      >>> _roots([('glob', 'g/*', ''), ('glob', 'g', ''), ('glob', 'g*', '')])
>      ['g', 'g', '.']
> +    >>> _roots([('rootfilesin', 'g', ''), ('rootfilesin', '', '')])
> +    ['g', '.']
>      >>> _roots([('relpath', 'r', ''), ('path', 'p/p', ''), ('path', '', '')])
>      ['r', 'p/p', '.']
>      >>> _roots([('relglob', 'rg*', ''), ('re', 're/', ''), ('relre', 'rr', '')])
> @@ -628,15 +642,28 @@
>                      break
>                  root.append(p)
>              r.append('/'.join(root) or '.')
> -        elif kind in ('relpath', 'path'):
> +        elif kind in ('relpath', 'path', 'rootfilesin'):
>              r.append(pat or '.')
>          else: # relglob, re, relre
>              r.append('.')
>      return r
>
> +def _explicitfiles(kindpats):
> +    '''Returns the potential explicit filenames from the patterns.
> +
> +    >>> _explicitfiles([('path', 'foo/bar', '')])
> +    ['foo/bar']
> +    >>> _explicitfiles([('rootfilesin', 'foo/bar', '')])
> +    []
> +    '''
> +    # Keep only the pattern kinds where one can specify filenames (vs only
> +    # directory names).
> +    filable = [kp for kp in kindpats if kp[0] not in ('rootfilesin')]
> +    return _roots(filable)
> +
>  def _anypats(kindpats):
>      for kind, pat, source in kindpats:
> -        if kind in ('glob', 're', 'relglob', 'relre', 'set'):
> +        if kind in ('glob', 're', 'relglob', 'relre', 'set', 'rootfilesin'):
>              return True
>
>  _commentre = None
> diff -r 72f25e17af9d -r 94264a6e6672 tests/test-walk.t
> --- a/tests/test-walk.t Mon Feb 13 02:31:56 2017 -0800
> +++ b/tests/test-walk.t Mon Feb 13 15:39:29 2017 -0800
> @@ -112,6 +112,74 @@
>    f  beans/navy      ../beans/navy
>    f  beans/pinto     ../beans/pinto
>    f  beans/turtle    ../beans/turtle
> +
> +  $ hg debugwalk 'rootfilesin:'
> +  f  fennel      ../fennel
> +  f  fenugreek   ../fenugreek
> +  f  fiddlehead  ../fiddlehead
> +  $ hg debugwalk -I 'rootfilesin:'
> +  f  fennel      ../fennel
> +  f  fenugreek   ../fenugreek
> +  f  fiddlehead  ../fiddlehead
> +  $ hg debugwalk 'rootfilesin:.'
> +  f  fennel      ../fennel
> +  f  fenugreek   ../fenugreek
> +  f  fiddlehead  ../fiddlehead
> +  $ hg debugwalk -I 'rootfilesin:.'
> +  f  fennel      ../fennel
> +  f  fenugreek   ../fenugreek
> +  f  fiddlehead  ../fiddlehead
> +  $ hg debugwalk -X 'rootfilesin:'
> +  f  beans/black                     ../beans/black
> +  f  beans/borlotti                  ../beans/borlotti
> +  f  beans/kidney                    ../beans/kidney
> +  f  beans/navy                      ../beans/navy
> +  f  beans/pinto                     ../beans/pinto
> +  f  beans/turtle                    ../beans/turtle
> +  f  mammals/Procyonidae/cacomistle  Procyonidae/cacomistle
> +  f  mammals/Procyonidae/coatimundi  Procyonidae/coatimundi
> +  f  mammals/Procyonidae/raccoon     Procyonidae/raccoon
> +  f  mammals/skunk                   skunk
> +  $ hg debugwalk 'rootfilesin:fennel'
> +  $ hg debugwalk -I 'rootfilesin:fennel'
> +  $ hg debugwalk 'rootfilesin:skunk'
> +  $ hg debugwalk -I 'rootfilesin:skunk'
> +  $ hg debugwalk 'rootfilesin:beans'
> +  f  beans/black     ../beans/black
> +  f  beans/borlotti  ../beans/borlotti
> +  f  beans/kidney    ../beans/kidney
> +  f  beans/navy      ../beans/navy
> +  f  beans/pinto     ../beans/pinto
> +  f  beans/turtle    ../beans/turtle
> +  $ hg debugwalk -I 'rootfilesin:beans'
> +  f  beans/black     ../beans/black
> +  f  beans/borlotti  ../beans/borlotti
> +  f  beans/kidney    ../beans/kidney
> +  f  beans/navy      ../beans/navy
> +  f  beans/pinto     ../beans/pinto
> +  f  beans/turtle    ../beans/turtle
> +  $ hg debugwalk 'rootfilesin:mammals'
> +  f  mammals/skunk  skunk
> +  $ hg debugwalk -I 'rootfilesin:mammals'
> +  f  mammals/skunk  skunk
> +  $ hg debugwalk 'rootfilesin:mammals/'
> +  f  mammals/skunk  skunk
> +  $ hg debugwalk -I 'rootfilesin:mammals/'
> +  f  mammals/skunk  skunk
> +  $ hg debugwalk -X 'rootfilesin:mammals'
> +  f  beans/black                     ../beans/black
> +  f  beans/borlotti                  ../beans/borlotti
> +  f  beans/kidney                    ../beans/kidney
> +  f  beans/navy                      ../beans/navy
> +  f  beans/pinto                     ../beans/pinto
> +  f  beans/turtle                    ../beans/turtle
> +  f  fennel                          ../fennel
> +  f  fenugreek                       ../fenugreek
> +  f  fiddlehead                      ../fiddlehead
> +  f  mammals/Procyonidae/cacomistle  Procyonidae/cacomistle
> +  f  mammals/Procyonidae/coatimundi  Procyonidae/coatimundi
> +  f  mammals/Procyonidae/raccoon     Procyonidae/raccoon
> +
>    $ hg debugwalk .
>    f  mammals/Procyonidae/cacomistle  Procyonidae/cacomistle
>    f  mammals/Procyonidae/coatimundi  Procyonidae/coatimundi
Yuya Nishihara - Feb. 18, 2017, 6:14 a.m.
On Mon, 13 Feb 2017 17:05:23 -0800, Rodrigo Damazio Bovendorp via Mercurial-devel wrote:
> # HG changeset patch
> # User Rodrigo Damazio Bovendorp <rdamazio@google.com>
> # Date 1487029169 28800
> #      Mon Feb 13 15:39:29 2017 -0800
> # Node ID 94264a6e6672c917d42518f7ae9322445868d067
> # Parent  72f25e17af9d6a206ea374c30f229ae9513f3f23
> match: adding support for matching files inside a directory

Looks good per foozy's comments on V1, queued. Thanks for the hard work on
consistent pattern naming.

> +def _explicitfiles(kindpats):
> +    '''Returns the potential explicit filenames from the patterns.
> +
> +    >>> _explicitfiles([('path', 'foo/bar', '')])
> +    ['foo/bar']
> +    >>> _explicitfiles([('rootfilesin', 'foo/bar', '')])
> +    []
> +    '''
> +    # Keep only the pattern kinds where one can specify filenames (vs only
> +    # directory names).
> +    filable = [kp for kp in kindpats if kp[0] not in ('rootfilesin')]
                                                        ^^^^^^^^^^^^^^^

Fixed this as "kp[0] not in ('rootfilesin',)".

Patch

diff -r 72f25e17af9d -r 94264a6e6672 mercurial/help/patterns.txt
--- a/mercurial/help/patterns.txt	Mon Feb 13 02:31:56 2017 -0800
+++ b/mercurial/help/patterns.txt	Mon Feb 13 15:39:29 2017 -0800
@@ -13,7 +13,10 @@ 
 
 To use a plain path name without any pattern matching, start it with
 ``path:``. These path names must completely match starting at the
-current repository root.
+current repository root, and when the path points to a directory, it is matched
+recursively. To match all files in a directory non-recursively (not including
+any files in subdirectories), ``rootfilesin:`` can be used, specifying an
+absolute path (relative to the repository root).
 
 To use an extended glob, start a name with ``glob:``. Globs are rooted
 at the current directory; a glob such as ``*.c`` will only match files
@@ -39,12 +42,15 @@ 
 All patterns, except for ``glob:`` specified in command line (not for
 ``-I`` or ``-X`` options), can match also against directories: files
 under matched directories are treated as matched.
+For ``-I`` and ``-X`` options, ``glob:`` will match directories recursively.
 
 Plain examples::
 
-  path:foo/bar   a name bar in a directory named foo in the root
-                 of the repository
-  path:path:name a file or directory named "path:name"
+  path:foo/bar        a name bar in a directory named foo in the root
+                      of the repository
+  path:path:name      a file or directory named "path:name"
+  rootfilesin:foo/bar the files in a directory called foo/bar, but not any files
+                      in its subdirectories and not a file bar in directory foo
 
 Glob examples::
 
@@ -52,6 +58,8 @@ 
   *.c            any name ending in ".c" in the current directory
   **.c           any name ending in ".c" in any subdirectory of the
                  current directory including itself.
+  foo/*          any file in directory foo plus all its subdirectories,
+                 recursively
   foo/*.c        any name ending in ".c" in the directory foo
   foo/**.c       any name ending in ".c" in any subdirectory of foo
                  including itself.
diff -r 72f25e17af9d -r 94264a6e6672 mercurial/match.py
--- a/mercurial/match.py	Mon Feb 13 02:31:56 2017 -0800
+++ b/mercurial/match.py	Mon Feb 13 15:39:29 2017 -0800
@@ -104,7 +104,10 @@ 
         a pattern is one of:
         'glob:<glob>' - a glob relative to cwd
         're:<regexp>' - a regular expression
-        'path:<path>' - a path relative to repository root
+        'path:<path>' - a path relative to repository root, which is matched
+                        recursively
+        'rootfilesin:<path>' - a path relative to repository root, which is
+                        matched non-recursively (will not match subdirectories)
         'relglob:<glob>' - an unrooted glob (*.c matches C files in all dirs)
         'relpath:<path>' - a path relative to cwd
         'relre:<regexp>' - a regexp that needn't match the start of a name
@@ -153,7 +156,7 @@ 
         elif patterns:
             kindpats = self._normalize(patterns, default, root, cwd, auditor)
             if not _kindpatsalwaysmatch(kindpats):
-                self._files = _roots(kindpats)
+                self._files = _explicitfiles(kindpats)
                 self._anypats = self._anypats or _anypats(kindpats)
                 self.patternspat, pm = _buildmatch(ctx, kindpats, '$',
                                                    listsubrepos, root)
@@ -286,7 +289,7 @@ 
         for kind, pat in [_patsplit(p, default) for p in patterns]:
             if kind in ('glob', 'relpath'):
                 pat = pathutil.canonpath(root, cwd, pat, auditor)
-            elif kind in ('relglob', 'path'):
+            elif kind in ('relglob', 'path', 'rootfilesin'):
                 pat = util.normpath(pat)
             elif kind in ('listfile', 'listfile0'):
                 try:
@@ -447,7 +450,8 @@ 
     if ':' in pattern:
         kind, pat = pattern.split(':', 1)
         if kind in ('re', 'glob', 'path', 'relglob', 'relpath', 'relre',
-                    'listfile', 'listfile0', 'set', 'include', 'subinclude'):
+                    'listfile', 'listfile0', 'set', 'include', 'subinclude',
+                    'rootfilesin'):
             return kind, pat
     return default, pattern
 
@@ -540,6 +544,14 @@ 
         if pat == '.':
             return ''
         return '^' + util.re.escape(pat) + '(?:/|$)'
+    if kind == 'rootfilesin':
+        if pat == '.':
+            escaped = ''
+        else:
+            # Pattern is a directory name.
+            escaped = util.re.escape(pat) + '/'
+        # Anything after the pattern must be a non-directory.
+        return '^' + escaped + '[^/]+$'
     if kind == 'relglob':
         return '(?:|.*/)' + _globre(pat) + globsuffix
     if kind == 'relpath':
@@ -614,6 +626,8 @@ 
 
     >>> _roots([('glob', 'g/*', ''), ('glob', 'g', ''), ('glob', 'g*', '')])
     ['g', 'g', '.']
+    >>> _roots([('rootfilesin', 'g', ''), ('rootfilesin', '', '')])
+    ['g', '.']
     >>> _roots([('relpath', 'r', ''), ('path', 'p/p', ''), ('path', '', '')])
     ['r', 'p/p', '.']
     >>> _roots([('relglob', 'rg*', ''), ('re', 're/', ''), ('relre', 'rr', '')])
@@ -628,15 +642,28 @@ 
                     break
                 root.append(p)
             r.append('/'.join(root) or '.')
-        elif kind in ('relpath', 'path'):
+        elif kind in ('relpath', 'path', 'rootfilesin'):
             r.append(pat or '.')
         else: # relglob, re, relre
             r.append('.')
     return r
 
+def _explicitfiles(kindpats):
+    '''Returns the potential explicit filenames from the patterns.
+
+    >>> _explicitfiles([('path', 'foo/bar', '')])
+    ['foo/bar']
+    >>> _explicitfiles([('rootfilesin', 'foo/bar', '')])
+    []
+    '''
+    # Keep only the pattern kinds where one can specify filenames (vs only
+    # directory names).
+    filable = [kp for kp in kindpats if kp[0] not in ('rootfilesin')]
+    return _roots(filable)
+
 def _anypats(kindpats):
     for kind, pat, source in kindpats:
-        if kind in ('glob', 're', 'relglob', 'relre', 'set'):
+        if kind in ('glob', 're', 'relglob', 'relre', 'set', 'rootfilesin'):
             return True
 
 _commentre = None
diff -r 72f25e17af9d -r 94264a6e6672 tests/test-walk.t
--- a/tests/test-walk.t	Mon Feb 13 02:31:56 2017 -0800
+++ b/tests/test-walk.t	Mon Feb 13 15:39:29 2017 -0800
@@ -112,6 +112,74 @@ 
   f  beans/navy      ../beans/navy
   f  beans/pinto     ../beans/pinto
   f  beans/turtle    ../beans/turtle
+
+  $ hg debugwalk 'rootfilesin:'
+  f  fennel      ../fennel
+  f  fenugreek   ../fenugreek
+  f  fiddlehead  ../fiddlehead
+  $ hg debugwalk -I 'rootfilesin:'
+  f  fennel      ../fennel
+  f  fenugreek   ../fenugreek
+  f  fiddlehead  ../fiddlehead
+  $ hg debugwalk 'rootfilesin:.'
+  f  fennel      ../fennel
+  f  fenugreek   ../fenugreek
+  f  fiddlehead  ../fiddlehead
+  $ hg debugwalk -I 'rootfilesin:.'
+  f  fennel      ../fennel
+  f  fenugreek   ../fenugreek
+  f  fiddlehead  ../fiddlehead
+  $ hg debugwalk -X 'rootfilesin:'
+  f  beans/black                     ../beans/black
+  f  beans/borlotti                  ../beans/borlotti
+  f  beans/kidney                    ../beans/kidney
+  f  beans/navy                      ../beans/navy
+  f  beans/pinto                     ../beans/pinto
+  f  beans/turtle                    ../beans/turtle
+  f  mammals/Procyonidae/cacomistle  Procyonidae/cacomistle
+  f  mammals/Procyonidae/coatimundi  Procyonidae/coatimundi
+  f  mammals/Procyonidae/raccoon     Procyonidae/raccoon
+  f  mammals/skunk                   skunk
+  $ hg debugwalk 'rootfilesin:fennel'
+  $ hg debugwalk -I 'rootfilesin:fennel'
+  $ hg debugwalk 'rootfilesin:skunk'
+  $ hg debugwalk -I 'rootfilesin:skunk'
+  $ hg debugwalk 'rootfilesin:beans'
+  f  beans/black     ../beans/black
+  f  beans/borlotti  ../beans/borlotti
+  f  beans/kidney    ../beans/kidney
+  f  beans/navy      ../beans/navy
+  f  beans/pinto     ../beans/pinto
+  f  beans/turtle    ../beans/turtle
+  $ hg debugwalk -I 'rootfilesin:beans'
+  f  beans/black     ../beans/black
+  f  beans/borlotti  ../beans/borlotti
+  f  beans/kidney    ../beans/kidney
+  f  beans/navy      ../beans/navy
+  f  beans/pinto     ../beans/pinto
+  f  beans/turtle    ../beans/turtle
+  $ hg debugwalk 'rootfilesin:mammals'
+  f  mammals/skunk  skunk
+  $ hg debugwalk -I 'rootfilesin:mammals'
+  f  mammals/skunk  skunk
+  $ hg debugwalk 'rootfilesin:mammals/'
+  f  mammals/skunk  skunk
+  $ hg debugwalk -I 'rootfilesin:mammals/'
+  f  mammals/skunk  skunk
+  $ hg debugwalk -X 'rootfilesin:mammals'
+  f  beans/black                     ../beans/black
+  f  beans/borlotti                  ../beans/borlotti
+  f  beans/kidney                    ../beans/kidney
+  f  beans/navy                      ../beans/navy
+  f  beans/pinto                     ../beans/pinto
+  f  beans/turtle                    ../beans/turtle
+  f  fennel                          ../fennel
+  f  fenugreek                       ../fenugreek
+  f  fiddlehead                      ../fiddlehead
+  f  mammals/Procyonidae/cacomistle  Procyonidae/cacomistle
+  f  mammals/Procyonidae/coatimundi  Procyonidae/coatimundi
+  f  mammals/Procyonidae/raccoon     Procyonidae/raccoon
+
   $ hg debugwalk .
   f  mammals/Procyonidae/cacomistle  Procyonidae/cacomistle
   f  mammals/Procyonidae/coatimundi  Procyonidae/coatimundi