Patchwork match: do not weirdly include explicit files excluded by -X option

login
register
mail settings
Submitter Yuya Nishihara
Date Jan. 16, 2018, 2:27 p.m.
Message ID <ca4d61bf46160a31c665.1516112864@mimosa>
Download mbox | patch
Permalink /patch/26788/
State Accepted
Headers show

Comments

Yuya Nishihara - Jan. 16, 2018, 2:27 p.m.
# HG changeset patch
# User Yuya Nishihara <yuya@tcha.org>
# Date 1516108473 -32400
#      Tue Jan 16 22:14:33 2018 +0900
# Node ID ca4d61bf46160a31c6653c4f0794cd6e8ae788a4
# Parent  14f119c9ff3630859b64e207b389c49d68840b96
match: do not weirdly include explicit files excluded by -X option

Actually, this was the original behavior. Before a83a7d27911e, "log" and
"files" showed nothing if "FILE -X FILE" was specified, whereas "debugwalk"
got confused by an explicit FILE pattern. Under the hood, "log" and "files"
use m() and ctx.matches(m) respectively, and "debugwalk" uses ctx.walk(m).
I suspect dirstate.walk() goes wrong in _walkexplicit(), which seems to
blindly trust m.files().

I reckon the original "log"/"files" behavior is correct, and drop the hack
from the differencematcher.
via Mercurial-devel - Jan. 16, 2018, 5:42 p.m.
On Tue, Jan 16, 2018 at 6:27 AM, Yuya Nishihara <yuya@tcha.org> wrote:
> # HG changeset patch
> # User Yuya Nishihara <yuya@tcha.org>
> # Date 1516108473 -32400
> #      Tue Jan 16 22:14:33 2018 +0900
> # Node ID ca4d61bf46160a31c6653c4f0794cd6e8ae788a4
> # Parent  14f119c9ff3630859b64e207b389c49d68840b96
> match: do not weirdly include explicit files excluded by -X option
>
> Actually, this was the original behavior. Before a83a7d27911e, "log" and
> "files" showed nothing if "FILE -X FILE" was specified, whereas "debugwalk"
> got confused by an explicit FILE pattern. Under the hood, "log" and "files"
> use m() and ctx.matches(m) respectively, and "debugwalk" uses ctx.walk(m).
> I suspect dirstate.walk() goes wrong in _walkexplicit(), which seems to
> blindly trust m.files().

Aha, that's how it was working! Nice find, and thanks for the patch.
Great to see this weird case gone. Queued.

I'm surprised there are no tests that need updating. I wonder how I
even noticed that debugwalk was behaving the way it was. I checked
that there were no test failures even at a83a7d27911e with this patch
on top.

>
> I reckon the original "log"/"files" behavior is correct, and drop the hack
> from the differencematcher.
>
> diff --git a/mercurial/match.py b/mercurial/match.py
> --- a/mercurial/match.py
> +++ b/mercurial/match.py
> @@ -457,17 +457,10 @@ class exactmatcher(basematcher):
>
>  class differencematcher(basematcher):
>      '''Composes two matchers by matching if the first matches and the second
> -    does not. Well, almost... If the user provides a pattern like "-X foo foo",
> -    Mercurial actually does match "foo" against that. That's because exact
> -    matches are treated specially. So, since this differencematcher is used for
> -    excludes, it needs to special-case exact matching.
> +    does not.
>
>      The second matcher's non-matching-attributes (root, cwd, bad, explicitdir,
>      traversedir) are ignored.
> -
> -    TODO: If we want to keep the behavior described above for exact matches, we
> -    should consider instead treating the above case something like this:
> -    union(exact(foo), difference(pattern(foo), include(foo)))
>      '''
>      def __init__(self, m1, m2):
>          super(differencematcher, self).__init__(m1._root, m1._cwd)
> @@ -478,7 +471,7 @@ class differencematcher(basematcher):
>          self.traversedir = m1.traversedir
>
>      def matchfn(self, f):
> -        return self._m1(f) and (not self._m2(f) or self._m1.exact(f))
> +        return self._m1(f) and not self._m2(f)
>
>      @propertycache
>      def _files(self):

There's also a comment in visitdir() that should be removed now. I'll
do that in flight. For reference: here's that hunk:

@@ -486,9 +486,6 @@ class differencematcher(basematcher):

     def visitdir(self, dir):
         if self._m2.visitdir(dir) == 'all':
-            # There's a bug here: If m1 matches file 'dir/file' and m2 excludes
-            # 'dir' (recursively), we should still visit 'dir' due to the
-            # exception we have for exact matches.
             return False
         return bool(self._m1.visitdir(dir))

> diff --git a/tests/test-locate.t b/tests/test-locate.t
> --- a/tests/test-locate.t
> +++ b/tests/test-locate.t
> @@ -103,6 +103,11 @@ Issue294: hg remove --after dir fails wh
>    $ hg files b
>    b
>
> +-X with explicit path:
> +
> +  $ hg files b -X b
> +  [1]
> +
>    $ mkdir otherdir
>    $ cd otherdir
>
> diff --git a/tests/test-log.t b/tests/test-log.t
> --- a/tests/test-log.t
> +++ b/tests/test-log.t
> @@ -102,6 +102,10 @@ log on directory
>    summary:     c
>
>
> +-X, with explicit path
> +
> +  $ hg log a -X a
> +
>  -f, non-existent directory
>
>    $ hg log -f dir
> diff --git a/tests/test-walk.t b/tests/test-walk.t
> --- a/tests/test-walk.t
> +++ b/tests/test-walk.t
> @@ -344,6 +344,21 @@
>    abort: path 'beans/.hg' is inside nested repo 'beans'
>    [255]
>
> +Test explicit paths and excludes:
> +(BROKEN: nothing should be included, but wctx.walk() does)
> +
> +  $ hg debugwalk fennel -X fennel
> +  matcher: <differencematcher m1=<patternmatcher patterns='(?:fennel(?:/|$))'>, m2=<includematcher includes='(?:fennel(?:/|$))'>>
> +  f  fennel  fennel  exact
> +  $ hg debugwalk fennel -X 'f*'
> +  matcher: <differencematcher m1=<patternmatcher patterns='(?:fennel(?:/|$))'>, m2=<includematcher includes='(?:f[^/]*(?:/|$))'>>
> +  f  fennel  fennel  exact
> +  $ hg debugwalk beans/black -X 'path:beans'
> +  matcher: <differencematcher m1=<patternmatcher patterns='(?:beans\\/black(?:/|$))'>, m2=<includematcher includes='(?:beans(?:/|$))'>>
> +  f  beans/black  beans/black  exact
> +  $ hg debugwalk -I 'path:beans/black' -X 'path:beans'
> +  matcher: <differencematcher m1=<includematcher includes='(?:beans\\/black(?:/|$))'>, m2=<includematcher includes='(?:beans(?:/|$))'>>
> +
>  Test absolute paths:
>
>    $ hg debugwalk `pwd`/beans

Patch

diff --git a/mercurial/match.py b/mercurial/match.py
--- a/mercurial/match.py
+++ b/mercurial/match.py
@@ -457,17 +457,10 @@  class exactmatcher(basematcher):
 
 class differencematcher(basematcher):
     '''Composes two matchers by matching if the first matches and the second
-    does not. Well, almost... If the user provides a pattern like "-X foo foo",
-    Mercurial actually does match "foo" against that. That's because exact
-    matches are treated specially. So, since this differencematcher is used for
-    excludes, it needs to special-case exact matching.
+    does not.
 
     The second matcher's non-matching-attributes (root, cwd, bad, explicitdir,
     traversedir) are ignored.
-
-    TODO: If we want to keep the behavior described above for exact matches, we
-    should consider instead treating the above case something like this:
-    union(exact(foo), difference(pattern(foo), include(foo)))
     '''
     def __init__(self, m1, m2):
         super(differencematcher, self).__init__(m1._root, m1._cwd)
@@ -478,7 +471,7 @@  class differencematcher(basematcher):
         self.traversedir = m1.traversedir
 
     def matchfn(self, f):
-        return self._m1(f) and (not self._m2(f) or self._m1.exact(f))
+        return self._m1(f) and not self._m2(f)
 
     @propertycache
     def _files(self):
diff --git a/tests/test-locate.t b/tests/test-locate.t
--- a/tests/test-locate.t
+++ b/tests/test-locate.t
@@ -103,6 +103,11 @@  Issue294: hg remove --after dir fails wh
   $ hg files b
   b
 
+-X with explicit path:
+
+  $ hg files b -X b
+  [1]
+
   $ mkdir otherdir
   $ cd otherdir
 
diff --git a/tests/test-log.t b/tests/test-log.t
--- a/tests/test-log.t
+++ b/tests/test-log.t
@@ -102,6 +102,10 @@  log on directory
   summary:     c
   
 
+-X, with explicit path
+
+  $ hg log a -X a
+
 -f, non-existent directory
 
   $ hg log -f dir
diff --git a/tests/test-walk.t b/tests/test-walk.t
--- a/tests/test-walk.t
+++ b/tests/test-walk.t
@@ -344,6 +344,21 @@ 
   abort: path 'beans/.hg' is inside nested repo 'beans'
   [255]
 
+Test explicit paths and excludes:
+(BROKEN: nothing should be included, but wctx.walk() does)
+
+  $ hg debugwalk fennel -X fennel
+  matcher: <differencematcher m1=<patternmatcher patterns='(?:fennel(?:/|$))'>, m2=<includematcher includes='(?:fennel(?:/|$))'>>
+  f  fennel  fennel  exact
+  $ hg debugwalk fennel -X 'f*'
+  matcher: <differencematcher m1=<patternmatcher patterns='(?:fennel(?:/|$))'>, m2=<includematcher includes='(?:f[^/]*(?:/|$))'>>
+  f  fennel  fennel  exact
+  $ hg debugwalk beans/black -X 'path:beans'
+  matcher: <differencematcher m1=<patternmatcher patterns='(?:beans\\/black(?:/|$))'>, m2=<includematcher includes='(?:beans(?:/|$))'>>
+  f  beans/black  beans/black  exact
+  $ hg debugwalk -I 'path:beans/black' -X 'path:beans'
+  matcher: <differencematcher m1=<includematcher includes='(?:beans\\/black(?:/|$))'>, m2=<includematcher includes='(?:beans(?:/|$))'>>
+
 Test absolute paths:
 
   $ hg debugwalk `pwd`/beans