Submitter | Yuya Nishihara |
---|---|
Date | Feb. 11, 2018, 4:09 a.m. |
Message ID | <ba8846546795996af10f.1518322190@mimosa> |
Download | mbox | patch |
Permalink | /patch/27523/ |
State | Accepted |
Headers | show |
Comments
On Sun, Feb 11, 2018 at 01:09:50PM +0900, Yuya Nishihara wrote: > # HG changeset patch > # User Yuya Nishihara <yuya@tcha.org> > # Date 1516963719 -32400 > # Fri Jan 26 19:48:39 2018 +0900 > # Node ID ba8846546795996af10fe93db4317b855018d749 > # Parent 91aac8e6604d1aa08b2683c1d4c7d1936f226e48 > dirstate: drop explicitly-specified files that shouldn't match (BC) Hmm. I'm not sure I like this behavior change. The current behavior is consistent with how an explicitly-tracked file shows up in status even if it's matched by .hgignore. Any particular reason to go this route? > > Before, wctx.walk() could include files excluded by -X pattern, which > disagrees with wctx.matches() and ctx.walk()/matches() behavior. This patch > fixes the problem by testing stat results against the matcher if the matcher > may contain false paths. > > I have no idea if the fix should be made before the workaround for case- > insensitive filesystems, but that shouldn't matter since match.anypats() > means 'not match.isexact()'. > > This patch also makes sparse.py to not exclude explicit paths on walk() > because it appears that sparse depends on the buggy behavior. > > .. bc:: > > Working-directory commands now respect ``-X PATTERN`` no matter if PATTERN > matches explicitly-specified FILEs. For example, ``hg add foo -X foo`` no > longer add the file ``foo``. > > diff --git a/hgext/sparse.py b/hgext/sparse.py > --- a/hgext/sparse.py > +++ b/hgext/sparse.py > @@ -194,7 +194,11 @@ def _setupdirstate(ui): > """ > > def walk(orig, self, match, subrepos, unknown, ignored, full=True): > - match = matchmod.intersectmatchers(match, self._sparsematcher) > + # hack to not exclude explicitly-specified paths so that they can > + # be warned later on e.g. dirstate.add() > + em = matchmod.exact(match._root, match._cwd, match.files()) > + sm = matchmod.unionmatcher([self._sparsematcher, em]) > + match = matchmod.intersectmatchers(match, sm) > return orig(self, match, subrepos, unknown, ignored, full) > > extensions.wrapfunction(dirstate.dirstate, 'walk', walk) > diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py > --- a/mercurial/dirstate.py > +++ b/mercurial/dirstate.py > @@ -787,6 +787,17 @@ class dirstate(object): > else: > badfn(ff, encoding.strtolocal(inst.strerror)) > > + # match.files() may contain explicitly-specified paths that shouldn't > + # be taken; drop them from the list of files found. dirsfound/notfound > + # aren't filtered here because they will be tested later. > + if match.anypats(): > + for f in list(results): > + if f == '.hg' or f in subrepos: > + # keep sentinel to disable further out-of-repo walks > + continue > + if not match(f): > + del results[f] > + > # Case insensitive filesystems cannot rely on lstat() failing to detect > # a case-only rename. Prune the stat object for any file that does not > # match the case in the filesystem, if there are multiple files that > diff --git a/tests/test-add.t b/tests/test-add.t > --- a/tests/test-add.t > +++ b/tests/test-add.t > @@ -146,6 +146,13 @@ Issue683: peculiarity with hg revert of > M a > ? a.orig > > +excluded file shouldn't be added even if it is explicitly specified > + > + $ hg add a.orig -X '*.orig' > + $ hg st > + M a > + ? a.orig > + > Forgotten file can be added back (as either clean or modified) > > $ hg forget b > diff --git a/tests/test-sparse.t b/tests/test-sparse.t > --- a/tests/test-sparse.t > +++ b/tests/test-sparse.t > @@ -129,6 +129,10 @@ Adding an excluded file should fail > (include file with `hg debugsparse --include <pattern>` or use `hg add -s <file>` to include file directory while adding) > [255] > > +But adding a truly excluded file shouldn't count > + > + $ hg add hide3 -X hide3 > + > Verify deleting sparseness while a file has changes fails > > $ hg debugsparse --delete 'show*' > diff --git a/tests/test-walk.t b/tests/test-walk.t > --- a/tests/test-walk.t > +++ b/tests/test-walk.t > @@ -304,12 +304,10 @@ > f beans/turtle beans/turtle > $ hg debugwalk -Xbeans/black beans/black > matcher: <differencematcher m1=<patternmatcher patterns='(?:beans\\/black(?:/|$))'>, m2=<includematcher includes='(?:beans\\/black(?:/|$))'>> > - f beans/black beans/black exact > $ hg debugwalk -Xbeans/black -Ibeans/black > matcher: <differencematcher m1=<includematcher includes='(?:beans\\/black(?:/|$))'>, m2=<includematcher includes='(?:beans\\/black(?:/|$))'>> > $ hg debugwalk -Xbeans beans/black > matcher: <differencematcher m1=<patternmatcher patterns='(?:beans\\/black(?:/|$))'>, m2=<includematcher includes='(?:beans(?:/|$))'>> > - f beans/black beans/black exact > $ hg debugwalk -Xbeans -Ibeans/black > matcher: <differencematcher m1=<includematcher includes='(?:beans\\/black(?:/|$))'>, m2=<includematcher includes='(?:beans(?:/|$))'>> > $ hg debugwalk 'glob:mammals/../beans/b*' > @@ -345,17 +343,13 @@ > [255] > > Test explicit paths and excludes: > -(BROKEN: nothing should be included, but wctx.walk() does) > > $ hg debugwalk fennel -X fennel > matcher: <differencematcher m1=<patternmatcher patterns='(?:fennel(?:/|$))'>, m2=<includematcher includes='(?:fennel(?:/|$))'>> > - f fennel fennel exact > $ hg debugwalk fennel -X 'f*' > matcher: <differencematcher m1=<patternmatcher patterns='(?:fennel(?:/|$))'>, m2=<includematcher includes='(?:f[^/]*(?:/|$))'>> > - f fennel fennel exact > $ hg debugwalk beans/black -X 'path:beans' > matcher: <differencematcher m1=<patternmatcher patterns='(?:beans\\/black(?:/|$))'>, m2=<includematcher includes='(?:beans(?:/|$))'>> > - f beans/black beans/black exact > $ hg debugwalk -I 'path:beans/black' -X 'path:beans' > matcher: <differencematcher m1=<includematcher includes='(?:beans\\/black(?:/|$))'>, m2=<includematcher includes='(?:beans(?:/|$))'>> > > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
On Sun, 11 Feb 2018 08:58:48 -0500, Augie Fackler wrote: > On Sun, Feb 11, 2018 at 01:09:50PM +0900, Yuya Nishihara wrote: > > # HG changeset patch > > # User Yuya Nishihara <yuya@tcha.org> > > # Date 1516963719 -32400 > > # Fri Jan 26 19:48:39 2018 +0900 > > # Node ID ba8846546795996af10fe93db4317b855018d749 > > # Parent 91aac8e6604d1aa08b2683c1d4c7d1936f226e48 > > dirstate: drop explicitly-specified files that shouldn't match (BC) > > Hmm. I'm not sure I like this behavior change. The current behavior is > consistent with how an explicitly-tracked file shows up in status even > if it's matched by .hgignore. > > Any particular reason to go this route? The current behavior is inconsistent with other commands, notably log and files. $ hg files hg -X hg Even more, it's inconsistent with status --change. $ hg status --all hg -X hg C hg $ hg status --all hg -X hg --change . So I believe it's a bug of dirsate.walk().
On Sun, 11 Feb 2018 09:17:42 -0500, Yuya Nishihara <yuya@tcha.org> wrote: > On Sun, 11 Feb 2018 08:58:48 -0500, Augie Fackler wrote: >> On Sun, Feb 11, 2018 at 01:09:50PM +0900, Yuya Nishihara wrote: >> > # HG changeset patch >> > # User Yuya Nishihara <yuya@tcha.org> >> > # Date 1516963719 -32400 >> > # Fri Jan 26 19:48:39 2018 +0900 >> > # Node ID ba8846546795996af10fe93db4317b855018d749 >> > # Parent 91aac8e6604d1aa08b2683c1d4c7d1936f226e48 >> > dirstate: drop explicitly-specified files that shouldn't match (BC) >> >> Hmm. I'm not sure I like this behavior change. The current behavior is >> consistent with how an explicitly-tracked file shows up in status even >> if it's matched by .hgignore. >> >> Any particular reason to go this route? > > The current behavior is inconsistent with other commands, notably log and > files. > > $ hg files hg -X hg > > Even more, it's inconsistent with status --change. > > $ hg status --all hg -X hg > C hg > $ hg status --all hg -X hg --change . > > So I believe it's a bug of dirsate.walk(). Related: https://bz.mercurial-scm.org/show_bug.cgi?id=4679
I agree with Yuya (and apparently with former me; thanks for digging that up, Matt). I think it's just an unintended consequence of how the dirstate walk works, but I'm not sure. The exception for explicit files also bothered me when I was working on the matcher code a year or so ago. I actually added the exception to the matcher code because I thought it was always working like that (not just for dirstate) in a83a7d27911e (match: handle excludes using new differencematcher, 2017-05-16). It was only recently that Yuya realized that it used to be inconsistent and that I probably made it consistently bad because I didn't realize it was inconsistent to start with, see 821d8a5ab4ff (match: do not weirdly include explicit files excluded by -X option, 2018-01-16). So I guess I just think it feels wrong to include explicit matches. I wish I could point to something that would actually crash (e.g. narrowhg), but I can't think of anything. Also, thanks for working on this, Yuya! I feel a bit guilty for never getting around to it. On Sun, Feb 11, 2018 at 12:54 PM Matt Harbison <mharbison72@gmail.com> wrote: > On Sun, 11 Feb 2018 09:17:42 -0500, Yuya Nishihara <yuya@tcha.org> wrote: > > > On Sun, 11 Feb 2018 08:58:48 -0500, Augie Fackler wrote: > >> On Sun, Feb 11, 2018 at 01:09:50PM +0900, Yuya Nishihara wrote: > >> > # HG changeset patch > >> > # User Yuya Nishihara <yuya@tcha.org> > >> > # Date 1516963719 -32400 > >> > # Fri Jan 26 19:48:39 2018 +0900 > >> > # Node ID ba8846546795996af10fe93db4317b855018d749 > >> > # Parent 91aac8e6604d1aa08b2683c1d4c7d1936f226e48 > >> > dirstate: drop explicitly-specified files that shouldn't match (BC) > >> > >> Hmm. I'm not sure I like this behavior change. The current behavior is > >> consistent with how an explicitly-tracked file shows up in status even > >> if it's matched by .hgignore. > >> > >> Any particular reason to go this route? > > > > The current behavior is inconsistent with other commands, notably log and > > files. > > > > $ hg files hg -X hg > > > > Even more, it's inconsistent with status --change. > > > > $ hg status --all hg -X hg > > C hg > > $ hg status --all hg -X hg --change . > > > > So I believe it's a bug of dirsate.walk(). > > Related: https://bz.mercurial-scm.org/show_bug.cgi?id=4679 > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel >
Queued, per the very helpful explanations. Thanks all! > On Feb 13, 2018, at 12:28, Martin von Zweigbergk <martinvonz@google.com> wrote: > > I agree with Yuya (and apparently with former me; thanks for digging that up, Matt). I think it's just an unintended consequence of how the dirstate walk works, but I'm not sure. The exception for explicit files also bothered me when I was working on the matcher code a year or so ago. I actually added the exception to the matcher code because I thought it was always working like that (not just for dirstate) in a83a7d27911e (match: handle excludes using new differencematcher, 2017-05-16). It was only recently that Yuya realized that it used to be inconsistent and that I probably made it consistently bad because I didn't realize it was inconsistent to start with, see 821d8a5ab4ff (match: do not weirdly include explicit files excluded by -X option, 2018-01-16). > > So I guess I just think it feels wrong to include explicit matches. I wish I could point to something that would actually crash (e.g. narrowhg), but I can't think of anything. > > Also, thanks for working on this, Yuya! I feel a bit guilty for never getting around to it. > > On Sun, Feb 11, 2018 at 12:54 PM Matt Harbison <mharbison72@gmail.com <mailto:mharbison72@gmail.com>> wrote: > On Sun, 11 Feb 2018 09:17:42 -0500, Yuya Nishihara <yuya@tcha.org <mailto:yuya@tcha.org>> wrote: > > > On Sun, 11 Feb 2018 08:58:48 -0500, Augie Fackler wrote: > >> On Sun, Feb 11, 2018 at 01:09:50PM +0900, Yuya Nishihara wrote: > >> > # HG changeset patch > >> > # User Yuya Nishihara <yuya@tcha.org <mailto:yuya@tcha.org>> > >> > # Date 1516963719 -32400 > >> > # Fri Jan 26 19:48:39 2018 +0900 > >> > # Node ID ba8846546795996af10fe93db4317b855018d749 > >> > # Parent 91aac8e6604d1aa08b2683c1d4c7d1936f226e48 > >> > dirstate: drop explicitly-specified files that shouldn't match (BC) > >> > >> Hmm. I'm not sure I like this behavior change. The current behavior is > >> consistent with how an explicitly-tracked file shows up in status even > >> if it's matched by .hgignore. > >> > >> Any particular reason to go this route? > > > > The current behavior is inconsistent with other commands, notably log and > > files. > > > > $ hg files hg -X hg > > > > Even more, it's inconsistent with status --change. > > > > $ hg status --all hg -X hg > > C hg > > $ hg status --all hg -X hg --change . > > > > So I believe it's a bug of dirsate.walk(). > > Related: https://bz.mercurial-scm.org/show_bug.cgi?id=4679 <https://bz.mercurial-scm.org/show_bug.cgi?id=4679> > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org <mailto:Mercurial-devel@mercurial-scm.org> > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel <https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel>
Patch
diff --git a/hgext/sparse.py b/hgext/sparse.py --- a/hgext/sparse.py +++ b/hgext/sparse.py @@ -194,7 +194,11 @@ def _setupdirstate(ui): """ def walk(orig, self, match, subrepos, unknown, ignored, full=True): - match = matchmod.intersectmatchers(match, self._sparsematcher) + # hack to not exclude explicitly-specified paths so that they can + # be warned later on e.g. dirstate.add() + em = matchmod.exact(match._root, match._cwd, match.files()) + sm = matchmod.unionmatcher([self._sparsematcher, em]) + match = matchmod.intersectmatchers(match, sm) return orig(self, match, subrepos, unknown, ignored, full) extensions.wrapfunction(dirstate.dirstate, 'walk', walk) diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py --- a/mercurial/dirstate.py +++ b/mercurial/dirstate.py @@ -787,6 +787,17 @@ class dirstate(object): else: badfn(ff, encoding.strtolocal(inst.strerror)) + # match.files() may contain explicitly-specified paths that shouldn't + # be taken; drop them from the list of files found. dirsfound/notfound + # aren't filtered here because they will be tested later. + if match.anypats(): + for f in list(results): + if f == '.hg' or f in subrepos: + # keep sentinel to disable further out-of-repo walks + continue + if not match(f): + del results[f] + # Case insensitive filesystems cannot rely on lstat() failing to detect # a case-only rename. Prune the stat object for any file that does not # match the case in the filesystem, if there are multiple files that diff --git a/tests/test-add.t b/tests/test-add.t --- a/tests/test-add.t +++ b/tests/test-add.t @@ -146,6 +146,13 @@ Issue683: peculiarity with hg revert of M a ? a.orig +excluded file shouldn't be added even if it is explicitly specified + + $ hg add a.orig -X '*.orig' + $ hg st + M a + ? a.orig + Forgotten file can be added back (as either clean or modified) $ hg forget b diff --git a/tests/test-sparse.t b/tests/test-sparse.t --- a/tests/test-sparse.t +++ b/tests/test-sparse.t @@ -129,6 +129,10 @@ Adding an excluded file should fail (include file with `hg debugsparse --include <pattern>` or use `hg add -s <file>` to include file directory while adding) [255] +But adding a truly excluded file shouldn't count + + $ hg add hide3 -X hide3 + Verify deleting sparseness while a file has changes fails $ hg debugsparse --delete 'show*' diff --git a/tests/test-walk.t b/tests/test-walk.t --- a/tests/test-walk.t +++ b/tests/test-walk.t @@ -304,12 +304,10 @@ f beans/turtle beans/turtle $ hg debugwalk -Xbeans/black beans/black matcher: <differencematcher m1=<patternmatcher patterns='(?:beans\\/black(?:/|$))'>, m2=<includematcher includes='(?:beans\\/black(?:/|$))'>> - f beans/black beans/black exact $ hg debugwalk -Xbeans/black -Ibeans/black matcher: <differencematcher m1=<includematcher includes='(?:beans\\/black(?:/|$))'>, m2=<includematcher includes='(?:beans\\/black(?:/|$))'>> $ hg debugwalk -Xbeans beans/black matcher: <differencematcher m1=<patternmatcher patterns='(?:beans\\/black(?:/|$))'>, m2=<includematcher includes='(?:beans(?:/|$))'>> - f beans/black beans/black exact $ hg debugwalk -Xbeans -Ibeans/black matcher: <differencematcher m1=<includematcher includes='(?:beans\\/black(?:/|$))'>, m2=<includematcher includes='(?:beans(?:/|$))'>> $ hg debugwalk 'glob:mammals/../beans/b*' @@ -345,17 +343,13 @@ [255] Test explicit paths and excludes: -(BROKEN: nothing should be included, but wctx.walk() does) $ hg debugwalk fennel -X fennel matcher: <differencematcher m1=<patternmatcher patterns='(?:fennel(?:/|$))'>, m2=<includematcher includes='(?:fennel(?:/|$))'>> - f fennel fennel exact $ hg debugwalk fennel -X 'f*' matcher: <differencematcher m1=<patternmatcher patterns='(?:fennel(?:/|$))'>, m2=<includematcher includes='(?:f[^/]*(?:/|$))'>> - f fennel fennel exact $ hg debugwalk beans/black -X 'path:beans' matcher: <differencematcher m1=<patternmatcher patterns='(?:beans\\/black(?:/|$))'>, m2=<includematcher includes='(?:beans(?:/|$))'>> - f beans/black beans/black exact $ hg debugwalk -I 'path:beans/black' -X 'path:beans' matcher: <differencematcher m1=<includematcher includes='(?:beans\\/black(?:/|$))'>, m2=<includematcher includes='(?:beans(?:/|$))'>>