Patchwork [2,of,2] localrepo: check for case collisions at commit time (issue4665) (BC)

login
register
mail settings
Submitter Gregory Szorc
Date Aug. 18, 2016, 5:28 a.m.
Message ID <e8d4db665bf172344d23.1471498130@ubuntu-vm-main>
Download mbox | patch
Permalink /patch/16346/
State Changes Requested
Headers show

Comments

Gregory Szorc - Aug. 18, 2016, 5:28 a.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1471497059 25200
#      Wed Aug 17 22:10:59 2016 -0700
# Node ID e8d4db665bf172344d2354c17a0eb8f69ef265c7
# Parent  04ba5f1f6c28674ed2df750a8acca02021b2698c
localrepo: check for case collisions at commit time (issue4665) (BC)

Before, case collisions were only reported when performing `hg add`.
This meant that many commands that could add or rename files (including
graft, rebase, histedit, backout, etc) weren't audited for case
collisions.

This patch adds case collision detection to the lower-level
repo.commit() so that any time a commit is performed from the working
directory state we look for case collisions.

As the test changes show, this results in a warning or abort at
`hg commit` time.

It would arguably be even better to perform case collision detection
in localrepository.commitctx(), which is the lowest level function
for adding a new changeset. But I'm not sure if that's appropriate.

The added code in scmutil.py could likely be consolidated with existing
code. It's short enough that I'm fine with the DRY violation. But if
someone wants to push back, I'll understand.
Gregory Szorc - Aug. 18, 2016, 5:48 a.m.
> On Aug 17, 2016, at 22:28, Gregory Szorc <gregory.szorc@gmail.com> wrote:
> 
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1471497059 25200
> #      Wed Aug 17 22:10:59 2016 -0700
> # Node ID e8d4db665bf172344d2354c17a0eb8f69ef265c7
> # Parent  04ba5f1f6c28674ed2df750a8acca02021b2698c
> localrepo: check for case collisions at commit time (issue4665) (BC)
> 
> Before, case collisions were only reported when performing `hg add`.
> This meant that many commands that could add or rename files (including
> graft, rebase, histedit, backout, etc) weren't audited for case
> collisions.
> 
> This patch adds case collision detection to the lower-level
> repo.commit() so that any time a commit is performed from the working
> directory state we look for case collisions.
> 
> As the test changes show, this results in a warning or abort at
> `hg commit` time.
> 
> It would arguably be even better to perform case collision detection
> in localrepository.commitctx(), which is the lowest level function
> for adding a new changeset. But I'm not sure if that's appropriate.
> 
> The added code in scmutil.py could likely be consolidated with existing
> code. It's short enough that I'm fine with the DRY violation. But if
> someone wants to push back, I'll understand.
> 
> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> --- a/mercurial/localrepo.py
> +++ b/mercurial/localrepo.py
> @@ -1581,16 +1581,24 @@ class localrepository(object):
>                     '.hgsubstate' not in (status.modified + status.added +
>                                           status.removed)):
>                     status.removed.insert(0, '.hgsubstate')
> 
>             # make sure all explicit patterns are matched
>             if not force:
>                 self.checkcommitpatterns(wctx, vdirs, match, status, fail)
> 
> +            # Audit added files for case collisions.
> +            caseabort, casewarn = scmutil.checkportabilityalert(self.ui)
> +            if caseabort or casewarn:
> +                cca = scmutil.ctxcasecollisionauditor(self.ui, wctx.p1(),
> +                                                      caseabort)
> +                for f in status.added:
> +                    cca(f)
> +
>             cctx = context.workingcommitctx(self, status,
>                                             text, user, date, extra)
> 
>             # internal config: ui.allowemptycommit
>             allowemptycommit = (wctx.branch() != wctx.p1().branch()
>                                 or extra.get('close') or merge or cctx.files()
>                                 or self.ui.configbool('ui', 'allowemptycommit'))
>             if not allowemptycommit:
> diff --git a/mercurial/scmutil.py b/mercurial/scmutil.py
> --- a/mercurial/scmutil.py
> +++ b/mercurial/scmutil.py
> @@ -202,16 +202,33 @@ class casecollisionauditor(object):
>         if fl in self._loweredfiles and f not in self._dirstate:
>             msg = _('possible case-folding collision for %s') % f
>             if self._abort:
>                 raise error.Abort(msg)
>             self._ui.warn(_("warning: %s\n") % msg)
>         self._loweredfiles.add(fl)
>         self._newfiles.add(f)
> 
> +class ctxcasecollisionauditor(object):
> +    """A case collision auditor that works on a changectx."""
> +    def __init__(self, ui, ctx, abort):
> +        self._ui = ui
> +        self._abort = abort
> +        self._loweredfiles = set(encoding.lower(p) for p in ctx.manifest())

This line has scalability issues. I'm open for suggestions on how to deal with this.

> +
> +    def __call__(self, path):
> +        l = encoding.lower(path)
> +        if l in self._loweredfiles:
> +            msg = _('possible case-folding collision for %s') % path
> +            if self._abort:
> +                raise error.Abort(msg)
> +            self._ui.warn(_('warning: %s\n') % msg)
> +
> +        self._loweredfiles.add(l)
> +
> def filteredhash(repo, maxrev):
>     """build hash of filtered revisions in the current repoview.
> 
>     Multiple caches perform up-to-date validation by checking that the
>     tiprev and tipnode stored in the cache file match the current repository.
>     However, this is not sufficient for validating repoviews because the set
>     of revisions in the view may change without the repository tiprev and
>     tipnode changing.
> diff --git a/tests/test-casecollision.t b/tests/test-casecollision.t
> --- a/tests/test-casecollision.t
> +++ b/tests/test-casecollision.t
> @@ -24,65 +24,88 @@ test file addition with colliding case
>   $ hg forget A
>   $ hg st
>   A a
>   ? A
>   $ hg add --config ui.portablefilenames=no A
>   $ hg st
>   A A
>   A a
> +
> +Abort at commit time when case-folding collisions aren't allowed
> +
> +  $ hg --config ui.portablefilenames=abort commit -m 'add A a'
> +  abort: possible case-folding collision for a
> +  [255]
> +
>   $ hg commit -m 'add A a'
> +  warning: possible case-folding collision for a
> 
>   $ mkdir b
>   $ touch b/c b/D
>   $ hg add b
>   adding b/D
>   adding b/c
>   $ hg commit -m 'add b/c b/D'
> 
>   $ touch b/d b/C
>   $ hg add b/C
>   warning: possible case-folding collision for b/C
>   $ hg add b/d
>   warning: possible case-folding collision for b/d
> -  $ hg commit -m 'add b/C b/d'
> +  $ hg --config ui.portablefilenames=no commit -m 'add b/C b/d'
> 
>   $ touch b/a1 b/a2
>   $ hg add b
>   adding b/a1
>   adding b/a2
>   $ hg commit -m 'add b/a1 b/a2'
> 
>   $ touch b/A2 b/a1.1
>   $ hg add b/a1.1 b/A2
>   warning: possible case-folding collision for b/A2
>   $ hg commit -m 'add b/A2 b/a1.1'
> +  warning: possible case-folding collision for b/A2
> 
>   $ touch b/f b/F
>   $ hg add b/f b/F
>   warning: possible case-folding collision for b/f
>   $ hg commit -m 'add b/f b/F'
> +  warning: possible case-folding collision for b/f
> 
>   $ touch g G
>   $ hg add g G
>   warning: possible case-folding collision for g
>   $ hg commit -m 'add g G'
> +  warning: possible case-folding collision for g
> 
>   $ mkdir h H
>   $ touch h/x H/x
>   $ hg add h/x H/x
>   warning: possible case-folding collision for h/x
>   $ hg commit -m 'add h/x H/x'
> +  warning: possible case-folding collision for h/x
> 
>   $ touch h/s H/s
>   $ hg add h/s
>   $ hg add H/s
>   warning: possible case-folding collision for H/s
>   $ hg commit -m 'add h/s H/s'
> +  warning: possible case-folding collision for h/s
> +
> +commit -A will not double alert
> +
> +  $ mkdir i I
> +  $ touch i/s I/s
> +  $ hg commit -A -m 'add i/s I/s'
> +  adding I/s
> +  adding i/s
> +  warning: possible case-folding collision for i/s
> 
> case changing rename must not warn or abort
> 
>   $ echo c > c
>   $ hg ci -qAmx
>   $ hg mv c C
>   $ hg commit -m 'mv c C'
> +  warning: possible case-folding collision for C
> 
>   $ cd ..
Yuya Nishihara - Aug. 21, 2016, 2:41 p.m.
On Wed, 17 Aug 2016 22:28:50 -0700, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1471497059 25200
> #      Wed Aug 17 22:10:59 2016 -0700
> # Node ID e8d4db665bf172344d2354c17a0eb8f69ef265c7
> # Parent  04ba5f1f6c28674ed2df750a8acca02021b2698c
> localrepo: check for case collisions at commit time (issue4665) (BC)

> --- a/mercurial/localrepo.py
> +++ b/mercurial/localrepo.py
> @@ -1581,16 +1581,24 @@ class localrepository(object):
>                      '.hgsubstate' not in (status.modified + status.added +
>                                            status.removed)):
>                      status.removed.insert(0, '.hgsubstate')
>  
>              # make sure all explicit patterns are matched
>              if not force:
>                  self.checkcommitpatterns(wctx, vdirs, match, status, fail)
>  
> +            # Audit added files for case collisions.
> +            caseabort, casewarn = scmutil.checkportabilityalert(self.ui)
> +            if caseabort or casewarn:
> +                cca = scmutil.ctxcasecollisionauditor(self.ui, wctx.p1(),
> +                                                      caseabort)

Maybe it wouldn't detect case collisions at merge commit.

> +                for f in status.added:
> +                    cca(f)

I think status.added isn't good enough to detect case collision at commit time.
For example, case-changing rename shouldn't be warned nor aborted.

  $ hg mv a A
  $ hg ci

Patch

diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -1581,16 +1581,24 @@  class localrepository(object):
                     '.hgsubstate' not in (status.modified + status.added +
                                           status.removed)):
                     status.removed.insert(0, '.hgsubstate')
 
             # make sure all explicit patterns are matched
             if not force:
                 self.checkcommitpatterns(wctx, vdirs, match, status, fail)
 
+            # Audit added files for case collisions.
+            caseabort, casewarn = scmutil.checkportabilityalert(self.ui)
+            if caseabort or casewarn:
+                cca = scmutil.ctxcasecollisionauditor(self.ui, wctx.p1(),
+                                                      caseabort)
+                for f in status.added:
+                    cca(f)
+
             cctx = context.workingcommitctx(self, status,
                                             text, user, date, extra)
 
             # internal config: ui.allowemptycommit
             allowemptycommit = (wctx.branch() != wctx.p1().branch()
                                 or extra.get('close') or merge or cctx.files()
                                 or self.ui.configbool('ui', 'allowemptycommit'))
             if not allowemptycommit:
diff --git a/mercurial/scmutil.py b/mercurial/scmutil.py
--- a/mercurial/scmutil.py
+++ b/mercurial/scmutil.py
@@ -202,16 +202,33 @@  class casecollisionauditor(object):
         if fl in self._loweredfiles and f not in self._dirstate:
             msg = _('possible case-folding collision for %s') % f
             if self._abort:
                 raise error.Abort(msg)
             self._ui.warn(_("warning: %s\n") % msg)
         self._loweredfiles.add(fl)
         self._newfiles.add(f)
 
+class ctxcasecollisionauditor(object):
+    """A case collision auditor that works on a changectx."""
+    def __init__(self, ui, ctx, abort):
+        self._ui = ui
+        self._abort = abort
+        self._loweredfiles = set(encoding.lower(p) for p in ctx.manifest())
+
+    def __call__(self, path):
+        l = encoding.lower(path)
+        if l in self._loweredfiles:
+            msg = _('possible case-folding collision for %s') % path
+            if self._abort:
+                raise error.Abort(msg)
+            self._ui.warn(_('warning: %s\n') % msg)
+
+        self._loweredfiles.add(l)
+
 def filteredhash(repo, maxrev):
     """build hash of filtered revisions in the current repoview.
 
     Multiple caches perform up-to-date validation by checking that the
     tiprev and tipnode stored in the cache file match the current repository.
     However, this is not sufficient for validating repoviews because the set
     of revisions in the view may change without the repository tiprev and
     tipnode changing.
diff --git a/tests/test-casecollision.t b/tests/test-casecollision.t
--- a/tests/test-casecollision.t
+++ b/tests/test-casecollision.t
@@ -24,65 +24,88 @@  test file addition with colliding case
   $ hg forget A
   $ hg st
   A a
   ? A
   $ hg add --config ui.portablefilenames=no A
   $ hg st
   A A
   A a
+
+Abort at commit time when case-folding collisions aren't allowed
+
+  $ hg --config ui.portablefilenames=abort commit -m 'add A a'
+  abort: possible case-folding collision for a
+  [255]
+
   $ hg commit -m 'add A a'
+  warning: possible case-folding collision for a
 
   $ mkdir b
   $ touch b/c b/D
   $ hg add b
   adding b/D
   adding b/c
   $ hg commit -m 'add b/c b/D'
 
   $ touch b/d b/C
   $ hg add b/C
   warning: possible case-folding collision for b/C
   $ hg add b/d
   warning: possible case-folding collision for b/d
-  $ hg commit -m 'add b/C b/d'
+  $ hg --config ui.portablefilenames=no commit -m 'add b/C b/d'
 
   $ touch b/a1 b/a2
   $ hg add b
   adding b/a1
   adding b/a2
   $ hg commit -m 'add b/a1 b/a2'
 
   $ touch b/A2 b/a1.1
   $ hg add b/a1.1 b/A2
   warning: possible case-folding collision for b/A2
   $ hg commit -m 'add b/A2 b/a1.1'
+  warning: possible case-folding collision for b/A2
 
   $ touch b/f b/F
   $ hg add b/f b/F
   warning: possible case-folding collision for b/f
   $ hg commit -m 'add b/f b/F'
+  warning: possible case-folding collision for b/f
 
   $ touch g G
   $ hg add g G
   warning: possible case-folding collision for g
   $ hg commit -m 'add g G'
+  warning: possible case-folding collision for g
 
   $ mkdir h H
   $ touch h/x H/x
   $ hg add h/x H/x
   warning: possible case-folding collision for h/x
   $ hg commit -m 'add h/x H/x'
+  warning: possible case-folding collision for h/x
 
   $ touch h/s H/s
   $ hg add h/s
   $ hg add H/s
   warning: possible case-folding collision for H/s
   $ hg commit -m 'add h/s H/s'
+  warning: possible case-folding collision for h/s
+
+commit -A will not double alert
+
+  $ mkdir i I
+  $ touch i/s I/s
+  $ hg commit -A -m 'add i/s I/s'
+  adding I/s
+  adding i/s
+  warning: possible case-folding collision for i/s
 
 case changing rename must not warn or abort
 
   $ echo c > c
   $ hg ci -qAmx
   $ hg mv c C
   $ hg commit -m 'mv c C'
+  warning: possible case-folding collision for C
 
   $ cd ..