Patchwork [2,of,7,STABLE] merge: check collision between files and directories before updating (issue29)

login
register
mail settings
Submitter Katsunori FUJIWARA
Date May 6, 2013, 8:35 p.m.
Message ID <1bede396d3f4dfd7763e.1367872502@juju>
Download mbox | patch
Permalink /patch/1566/
State Superseded, archived
Commit 9bfa86746c9c1f6ab51deb8f174ffc482417d09f
Headers show

Comments

Katsunori FUJIWARA - May 6, 2013, 8:35 p.m.
# HG changeset patch
# User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
# Date 1367870651 -32400
#      Tue May 07 05:04:11 2013 +0900
# Branch stable
# Node ID 1bede396d3f4dfd7763e214befd12fea9df63130
# Parent  127d7233b406699617f6dbcbc34e5ea00c2aa223
merge: check collision between files and directories before updating (issue29)

For example, it is assumed that the revision X consists file 'A', and
the revision Y consists of file 'A/A'. Merging between X and Y causes
collision between file 'A' (on X) and directory 'A' of 'A/A' (on Y).

Before this patch, such collision is detected via failure of "mkdir()"
on existing file 'A' (merging Y into X) or "unlink()" on directory 'A'
of 'A/A' (merging X into Y) while updating working directory.

This causes incomplete working directory status.

This patch checks collision between files and directories before
updating working directory.

Collision check logic is a little redundant to fix collision problem
with largefiles extension easily in succeeding patch.

This patch skips collision check only for clean updating on case
sensitive filesystem, because the files and the directories in the
merged manifest may collide each other, even on case sensitive
filesystem.

This patch also adds "no-icasefs" to "#if" directive in
test-audit-path.t, because collision between symbolic-link 'back' and
directory 'back' of 'back/test' is detected before check by
pathauditor on case insensitive filesystem, and this changes detail
of abort message.
Siddharth Agarwal - May 7, 2013, 9:14 p.m.
On 05/06/2013 01:35 PM, FUJIWARA Katsunori wrote:
> +
> +    # check collision in provisional merged manifest
> +    if normcase:
> +        def normiter():
> +            for f in pmmf:
> +                yield normcase(f)
> +        dirs = scmutil.dirs(normiter())
> +        for f in sorted(pmmf):
> +            check(f, normcase(f))
> +    else:
> +        dirs = scmutil.dirs(pmmf)
> +        for f in sorted(pmmf):
> +            check(f, f)

This is going to be super expensive for large repositories, involving:
- listing all the dirs, which is faster than it used to be but still 
fairly slow.
- a sorted call, which is even more expensive than dirs!
- a call to check() for each file.

Having the list of dirs may be necessary, but maybe there's a way of 
reusing the dirs from the dirstate that we may already have? The sorted 
call on the full manifest can be replaced with building up a list of 
files you need to print (which will be much smaller) and sorting that 
instead. The check call can be inlined.
Matt Mackall - May 7, 2013, 9:36 p.m.
On Tue, 2013-05-07 at 14:14 -0700, Siddharth Agarwal wrote:
> On 05/06/2013 01:35 PM, FUJIWARA Katsunori wrote:
> > +
> > +    # check collision in provisional merged manifest
> > +    if normcase:
> > +        def normiter():
> > +            for f in pmmf:
> > +                yield normcase(f)
> > +        dirs = scmutil.dirs(normiter())
> > +        for f in sorted(pmmf):
> > +            check(f, normcase(f))
> > +    else:
> > +        dirs = scmutil.dirs(pmmf)
> > +        for f in sorted(pmmf):
> > +            check(f, f)
> 
> This is going to be super expensive for large repositories, involving:
> - listing all the dirs, which is faster than it used to be but still 
> fairly slow.
> - a sorted call, which is even more expensive than dirs!
> - a call to check() for each file.

I've asked Siddarth to take a look at this patch because he's been
trying to optimize this code for a repo that has half a million files
where this stuff becomes very noticeable.

One optimization that suggests itself: we might not need to build dirs
if we're not creating any new files. We currently don't have a
distinction in the action list for getting a new file and overriding an
old one, but we could add one.

> Having the list of dirs may be necessary, but maybe there's a way of 
> reusing the dirs from the dirstate that we may already have? The sorted 
> call on the full manifest can be replaced with building up a list of 
> files you need to print (which will be much smaller) and sorting that 
> instead. The check call can be inlined.
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/merge.py b/mercurial/merge.py
--- a/mercurial/merge.py
+++ b/mercurial/merge.py
@@ -8,7 +8,7 @@ 
 from node import nullid, nullrev, hex, bin
 from i18n import _
 from mercurial import obsolete
-import error, util, filemerge, copies, subrepo, worker, dicthelpers
+import error, util, filemerge, copies, subrepo, worker, dicthelpers, scmutil
 import errno, os, shutil
 
 class mergestate(object):
@@ -138,7 +138,7 @@ 
 
     return actions
 
-def _checkcollision(repo, wmf, actions, prompts):
+def _checkcollision(repo, wmf, actions, prompts, casesensitive):
     # build provisional merged manifest up
     pmmf = set(wmf)
 
@@ -185,14 +185,36 @@ 
         assert op, m
         op(f, None)
 
-    # check case-folding collision in provisional merged manifest
-    foldmap = {}
-    for f in sorted(pmmf):
-        fold = util.normcase(f)
-        if fold in foldmap:
-            raise util.Abort(_("case-folding collision between %s and %s")
-                             % (f, foldmap[fold]))
-        foldmap[fold] = f
+    if casesensitive:
+        normcase = None
+        def check(f, nf):
+            if nf in dirs:
+                raise util.Abort(_("collision between file %s and "
+                                   "other directory") % (f))
+    else:
+        normcase = util.normcase
+        foldmap = {}
+        def check(f, nf):
+            if nf in dirs:
+                raise util.Abort(_("collision between file %s and "
+                                   "other directory") % (f))
+            if nf in foldmap:
+                raise util.Abort(_("case-folding collision between %s and %s")
+                                 % (f, foldmap[nf]))
+            foldmap[nf] = f
+
+    # check collision in provisional merged manifest
+    if normcase:
+        def normiter():
+            for f in pmmf:
+                yield normcase(f)
+        dirs = scmutil.dirs(normiter())
+        for f in sorted(pmmf):
+            check(f, normcase(f))
+    else:
+        dirs = scmutil.dirs(pmmf)
+        for f in sorted(pmmf):
+            check(f, f)
 
 def manifestmerge(repo, wctx, p2, pa, branchmerge, force, partial,
                   acceptremote=False):
@@ -350,13 +372,15 @@ 
         raise util.Abort(_("untracked files in working directory differ "
                            "from files in requested revision"))
 
-    if not util.checkcase(repo.path):
-        # check collision between files only in p2 for clean update
-        if (not branchmerge and
-            (force or not wctx.dirty(missing=True, branch=False))):
-            _checkcollision(repo, m2, [], [])
-        else:
-            _checkcollision(repo, m1, actions, prompts)
+    checkcase = util.checkcase(repo.path)
+    if (not branchmerge and
+        (force or not wctx.dirty(missing=True, branch=False))):
+        if not checkcase:
+            # check is not needed on case sensitive filesystem,
+            # because clean updating should not cause collision
+            _checkcollision(repo, m2, [], [], checkcase)
+    else:
+        _checkcollision(repo, m1, actions, prompts, checkcase)
 
     for f, m in sorted(prompts):
         if m == "cd":
diff --git a/tests/test-audit-path.t b/tests/test-audit-path.t
--- a/tests/test-audit-path.t
+++ b/tests/test-audit-path.t
@@ -62,7 +62,7 @@ 
   $ hg manifest -r2
   back
   back/test
-#if symlink
+#if symlink no-icasefs
   $ hg update -Cr2
   abort: path 'back/test' traverses symbolic link 'back'
   [255]
diff --git a/tests/test-casecollision-merge.t b/tests/test-casecollision-merge.t
--- a/tests/test-casecollision-merge.t
+++ b/tests/test-casecollision-merge.t
@@ -191,6 +191,29 @@ 
   M x
   C A
 
+Test for issue29: collision between file and directory, of which names
+are different from each other only in the letter case, should be
+detected on case insensitive filesystem.
+
+  $ hg update -q --clean 2
+  $ mkdir X
+  $ echo X/X > X/X
+  $ hg add X/X
+  $ hg commit -m '#5'
+  $ hg manifest -r 5
+  A
+  X/X
+  $ hg manifest -r 3
+  a
+  x
+  $ hg merge 3
+  abort: collision between file x and other directory
+  [255]
+  $ hg update -q --clean 3
+  $ hg merge 5
+  abort: collision between file x and other directory
+  [255]
+
   $ cd ..
 
 
diff --git a/tests/test-merge1.t b/tests/test-merge1.t
--- a/tests/test-merge1.t
+++ b/tests/test-merge1.t
@@ -173,4 +173,27 @@ 
   $ hg revert -r -2 b
   $ hg up -q -- -2
 
+Test for issue29
+
+  $ hg update -q -C 2
+  $ mkdir c
+  $ echo c/c > c/c
+  $ hg add c/c
+  $ hg ci -m "add c/c"
+  $ hg manifest -r 5
+  a
+  b
+  c/c
+  $ hg manifest -r 3
+  a
+  b
+  c
+  $ hg merge 3
+  abort: collision between file c and other directory
+  [255]
+  $ hg update -q -C 3
+  $ hg merge 5
+  abort: collision between file c and other directory
+  [255]
+
   $ cd ..