Patchwork [STABLE] status: don't crash if a lookup file disappears

login
register
mail settings
Submitter Siddharth Agarwal
Date June 3, 2017, 5:28 a.m.
Message ID <d39f934da0c80a568486.1496467687@devvm31800.prn1.facebook.com>
Download mbox | patch
Permalink /patch/21145/
State Accepted
Headers show

Comments

Siddharth Agarwal - June 3, 2017, 5:28 a.m.
# HG changeset patch
# User Siddharth Agarwal <sid0@fb.com>
# Date 1496467672 25200
#      Fri Jun 02 22:27:52 2017 -0700
# Branch stable
# Node ID d39f934da0c80a568486cd8645eb6bbe513f1f03
# Parent  62e42e2897502bcbaa3a57d3301c789309596391
status: don't crash if a lookup file disappears

This can happen if another process (even another hg process!) comes along and
removes the file at that time.

This partly resolves issue5584, but not completely -- a bogus dirstate update
can still happen. However, the full fix is too involved for stable.
Yuya Nishihara - June 3, 2017, 2:26 p.m.
On Fri, 2 Jun 2017 22:28:07 -0700, Siddharth Agarwal wrote:
> # HG changeset patch
> # User Siddharth Agarwal <sid0@fb.com>
> # Date 1496467672 25200
> #      Fri Jun 02 22:27:52 2017 -0700
> # Branch stable
> # Node ID d39f934da0c80a568486cd8645eb6bbe513f1f03
> # Parent  62e42e2897502bcbaa3a57d3301c789309596391
> status: don't crash if a lookup file disappears

Looks good to me. A few nits.

> diff --git a/mercurial/context.py b/mercurial/context.py
> --- a/mercurial/context.py
> +++ b/mercurial/context.py
> @@ -1613,18 +1613,30 @@ class workingctx(committablectx):
>      def _checklookup(self, files):
>          # check for any possibly clean files
>          if not files:
> -            return [], []
> +            return [], [], []
>  
>          modified = []
> +        deleted = []
>          fixup = []
>          pctx = self._parents[0]
>          # do a full compare of any files that might have changed
>          for f in sorted(files):
> -            if (f not in pctx or self.flags(f) != pctx.flags(f)
> -                or pctx[f].cmp(self[f])):
> -                modified.append(f)
> -            else:
> -                fixup.append(f)
> +            try:
> +                # This will return True for a file that got replaced by a
> +                # directory in the interim, but fixing that is pretty hard.
> +                if (f not in pctx or self.flags(f) != pctx.flags(f)
> +                    or pctx[f].cmp(self[f])):

If cmp() falls back to fctx.data() path, maybe IOError could be raised.

> +                    modified.append(f)
> +                else:
> +                    fixup.append(f)
> +            except OSError:

Can we check errno strictly? Perhaps ENOENT and EISDIR should be caught.
Siddharth Agarwal - June 3, 2017, 3:58 p.m.
On 6/3/17 7:26 AM, Yuya Nishihara wrote:
> On Fri, 2 Jun 2017 22:28:07 -0700, Siddharth Agarwal wrote:
>> # HG changeset patch
>> # User Siddharth Agarwal <sid0@fb.com>
>> # Date 1496467672 25200
>> #      Fri Jun 02 22:27:52 2017 -0700
>> # Branch stable
>> # Node ID d39f934da0c80a568486cd8645eb6bbe513f1f03
>> # Parent  62e42e2897502bcbaa3a57d3301c789309596391
>> status: don't crash if a lookup file disappears
> Looks good to me. A few nits.
>
>> diff --git a/mercurial/context.py b/mercurial/context.py
>> --- a/mercurial/context.py
>> +++ b/mercurial/context.py
>> @@ -1613,18 +1613,30 @@ class workingctx(committablectx):
>>       def _checklookup(self, files):
>>           # check for any possibly clean files
>>           if not files:
>> -            return [], []
>> +            return [], [], []
>>   
>>           modified = []
>> +        deleted = []
>>           fixup = []
>>           pctx = self._parents[0]
>>           # do a full compare of any files that might have changed
>>           for f in sorted(files):
>> -            if (f not in pctx or self.flags(f) != pctx.flags(f)
>> -                or pctx[f].cmp(self[f])):
>> -                modified.append(f)
>> -            else:
>> -                fixup.append(f)
>> +            try:
>> +                # This will return True for a file that got replaced by a
>> +                # directory in the interim, but fixing that is pretty hard.
>> +                if (f not in pctx or self.flags(f) != pctx.flags(f)
>> +                    or pctx[f].cmp(self[f])):
> If cmp() falls back to fctx.data() path, maybe IOError could be raised.

That sounds like a good idea. (Sorry, I've been writing some Python 3 
where OSError and IOError are the same.)

>
>> +                    modified.append(f)
>> +                else:
>> +                    fixup.append(f)
>> +            except OSError:
> Can we check errno strictly? Perhaps ENOENT and EISDIR should be caught.

The dirstate doesn't check errno strictly, and I think both the dirstate 
and this should do the same thing.

> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Yuya Nishihara - June 3, 2017, 4:05 p.m.
On Sat, 3 Jun 2017 08:58:05 -0700, Siddharth Agarwal wrote:
> On 6/3/17 7:26 AM, Yuya Nishihara wrote:
> > On Fri, 2 Jun 2017 22:28:07 -0700, Siddharth Agarwal wrote:
> >> # HG changeset patch
> >> # User Siddharth Agarwal <sid0@fb.com>
> >> # Date 1496467672 25200
> >> #      Fri Jun 02 22:27:52 2017 -0700
> >> # Branch stable
> >> # Node ID d39f934da0c80a568486cd8645eb6bbe513f1f03
> >> # Parent  62e42e2897502bcbaa3a57d3301c789309596391
> >> status: don't crash if a lookup file disappears
> > Looks good to me. A few nits.
> >
> >> diff --git a/mercurial/context.py b/mercurial/context.py
> >> --- a/mercurial/context.py
> >> +++ b/mercurial/context.py
> >> @@ -1613,18 +1613,30 @@ class workingctx(committablectx):
> >>       def _checklookup(self, files):
> >>           # check for any possibly clean files
> >>           if not files:
> >> -            return [], []
> >> +            return [], [], []
> >>   
> >>           modified = []
> >> +        deleted = []
> >>           fixup = []
> >>           pctx = self._parents[0]
> >>           # do a full compare of any files that might have changed
> >>           for f in sorted(files):
> >> -            if (f not in pctx or self.flags(f) != pctx.flags(f)
> >> -                or pctx[f].cmp(self[f])):
> >> -                modified.append(f)
> >> -            else:
> >> -                fixup.append(f)
> >> +            try:
> >> +                # This will return True for a file that got replaced by a
> >> +                # directory in the interim, but fixing that is pretty hard.
> >> +                if (f not in pctx or self.flags(f) != pctx.flags(f)
> >> +                    or pctx[f].cmp(self[f])):
> > If cmp() falls back to fctx.data() path, maybe IOError could be raised.
> 
> That sounds like a good idea. (Sorry, I've been writing some Python 3 
> where OSError and IOError are the same.)
> 
> >
> >> +                    modified.append(f)
> >> +                else:
> >> +                    fixup.append(f)
> >> +            except OSError:
> > Can we check errno strictly? Perhaps ENOENT and EISDIR should be caught.
> 
> The dirstate doesn't check errno strictly, and I think both the dirstate 
> and this should do the same thing.

Got it, thanks. I've added IOError and queued for stable.

Patch

diff --git a/mercurial/context.py b/mercurial/context.py
--- a/mercurial/context.py
+++ b/mercurial/context.py
@@ -1613,18 +1613,30 @@  class workingctx(committablectx):
     def _checklookup(self, files):
         # check for any possibly clean files
         if not files:
-            return [], []
+            return [], [], []
 
         modified = []
+        deleted = []
         fixup = []
         pctx = self._parents[0]
         # do a full compare of any files that might have changed
         for f in sorted(files):
-            if (f not in pctx or self.flags(f) != pctx.flags(f)
-                or pctx[f].cmp(self[f])):
-                modified.append(f)
-            else:
-                fixup.append(f)
+            try:
+                # This will return True for a file that got replaced by a
+                # directory in the interim, but fixing that is pretty hard.
+                if (f not in pctx or self.flags(f) != pctx.flags(f)
+                    or pctx[f].cmp(self[f])):
+                    modified.append(f)
+                else:
+                    fixup.append(f)
+            except OSError:
+                # A file become inaccessible in between? Mark it as deleted,
+                # matching dirstate behavior (issue5584).
+                # The dirstate has more complex behavior around whether a
+                # missing file matches a directory, etc, but we don't need to
+                # bother with that: if f has made it to this point, we're sure
+                # it's in the dirstate.
+                deleted.append(f)
 
         # update dirstate for files that are actually clean
         if fixup:
@@ -1644,7 +1656,7 @@  class workingctx(committablectx):
                     self._repo.dirstate.write(self._repo.currenttransaction())
             except error.LockError:
                 pass
-        return modified, fixup
+        return modified, deleted, fixup
 
     def _dirstatestatus(self, match=None, ignored=False, clean=False,
                         unknown=False):
@@ -1659,8 +1671,9 @@  class workingctx(committablectx):
 
         # check for any possibly clean files
         if cmp:
-            modified2, fixup = self._checklookup(cmp)
+            modified2, deleted2, fixup = self._checklookup(cmp)
             s.modified.extend(modified2)
+            s.deleted.extend(deleted2)
 
             # update dirstate for files that are actually clean
             if fixup and listclean:
diff --git a/tests/test-dirstate-race.t b/tests/test-dirstate-race.t
--- a/tests/test-dirstate-race.t
+++ b/tests/test-dirstate-race.t
@@ -1,4 +1,5 @@ 
-  $ hg init
+  $ hg init repo
+  $ cd repo
   $ echo a > a
   $ hg add a
   $ hg commit -m test
@@ -31,3 +32,62 @@  Do we ever miss a sub-second change?:
   M a
   M a
 
+  $ echo test > b
+  $ mkdir dir1
+  $ echo test > dir1/c
+  $ echo test > d
+
+  $ echo test > e
+#if execbit
+A directory will typically have the execute bit -- make sure it doesn't get
+confused with a file with the exec bit set
+  $ chmod +x e
+#endif
+
+  $ hg add b dir1 d e
+  adding dir1/c
+  $ hg commit -m test2
+
+  $ cat >> $TESTTMP/dirstaterace.py << EOF
+  > from mercurial import (
+  >     context,
+  >     extensions,
+  > )
+  > def extsetup():
+  >     extensions.wrapfunction(context.workingctx, '_checklookup', overridechecklookup)
+  > def overridechecklookup(orig, self, files):
+  >     # make an update that changes the dirstate from underneath
+  >     self._repo.ui.system(self._repo.ui.config('dirstaterace', 'command'), cwd=self._repo.root)
+  >     return orig(self, files)
+  > EOF
+
+  $ hg debugrebuilddirstate
+  $ hg debugdirstate
+  n   0         -1 unset               a
+  n   0         -1 unset               b
+  n   0         -1 unset               d
+  n   0         -1 unset               dir1/c
+  n   0         -1 unset               e
+
+XXX Note that this returns M for files that got replaced by directories. This is
+definitely a bug, but the fix for that is hard and the next status run is fine
+anyway.
+
+  $ hg status --config extensions.dirstaterace=$TESTTMP/dirstaterace.py \
+  >   --config dirstaterace.command='rm b && rm -r dir1 && rm d && mkdir d && rm e && mkdir e'
+  M d
+  M e
+  ! b
+  ! dir1/c
+  $ hg debugdirstate
+  n 644          2 * a (glob)
+  n   0         -1 unset               b
+  n   0         -1 unset               d
+  n   0         -1 unset               dir1/c
+  n   0         -1 unset               e
+
+  $ hg status
+  ! b
+  ! d
+  ! dir1/c
+  ! e