Patchwork [7,of,7] context: avoid writing outdated dirstate out (issue5584)

login
register
mail settings
Submitter Katsunori FUJIWARA
Date June 9, 2017, 4:08 a.m.
Message ID <664e3e012535c9d161f0.1496981307@speaknoevil>
Download mbox | patch
Permalink /patch/21267/
State Accepted
Headers show

Comments

Katsunori FUJIWARA - June 9, 2017, 4:08 a.m.
# HG changeset patch
# User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
# Date 1496981269 -32400
#      Fri Jun 09 13:07:49 2017 +0900
# Node ID 664e3e012535c9d161f0ecb95c6dc0bfdfe9c672
# Parent  9b4ff10f9db00ae7d91a28067d6b50655f5f957d
context: avoid writing outdated dirstate out (issue5584)

Before this patch, workingctx.status() may cause writing outdated
dirstate out, if:

  - .hg/dirstate is changed simultaneously after last loading it,
  - there is any file, which should be dirstate.normal()-ed

Typical issue case is:

  - the working directory is updated by "hg update"
  - .hg/dirstate is updated in background (e.g. fsmonitor)

This patch compares identities of dirstate before and after
acquisition of wlock, and avoids writing outdated dirstate out, if
change of .hg/dirstate is detected.
Augie Fackler - June 9, 2017, 6:19 p.m.
On Fri, Jun 09, 2017 at 01:08:27PM +0900, FUJIWARA Katsunori wrote:
> # HG changeset patch
> # User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
> # Date 1496981269 -32400
> #      Fri Jun 09 13:07:49 2017 +0900
> # Node ID 664e3e012535c9d161f0ecb95c6dc0bfdfe9c672
> # Parent  9b4ff10f9db00ae7d91a28067d6b50655f5f957d
> context: avoid writing outdated dirstate out (issue5584)

queued, thanks
Siddharth Agarwal - June 9, 2017, 6:38 p.m.
On 6/8/17 9:08 PM, FUJIWARA Katsunori wrote:
> # HG changeset patch
> # User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
> # Date 1496981269 -32400
> #      Fri Jun 09 13:07:49 2017 +0900
> # Node ID 664e3e012535c9d161f0ecb95c6dc0bfdfe9c672
> # Parent  9b4ff10f9db00ae7d91a28067d6b50655f5f957d
> context: avoid writing outdated dirstate out (issue5584)

This series looks great, thanks.

>
> Before this patch, workingctx.status() may cause writing outdated
> dirstate out, if:
>
>    - .hg/dirstate is changed simultaneously after last loading it,
>    - there is any file, which should be dirstate.normal()-ed
>
> Typical issue case is:
>
>    - the working directory is updated by "hg update"
>    - .hg/dirstate is updated in background (e.g. fsmonitor)
>
> This patch compares identities of dirstate before and after
> acquisition of wlock, and avoids writing outdated dirstate out, if
> change of .hg/dirstate is detected.
>
> diff --git a/mercurial/context.py b/mercurial/context.py
> --- a/mercurial/context.py
> +++ b/mercurial/context.py
> @@ -1763,19 +1763,30 @@ class workingctx(committablectx):
>           # update dirstate for files that are actually clean
>           if fixup:
>               try:
> +                oldid = self._repo.dirstate.identity()
> +
>                   # updating the dirstate is optional
>                   # so we don't wait on the lock
>                   # wlock can invalidate the dirstate, so cache normal _after_
>                   # taking the lock
>                   with self._repo.wlock(False):
> -                    normal = self._repo.dirstate.normal
> -                    for f in fixup:
> -                        normal(f)
> -                    # write changes out explicitly, because nesting
> -                    # wlock at runtime may prevent 'wlock.release()'
> -                    # after this block from doing so for subsequent
> -                    # changing files
> -                    self._repo.dirstate.write(self._repo.currenttransaction())
> +                    if self._repo.dirstate.identity() == oldid:
> +                        normal = self._repo.dirstate.normal
> +                        for f in fixup:
> +                            normal(f)
> +                        # write changes out explicitly, because nesting
> +                        # wlock at runtime may prevent 'wlock.release()'
> +                        # after this block from doing so for subsequent
> +                        # changing files
> +                        tr = self._repo.currenttransaction()
> +                        self._repo.dirstate.write(tr)
> +                    else:
> +                        # in this case, writing changes out breaks
> +                        # consistency, because .hg/dirstate was
> +                        # already changed simultaneously after last
> +                        # caching (see also issue5584 for detail)
> +                        self._repo.ui.debug('skip updating dirstate: '
> +                                            'identity mismatch\n')
>               except error.LockError:
>                   pass
>           return modified, deleted, fixup
> diff --git a/tests/test-dirstate-race.t b/tests/test-dirstate-race.t
> --- a/tests/test-dirstate-race.t
> +++ b/tests/test-dirstate-race.t
> @@ -95,3 +95,65 @@ anyway.
>     ! d
>     ! dir1/c
>     ! e
> +
> +  $ rmdir d e
> +  $ hg update -C -q .
> +
> +Test that dirstate changes aren't written out at the end of "hg
> +status", if .hg/dirstate is already changed simultaneously before
> +acquisition of wlock in workingctx._checklookup().
> +
> +This avoidance is important to keep consistency of dirstate in race
> +condition (see issue5584 for detail).
> +
> +  $ hg parents -q
> +  1:* (glob)
> +
> +  $ hg debugrebuilddirstate
> +  $ hg debugdirstate
> +  n   0         -1 unset               a
> +  n   0         -1 unset               b
> +  n   0         -1 unset               d
> +  n   0         -1 unset               dir1/c
> +  n   0         -1 unset               e
> +
> +  $ cat > $TESTTMP/dirstaterace.sh <<EOF
> +  > # This script assumes timetable of typical issue5584 case below:
> +  > #
> +  > # 1. "hg status" loads .hg/dirstate
> +  > # 2. "hg status" confirms clean-ness of FILE
> +  > # 3. "hg update -C 0" updates the working directory simultaneously
> +  > #    (FILE is removed, and FILE is dropped from .hg/dirstate)
> +  > # 4. "hg status" acquires wlock
> +  > #    (.hg/dirstate is re-loaded = no FILE entry in dirstate)
> +  > # 5. "hg status" marks FILE in dirstate as clean
> +  > #    (FILE entry is added to in-memory dirstate)
> +  > # 6. "hg status" writes dirstate changes into .hg/dirstate
> +  > #    (FILE entry is written into .hg/dirstate)
> +  > #
> +  > # To reproduce similar situation easily and certainly, #2 and #3
> +  > # are swapped.  "hg cat" below ensures #2 on "hg status" side.
> +  >
> +  > hg update -q -C 0
> +  > hg cat -r 1 b > b
> +  > EOF
> +
> +"hg status" below should excludes "e", of which exec flag is set, for
> +portability of test scenario, because unsure but missing "e" is
> +treated differently in _checklookup() according to runtime platform.
> +
> +- "missing(!)" on POSIX, "pctx[f].cmp(self[f])" raises ENOENT
> +- "modified(M)" on Windows, "self.flags(f) != pctx.flags(f)" is True
> +
> +  $ hg status --config extensions.dirstaterace=$TESTTMP/dirstaterace.py --debug -X path:e
> +  skip updating dirstate: identity mismatch
> +  M a
> +  ! d
> +  ! dir1/c
> +
> +  $ hg parents -q
> +  0:* (glob)
> +  $ hg files
> +  a
> +  $ hg debugdirstate
> +  n * * * a (glob)
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/context.py b/mercurial/context.py
--- a/mercurial/context.py
+++ b/mercurial/context.py
@@ -1763,19 +1763,30 @@  class workingctx(committablectx):
         # update dirstate for files that are actually clean
         if fixup:
             try:
+                oldid = self._repo.dirstate.identity()
+
                 # updating the dirstate is optional
                 # so we don't wait on the lock
                 # wlock can invalidate the dirstate, so cache normal _after_
                 # taking the lock
                 with self._repo.wlock(False):
-                    normal = self._repo.dirstate.normal
-                    for f in fixup:
-                        normal(f)
-                    # write changes out explicitly, because nesting
-                    # wlock at runtime may prevent 'wlock.release()'
-                    # after this block from doing so for subsequent
-                    # changing files
-                    self._repo.dirstate.write(self._repo.currenttransaction())
+                    if self._repo.dirstate.identity() == oldid:
+                        normal = self._repo.dirstate.normal
+                        for f in fixup:
+                            normal(f)
+                        # write changes out explicitly, because nesting
+                        # wlock at runtime may prevent 'wlock.release()'
+                        # after this block from doing so for subsequent
+                        # changing files
+                        tr = self._repo.currenttransaction()
+                        self._repo.dirstate.write(tr)
+                    else:
+                        # in this case, writing changes out breaks
+                        # consistency, because .hg/dirstate was
+                        # already changed simultaneously after last
+                        # caching (see also issue5584 for detail)
+                        self._repo.ui.debug('skip updating dirstate: '
+                                            'identity mismatch\n')
             except error.LockError:
                 pass
         return modified, deleted, fixup
diff --git a/tests/test-dirstate-race.t b/tests/test-dirstate-race.t
--- a/tests/test-dirstate-race.t
+++ b/tests/test-dirstate-race.t
@@ -95,3 +95,65 @@  anyway.
   ! d
   ! dir1/c
   ! e
+
+  $ rmdir d e
+  $ hg update -C -q .
+
+Test that dirstate changes aren't written out at the end of "hg
+status", if .hg/dirstate is already changed simultaneously before
+acquisition of wlock in workingctx._checklookup().
+
+This avoidance is important to keep consistency of dirstate in race
+condition (see issue5584 for detail).
+
+  $ hg parents -q
+  1:* (glob)
+
+  $ hg debugrebuilddirstate
+  $ hg debugdirstate
+  n   0         -1 unset               a
+  n   0         -1 unset               b
+  n   0         -1 unset               d
+  n   0         -1 unset               dir1/c
+  n   0         -1 unset               e
+
+  $ cat > $TESTTMP/dirstaterace.sh <<EOF
+  > # This script assumes timetable of typical issue5584 case below:
+  > #
+  > # 1. "hg status" loads .hg/dirstate
+  > # 2. "hg status" confirms clean-ness of FILE
+  > # 3. "hg update -C 0" updates the working directory simultaneously
+  > #    (FILE is removed, and FILE is dropped from .hg/dirstate)
+  > # 4. "hg status" acquires wlock
+  > #    (.hg/dirstate is re-loaded = no FILE entry in dirstate)
+  > # 5. "hg status" marks FILE in dirstate as clean
+  > #    (FILE entry is added to in-memory dirstate)
+  > # 6. "hg status" writes dirstate changes into .hg/dirstate
+  > #    (FILE entry is written into .hg/dirstate)
+  > #
+  > # To reproduce similar situation easily and certainly, #2 and #3
+  > # are swapped.  "hg cat" below ensures #2 on "hg status" side.
+  > 
+  > hg update -q -C 0
+  > hg cat -r 1 b > b
+  > EOF
+
+"hg status" below should excludes "e", of which exec flag is set, for
+portability of test scenario, because unsure but missing "e" is
+treated differently in _checklookup() according to runtime platform.
+
+- "missing(!)" on POSIX, "pctx[f].cmp(self[f])" raises ENOENT
+- "modified(M)" on Windows, "self.flags(f) != pctx.flags(f)" is True
+
+  $ hg status --config extensions.dirstaterace=$TESTTMP/dirstaterace.py --debug -X path:e
+  skip updating dirstate: identity mismatch
+  M a
+  ! d
+  ! dir1/c
+
+  $ hg parents -q
+  0:* (glob)
+  $ hg files
+  a
+  $ hg debugdirstate
+  n * * * a (glob)