Patchwork fsmonitor: be robust in the face of bad state

login
register
mail settings
Submitter Simon Farnsworth
Date Nov. 25, 2016, 3:30 p.m.
Message ID <0ca34f1b83da754246ee.1480087853@devvm022.lla2.facebook.com>
Download mbox | patch
Permalink /patch/17756/
State Accepted
Headers show

Comments

Simon Farnsworth - Nov. 25, 2016, 3:30 p.m.
# HG changeset patch
# User Simon Farnsworth <simonfar@fb.com>
# Date 1480087846 28800
#      Fri Nov 25 07:30:46 2016 -0800
# Node ID 0ca34f1b83da754246ee33e01c4f7d6652061f5d
# Parent  a3163433647108b7bec8fc45896db1c20b18ab21
fsmonitor: be robust in the face of bad state

fsmonitor could write out bad state if interrupted part way through, and
would then crash when it tried to read it back in.

Make both sides of the operation more robust - reading state should fail
cleanly, and we can use atomictemp to write out cleanly as the file is
small. Between the two, we shouldn't crash with an IndexError any more.
Mateusz Kwapich - Nov. 25, 2016, 3:59 p.m.
The logic looks correct. It definitely makes the fsmonitor more robust.

Excerpts from Simon Farnsworth's message of 2016-11-25 07:30:53 -0800:
> # HG changeset patch
> # User Simon Farnsworth <simonfar@fb.com>
> # Date 1480087846 28800
> #      Fri Nov 25 07:30:46 2016 -0800
> # Node ID 0ca34f1b83da754246ee33e01c4f7d6652061f5d
> # Parent  a3163433647108b7bec8fc45896db1c20b18ab21
> fsmonitor: be robust in the face of bad state
> 
> fsmonitor could write out bad state if interrupted part way through, and
> would then crash when it tried to read it back in.
> 
> Make both sides of the operation more robust - reading state should fail
> cleanly, and we can use atomictemp to write out cleanly as the file is
> small. Between the two, we shouldn't crash with an IndexError any more.
> 
> diff --git a/hgext/fsmonitor/state.py b/hgext/fsmonitor/state.py
> --- a/hgext/fsmonitor/state.py
> +++ b/hgext/fsmonitor/state.py
> @@ -59,6 +59,12 @@
>              state = file.read().split('\0')
>              # state = hostname\0clock\0ignorehash\0 + list of files, each
>              # followed by a \0
> +            if len(state) < 3:
> +                self._ui.log(
> +                    'fsmonitor', 'fsmonitor: state file truncated (expected '
> +                    '3 chunks, found %d), nuking state\n' % len(state))
> +                self.invalidate()
> +                return None, None, None
>              diskhostname = state[0]
>              hostname = socket.gethostname()
>              if diskhostname != hostname:
> @@ -85,12 +91,12 @@
>              return
>  
>          try:
> -            file = self._opener('fsmonitor.state', 'wb')
> +            file = self._opener('fsmonitor.state', 'wb', atomictemp=True)
>          except (IOError, OSError):
>              self._ui.warn(_("warning: unable to write out fsmonitor state\n"))
>              return
>  
> -        try:
> +        with file:
>              file.write(struct.pack(_versionformat, _version))
>              file.write(socket.gethostname() + '\0')
>              file.write(clock + '\0')
> @@ -98,8 +104,6 @@
>              if notefiles:
>                  file.write('\0'.join(notefiles))
>                  file.write('\0')
> -        finally:
> -            file.close()
>  
>      def invalidate(self):
>          try:
Martijn Pieters - Nov. 25, 2016, 5:18 p.m.
On 25 November 2016 at 15:30, Simon Farnsworth <simonfar@fb.com> wrote:

> # HG changeset patch
> # User Simon Farnsworth <simonfar@fb.com>
> # Date 1480087846 28800
> #      Fri Nov 25 07:30:46 2016 -0800
> # Node ID 0ca34f1b83da754246ee33e01c4f7d6652061f5d
> # Parent  a3163433647108b7bec8fc45896db1c20b18ab21
> fsmonitor: be robust in the face of bad state
>
> fsmonitor could write out bad state if interrupted part way through, and
> would then crash when it tried to read it back in.
>
> Make both sides of the operation more robust - reading state should fail
> cleanly, and we can use atomictemp to write out cleanly as the file is
> small. Between the two, we shouldn't crash with an IndexError any more.
>

I stepped through this with Simon; specifically using the atomic file
object as a context manager ensures that it never replaces the old state
file if an exception occurs; the previous version would close the new file
no matter what happened (and so leaving an incomplete file).

I traced what would happen if there are exactly 3 elements (so notefiles
would be an empty list); the code that uses state.get() can set notefiles
to an empty list in various places *anyway*, so the list being empty is a
valid edgecase.

In other words, looks great to me.



> diff --git a/hgext/fsmonitor/state.py b/hgext/fsmonitor/state.py
> --- a/hgext/fsmonitor/state.py
> +++ b/hgext/fsmonitor/state.py
> @@ -59,6 +59,12 @@
>              state = file.read().split('\0')
>              # state = hostname\0clock\0ignorehash\0 + list of files, each
>              # followed by a \0
> +            if len(state) < 3:
> +                self._ui.log(
> +                    'fsmonitor', 'fsmonitor: state file truncated
> (expected '
> +                    '3 chunks, found %d), nuking state\n' % len(state))
> +                self.invalidate()
> +                return None, None, None
>              diskhostname = state[0]
>              hostname = socket.gethostname()
>              if diskhostname != hostname:
> @@ -85,12 +91,12 @@
>              return
>
>          try:
> -            file = self._opener('fsmonitor.state', 'wb')
> +            file = self._opener('fsmonitor.state', 'wb', atomictemp=True)
>          except (IOError, OSError):
>              self._ui.warn(_("warning: unable to write out fsmonitor
> state\n"))
>              return
>
> -        try:
> +        with file:
>              file.write(struct.pack(_versionformat, _version))
>              file.write(socket.gethostname() + '\0')
>              file.write(clock + '\0')
> @@ -98,8 +104,6 @@
>              if notefiles:
>                  file.write('\0'.join(notefiles))
>                  file.write('\0')
> -        finally:
> -            file.close()
>
>      def invalidate(self):
>          try:
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
Yuya Nishihara - Nov. 27, 2016, 9:35 a.m.
On Fri, 25 Nov 2016 07:30:53 -0800, Simon Farnsworth wrote:
> # HG changeset patch
> # User Simon Farnsworth <simonfar@fb.com>
> # Date 1480087846 28800
> #      Fri Nov 25 07:30:46 2016 -0800
> # Node ID 0ca34f1b83da754246ee33e01c4f7d6652061f5d
> # Parent  a3163433647108b7bec8fc45896db1c20b18ab21
> fsmonitor: be robust in the face of bad state

Queued per reviews, thanks.

> --- a/hgext/fsmonitor/state.py
> +++ b/hgext/fsmonitor/state.py
> @@ -59,6 +59,12 @@
>              state = file.read().split('\0')
>              # state = hostname\0clock\0ignorehash\0 + list of files, each
>              # followed by a \0
> +            if len(state) < 3:
> +                self._ui.log(
> +                    'fsmonitor', 'fsmonitor: state file truncated (expected '
> +                    '3 chunks, found %d), nuking state\n' % len(state))

ui.log() takes (event, fmtstr, fmtargs...), updated in flight.

Patch

diff --git a/hgext/fsmonitor/state.py b/hgext/fsmonitor/state.py
--- a/hgext/fsmonitor/state.py
+++ b/hgext/fsmonitor/state.py
@@ -59,6 +59,12 @@ 
             state = file.read().split('\0')
             # state = hostname\0clock\0ignorehash\0 + list of files, each
             # followed by a \0
+            if len(state) < 3:
+                self._ui.log(
+                    'fsmonitor', 'fsmonitor: state file truncated (expected '
+                    '3 chunks, found %d), nuking state\n' % len(state))
+                self.invalidate()
+                return None, None, None
             diskhostname = state[0]
             hostname = socket.gethostname()
             if diskhostname != hostname:
@@ -85,12 +91,12 @@ 
             return
 
         try:
-            file = self._opener('fsmonitor.state', 'wb')
+            file = self._opener('fsmonitor.state', 'wb', atomictemp=True)
         except (IOError, OSError):
             self._ui.warn(_("warning: unable to write out fsmonitor state\n"))
             return
 
-        try:
+        with file:
             file.write(struct.pack(_versionformat, _version))
             file.write(socket.gethostname() + '\0')
             file.write(clock + '\0')
@@ -98,8 +104,6 @@ 
             if notefiles:
                 file.write('\0'.join(notefiles))
                 file.write('\0')
-        finally:
-            file.close()
 
     def invalidate(self):
         try: