Patchwork [1,of,9,V2] cmdutil: add the class to restore dirstate at unexpected failure easily

login
register
mail settings
Submitter Katsunori FUJIWARA
Date May 2, 2015, 3:59 p.m.
Message ID <fac0b879377d014c4081.1430582376@feefifofum>
Download mbox | patch
Permalink /patch/8854/
State Superseded
Headers show

Comments

Katsunori FUJIWARA - May 2, 2015, 3:59 p.m.
# HG changeset patch
# User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
# Date 1430582197 -32400
#      Sun May 03 00:56:37 2015 +0900
# Node ID fac0b879377d014c4081940040d092469635447a
# Parent  e9edd53770fb77a9787a3e6592a3bf0a29c1bd80
cmdutil: add the class to restore dirstate at unexpected failure easily

Before this patch, after "dirstate.write()" execution, there is no way
to restore dirstate to the original status before "dirstate.write()".

In some code paths, "dirstate.invalidate()" is used as a kind of
"restore .hg/dirstate to the original status". But it just avoids
writing changes in memory out, and doesn't actually restore
".hg/dirstate" file. "dirstate.write()" prevents it from working as
expected.

To fix the issue that recent (in memory) dirstate isn't visible to
external process (e.g. "precommit" hooks), "dirstate.write()" should
be invoked before invocation of external process. But at the same
time, ".hg/dirstate" should be restored to the status before
"dirstate.write()" at unexpected failure in some cases.

This patch adds the class "dirstateguard" to easily restore
".hg/dirstate" at unexpected failure. Typical usecase of it is:

    # (1) build dirstate up
    ....

    # (2) write dirstate out, and backup ".hg/dirstate"
    dsguard = dirstateguard(repo, 'scopename')
    try:
        # (3) execute somethig to do:
        #     this may imply making some additional changes on dirstate
        ....

        # (4) unlink backup-ed dirstate file at the end of dsguard scope
        dsguard.close()
    finally:
        # (5) if execution is aborted before "dsguard.close()",
        #     ".hg/dirstate" is restored from the backup
        dsguard.release()

For this kind of issue, "extending transaction" approach (in
https://titanpad.com/mercurial32-sprint) seems not to be suitable,
because:

  - transaction nesting occurs in some cases (e.g. "shelve => rebase"), and

  - "dirstate" may be already modified since the beginning of OUTER
    transaction scope, then

  - dirstate should be backed up into the file other than
    "dirstate.journal" at the beginning of INNER transaction scope, but

  - such alternative backup files are useless for transaction itself,
    and increases complication of its implementation

"transaction" and "dirstateguard" differ from each other also in "what
it should do for .hg/dirstate" in cases other than success.

  ============== ======= ======== =============
  type           success fail     "hg rollback"
  ============== ======= ======== =============
  transaction    keep     keep     restore
  dirstateguard  keep     restore  (not implied)
  ============== ======= ======== =============
Yuya Nishihara - May 3, 2015, 10:16 a.m.
On Sun, 03 May 2015 00:59:36 +0900, FUJIWARA Katsunori wrote:
> # HG changeset patch
> # User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
> # Date 1430582197 -32400
> #      Sun May 03 00:56:37 2015 +0900
> # Node ID fac0b879377d014c4081940040d092469635447a
> # Parent  e9edd53770fb77a9787a3e6592a3bf0a29c1bd80
> cmdutil: add the class to restore dirstate at unexpected failure easily
> 
> Before this patch, after "dirstate.write()" execution, there is no way
> to restore dirstate to the original status before "dirstate.write()".
> 
> In some code paths, "dirstate.invalidate()" is used as a kind of
> "restore .hg/dirstate to the original status". But it just avoids
> writing changes in memory out, and doesn't actually restore
> ".hg/dirstate" file. "dirstate.write()" prevents it from working as
> expected.
> 
> To fix the issue that recent (in memory) dirstate isn't visible to
> external process (e.g. "precommit" hooks), "dirstate.write()" should
> be invoked before invocation of external process. But at the same
> time, ".hg/dirstate" should be restored to the status before
> "dirstate.write()" at unexpected failure in some cases.
> 
> This patch adds the class "dirstateguard" to easily restore
> ".hg/dirstate" at unexpected failure. Typical usecase of it is:
> 
>     # (1) build dirstate up
>     ....
> 
>     # (2) write dirstate out, and backup ".hg/dirstate"
>     dsguard = dirstateguard(repo, 'scopename')
>     try:
>         # (3) execute somethig to do:
>         #     this may imply making some additional changes on dirstate
>         ....
> 
>         # (4) unlink backup-ed dirstate file at the end of dsguard scope
>         dsguard.close()
>     finally:
>         # (5) if execution is aborted before "dsguard.close()",
>         #     ".hg/dirstate" is restored from the backup
>         dsguard.release()
[...]
> +    def _abort(self):
> +        # this "invalidate()" prevents "wlock.release()" from writing
> +        # changes of dirstate out after restoring to original status
> +        self.repo.dirstate.invalidate()
> +
> +        self.repo.vfs.write('dirstate', self.repo.vfs.read(self.name))
> +        self.repo.vfs.unlink(self.name)
> +        self.active = False

Do you avoid atomic rename for some reason?
Another process could read 'dirstate' in the middle of vfs.write() even if
wlock was taken.
Katsunori FUJIWARA - May 5, 2015, 5:19 p.m.
At Sun, 3 May 2015 19:16:39 +0900,
Yuya Nishihara wrote:
> 
> On Sun, 03 May 2015 00:59:36 +0900, FUJIWARA Katsunori wrote:
> > # HG changeset patch
> > # User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
> > # Date 1430582197 -32400
> > #      Sun May 03 00:56:37 2015 +0900
> > # Node ID fac0b879377d014c4081940040d092469635447a
> > # Parent  e9edd53770fb77a9787a3e6592a3bf0a29c1bd80
> > cmdutil: add the class to restore dirstate at unexpected failure easily
> > 
> > Before this patch, after "dirstate.write()" execution, there is no way
> > to restore dirstate to the original status before "dirstate.write()".
> > 
> > In some code paths, "dirstate.invalidate()" is used as a kind of
> > "restore .hg/dirstate to the original status". But it just avoids
> > writing changes in memory out, and doesn't actually restore
> > ".hg/dirstate" file. "dirstate.write()" prevents it from working as
> > expected.
> > 
> > To fix the issue that recent (in memory) dirstate isn't visible to
> > external process (e.g. "precommit" hooks), "dirstate.write()" should
> > be invoked before invocation of external process. But at the same
> > time, ".hg/dirstate" should be restored to the status before
> > "dirstate.write()" at unexpected failure in some cases.
> > 
> > This patch adds the class "dirstateguard" to easily restore
> > ".hg/dirstate" at unexpected failure. Typical usecase of it is:
> > 
> >     # (1) build dirstate up
> >     ....
> > 
> >     # (2) write dirstate out, and backup ".hg/dirstate"
> >     dsguard = dirstateguard(repo, 'scopename')
> >     try:
> >         # (3) execute somethig to do:
> >         #     this may imply making some additional changes on dirstate
> >         ....
> > 
> >         # (4) unlink backup-ed dirstate file at the end of dsguard scope
> >         dsguard.close()
> >     finally:
> >         # (5) if execution is aborted before "dsguard.close()",
> >         #     ".hg/dirstate" is restored from the backup
> >         dsguard.release()
> [...]
> > +    def _abort(self):
> > +        # this "invalidate()" prevents "wlock.release()" from writing
> > +        # changes of dirstate out after restoring to original status
> > +        self.repo.dirstate.invalidate()
> > +
> > +        self.repo.vfs.write('dirstate', self.repo.vfs.read(self.name))
> > +        self.repo.vfs.unlink(self.name)
> > +        self.active = False
> 
> Do you avoid atomic rename for some reason?
> Another process could read 'dirstate' in the middle of vfs.write() even if
> wlock was taken.
> 

I just forgot that normal "vfs.write()" doesn't use "atomictemp=True".

I'll write dirstate out via fp opened with "atomictemp=True()" in
revised series.

----------------------------------------------------------------------
[FUJIWARA Katsunori]                             foozy@lares.dti.ne.jp

Patch

diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py
--- a/mercurial/cmdutil.py
+++ b/mercurial/cmdutil.py
@@ -3259,3 +3259,60 @@ 
     for f, clearable, allowcommit, msg, hint in unfinishedstates:
         if clearable and repo.vfs.exists(f):
             util.unlink(repo.join(f))
+
+class dirstateguard(object):
+    '''Restore dirstate at unexpected failure.
+
+    At the construction, this class does:
+
+    - write current ``repo.dirstate`` out, and
+    - save ``.hg/dirstate`` into the backup file
+
+    This restores ``.hg/dirstate`` from backup file, if ``release()``
+    is invoked before ``close()``.
+
+    This just removes the backup file at ``close()`` before ``release()``.
+    '''
+
+    def __init__(self, repo, name):
+        repo.dirstate.write()
+        self.repo = repo
+        self.name = 'dirstate.backup.%s' % name
+        repo.vfs.write(self.name, repo.vfs.tryread('dirstate'))
+        self.active = True
+        self.closed = False
+
+    def __del__(self):
+        if self.active: # still active
+            # this may occur, even if this class is used correctly:
+            # for example, releasing other resources like transaction
+            # may raise exception before ``dirstateguard.release`` in
+            # ``release(tr, ....)``.
+            self._abort()
+
+    def close(self):
+        if not self.active: # already inactivated
+            msg = (_("can't close already inactivated backup: %s")
+                   % self.name)
+            raise util.Abort(msg)
+
+        self.repo.vfs.unlink(self.name)
+        self.active = False
+        self.closed = True
+
+    def _abort(self):
+        # this "invalidate()" prevents "wlock.release()" from writing
+        # changes of dirstate out after restoring to original status
+        self.repo.dirstate.invalidate()
+
+        self.repo.vfs.write('dirstate', self.repo.vfs.read(self.name))
+        self.repo.vfs.unlink(self.name)
+        self.active = False
+
+    def release(self):
+        if not self.closed:
+            if not self.active: # already inactivated
+                msg = (_("can't release already inactivated backup: %s")
+                       % self.name)
+                raise util.Abort(msg)
+            self._abort()