Patchwork [4,of,8] transaction: avoid file stat ambiguity only for files in blacklist

login
register
mail settings
Submitter Katsunori FUJIWARA
Date June 29, 2017, 4:53 p.m.
Message ID <d4d47784511456e6cf77.1498755191@speaknoevil>
Download mbox | patch
Permalink /patch/21828/
State Accepted
Headers show

Comments

Katsunori FUJIWARA - June 29, 2017, 4:53 p.m.
# HG changeset patch
# User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
# Date 1498754869 -32400
#      Fri Jun 30 01:47:49 2017 +0900
# Node ID d4d47784511456e6cf77b194822f23586e052312
# Parent  9114ae36f40b733912705ef0bacb2961941f18d6
transaction: avoid file stat ambiguity only for files in blacklist

Advancing mtime by os.utime() fails for EPERM, if the target file is
owned by another. bff5ccbe5ead and related changes made some code
paths give advancing mtime up in such case, to fix issue5418.

This causes file stat ambiguity (again), if it is owned by another.

    https://www.mercurial-scm.org/wiki/ExactCacheValidationPlan

To avoid file stat ambiguity in such case, especially for
.hg/dirstate, ed66ec39933f made vfs.rename() copy the target file, and
advance mtime of renamed one again, if EPERM (see issue5584 for detail).

But straightforward "copy if EPERM" isn't reasonable for truncation of
append-only files at rollbacking, because rollbacking might cost much
for truncation of many filelogs, even though filelogs aren't
filecache-ed.

Therefore, this patch introduces blacklist "noambigstatfiles", and
avoids file stat ambiguity only for files specified in this blacklist.

This patch consists of two parts below, which should be applied at
once in order to avoid regression.

  - specify 'checkambig=True' at vfs.open(mode='a') in _playback()
    according to noambigstatfiles

  - invoke _playback() with noambigstatfiles
    - add transaction.__init__() noambigstatfiles argument, for _abort()
      - make localrepo instantiate transaction with _cachedfiles
    - add rollback() noambigstatfiles argument, for "hg rollback/recover"
      - make localrepo invoke rollback() with _cachedfiles

After this patch, straightforward "copy if EPERM" will be reasonable
at closing the file opened with checkambig=True, because this policy
is applied only on files, which are listed in blacklist
"noambigstatfiles".
Yuya Nishihara - July 2, 2017, 2:06 p.m.
On Fri, 30 Jun 2017 01:53:11 +0900, FUJIWARA Katsunori wrote:
> # HG changeset patch
> # User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
> # Date 1498754869 -32400
> #      Fri Jun 30 01:47:49 2017 +0900
> # Node ID d4d47784511456e6cf77b194822f23586e052312
> # Parent  9114ae36f40b733912705ef0bacb2961941f18d6
> transaction: avoid file stat ambiguity only for files in blacklist

> diff --git a/mercurial/transaction.py b/mercurial/transaction.py
> --- a/mercurial/transaction.py
> +++ b/mercurial/transaction.py
> @@ -44,11 +44,12 @@ def active(func):
>      return _active
>  
>  def _playback(journal, report, opener, vfsmap, entries, backupentries,
> -              unlink=True):
> +              unlink=True, noambigstatfiles=None):
>      for f, o, _ignore in entries:
>          if o or not unlink:
> +            checkambig = noambigstatfiles and (f, '') in noambigstatfiles
>              try:
> -                fp = opener(f, 'a', checkambig=True)
> +                fp = opener(f, 'a', checkambig=checkambig)
>                  fp.truncate(o)
>                  fp.close()
>              except IOError:
> @@ -101,7 +102,8 @@ def _playback(journal, report, opener, v
>  
>  class transaction(object):
>      def __init__(self, report, opener, vfsmap, journalname, undoname=None,
> -                 after=None, createmode=None, validator=None, releasefn=None):
> +                 after=None, createmode=None, validator=None, releasefn=None,
> +                 noambigstatfiles=None):
>          """Begin a new transaction
>  
>          Begins a new transaction that allows rolling back writes in the event of
> @@ -110,6 +112,10 @@ class transaction(object):
>          * `after`: called after the transaction has been committed
>          * `createmode`: the mode of the journal file that will be created
>          * `releasefn`: called after releasing (with transaction and result)
> +
> +        `noambigstatfiles` is a set of (path, vfs-location) tuples,
> +        which determine whether file stat ambiguity should be avoided
> +        for corresponded files.

IIUC, this is a set of files that could get ambiguous, but "noambig" semed to
say the opposite.

Other than that, the code looks fine. FWIW, I'm getting a feeling that we've
failed to design a sensible cache-validation API because it's quite hard to
rely only on mtime/ctime.
Katsunori FUJIWARA - July 2, 2017, 2:55 p.m.
At Sun, 2 Jul 2017 23:06:04 +0900,
Yuya Nishihara wrote:
> 
> On Fri, 30 Jun 2017 01:53:11 +0900, FUJIWARA Katsunori wrote:
> > # HG changeset patch
> > # User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
> > # Date 1498754869 -32400
> > #      Fri Jun 30 01:47:49 2017 +0900
> > # Node ID d4d47784511456e6cf77b194822f23586e052312
> > # Parent  9114ae36f40b733912705ef0bacb2961941f18d6
> > transaction: avoid file stat ambiguity only for files in blacklist
> 
> > diff --git a/mercurial/transaction.py b/mercurial/transaction.py
> > --- a/mercurial/transaction.py
> > +++ b/mercurial/transaction.py
> > @@ -44,11 +44,12 @@ def active(func):
> >      return _active
> >  
> >  def _playback(journal, report, opener, vfsmap, entries, backupentries,
> > -              unlink=True):
> > +              unlink=True, noambigstatfiles=None):
> >      for f, o, _ignore in entries:
> >          if o or not unlink:
> > +            checkambig = noambigstatfiles and (f, '') in noambigstatfiles
> >              try:
> > -                fp = opener(f, 'a', checkambig=True)
> > +                fp = opener(f, 'a', checkambig=checkambig)
> >                  fp.truncate(o)
> >                  fp.close()
> >              except IOError:
> > @@ -101,7 +102,8 @@ def _playback(journal, report, opener, v
> >  
> >  class transaction(object):
> >      def __init__(self, report, opener, vfsmap, journalname, undoname=None,
> > -                 after=None, createmode=None, validator=None, releasefn=None):
> > +                 after=None, createmode=None, validator=None, releasefn=None,
> > +                 noambigstatfiles=None):
> >          """Begin a new transaction
> >  
> >          Begins a new transaction that allows rolling back writes in the event of
> > @@ -110,6 +112,10 @@ class transaction(object):
> >          * `after`: called after the transaction has been committed
> >          * `createmode`: the mode of the journal file that will be created
> >          * `releasefn`: called after releasing (with transaction and result)
> > +
> > +        `noambigstatfiles` is a set of (path, vfs-location) tuples,
> > +        which determine whether file stat ambiguity should be avoided
> > +        for corresponded files.
> 
> IIUC, this is a set of files that could get ambiguous, but "noambig" semed to
> say the opposite.

I named this as "(caller requires) no ambiguity of stat for files (in
this set)". But I agree that this name is not understandable :-)

How about "avoidambigfiles" or "avoidstatambigfiles" ? or any ideas ?


> Other than that, the code looks fine. FWIW, I'm getting a feeling that we've
> failed to design a sensible cache-validation API because it's quite hard to
> rely only on mtime/ctime.

I agree, too. But it also is fact that there is no other better (and
portable) information to achieve exact cache validation :-<
Yuya Nishihara - July 3, 2017, 12:44 p.m.
On Sun, 02 Jul 2017 23:55:15 +0900, FUJIWARA Katsunori wrote:
> At Sun, 2 Jul 2017 23:06:04 +0900,
> Yuya Nishihara wrote:
> > 
> > On Fri, 30 Jun 2017 01:53:11 +0900, FUJIWARA Katsunori wrote:
> > > # HG changeset patch
> > > # User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
> > > # Date 1498754869 -32400
> > > #      Fri Jun 30 01:47:49 2017 +0900
> > > # Node ID d4d47784511456e6cf77b194822f23586e052312
> > > # Parent  9114ae36f40b733912705ef0bacb2961941f18d6
> > > transaction: avoid file stat ambiguity only for files in blacklist
> > 
> > > diff --git a/mercurial/transaction.py b/mercurial/transaction.py
> > > --- a/mercurial/transaction.py
> > > +++ b/mercurial/transaction.py
> > > @@ -44,11 +44,12 @@ def active(func):
> > >      return _active
> > >  
> > >  def _playback(journal, report, opener, vfsmap, entries, backupentries,
> > > -              unlink=True):
> > > +              unlink=True, noambigstatfiles=None):
> > >      for f, o, _ignore in entries:
> > >          if o or not unlink:
> > > +            checkambig = noambigstatfiles and (f, '') in noambigstatfiles
> > >              try:
> > > -                fp = opener(f, 'a', checkambig=True)
> > > +                fp = opener(f, 'a', checkambig=checkambig)
> > >                  fp.truncate(o)
> > >                  fp.close()
> > >              except IOError:
> > > @@ -101,7 +102,8 @@ def _playback(journal, report, opener, v
> > >  
> > >  class transaction(object):
> > >      def __init__(self, report, opener, vfsmap, journalname, undoname=None,
> > > -                 after=None, createmode=None, validator=None, releasefn=None):
> > > +                 after=None, createmode=None, validator=None, releasefn=None,
> > > +                 noambigstatfiles=None):
> > >          """Begin a new transaction
> > >  
> > >          Begins a new transaction that allows rolling back writes in the event of
> > > @@ -110,6 +112,10 @@ class transaction(object):
> > >          * `after`: called after the transaction has been committed
> > >          * `createmode`: the mode of the journal file that will be created
> > >          * `releasefn`: called after releasing (with transaction and result)
> > > +
> > > +        `noambigstatfiles` is a set of (path, vfs-location) tuples,
> > > +        which determine whether file stat ambiguity should be avoided
> > > +        for corresponded files.
> > 
> > IIUC, this is a set of files that could get ambiguous, but "noambig" semed to
> > say the opposite.
> 
> I named this as "(caller requires) no ambiguity of stat for files (in
> this set)". But I agree that this name is not understandable :-)
> 
> How about "avoidambigfiles" or "avoidstatambigfiles" ? or any ideas ?

or "checkambigfiles" as the boolean flag is called as "checkambig"? I don't
know which is preferable, but any of these seem better than "noambig".
Katsunori FUJIWARA - July 3, 2017, 4:18 p.m.
At Mon, 3 Jul 2017 21:44:18 +0900,
Yuya Nishihara wrote:
> 
> On Sun, 02 Jul 2017 23:55:15 +0900, FUJIWARA Katsunori wrote:
> > At Sun, 2 Jul 2017 23:06:04 +0900,
> > Yuya Nishihara wrote:
> > > 
> > > On Fri, 30 Jun 2017 01:53:11 +0900, FUJIWARA Katsunori wrote:
> > > > # HG changeset patch
> > > > # User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
> > > > # Date 1498754869 -32400
> > > > #      Fri Jun 30 01:47:49 2017 +0900
> > > > # Node ID d4d47784511456e6cf77b194822f23586e052312
> > > > # Parent  9114ae36f40b733912705ef0bacb2961941f18d6
> > > > transaction: avoid file stat ambiguity only for files in blacklist
> > > 
> > > > diff --git a/mercurial/transaction.py b/mercurial/transaction.py
> > > > --- a/mercurial/transaction.py
> > > > +++ b/mercurial/transaction.py
> > > > @@ -44,11 +44,12 @@ def active(func):
> > > >      return _active
> > > >  
> > > >  def _playback(journal, report, opener, vfsmap, entries, backupentries,
> > > > -              unlink=True):
> > > > +              unlink=True, noambigstatfiles=None):
> > > >      for f, o, _ignore in entries:
> > > >          if o or not unlink:
> > > > +            checkambig = noambigstatfiles and (f, '') in noambigstatfiles
> > > >              try:
> > > > -                fp = opener(f, 'a', checkambig=True)
> > > > +                fp = opener(f, 'a', checkambig=checkambig)
> > > >                  fp.truncate(o)
> > > >                  fp.close()
> > > >              except IOError:
> > > > @@ -101,7 +102,8 @@ def _playback(journal, report, opener, v
> > > >  
> > > >  class transaction(object):
> > > >      def __init__(self, report, opener, vfsmap, journalname, undoname=None,
> > > > -                 after=None, createmode=None, validator=None, releasefn=None):
> > > > +                 after=None, createmode=None, validator=None, releasefn=None,
> > > > +                 noambigstatfiles=None):
> > > >          """Begin a new transaction
> > > >  
> > > >          Begins a new transaction that allows rolling back writes in the event of
> > > > @@ -110,6 +112,10 @@ class transaction(object):
> > > >          * `after`: called after the transaction has been committed
> > > >          * `createmode`: the mode of the journal file that will be created
> > > >          * `releasefn`: called after releasing (with transaction and result)
> > > > +
> > > > +        `noambigstatfiles` is a set of (path, vfs-location) tuples,
> > > > +        which determine whether file stat ambiguity should be avoided
> > > > +        for corresponded files.
> > > 
> > > IIUC, this is a set of files that could get ambiguous, but "noambig" semed to
> > > say the opposite.
> > 
> > I named this as "(caller requires) no ambiguity of stat for files (in
> > this set)". But I agree that this name is not understandable :-)
> > 
> > How about "avoidambigfiles" or "avoidstatambigfiles" ? or any ideas ?
> 
> or "checkambigfiles" as the boolean flag is called as "checkambig"? I don't
> know which is preferable, but any of these seem better than "noambig".
> 

I'll use it, thanks!

Patch

diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -1093,7 +1093,8 @@  class localrepository(object):
                                      aftertrans(renames),
                                      self.store.createmode,
                                      validator=validate,
-                                     releasefn=releasefn)
+                                     releasefn=releasefn,
+                                     noambigstatfiles=_cachedfiles)
         tr.changes['revs'] = set()
 
         tr.hookargs['txnid'] = txnid
@@ -1158,7 +1159,8 @@  class localrepository(object):
                 vfsmap = {'': self.svfs,
                           'plain': self.vfs,}
                 transaction.rollback(self.svfs, vfsmap, "journal",
-                                     self.ui.warn)
+                                     self.ui.warn,
+                                     noambigstatfiles=_cachedfiles)
                 self.invalidate()
                 return True
             else:
@@ -1214,7 +1216,8 @@  class localrepository(object):
         parents = self.dirstate.parents()
         self.destroying()
         vfsmap = {'plain': self.vfs, '': self.svfs}
-        transaction.rollback(self.svfs, vfsmap, 'undo', ui.warn)
+        transaction.rollback(self.svfs, vfsmap, 'undo', ui.warn,
+                             noambigstatfiles=_cachedfiles)
         if self.vfs.exists('undo.bookmarks'):
             self.vfs.rename('undo.bookmarks', 'bookmarks', checkambig=True)
         if self.svfs.exists('undo.phaseroots'):
diff --git a/mercurial/transaction.py b/mercurial/transaction.py
--- a/mercurial/transaction.py
+++ b/mercurial/transaction.py
@@ -44,11 +44,12 @@  def active(func):
     return _active
 
 def _playback(journal, report, opener, vfsmap, entries, backupentries,
-              unlink=True):
+              unlink=True, noambigstatfiles=None):
     for f, o, _ignore in entries:
         if o or not unlink:
+            checkambig = noambigstatfiles and (f, '') in noambigstatfiles
             try:
-                fp = opener(f, 'a', checkambig=True)
+                fp = opener(f, 'a', checkambig=checkambig)
                 fp.truncate(o)
                 fp.close()
             except IOError:
@@ -101,7 +102,8 @@  def _playback(journal, report, opener, v
 
 class transaction(object):
     def __init__(self, report, opener, vfsmap, journalname, undoname=None,
-                 after=None, createmode=None, validator=None, releasefn=None):
+                 after=None, createmode=None, validator=None, releasefn=None,
+                 noambigstatfiles=None):
         """Begin a new transaction
 
         Begins a new transaction that allows rolling back writes in the event of
@@ -110,6 +112,10 @@  class transaction(object):
         * `after`: called after the transaction has been committed
         * `createmode`: the mode of the journal file that will be created
         * `releasefn`: called after releasing (with transaction and result)
+
+        `noambigstatfiles` is a set of (path, vfs-location) tuples,
+        which determine whether file stat ambiguity should be avoided
+        for corresponded files.
         """
         self.count = 1
         self.usages = 1
@@ -137,6 +143,10 @@  class transaction(object):
             releasefn = lambda tr, success: None
         self.releasefn = releasefn
 
+        self.noambigstatfiles = set()
+        if noambigstatfiles:
+            self.noambigstatfiles.update(noambigstatfiles)
+
         # A dict dedicated to precisely tracking the changes introduced in the
         # transaction.
         self.changes = {}
@@ -564,7 +574,8 @@  class transaction(object):
                 # Prevent double usage and help clear cycles.
                 self._abortcallback = None
                 _playback(self.journal, self.report, self.opener, self._vfsmap,
-                          self.entries, self._backupentries, False)
+                          self.entries, self._backupentries, False,
+                          noambigstatfiles=self.noambigstatfiles)
                 self.report(_("rollback completed\n"))
             except BaseException:
                 self.report(_("rollback failed - please run hg recover\n"))
@@ -573,7 +584,7 @@  class transaction(object):
             self.releasefn(self, False) # notify failure of transaction
             self.releasefn = None # Help prevent cycles.
 
-def rollback(opener, vfsmap, file, report):
+def rollback(opener, vfsmap, file, report, noambigstatfiles=None):
     """Rolls back the transaction contained in the given file
 
     Reads the entries in the specified file, and the corresponding
@@ -584,6 +595,10 @@  def rollback(opener, vfsmap, file, repor
     file\0offset pairs, delimited by newlines. The corresponding
     '*.backupfiles' file should contain a list of file\0backupfile
     pairs, delimited by \0.
+
+    `noambigstatfiles` is a set of (path, vfs-location) tuples,
+    which determine whether file stat ambiguity should be avoided at
+    restoring corresponded files.
     """
     entries = []
     backupentries = []
@@ -615,4 +630,5 @@  def rollback(opener, vfsmap, file, repor
                 report(_("journal was created by a different version of "
                          "Mercurial\n"))
 
-    _playback(file, report, opener, vfsmap, entries, backupentries)
+    _playback(file, report, opener, vfsmap, entries, backupentries,
+              noambigstatfiles=noambigstatfiles)
diff --git a/tests/test-mq-qpush-fail.t b/tests/test-mq-qpush-fail.t
--- a/tests/test-mq-qpush-fail.t
+++ b/tests/test-mq-qpush-fail.t
@@ -39,8 +39,9 @@  test qpush on empty series
   > from mercurial import extensions, transaction
   > def wrapplayback(orig,
   >                  journal, report, opener, vfsmap, entries, backupentries,
-  >                  unlink=True):
-  >     orig(journal, report, opener, vfsmap, entries, backupentries, unlink)
+  >                  unlink=True, noambigstatfiles=None):
+  >     orig(journal, report, opener, vfsmap, entries, backupentries, unlink,
+  >          noambigstatfiles)
   >     # Touching files truncated at "transaction.abort" causes
   >     # forcible re-loading invalidated filecache properties
   >     # (including repo.changelog)