Patchwork D10725: revlog: update data file record before index rename

login
register
mail settings
Submitter phabricator
Date May 18, 2021, 2:27 a.m.
Message ID <differential-rev-PHID-DREV-odndjshi22du67dx7c4d-req@mercurial-scm.org>
Download mbox | patch
Permalink /patch/49045/
State Superseded
Headers show

Comments

phabricator - May 18, 2021, 2:27 a.m.
joerg.sonnenberger created this revision.
Herald added a reviewer: indygreg.
Herald added a reviewer: hg-reviewers.
Herald added a subscriber: mercurial-patches.

REVISION SUMMARY
  Whe migrating from inline to non-inline data storage, the data file is
  recorded initially as zero sized so that it is removed on failure. But
  the record has to be updated before the index is renamed since otherwise
  data is lost on rollback.

REPOSITORY
  rHG Mercurial

BRANCH
  default

REVISION DETAIL
  https://phab.mercurial-scm.org/D10725

AFFECTED FILES
  mercurial/revlog.py
  tests/test-transaction-rollback-on-revlog-split.t

CHANGE DETAILS




To: joerg.sonnenberger, indygreg, #hg-reviewers
Cc: mercurial-patches, mercurial-devel

Patch

diff --git a/tests/test-transaction-rollback-on-revlog-split.t b/tests/test-transaction-rollback-on-revlog-split.t
--- a/tests/test-transaction-rollback-on-revlog-split.t
+++ b/tests/test-transaction-rollback-on-revlog-split.t
@@ -1,7 +1,27 @@ 
 Test correctness of revlog inline -> non-inline transition
 ----------------------------------------------------------
 
+Helper extension to intercept renames.
+
+  $ cat > $TESTTMP/intercept_rename.py << EOF
+  > from mercurial.extensions import wrapfunction
+  > from mercurial.util import atomictempfile
+  > import os, sys
+  > 
+  > def extsetup(ui):
+  >     def close(orig, *args, **kwargs):
+  >         path = args[0]._atomictempfile__name
+  >         if path.endswith(b'/.hg/store/data/file.i'):
+  >             os._exit(80)
+  >         return orig(*args, **kwargs)
+  >     wrapfunction(atomictempfile, 'close', close)
+  > EOF
+
+
 Test offset computation to correctly factor in the index entries themselve.
+Also test that the new data size has the correct size if the transaction is aborted
+after the index has been replaced.
+
 Test repo has one small, one moderate and one big change. The clone has
 the small and moderate change and will transition to non-inline storage when
 adding the big change.
@@ -18,6 +38,12 @@ 
 
   $ hg clone -r 1 troffset-computation troffset-computation-copy --config format.revlog-compression=none -q
   $ cd troffset-computation-copy
+
+Reference size:
+
+  $ f -s .hg/store/data/file*
+  .hg/store/data/file.i: size=1174
+
   $ cat > .hg/hgrc <<EOF
   > [hooks]
   > pretxnchangegroup = python:$TESTDIR/helper-killhook.py:killme
@@ -25,5 +51,36 @@ 
   $ hg pull ../troffset-computation
   pulling from ../troffset-computation
   [80]
-  $ cat .hg/store/journal | tr -s '\000' ' ' | grep data/file | tail -1
+
+The first file.i entry should match the size above.
+The first file.d entry is the temporary record during the split,
+the second entry after the split happened. The sum of the second file.d
+and the second file.i entry should match the first file.i entry.
+
+  $ cat .hg/store/journal | tr -s '\000' ' ' | grep data/file
+  data/file.i 1174
+  data/file.d 0
+  data/file.d 1046
   data/file.i 128
+  $ cd ..
+
+Now retry the same but intercept the rename of the index and check that
+the journal does not contain the new index size. This demonstrates the edge case
+where the data file is left as garbage.
+
+  $ hg clone -r 1 troffset-computation troffset-computation-copy2 --config format.revlog-compression=none -q
+  $ cd troffset-computation-copy2
+  $ cat > .hg/hgrc <<EOF
+  > [extensions]
+  > intercept_rename = $TESTTMP/intercept_rename.py
+  > [hooks]
+  > pretxnchangegroup = python:$TESTDIR/helper-killhook.py:killme
+  > EOF
+  $ hg pull ../troffset-computation
+  pulling from ../troffset-computation
+  [80]
+  $ cat .hg/store/journal | tr -s '\000' ' ' | grep data/file
+  data/file.i 1174
+  data/file.d 0
+  data/file.d 1046
+  $ cd ..
diff --git a/mercurial/revlog.py b/mercurial/revlog.py
--- a/mercurial/revlog.py
+++ b/mercurial/revlog.py
@@ -1999,6 +1999,12 @@ 
                     e = header + e
                 fp.write(e)
 
+            # There is a small transactional race here. If the rename of
+            # the index fails, we should remove the datafile. It is more
+            # important to ensure that the data file is not truncated
+            # when the index is replaced as otherwise data is lost.
+            tr.replace(self._datafile, self.start(trindex))
+
             # the temp file replace the real index when we exit the context
             # manager