Submitter | Pierre-Yves David |
---|---|
Date | Aug. 1, 2019, 5:55 p.m. |
Message ID | <56f34aeebffda197bacd.1564682140@nodosa.octopoid.net> |
Download | mbox | patch |
Permalink | /patch/41109/ |
State | Accepted |
Headers | show |
Comments
queued for stable, thanks > On Aug 1, 2019, at 13:55, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote: > > # HG changeset patch > # User Pierre-Yves David <pierre-yves.david@octobus.net> > # Date 1561081840 -7200 > # Fri Jun 21 03:50:40 2019 +0200 > # Branch stable > # Node ID 56f34aeebffda197bacd77cc2345386d9dd1f073 > # Parent df77a77a8b1744d93de9823fb8ac2d5c5d8c07d8 > # EXP-Topic debug-book > # Available At https://bitbucket.org/octobus/mercurial-devel/ > # hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 56f34aeebffd > bookmarks: actual fix for race condition deleting bookmark > > This is a simple but efficient fix to prevent the issue tested in > `test-bookmarks-corner-case.t`. It might be worth pursuing a more generic > approach where filecache learn to depend on each other, but that would not be > suitable for stable. > > The issue is complicated enough that I documented the race and its current > solution as inline comment. See this comment for details on the fix. > > diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py > --- a/mercurial/localrepo.py > +++ b/mercurial/localrepo.py > @@ -1227,6 +1227,55 @@ class localrepository(object): > @mixedrepostorecache(('bookmarks', 'plain'), ('bookmarks.current', 'plain'), > ('bookmarks', ''), ('00changelog.i', '')) > def _bookmarks(self): > + # Since the multiple files involved in the transaction cannot be > + # written atomically (with current repository format), there is a race > + # condition here. > + # > + # 1) changelog content A is read > + # 2) outside transaction update changelog to content B > + # 3) outside transaction update bookmark file referring to content B > + # 4) bookmarks file content is read and filtered against changelog-A > + # > + # When this happens, bookmarks against nodes missing from A are dropped. > + # > + # Having this happening during read is not great, but it become worse > + # when this happen during write because the bookmarks to the "unknown" > + # nodes will be dropped for good. However, writes happen within locks. > + # This locking makes it possible to have a race free consistent read. > + # For this purpose data read from disc before locking are > + # "invalidated" right after the locks are taken. This invalidations are > + # "light", the `filecache` mechanism keep the data in memory and will > + # reuse them if the underlying files did not changed. Not parsing the > + # same data multiple times helps performances. > + # > + # Unfortunately in the case describe above, the files tracked by the > + # bookmarks file cache might not have changed, but the in-memory > + # content is still "wrong" because we used an older changelog content > + # to process the on-disk data. So after locking, the changelog would be > + # refreshed but `_bookmarks` would be preserved. > + # Adding `00changelog.i` to the list of tracked file is not > + # enough, because at the time we build the content for `_bookmarks` in > + # (4), the changelog file has already diverged from the content used > + # for loading `changelog` in (1) > + # > + # To prevent the issue, we force the changelog to be explicitly > + # reloaded while computing `_bookmarks`. The data race can still happen > + # without the lock (with a narrower window), but it would no longer go > + # undetected during the lock time refresh. > + # > + # The new schedule is as follow > + # > + # 1) filecache logic detect that `_bookmarks` needs to be computed > + # 2) cachestat for `bookmarks` and `changelog` are captured (for book) > + # 3) We force `changelog` filecache to be tested > + # 4) cachestat for `changelog` are captured (for changelog) > + # 5) `_bookmarks` is computed and cached > + # > + # The step in (3) ensure we have a changelog at least as recent as the > + # cache stat computed in (1). As a result at locking time: > + # * if the changelog did not changed since (1) -> we can reuse the data > + # * otherwise -> the bookmarks get refreshed. > + self._refreshchangelog() > return bookmarks.bmstore(self) > > def _refreshchangelog(self): > diff --git a/tests/test-bookmarks-corner-case.t b/tests/test-bookmarks-corner-case.t > --- a/tests/test-bookmarks-corner-case.t > +++ b/tests/test-bookmarks-corner-case.t > @@ -200,6 +200,7 @@ Check raced push output. > $ cat push-output.txt > pushing to ssh://user@dummy/bookrace-server > searching for changes > + remote has heads on branch 'default' that are not known locally: f26c3b5167d1 > remote: setting raced push up > remote: adding changesets > remote: adding manifests > @@ -219,7 +220,7 @@ Check result of the push. > | summary: A1 > | > | o changeset: 3:f26c3b5167d1 > - | | bookmark: book-B (false !) > + | | bookmark: book-B > | | user: test > | | date: Thu Jan 01 00:00:00 1970 +0000 > | | summary: B1 > @@ -242,4 +243,4 @@ Check result of the push. > > $ hg -R bookrace-server book > book-A 4:9ce3b28c16de > - book-B 3:f26c3b5167d1 (false !) > + book-B 3:f26c3b5167d1 > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
I don't see them in any repositories. Did you push them somewhere ? On 8/1/19 8:54 PM, Augie Fackler wrote: > queued for stable, thanks > >> On Aug 1, 2019, at 13:55, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote: >> >> # HG changeset patch >> # User Pierre-Yves David <pierre-yves.david@octobus.net> >> # Date 1561081840 -7200 >> # Fri Jun 21 03:50:40 2019 +0200 >> # Branch stable >> # Node ID 56f34aeebffda197bacd77cc2345386d9dd1f073 >> # Parent df77a77a8b1744d93de9823fb8ac2d5c5d8c07d8 >> # EXP-Topic debug-book >> # Available At https://bitbucket.org/octobus/mercurial-devel/ >> # hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 56f34aeebffd >> bookmarks: actual fix for race condition deleting bookmark >> >> This is a simple but efficient fix to prevent the issue tested in >> `test-bookmarks-corner-case.t`. It might be worth pursuing a more generic >> approach where filecache learn to depend on each other, but that would not be >> suitable for stable. >> >> The issue is complicated enough that I documented the race and its current >> solution as inline comment. See this comment for details on the fix. >> >> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py >> --- a/mercurial/localrepo.py >> +++ b/mercurial/localrepo.py >> @@ -1227,6 +1227,55 @@ class localrepository(object): >> @mixedrepostorecache(('bookmarks', 'plain'), ('bookmarks.current', 'plain'), >> ('bookmarks', ''), ('00changelog.i', '')) >> def _bookmarks(self): >> + # Since the multiple files involved in the transaction cannot be >> + # written atomically (with current repository format), there is a race >> + # condition here. >> + # >> + # 1) changelog content A is read >> + # 2) outside transaction update changelog to content B >> + # 3) outside transaction update bookmark file referring to content B >> + # 4) bookmarks file content is read and filtered against changelog-A >> + # >> + # When this happens, bookmarks against nodes missing from A are dropped. >> + # >> + # Having this happening during read is not great, but it become worse >> + # when this happen during write because the bookmarks to the "unknown" >> + # nodes will be dropped for good. However, writes happen within locks. >> + # This locking makes it possible to have a race free consistent read. >> + # For this purpose data read from disc before locking are >> + # "invalidated" right after the locks are taken. This invalidations are >> + # "light", the `filecache` mechanism keep the data in memory and will >> + # reuse them if the underlying files did not changed. Not parsing the >> + # same data multiple times helps performances. >> + # >> + # Unfortunately in the case describe above, the files tracked by the >> + # bookmarks file cache might not have changed, but the in-memory >> + # content is still "wrong" because we used an older changelog content >> + # to process the on-disk data. So after locking, the changelog would be >> + # refreshed but `_bookmarks` would be preserved. >> + # Adding `00changelog.i` to the list of tracked file is not >> + # enough, because at the time we build the content for `_bookmarks` in >> + # (4), the changelog file has already diverged from the content used >> + # for loading `changelog` in (1) >> + # >> + # To prevent the issue, we force the changelog to be explicitly >> + # reloaded while computing `_bookmarks`. The data race can still happen >> + # without the lock (with a narrower window), but it would no longer go >> + # undetected during the lock time refresh. >> + # >> + # The new schedule is as follow >> + # >> + # 1) filecache logic detect that `_bookmarks` needs to be computed >> + # 2) cachestat for `bookmarks` and `changelog` are captured (for book) >> + # 3) We force `changelog` filecache to be tested >> + # 4) cachestat for `changelog` are captured (for changelog) >> + # 5) `_bookmarks` is computed and cached >> + # >> + # The step in (3) ensure we have a changelog at least as recent as the >> + # cache stat computed in (1). As a result at locking time: >> + # * if the changelog did not changed since (1) -> we can reuse the data >> + # * otherwise -> the bookmarks get refreshed. >> + self._refreshchangelog() >> return bookmarks.bmstore(self) >> >> def _refreshchangelog(self): >> diff --git a/tests/test-bookmarks-corner-case.t b/tests/test-bookmarks-corner-case.t >> --- a/tests/test-bookmarks-corner-case.t >> +++ b/tests/test-bookmarks-corner-case.t >> @@ -200,6 +200,7 @@ Check raced push output. >> $ cat push-output.txt >> pushing to ssh://user@dummy/bookrace-server >> searching for changes >> + remote has heads on branch 'default' that are not known locally: f26c3b5167d1 >> remote: setting raced push up >> remote: adding changesets >> remote: adding manifests >> @@ -219,7 +220,7 @@ Check result of the push. >> | summary: A1 >> | >> | o changeset: 3:f26c3b5167d1 >> - | | bookmark: book-B (false !) >> + | | bookmark: book-B >> | | user: test >> | | date: Thu Jan 01 00:00:00 1970 +0000 >> | | summary: B1 >> @@ -242,4 +243,4 @@ Check result of the push. >> >> $ hg -R bookrace-server book >> book-A 4:9ce3b28c16de >> - book-B 3:f26c3b5167d1 (false !) >> + book-B 3:f26c3b5167d1 >> _______________________________________________ >> Mercurial-devel mailing list >> Mercurial-devel@mercurial-scm.org >> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel > > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel >
Patch
diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py --- a/mercurial/localrepo.py +++ b/mercurial/localrepo.py @@ -1227,6 +1227,55 @@ class localrepository(object): @mixedrepostorecache(('bookmarks', 'plain'), ('bookmarks.current', 'plain'), ('bookmarks', ''), ('00changelog.i', '')) def _bookmarks(self): + # Since the multiple files involved in the transaction cannot be + # written atomically (with current repository format), there is a race + # condition here. + # + # 1) changelog content A is read + # 2) outside transaction update changelog to content B + # 3) outside transaction update bookmark file referring to content B + # 4) bookmarks file content is read and filtered against changelog-A + # + # When this happens, bookmarks against nodes missing from A are dropped. + # + # Having this happening during read is not great, but it become worse + # when this happen during write because the bookmarks to the "unknown" + # nodes will be dropped for good. However, writes happen within locks. + # This locking makes it possible to have a race free consistent read. + # For this purpose data read from disc before locking are + # "invalidated" right after the locks are taken. This invalidations are + # "light", the `filecache` mechanism keep the data in memory and will + # reuse them if the underlying files did not changed. Not parsing the + # same data multiple times helps performances. + # + # Unfortunately in the case describe above, the files tracked by the + # bookmarks file cache might not have changed, but the in-memory + # content is still "wrong" because we used an older changelog content + # to process the on-disk data. So after locking, the changelog would be + # refreshed but `_bookmarks` would be preserved. + # Adding `00changelog.i` to the list of tracked file is not + # enough, because at the time we build the content for `_bookmarks` in + # (4), the changelog file has already diverged from the content used + # for loading `changelog` in (1) + # + # To prevent the issue, we force the changelog to be explicitly + # reloaded while computing `_bookmarks`. The data race can still happen + # without the lock (with a narrower window), but it would no longer go + # undetected during the lock time refresh. + # + # The new schedule is as follow + # + # 1) filecache logic detect that `_bookmarks` needs to be computed + # 2) cachestat for `bookmarks` and `changelog` are captured (for book) + # 3) We force `changelog` filecache to be tested + # 4) cachestat for `changelog` are captured (for changelog) + # 5) `_bookmarks` is computed and cached + # + # The step in (3) ensure we have a changelog at least as recent as the + # cache stat computed in (1). As a result at locking time: + # * if the changelog did not changed since (1) -> we can reuse the data + # * otherwise -> the bookmarks get refreshed. + self._refreshchangelog() return bookmarks.bmstore(self) def _refreshchangelog(self): diff --git a/tests/test-bookmarks-corner-case.t b/tests/test-bookmarks-corner-case.t --- a/tests/test-bookmarks-corner-case.t +++ b/tests/test-bookmarks-corner-case.t @@ -200,6 +200,7 @@ Check raced push output. $ cat push-output.txt pushing to ssh://user@dummy/bookrace-server searching for changes + remote has heads on branch 'default' that are not known locally: f26c3b5167d1 remote: setting raced push up remote: adding changesets remote: adding manifests @@ -219,7 +220,7 @@ Check result of the push. | summary: A1 | | o changeset: 3:f26c3b5167d1 - | | bookmark: book-B (false !) + | | bookmark: book-B | | user: test | | date: Thu Jan 01 00:00:00 1970 +0000 | | summary: B1 @@ -242,4 +243,4 @@ Check result of the push. $ hg -R bookrace-server book book-A 4:9ce3b28c16de - book-B 3:f26c3b5167d1 (false !) + book-B 3:f26c3b5167d1