Patchwork D7835: nodemap: write nodemap data on disk

login
register
mail settings
Submitter phabricator
Date Jan. 11, 2020, 5:05 p.m.
Message ID <differential-rev-PHID-DREV-oig2edvheavm4sdln4i7-req@mercurial-scm.org>
Download mbox | patch
Permalink /patch/44265/
State Superseded
Headers show

Comments

phabricator - Jan. 11, 2020, 5:05 p.m.
marmoute created this revision.
Herald added a reviewer: indygreg.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Let us start writing data on disk (so that we can read it from there later).
  This series of changeset is going to focus first on having data on disk and
  updating it.
  
  Right now the data is written right next to the revlog data, in the store. We
  might move it to cache (with proper cache validation mechanism) later, but for
  now revlog have a storevfs instance and it is simpler to us it. The right
  location for this data is not the focus of this series.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D7835

AFFECTED FILES
  mercurial/changelog.py
  mercurial/configitems.py
  mercurial/localrepo.py
  mercurial/revlog.py
  mercurial/revlogutils/nodemap.py
  tests/test-persistent-nodemap.t

CHANGE DETAILS




To: marmoute, indygreg, #hg-reviewers
Cc: mercurial-devel
phabricator - Jan. 23, 2020, 6:51 p.m.
martinvonz added inline comments.

INLINE COMMENTS

> nodemap.py:42
> +        raise error.ProgrammingError(
> +            "cannot persist nodemap of a filtered changelog"
> +        )

need b'' here and below for py3?

> nodemap.py:49
> +    # EXP-TODO: if this is a cache, this should use a cache vfs, not a
> +    # store vfs
> +    with revlog.opener(revlog.nodemap_file, 'w') as f:

and here? this is a series that introduced a lot of new code, so try running all (previously passing) py3 test at least at the end of the series

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7835/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7835

To: marmoute, indygreg, #hg-reviewers
Cc: martinvonz, mercurial-devel
phabricator - Jan. 23, 2020, 6:54 p.m.
martinvonz added inline comments.

INLINE COMMENTS

> nodemap.py:50-51
> +    # store vfs
> +    with revlog.opener(revlog.nodemap_file, 'w') as f:
> +        f.write(data)
> +    # EXP-TODO: if the transaction abort, we should remove the new data and

Actually, does `revlog.opener.write(revlog.nodemap_file, data)` work?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7835/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7835

To: marmoute, indygreg, #hg-reviewers
Cc: martinvonz, mercurial-devel
phabricator - Jan. 23, 2020, 9:14 p.m.
martinvonz added inline comments.

INLINE COMMENTS

> martinvonz wrote in nodemap.py:42
> need b'' here and below for py3?

not needed here, it seems, because `ProgrammingError` converts it for us

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7835/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7835

To: marmoute, indygreg, #hg-reviewers
Cc: martinvonz, mercurial-devel
phabricator - Jan. 31, 2020, 9:24 a.m.
marmoute added inline comments.

INLINE COMMENTS

> martinvonz wrote in nodemap.py:49
> and here? this is a series that introduced a lot of new code, so try running all (previously passing) py3 test at least at the end of the series

If you can taje the series starting at D8011 <https://phab.mercurial-scm.org/D8011>, if would help me to make sure all angles are covered.

> martinvonz wrote in nodemap.py:50-51
> Actually, does `revlog.opener.write(revlog.nodemap_file, data)` work?

probably, but we will need the context manager approach later in the series so I don't think it is worth updating.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7835/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7835

To: marmoute, indygreg, #hg-reviewers
Cc: martinvonz, mercurial-devel

Patch

diff --git a/tests/test-persistent-nodemap.t b/tests/test-persistent-nodemap.t
--- a/tests/test-persistent-nodemap.t
+++ b/tests/test-persistent-nodemap.t
@@ -5,8 +5,14 @@ 
 
   $ hg init test-repo
   $ cd test-repo
+  $ cat << EOF >> .hg/hgrc
+  > [experimental]
+  > exp-persistent-nodemap=yes
+  > EOF
   $ hg debugbuilddag .+5000
-  $ hg debugnodemap --dump | f --sha256 --bytes=256 --hexdump --size
+  $ hg debugnodemap --dump | f --sha256 --size
+  size=245760, sha256=5dbe62ab98a26668b544063d4d674ac4452ba903ee8895c52fd21d9bbd771e09
+  $ f --sha256 --bytes=256 --hexdump --size < .hg/store/00changelog.n
   size=245760, sha256=5dbe62ab98a26668b544063d4d674ac4452ba903ee8895c52fd21d9bbd771e09
   0000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
   0010: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
diff --git a/mercurial/revlogutils/nodemap.py b/mercurial/revlogutils/nodemap.py
--- a/mercurial/revlogutils/nodemap.py
+++ b/mercurial/revlogutils/nodemap.py
@@ -21,6 +21,35 @@ 
         raise error.RevlogError(b'unknown node: %s' % x)
 
 
+def setup_persistent_nodemap(tr, revlog):
+    if revlog.nodemap_file is None:
+        return  # we do not use persistent_nodemap on this revlog
+    callback_id = b"revlog-persistent-nodemap-%s" % revlog.nodemap_file
+    if tr.hasfinalize(callback_id):
+        return  # no need to register again
+    tr.addfinalize(callback_id, lambda tr: _persist_nodemap(tr, revlog))
+
+
+def _persist_nodemap(tr, revlog):
+    """Write nodemap data on disk for a given revlog
+    """
+    if getattr(revlog, 'filteredrevs', ()):
+        raise error.ProgrammingError(
+            "cannot persist nodemap of a filtered changelog"
+        )
+    if revlog.nodemap_file is None:
+        msg = "calling persist nodemap on a revlog without the feature enableb"
+        raise error.ProgrammingError(msg)
+    data = persistent_data(revlog.index)
+    # EXP-TODO: if this is a cache, this should use a cache vfs, not a
+    # store vfs
+    with revlog.opener(revlog.nodemap_file, 'w') as f:
+        f.write(data)
+    # EXP-TODO: if the transaction abort, we should remove the new data and
+    # reinstall the old one. (This will be simpler when the file format get a
+    # bit more advanced)
+
+
 ### Nodemap Trie
 #
 # This is a simple reference implementation to compute and serialise a nodemap
diff --git a/mercurial/revlog.py b/mercurial/revlog.py
--- a/mercurial/revlog.py
+++ b/mercurial/revlog.py
@@ -407,6 +407,7 @@ 
         mmaplargeindex=False,
         censorable=False,
         upperboundcomp=None,
+        persistentnodemap=False,
     ):
         """
         create a revlog object
@@ -418,6 +419,10 @@ 
         self.upperboundcomp = upperboundcomp
         self.indexfile = indexfile
         self.datafile = datafile or (indexfile[:-2] + b".d")
+        self.nodemap_file = None
+        if persistentnodemap:
+            self.nodemap_file = indexfile[:-2] + b".n"
+
         self.opener = opener
         #  When True, indexfile is opened with checkambig=True at writing, to
         #  avoid file stat ambiguity.
@@ -2286,6 +2291,7 @@ 
             ifh.write(data[0])
             ifh.write(data[1])
             self._enforceinlinesize(transaction, ifh)
+        nodemaputil.setup_persistent_nodemap(transaction, self)
 
     def addgroup(self, deltas, linkmapper, transaction, addrevisioncb=None):
         """
diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -929,6 +929,8 @@ 
 
     if ui.configbool(b'experimental', b'rust.index'):
         options[b'rust.index'] = True
+    if ui.configbool('experimental', 'exp-persistent-nodemap'):
+        options[b'exp-persistent-nodemap'] = True
 
     return options
 
diff --git a/mercurial/configitems.py b/mercurial/configitems.py
--- a/mercurial/configitems.py
+++ b/mercurial/configitems.py
@@ -660,6 +660,9 @@ 
     b'experimental', b'rust.index', default=False,
 )
 coreconfigitem(
+    b'experimental', b'exp-persistent-nodemap', default=False,
+)
+coreconfigitem(
     b'experimental', b'server.filesdata.recommended-batch-size', default=50000,
 )
 coreconfigitem(
diff --git a/mercurial/changelog.py b/mercurial/changelog.py
--- a/mercurial/changelog.py
+++ b/mercurial/changelog.py
@@ -385,6 +385,9 @@ 
             datafile=datafile,
             checkambig=True,
             mmaplargeindex=True,
+            persistentnodemap=opener.options.get(
+                'exp-persistent-nodemap', False
+            ),
         )
 
         if self._initempty and (self.version & 0xFFFF == revlog.REVLOGV1):