Patchwork D10785: revlog: use dedicated code for reading sidedata

login
register
mail settings
Submitter phabricator
Date May 28, 2021, 10:58 p.m.
Message ID <differential-rev-PHID-DREV-mgym57j5nlftrv6q26ut-req@mercurial-scm.org>
Download mbox | patch
Permalink /patch/49108/
State Superseded
Headers show

Comments

phabricator - May 28, 2021, 10:58 p.m.
marmoute created this revision.
Herald added a reviewer: indygreg.
Herald added a reviewer: hg-reviewers.
Herald added a subscriber: mercurial-patches.

REVISION SUMMARY
  We are about to introduce a new, dedicated, file to store sidedata. Before doing so, we make sidedata reading go through different code as reading data chunk. This will simplify some of the complexity of the next changesets.
  
  The reading is very simple right now and will need some improvement later to
  reuse some of the caching strategy we use for the data file.

REPOSITORY
  rHG Mercurial

BRANCH
  default

REVISION DETAIL
  https://phab.mercurial-scm.org/D10785

AFFECTED FILES
  mercurial/configitems.py
  mercurial/revlog.py

CHANGE DETAILS




To: marmoute, indygreg, #hg-reviewers
Cc: mercurial-patches, mercurial-devel

Patch

diff --git a/mercurial/revlog.py b/mercurial/revlog.py
--- a/mercurial/revlog.py
+++ b/mercurial/revlog.py
@@ -803,6 +803,10 @@ 
             with func() as fp:
                 yield fp
 
+    def _sidedatareadfp(self):
+        """file object suitable to read sidedata"""
+        return self._datareadfp()
+
     def tiprev(self):
         return len(self.index) - 1
 
@@ -2068,7 +2072,19 @@ 
         if sidedata_size == 0:
             return {}
 
-        comp_segment = self._getsegment(sidedata_offset, sidedata_size)
+        # XXX this need caching, as we do for data
+        with self._sidedatareadfp() as sdf:
+            sdf.seek(sidedata_offset)
+            comp_segment = sdf.read(sidedata_size)
+
+            if len(comp_segment) < sidedata_size:
+                filename = self._datafile
+                length = sidedata_size
+                offset = sidedata_offset
+                got = len(comp_segment)
+                m = PARTIAL_READ_MSG % (filename, length, offset, got)
+                raise error.RevlogError(m)
+
         comp = self.index[rev][11]
         if comp == COMP_MODE_PLAIN:
             segment = comp_segment
diff --git a/mercurial/configitems.py b/mercurial/configitems.py
--- a/mercurial/configitems.py
+++ b/mercurial/configitems.py
@@ -1161,6 +1161,7 @@ 
 #   keeping references to the affected revlogs, especially memory-wise when
 #   rewriting sidedata.
 # * introduce a proper solution to reduce the number of filelog related files.
+# * use caching for reading sidedata (similar to what we do for data).
 # * Improvement to consider
 #   - avoid compression header in chunk using the default compression?
 #   - forbid "inline" compression mode entirely?