Patchwork D11096: dirstate-v2: Reuse existing paths when appending to a data file

login
register
mail settings
Submitter phabricator
Date July 15, 2021, 3:28 p.m.
Message ID <differential-rev-PHID-DREV-hibumparc5qgp5cl3qu4-req@mercurial-scm.org>
Download mbox | patch
Permalink /patch/49409/
State Superseded
Headers show

Comments

phabricator - July 15, 2021, 3:28 p.m.
SimonSapin created this revision.
Herald added a reviewer: hg-reviewers.
Herald added a subscriber: mercurial-patches.

REVISION SUMMARY
  When writing a dirstate in v2 format by appending to an existing data file,
  filenames / paths that are borrowed from the previous on-disk representation
  can be reused.

REPOSITORY
  rHG Mercurial

BRANCH
  default

REVISION DETAIL
  https://phab.mercurial-scm.org/D11096

AFFECTED FILES
  rust/hg-core/src/dirstate_tree/on_disk.rs

CHANGE DETAILS




To: SimonSapin, #hg-reviewers
Cc: mercurial-patches, mercurial-devel

Patch

diff --git a/rust/hg-core/src/dirstate_tree/on_disk.rs b/rust/hg-core/src/dirstate_tree/on_disk.rs
--- a/rust/hg-core/src/dirstate_tree/on_disk.rs
+++ b/rust/hg-core/src/dirstate_tree/on_disk.rs
@@ -590,9 +590,12 @@ 
         &mut self,
         nodes: dirstate_map::ChildNodesRef,
     ) -> Result<ChildNodes, DirstateError> {
+        // Reuse already-written nodes if possible
         if self.append {
             if let dirstate_map::ChildNodesRef::OnDisk(nodes_slice) = nodes {
-                let start = self.offset_of(nodes_slice);
+                let start = self.on_disk_offset_of(nodes_slice).expect(
+                    "dirstate-v2 OnDisk nodes not found within on_disk",
+                );
                 let len = child_nodes_len_from_usize(nodes_slice.len());
                 return Ok(ChildNodes { start, len });
             }
@@ -678,11 +681,9 @@ 
         Ok(ChildNodes { start, len })
     }
 
-    /// Takes a slice of items within `on_disk` and returns its offset for the
-    /// start of `on_disk`.
-    ///
-    /// Panics if the given slice is not within `on_disk`.
-    fn offset_of<T>(&self, slice: &[T]) -> Offset
+    /// If the given slice of items is within `on_disk`, returns its offset
+    /// from the start of `on_disk`.
+    fn on_disk_offset_of<T>(&self, slice: &[T]) -> Option<Offset>
     where
         T: BytesCast,
     {
@@ -693,10 +694,14 @@ 
         }
         let slice_addresses = address_range(slice.as_bytes());
         let on_disk_addresses = address_range(self.dirstate_map.on_disk);
-        assert!(on_disk_addresses.contains(slice_addresses.start()));
-        assert!(on_disk_addresses.contains(slice_addresses.end()));
-        let offset = slice_addresses.start() - on_disk_addresses.start();
-        offset_from_usize(offset)
+        if on_disk_addresses.contains(slice_addresses.start())
+            && on_disk_addresses.contains(slice_addresses.end())
+        {
+            let offset = slice_addresses.start() - on_disk_addresses.start();
+            Some(offset_from_usize(offset))
+        } else {
+            None
+        }
     }
 
     fn curent_offset(&mut self) -> Offset {
@@ -708,8 +713,14 @@ 
     }
 
     fn write_path(&mut self, slice: &[u8]) -> PathSlice {
+        let len = path_len_from_usize(slice.len());
+        // Reuse an already-written path if possible
+        if self.append {
+            if let Some(start) = self.on_disk_offset_of(slice) {
+                return PathSlice { start, len };
+            }
+        }
         let start = self.curent_offset();
-        let len = path_len_from_usize(slice.len());
         self.out.extend(slice.as_bytes());
         PathSlice { start, len }
     }