Patchwork D7793: rust-nodemap: mutable NodeTree data structure

login
register
mail settings
Submitter phabricator
Date Jan. 6, 2020, 7:26 p.m.
Message ID <differential-rev-PHID-DREV-azo53awudx7nqngm3nzi-req@mercurial-scm.org>
Download mbox | patch
Permalink /patch/44152/
State Superseded
Headers show

Comments

phabricator - Jan. 6, 2020, 7:26 p.m.
gracinet created this revision.
Herald added subscribers: mercurial-devel, kevincox, durin42.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Thanks to the previously indexing abstraction,
  the only difference in the lookup algorithm is that we
  don't need the special case for an empty NodeTree any more.
  
  We've considered making the mutable root an `Option<Block>`,
  but that leads to unpleasant checks and `unwrap()` unless we
  abstract it as typestate patterns (`NodeTree<Immutable>` and
  `NodeTree<Mutated>`) which seem exaggerated in that
  case.
  
  The initial copy of the root block is a very minor
  performance penalty, given that it typically occurs just once
  per transaction.

REPOSITORY
  rHG Mercurial

BRANCH
  default

REVISION DETAIL
  https://phab.mercurial-scm.org/D7793

AFFECTED FILES
  rust/hg-core/src/revlog/nodemap.rs

CHANGE DETAILS




To: gracinet, #hg-reviewers
Cc: durin42, kevincox, mercurial-devel
phabricator - Jan. 16, 2020, 10:15 a.m.
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> nodemap.rs:158
>      readonly: Box<dyn Deref<Target = [Block]> + Send>,
> +    growable: Vec<Block>,
> +    root: Block,

This strikes me as a weird name. The fact that it is an adjective rather than a noun is a hint. Can you rename to answer "Growable what?"

> nodemap.rs:249
> +            readonly, self.growable, self.root
> +        )
>      }

I would use https://doc.rust-lang.org/std/fmt/struct.Formatter.html#method.debug_struct for consistency unless you really want to avoid printing the struct name.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7793/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7793

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
phabricator - Jan. 27, 2020, 3:54 p.m.
gracinet added inline comments.

INLINE COMMENTS

> kevincox wrote in nodemap.rs:158
> This strikes me as a weird name. The fact that it is an adjective rather than a noun is a hint. Can you rename to answer "Growable what?"

In a previous version, I was calling it `mutable` (also an adjective, btw), but that became just wrong with the introduction of the separate root block. I don't have many ideas:

- `added_inner_blocks` ? actually, these are added on top of the readonly part and often mask some of the readonly blocks
- `added_non_root_blocks` ? Eww
- `mutable_inner_blocks` ?

I'm no native speaker, but the adjective thing doesn't deter me, I would expect it to be rather clear that it refers to blocks, given the type.
Open to suggestions, leaving as is for the time being

> kevincox wrote in nodemap.rs:249
> I would use https://doc.rust-lang.org/std/fmt/struct.Formatter.html#method.debug_struct for consistency unless you really want to avoid printing the struct name.

sadly, `Formatter.debug_struct` doesn't tale `?Sized` arguments (at least in our target 1.34 version):

     --> hg-core/src/revlog/nodemap.rs:298:32
      |
  298 |             .field("readonly", readonly)
      |                                ^^^^^^^^ doesn't have a size known at compile-time
      |
      = help: the trait `std::marker::Sized` is not implemented for `[revlog::nodemap::Block]`

Kept it as it was for the time being, just dropped the unneeded  anonymous lifetime

Omitting the struct name was indeed on purpose, to make it more manageable in test data. That was useful in early versions. I would have been willing to drop it for the final form. Anyone using this for real data wouldn't care about just a few bytes at the beginning.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7793/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7793

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel

Patch

diff --git a/rust/hg-core/src/revlog/nodemap.rs b/rust/hg-core/src/revlog/nodemap.rs
--- a/rust/hg-core/src/revlog/nodemap.rs
+++ b/rust/hg-core/src/revlog/nodemap.rs
@@ -148,16 +148,31 @@ 
     }
 }
 
-/// A 16-radix tree with the root block at the end
+/// A mutable 16-radix tree with the root block logically at the end
+///
+/// Because of the append only nature of our node trees, we need to
+/// keep the original untouched and store new blocks separately.
+///
+/// The mutable root `Block` is kept apart so that we don't have to rebump
+/// it on each insertion.
 pub struct NodeTree {
     readonly: Box<dyn Deref<Target = [Block]> + Send>,
+    growable: Vec<Block>,
+    root: Block,
 }
 
 impl Index<usize> for NodeTree {
     type Output = Block;
 
     fn index(&self, i: usize) -> &Block {
-        &self.readonly[i]
+        let ro_len = self.readonly.len();
+        if i < ro_len {
+            &self.readonly[i]
+        } else if i == ro_len + self.growable.len() {
+            &self.root
+        } else {
+            &self.growable[i - ro_len]
+        }
     }
 }
 
@@ -179,8 +194,24 @@ 
 }
 
 impl NodeTree {
+    /// Initiate a NodeTree from an immutable slice-like of `Block`
+    ///
+    /// We keep `readonly` and clone its root block if it isn't empty.
+    fn new(readonly: Box<dyn Deref<Target = [Block]> + Send>) -> Self {
+        let root = readonly
+            .last()
+            .map(|b| b.clone())
+            .unwrap_or_else(|| Block::new());
+        NodeTree {
+            readonly: readonly,
+            growable: Vec::new(),
+            root: root,
+        }
+    }
+
+    /// Total number of blocks
     fn len(&self) -> usize {
-        self.readonly.len()
+        self.readonly.len() + self.growable.len() + 1
     }
 
     /// Main working method for `NodeTree` searches
@@ -191,11 +222,7 @@ 
         &self,
         prefix: NodePrefixRef<'p>,
     ) -> Result<Option<Revision>, NodeMapError> {
-        let len = self.len();
-        if len == 0 {
-            return Ok(None);
-        }
-        let mut visit = len - 1;
+        let mut visit = self.len() - 1;
         for i in 0..prefix.len() {
             let nybble = prefix.get_nybble(i);
             match self[visit].read(nybble) {
@@ -210,16 +237,18 @@ 
 
 impl From<Vec<Block>> for NodeTree {
     fn from(vec: Vec<Block>) -> Self {
-        NodeTree {
-            readonly: Box::new(vec),
-        }
+        Self::new(Box::new(vec))
     }
 }
 
 impl fmt::Debug for NodeTree {
     fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
-        let blocks: &[Block] = &*self.readonly;
-        write!(f, "readonly: {:?}", blocks)
+        let readonly: &[Block] = &*self.readonly;
+        write!(
+            f,
+            "readonly: {:?}, growable: {:?}, root: {:?}",
+            readonly, self.growable, self.root
+        )
     }
 }
 
@@ -320,7 +349,9 @@ 
         assert_eq!(
             format!("{:?}", nt),
             "readonly: \
-             [[0: Rev(9)], [0: Rev(0), 1: Rev(9)], [0: Block(1), 1: Rev(1)]]"
+             [[0: Rev(9)], [0: Rev(0), 1: Rev(9)], [0: Block(1), 1: Rev(1)]], \
+             growable: [], \
+             root: [0: Block(1), 1: Rev(1)]",
         );
     }
 
@@ -352,4 +383,24 @@ 
         assert_eq!(nt.find_hex(&idx, "00a"), Ok(Some(0)));
     }
 
+    #[test]
+    fn test_mutated_find() -> Result<(), NodeMapError> {
+        let mut idx = TestIndex::new();
+        pad_insert(&mut idx, 9, "012");
+        pad_insert(&mut idx, 0, "00a");
+        pad_insert(&mut idx, 2, "cafe");
+        pad_insert(&mut idx, 3, "15");
+        pad_insert(&mut idx, 1, "10");
+
+        let nt = NodeTree {
+            readonly: sample_nodetree().readonly,
+            growable: vec![block![0: Rev(1), 5: Rev(3)]],
+            root: block![0: Block(1), 1:Block(3), 12: Rev(2)],
+        };
+        assert_eq!(nt.find_hex(&idx, "10")?, Some(1));
+        assert_eq!(nt.find_hex(&idx, "c")?, Some(2));
+        assert_eq!(nt.find_hex(&idx, "00")?, Some(0));
+        assert_eq!(nt.find_hex(&idx, "01")?, Some(9));
+        Ok(())
+    }
 }