Patchwork [4,of,5,RFC] revlog: bound based on the length of the compressed deltas

login
register
mail settings
Submitter Siddharth Agarwal
Date Nov. 12, 2014, 11:09 p.m.
Message ID <ec432ebb569fd4537093.1415833751@devbig136.prn2.facebook.com>
Download mbox | patch
Permalink /patch/6693/
State Accepted
Commit 426d7f901789f6df2087e36a04ccd40ea45a20ae
Headers show

Comments

Siddharth Agarwal - Nov. 12, 2014, 11:09 p.m.
# HG changeset patch
# User Siddharth Agarwal <sid0@fb.com>
# Date 1415764879 28800
#      Tue Nov 11 20:01:19 2014 -0800
# Node ID ec432ebb569fd4537093a3c446ccaefce5298636
# Parent  69954f49716d0bdd3094be4b4b7b06da155a7704
revlog: bound based on the length of the compressed deltas

This is only relevant for generaldelta clones.

Patch

diff --git a/mercurial/revlog.py b/mercurial/revlog.py
--- a/mercurial/revlog.py
+++ b/mercurial/revlog.py
@@ -1225,8 +1225,10 @@ 
                 base = rev
             else:
                 base = chainbase
-            chainlen = self.chainlen(rev) + 1
-            return dist, l, data, base, chainbase, chainlen
+            chainlen, compresseddeltalen = self._chaininfo(rev)
+            chainlen += 1
+            compresseddeltalen += l
+            return dist, l, data, base, chainbase, chainlen, compresseddeltalen
 
         curr = len(self)
         prev = curr - 1
@@ -1251,7 +1253,7 @@ 
                     d = builddelta(prev)
             else:
                 d = builddelta(prev)
-            dist, l, data, base, chainbase, chainlen = d
+            dist, l, data, base, chainbase, chainlen, compresseddeltalen = d
 
         # full versions are inserted when the needed deltas
         # become comparable to the uncompressed text
@@ -1260,7 +1262,13 @@ 
                                         cachedelta[1])
         else:
             textlen = len(text)
+
+        # - 'dist' is the distance from the base revision -- bounding it limits
+        #   the amount of I/O we need to do.
+        # - 'compresseddeltalen' is the sum of the total size of deltas we need
+        #   to apply -- bounding it limits the amount of CPU we consume.
         if (d is None or dist > textlen * 2 or l > textlen or
+            compresseddeltalen > textlen * 2 or
             (self._maxchainlen and chainlen > self._maxchainlen)):
             text = buildtext()
             data = self.compress(text)