Patchwork [2,of,2] revlog: add Mercurial config variable for limiting delta-chain length

login
register
mail settings
Submitter Mateusz Kwapich
Date Nov. 10, 2014, 7:27 p.m.
Message ID <0c2718661ea13e8054ab.1415647661@devrs014.prn2.facebook.com>
Download mbox | patch
Permalink /patch/6669/
State Accepted
Headers show

Comments

Mateusz Kwapich - Nov. 10, 2014, 7:27 p.m.
# HG changeset patch
# User Mateusz Kwapich <mitrandir@fb.com>
# Date 1415312405 28800
#      Thu Nov 06 14:20:05 2014 -0800
# Node ID 0c2718661ea13e8054ab0d336cb5784b85991a5a
# Parent  79ae6c4132b5c582ea7dbd1aa4af8e2bcd2f5973
revlog: add Mercurial config variable for limiting delta-chain length

The current heuristic for deciding between storing delta and full texts
is based on ratio of (sizeofdeltas)/(sizeoffulltext).

In some cases (for example for mercurial Manifest for huge repo) this approach
can result in extremely long delta chains (~30,000) which are very slow to
read. (In case of Manifest ~500ms are added to every hg command because of that).

This commit introduces "revlog.maxchainlength" configuration variable that will
limit delta chain length.
Augie Fackler - Nov. 11, 2014, 2:22 p.m.
On Mon, Nov 10, 2014 at 11:27:41AM -0800, Mateusz Kwapich wrote:
> # HG changeset patch
> # User Mateusz Kwapich <mitrandir@fb.com>
> # Date 1415312405 28800
> #      Thu Nov 06 14:20:05 2014 -0800
> # Node ID 0c2718661ea13e8054ab0d336cb5784b85991a5a
> # Parent  79ae6c4132b5c582ea7dbd1aa4af8e2bcd2f5973
> revlog: add Mercurial config variable for limiting delta-chain length
>

Queued these, thanks.

Congratulations on your first patches to Mercurial.

>
> The current heuristic for deciding between storing delta and full texts
> is based on ratio of (sizeofdeltas)/(sizeoffulltext).
>
> In some cases (for example for mercurial Manifest for huge repo) this approach
> can result in extremely long delta chains (~30,000) which are very slow to
> read. (In case of Manifest ~500ms are added to every hg command because of that).
>
> This commit introduces "revlog.maxchainlength" configuration variable that will
> limit delta chain length.
>
> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> --- a/mercurial/localrepo.py
> +++ b/mercurial/localrepo.py
> @@ -316,6 +316,9 @@
>          chunkcachesize = self.ui.configint('format', 'chunkcachesize')
>          if chunkcachesize is not None:
>              self.sopener.options['chunkcachesize'] = chunkcachesize
> +        maxchainlen = self.ui.configint('revlog', 'maxchainlen')
> +        if maxchainlen is not None:
> +            self.sopener.options['maxchainlen'] = maxchainlen
>
>      def _writerequirements(self):
>          reqfile = self.opener("requires", "w")
> diff --git a/mercurial/revlog.py b/mercurial/revlog.py
> --- a/mercurial/revlog.py
> +++ b/mercurial/revlog.py
> @@ -204,6 +204,7 @@
>          self._basecache = None
>          self._chunkcache = (0, '')
>          self._chunkcachesize = 65536
> +        self._maxchainlen = None
>          self.index = []
>          self._pcache = {}
>          self._nodecache = {nullid: nullrev}
> @@ -219,6 +220,8 @@
>                  v = 0
>              if 'chunkcachesize' in opts:
>                  self._chunkcachesize = opts['chunkcachesize']
> +            if 'maxchainlen' in opts:
> +                self._maxchainlen = opts['maxchainlen']
>
>          if self._chunkcachesize <= 0:
>              raise RevlogError(_('revlog chunk cache size %r is not greater '
> @@ -1216,11 +1219,13 @@
>                  base = rev
>              else:
>                  base = chainbase
> -            return dist, l, data, base, chainbase
> +            chainlen = self.chainlen(rev) + 1
> +            return dist, l, data, base, chainbase, chainlen
>
>          curr = len(self)
>          prev = curr - 1
>          base = chainbase = curr
> +        chainlen = None
>          offset = self.end(prev)
>          flags = 0
>          d = None
> @@ -1240,7 +1245,7 @@
>                      d = builddelta(prev)
>              else:
>                  d = builddelta(prev)
> -            dist, l, data, base, chainbase = d
> +            dist, l, data, base, chainbase, chainlen = d
>
>          # full versions are inserted when the needed deltas
>          # become comparable to the uncompressed text
> @@ -1249,7 +1254,8 @@
>                                          cachedelta[1])
>          else:
>              textlen = len(text)
> -        if d is None or dist > textlen * 2:
> +        if (d is None or dist > textlen * 2 or
> +            self._maxchainlen and chainlen > self._maxchainlen):
>              text = buildtext()
>              data = self.compress(text)
>              l = len(data[1]) + len(data[0])
> diff --git a/tests/test-debugcommands.t b/tests/test-debugcommands.t
> --- a/tests/test-debugcommands.t
> +++ b/tests/test-debugcommands.t
> @@ -24,6 +24,40 @@
>    full revision size (min/max/avg)     : 44 / 44 / 44
>    delta size (min/max/avg)             : 0 / 0 / 0
>
> +Test max chain len
> +  $ cat >> $HGRCPATH << EOF
> +  > [revlog]
> +  > maxchainlen=4
> +  > EOF
> +
> +  $ echo "This test checks if maxchainlen config value is respected also it can serve as basic test for debugrevlog -d <file>.\n" >> a
> +  $ hg ci -m a
> +  $ echo "b\n" >> a
> +  $ hg ci -m a
> +  $ echo "c\n" >> a
> +  $ hg ci -m a
> +  $ echo "d\n" >> a
> +  $ hg ci -m a
> +  $ echo "e\n" >> a
> +  $ hg ci -m a
> +  $ echo "f\n" >> a
> +  $ hg ci -m a
> +  $ echo 'g\n' >> a
> +  $ hg ci -m a
> +  $ echo 'h\n' >> a
> +  $ hg ci -m a
> +  $ hg debugrevlog -d a
> +  # rev p1rev p2rev start   end deltastart base   p1   p2 rawsize totalsize compression heads chainlen
> +      0    -1    -1     0   ???          0    0    0    0     ???      ????           ?     1        0 (glob)
> +      1     0    -1   ???   ???          0    0    0    0     ???      ????           ?     1        1 (glob)
> +      2     1    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        2 (glob)
> +      3     2    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        3 (glob)
> +      4     3    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        4 (glob)
> +      5     4    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        0 (glob)
> +      6     5    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        1 (glob)
> +      7     6    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        2 (glob)
> +      8     7    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        3 (glob)
> +  $ cd ..
>
>  Test internal debugstacktrace command
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
Augie Fackler - Nov. 11, 2014, 2:49 p.m.
On Mon, Nov 10, 2014 at 11:27:41AM -0800, Mateusz Kwapich wrote:
> # HG changeset patch
> # User Mateusz Kwapich <mitrandir@fb.com>
> # Date 1415312405 28800
> #      Thu Nov 06 14:20:05 2014 -0800
> # Node ID 0c2718661ea13e8054ab0d336cb5784b85991a5a
> # Parent  79ae6c4132b5c582ea7dbd1aa4af8e2bcd2f5973
> revlog: add Mercurial config variable for limiting delta-chain length
>
> The current heuristic for deciding between storing delta and full texts
> is based on ratio of (sizeofdeltas)/(sizeoffulltext).
>
> In some cases (for example for mercurial Manifest for huge repo) this approach
> can result in extremely long delta chains (~30,000) which are very slow to
> read. (In case of Manifest ~500ms are added to every hg command because of that).
>
> This commit introduces "revlog.maxchainlength" configuration variable that will
> limit delta chain length.
>
> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> --- a/mercurial/localrepo.py
> +++ b/mercurial/localrepo.py
> @@ -316,6 +316,9 @@
>          chunkcachesize = self.ui.configint('format', 'chunkcachesize')
>          if chunkcachesize is not None:
>              self.sopener.options['chunkcachesize'] = chunkcachesize
> +        maxchainlen = self.ui.configint('revlog', 'maxchainlen')
> +        if maxchainlen is not None:
> +            self.sopener.options['maxchainlen'] = maxchainlen
>
>      def _writerequirements(self):
>          reqfile = self.opener("requires", "w")
> diff --git a/mercurial/revlog.py b/mercurial/revlog.py
> --- a/mercurial/revlog.py
> +++ b/mercurial/revlog.py
> @@ -204,6 +204,7 @@
>          self._basecache = None
>          self._chunkcache = (0, '')
>          self._chunkcachesize = 65536
> +        self._maxchainlen = None
>          self.index = []
>          self._pcache = {}
>          self._nodecache = {nullid: nullrev}
> @@ -219,6 +220,8 @@
>                  v = 0
>              if 'chunkcachesize' in opts:
>                  self._chunkcachesize = opts['chunkcachesize']
> +            if 'maxchainlen' in opts:
> +                self._maxchainlen = opts['maxchainlen']
>
>          if self._chunkcachesize <= 0:
>              raise RevlogError(_('revlog chunk cache size %r is not greater '
> @@ -1216,11 +1219,13 @@
>                  base = rev
>              else:
>                  base = chainbase
> -            return dist, l, data, base, chainbase
> +            chainlen = self.chainlen(rev) + 1
> +            return dist, l, data, base, chainbase, chainlen
>
>          curr = len(self)
>          prev = curr - 1
>          base = chainbase = curr
> +        chainlen = None
>          offset = self.end(prev)
>          flags = 0
>          d = None
> @@ -1240,7 +1245,7 @@
>                      d = builddelta(prev)
>              else:
>                  d = builddelta(prev)
> -            dist, l, data, base, chainbase = d
> +            dist, l, data, base, chainbase, chainlen = d
>
>          # full versions are inserted when the needed deltas
>          # become comparable to the uncompressed text
> @@ -1249,7 +1254,8 @@
>                                          cachedelta[1])
>          else:
>              textlen = len(text)
> -        if d is None or dist > textlen * 2:
> +        if (d is None or dist > textlen * 2 or
> +            self._maxchainlen and chainlen > self._maxchainlen):
>              text = buildtext()
>              data = self.compress(text)
>              l = len(data[1]) + len(data[0])
> diff --git a/tests/test-debugcommands.t b/tests/test-debugcommands.t
> --- a/tests/test-debugcommands.t
> +++ b/tests/test-debugcommands.t
> @@ -24,6 +24,40 @@
>    full revision size (min/max/avg)     : 44 / 44 / 44
>    delta size (min/max/avg)             : 0 / 0 / 0
>
> +Test max chain len
> +  $ cat >> $HGRCPATH << EOF
> +  > [revlog]
> +  > maxchainlen=4
> +  > EOF
> +
> +  $ echo "This test checks if maxchainlen config value is respected also it can serve as basic test for debugrevlog -d <file>.\n" >> a
> +  $ hg ci -m a
> +  $ echo "b\n" >> a
> +  $ hg ci -m a
> +  $ echo "c\n" >> a
> +  $ hg ci -m a
> +  $ echo "d\n" >> a
> +  $ hg ci -m a
> +  $ echo "e\n" >> a
> +  $ hg ci -m a
> +  $ echo "f\n" >> a
> +  $ hg ci -m a
> +  $ echo 'g\n' >> a
> +  $ hg ci -m a
> +  $ echo 'h\n' >> a

check-code sends its regards here, though I think it's an overzealous check. Investigating.

> +  $ hg ci -m a
> +  $ hg debugrevlog -d a
> +  # rev p1rev p2rev start   end deltastart base   p1   p2 rawsize totalsize compression heads chainlen
> +      0    -1    -1     0   ???          0    0    0    0     ???      ????           ?     1        0 (glob)
> +      1     0    -1   ???   ???          0    0    0    0     ???      ????           ?     1        1 (glob)
> +      2     1    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        2 (glob)
> +      3     2    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        3 (glob)
> +      4     3    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        4 (glob)
> +      5     4    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        0 (glob)
> +      6     5    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        1 (glob)
> +      7     6    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        2 (glob)
> +      8     7    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        3 (glob)
> +  $ cd ..
>
>  Test internal debugstacktrace command
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
Augie Fackler - Nov. 11, 2014, 2:51 p.m.
On Mon, Nov 10, 2014 at 11:27:41AM -0800, Mateusz Kwapich wrote:
> # HG changeset patch
> # User Mateusz Kwapich <mitrandir@fb.com>
> # Date 1415312405 28800
> #      Thu Nov 06 14:20:05 2014 -0800
> # Node ID 0c2718661ea13e8054ab0d336cb5784b85991a5a
> # Parent  79ae6c4132b5c582ea7dbd1aa4af8e2bcd2f5973
> revlog: add Mercurial config variable for limiting delta-chain length
>
> The current heuristic for deciding between storing delta and full texts
> is based on ratio of (sizeofdeltas)/(sizeoffulltext).
>
> In some cases (for example for mercurial Manifest for huge repo) this approach
> can result in extremely long delta chains (~30,000) which are very slow to
> read. (In case of Manifest ~500ms are added to every hg command because of that).
>
> This commit introduces "revlog.maxchainlength" configuration variable that will
> limit delta chain length.
>
> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> --- a/mercurial/localrepo.py
> +++ b/mercurial/localrepo.py
> @@ -316,6 +316,9 @@
>          chunkcachesize = self.ui.configint('format', 'chunkcachesize')
>          if chunkcachesize is not None:
>              self.sopener.options['chunkcachesize'] = chunkcachesize
> +        maxchainlen = self.ui.configint('revlog', 'maxchainlen')
> +        if maxchainlen is not None:
> +            self.sopener.options['maxchainlen'] = maxchainlen
>
>      def _writerequirements(self):
>          reqfile = self.opener("requires", "w")
> diff --git a/mercurial/revlog.py b/mercurial/revlog.py
> --- a/mercurial/revlog.py
> +++ b/mercurial/revlog.py
> @@ -204,6 +204,7 @@
>          self._basecache = None
>          self._chunkcache = (0, '')
>          self._chunkcachesize = 65536
> +        self._maxchainlen = None
>          self.index = []
>          self._pcache = {}
>          self._nodecache = {nullid: nullrev}
> @@ -219,6 +220,8 @@
>                  v = 0
>              if 'chunkcachesize' in opts:
>                  self._chunkcachesize = opts['chunkcachesize']
> +            if 'maxchainlen' in opts:
> +                self._maxchainlen = opts['maxchainlen']
>
>          if self._chunkcachesize <= 0:
>              raise RevlogError(_('revlog chunk cache size %r is not greater '
> @@ -1216,11 +1219,13 @@
>                  base = rev
>              else:
>                  base = chainbase
> -            return dist, l, data, base, chainbase
> +            chainlen = self.chainlen(rev) + 1
> +            return dist, l, data, base, chainbase, chainlen
>
>          curr = len(self)
>          prev = curr - 1
>          base = chainbase = curr
> +        chainlen = None
>          offset = self.end(prev)
>          flags = 0
>          d = None
> @@ -1240,7 +1245,7 @@
>                      d = builddelta(prev)
>              else:
>                  d = builddelta(prev)
> -            dist, l, data, base, chainbase = d
> +            dist, l, data, base, chainbase, chainlen = d
>
>          # full versions are inserted when the needed deltas
>          # become comparable to the uncompressed text
> @@ -1249,7 +1254,8 @@
>                                          cachedelta[1])
>          else:
>              textlen = len(text)
> -        if d is None or dist > textlen * 2:
> +        if (d is None or dist > textlen * 2 or
> +            self._maxchainlen and chainlen > self._maxchainlen):
>              text = buildtext()
>              data = self.compress(text)
>              l = len(data[1]) + len(data[0])
> diff --git a/tests/test-debugcommands.t b/tests/test-debugcommands.t
> --- a/tests/test-debugcommands.t
> +++ b/tests/test-debugcommands.t
> @@ -24,6 +24,40 @@
>    full revision size (min/max/avg)     : 44 / 44 / 44
>    delta size (min/max/avg)             : 0 / 0 / 0
>
> +Test max chain len
> +  $ cat >> $HGRCPATH << EOF
> +  > [revlog]
> +  > maxchainlen=4
> +  > EOF
> +
> +  $ echo "This test checks if maxchainlen config value is respected also it can serve as basic test for debugrevlog -d <file>.\n" >> a

Oh, I see waht check-code meant: you wanted printf instead of echo
since you have a \n for sh portability reasons. Sigh. I'll fix this in
flight.

> +  $ hg ci -m a
> +  $ echo "b\n" >> a
> +  $ hg ci -m a
> +  $ echo "c\n" >> a
> +  $ hg ci -m a
> +  $ echo "d\n" >> a
> +  $ hg ci -m a
> +  $ echo "e\n" >> a
> +  $ hg ci -m a
> +  $ echo "f\n" >> a
> +  $ hg ci -m a
> +  $ echo 'g\n' >> a
> +  $ hg ci -m a
> +  $ echo 'h\n' >> a
> +  $ hg ci -m a
> +  $ hg debugrevlog -d a
> +  # rev p1rev p2rev start   end deltastart base   p1   p2 rawsize totalsize compression heads chainlen
> +      0    -1    -1     0   ???          0    0    0    0     ???      ????           ?     1        0 (glob)
> +      1     0    -1   ???   ???          0    0    0    0     ???      ????           ?     1        1 (glob)
> +      2     1    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        2 (glob)
> +      3     2    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        3 (glob)
> +      4     3    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        4 (glob)
> +      5     4    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        0 (glob)
> +      6     5    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        1 (glob)
> +      7     6    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        2 (glob)
> +      8     7    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        3 (glob)
> +  $ cd ..
>
>  Test internal debugstacktrace command
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
Pierre-Yves David - Nov. 11, 2014, 2:54 p.m.
On 11/11/2014 02:49 PM, Augie Fackler wrote:
> On Mon, Nov 10, 2014 at 11:27:41AM -0800, Mateusz Kwapich wrote:
>> # HG changeset patch
>> # User Mateusz Kwapich <mitrandir@fb.com>
>> # Date 1415312405 28800
>> #      Thu Nov 06 14:20:05 2014 -0800
>> # Node ID 0c2718661ea13e8054ab0d336cb5784b85991a5a
>> # Parent  79ae6c4132b5c582ea7dbd1aa4af8e2bcd2f5973
>> revlog: add Mercurial config variable for limiting delta-chain length
>>
>> The current heuristic for deciding between storing delta and full texts
>> is based on ratio of (sizeofdeltas)/(sizeoffulltext).
>>
>> In some cases (for example for mercurial Manifest for huge repo) this approach
>> can result in extremely long delta chains (~30,000) which are very slow to
>> read. (In case of Manifest ~500ms are added to every hg command because of that).
>>
>> This commit introduces "revlog.maxchainlength" configuration variable that will
>> limit delta chain length.

We do not have a 'revlog' section yet. I would use the 'format' one for 
this as this is kind of related to the on disk format.

>>
>> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
>> --- a/mercurial/localrepo.py
>> +++ b/mercurial/localrepo.py
>> @@ -316,6 +316,9 @@
>>           chunkcachesize = self.ui.configint('format', 'chunkcachesize')
>>           if chunkcachesize is not None:
>>               self.sopener.options['chunkcachesize'] = chunkcachesize
>> +        maxchainlen = self.ui.configint('revlog', 'maxchainlen')
>> +        if maxchainlen is not None:
>> +            self.sopener.options['maxchainlen'] = maxchainlen
>>
>>       def _writerequirements(self):
>>           reqfile = self.opener("requires", "w")
>> diff --git a/mercurial/revlog.py b/mercurial/revlog.py
>> --- a/mercurial/revlog.py
>> +++ b/mercurial/revlog.py
>> @@ -204,6 +204,7 @@
>>           self._basecache = None
>>           self._chunkcache = (0, '')
>>           self._chunkcachesize = 65536
>> +        self._maxchainlen = None
>>           self.index = []
>>           self._pcache = {}
>>           self._nodecache = {nullid: nullrev}
>> @@ -219,6 +220,8 @@
>>                   v = 0
>>               if 'chunkcachesize' in opts:
>>                   self._chunkcachesize = opts['chunkcachesize']
>> +            if 'maxchainlen' in opts:
>> +                self._maxchainlen = opts['maxchainlen']
>>
>>           if self._chunkcachesize <= 0:
>>               raise RevlogError(_('revlog chunk cache size %r is not greater '
>> @@ -1216,11 +1219,13 @@
>>                   base = rev
>>               else:
>>                   base = chainbase
>> -            return dist, l, data, base, chainbase
>> +            chainlen = self.chainlen(rev) + 1
>> +            return dist, l, data, base, chainbase, chainlen
>>
>>           curr = len(self)
>>           prev = curr - 1
>>           base = chainbase = curr
>> +        chainlen = None
>>           offset = self.end(prev)
>>           flags = 0
>>           d = None
>> @@ -1240,7 +1245,7 @@
>>                       d = builddelta(prev)
>>               else:
>>                   d = builddelta(prev)
>> -            dist, l, data, base, chainbase = d
>> +            dist, l, data, base, chainbase, chainlen = d
>>
>>           # full versions are inserted when the needed deltas
>>           # become comparable to the uncompressed text
>> @@ -1249,7 +1254,8 @@
>>                                           cachedelta[1])
>>           else:
>>               textlen = len(text)
>> -        if d is None or dist > textlen * 2:
>> +        if (d is None or dist > textlen * 2 or
>> +            self._maxchainlen and chainlen > self._maxchainlen):

The conditional becomes fairly hairy. Can we tighten it a bit with 
temporary variable usage?)

>>               text = buildtext()
>>               data = self.compress(text)
>>               l = len(data[1]) + len(data[0])
>> diff --git a/tests/test-debugcommands.t b/tests/test-debugcommands.t
>> --- a/tests/test-debugcommands.t
>> +++ b/tests/test-debugcommands.t
>> @@ -24,6 +24,40 @@
>>     full revision size (min/max/avg)     : 44 / 44 / 44
>>     delta size (min/max/avg)             : 0 / 0 / 0
>>
>> +Test max chain len
>> +  $ cat >> $HGRCPATH << EOF
>> +  > [revlog]
>> +  > maxchainlen=4
>> +  > EOF
>> +
>> +  $ echo "This test checks if maxchainlen config value is respected also it can serve as basic test for debugrevlog -d <file>.\n" >> a
>> +  $ hg ci -m a
>> +  $ echo "b\n" >> a
>> +  $ hg ci -m a
>> +  $ echo "c\n" >> a
>> +  $ hg ci -m a
>> +  $ echo "d\n" >> a
>> +  $ hg ci -m a
>> +  $ echo "e\n" >> a
>> +  $ hg ci -m a
>> +  $ echo "f\n" >> a
>> +  $ hg ci -m a
>> +  $ echo 'g\n' >> a
>> +  $ hg ci -m a
>> +  $ echo 'h\n' >> a
>
> check-code sends its regards here, though I think it's an overzealous check. Investigating.

Also, not sure why you add the \n to the echo.


The rest looks good to me.
Augie Fackler - Nov. 11, 2014, 2:56 p.m.
On Tue, Nov 11, 2014 at 9:54 AM, Pierre-Yves David
<pierre-yves.david@ens-lyon.org> wrote:
>>> This commit introduces "revlog.maxchainlength" configuration variable
>>> that will
>>> limit delta chain length.
>
>
> We do not have a 'revlog' section yet. I would use the 'format' one for this
> as this is kind of related to the on disk format.


format sounds good. Can I get you to send me a followup once I push?
Pierre-Yves David - Nov. 11, 2014, 2:57 p.m.
On 11/11/2014 02:56 PM, Augie Fackler wrote:
> On Tue, Nov 11, 2014 at 9:54 AM, Pierre-Yves David
> <pierre-yves.david@ens-lyon.org> wrote:
>>>> This commit introduces "revlog.maxchainlength" configuration variable
>>>> that will
>>>> limit delta chain length.
>>
>>
>> We do not have a 'revlog' section yet. I would use the 'format' one for this
>> as this is kind of related to the on disk format.
>
>
> format sounds good. Can I get you to send me a followup once I push?

Meh, I do not see any value for using a followup there.
Augie Fackler - Nov. 11, 2014, 3:03 p.m.
On Tue, Nov 11, 2014 at 9:57 AM, Pierre-Yves David
<pierre-yves.david@ens-lyon.org> wrote:
>>>
>>> We do not have a 'revlog' section yet. I would use the 'format' one for
>>> this
>>> as this is kind of related to the on disk format.
>>
>>
>>
>> format sounds good. Can I get you to send me a followup once I push?
>
>
> Meh, I do not see any value for using a followup there.

But then the patch I'm pushing doesn't match the behavioral intent of
what Mateusz sent, which starts making me a little weird about
"signing his name" on the modified patch. I think a followup here is
more appropriate.

(I'm fine fixing up wording things for you in-flight because you've
asked me to, and I'm fine fixing minor test failure problems for
anyone because the intent is always clear. This, to me, seems over the
line of what I'm willing to just ninja-fix for someone.)
Pierre-Yves David - Nov. 11, 2014, 3:20 p.m.
On 11/11/2014 03:03 PM, Augie Fackler wrote:
> On Tue, Nov 11, 2014 at 9:57 AM, Pierre-Yves David
> <pierre-yves.david@ens-lyon.org> wrote:
>>>>
>>>> We do not have a 'revlog' section yet. I would use the 'format' one for
>>>> this
>>>> as this is kind of related to the on disk format.
>>>
>>>
>>>
>>> format sounds good. Can I get you to send me a followup once I push?
>>
>>
>> Meh, I do not see any value for using a followup there.
>
> But then the patch I'm pushing doesn't match the behavioral intent of
> what Mateusz sent, which starts making me a little weird about
> "signing his name" on the modified patch. I think a followup here is
> more appropriate.
>
> (I'm fine fixing up wording things for you in-flight because you've
> asked me to, and I'm fine fixing minor test failure problems for
> anyone because the intent is always clear. This, to me, seems over the
> line of what I'm willing to just ninja-fix for someone.)

+1 I was meaning dropping the patch and asking for a followup. Behavior 
change or significant refactor should not be fixed in flight.
Augie Fackler - Nov. 11, 2014, 5:31 p.m.
On Tue, Nov 11, 2014 at 10:20 AM, Pierre-Yves David
<pierre-yves.david@ens-lyon.org> wrote:
>
>
> On 11/11/2014 03:03 PM, Augie Fackler wrote:
>>
>> On Tue, Nov 11, 2014 at 9:57 AM, Pierre-Yves David
>> <pierre-yves.david@ens-lyon.org> wrote:
>>>>>
>>>>>
>>>>> We do not have a 'revlog' section yet. I would use the 'format' one for
>>>>> this
>>>>> as this is kind of related to the on disk format.
>>>>
>>>>
>>>>
>>>>
>>>> format sounds good. Can I get you to send me a followup once I push?
>>>
>>>
>>>
>>> Meh, I do not see any value for using a followup there.
>>
>>
>> But then the patch I'm pushing doesn't match the behavioral intent of
>> what Mateusz sent, which starts making me a little weird about
>> "signing his name" on the modified patch. I think a followup here is
>> more appropriate.
>>
>> (I'm fine fixing up wording things for you in-flight because you've
>> asked me to, and I'm fine fixing minor test failure problems for
>> anyone because the intent is always clear. This, to me, seems over the
>> line of what I'm willing to just ninja-fix for someone.)
>
>
> +1 I was meaning dropping the patch and asking for a followup. Behavior
> change or significant refactor should not be fixed in flight.


FYI, I pushed a cleanup to crew for this:
http://hg.intevation.org/mercurial/crew/rev/8d1c508181e6
Mateusz Kwapich - Nov. 11, 2014, 6:57 p.m.
I totally fine with this cleanup. Thanks for fast review! :)

On 11/11/14, 9:31 AM, "Augie Fackler" <raf@durin42.com> wrote:

>On Tue, Nov 11, 2014 at 10:20 AM, Pierre-Yves David
><pierre-yves.david@ens-lyon.org> wrote:
>>
>>
>> On 11/11/2014 03:03 PM, Augie Fackler wrote:
>>>
>>> On Tue, Nov 11, 2014 at 9:57 AM, Pierre-Yves David
>>> <pierre-yves.david@ens-lyon.org> wrote:
>>>>>>
>>>>>>
>>>>>> We do not have a 'revlog' section yet. I would use the 'format' one
>>>>>>for
>>>>>> this
>>>>>> as this is kind of related to the on disk format.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> format sounds good. Can I get you to send me a followup once I push?
>>>>
>>>>
>>>>
>>>> Meh, I do not see any value for using a followup there.
>>>
>>>
>>> But then the patch I'm pushing doesn't match the behavioral intent of
>>> what Mateusz sent, which starts making me a little weird about
>>> "signing his name" on the modified patch. I think a followup here is
>>> more appropriate.
>>>
>>> (I'm fine fixing up wording things for you in-flight because you've
>>> asked me to, and I'm fine fixing minor test failure problems for
>>> anyone because the intent is always clear. This, to me, seems over the
>>> line of what I'm willing to just ninja-fix for someone.)
>>
>>
>> +1 I was meaning dropping the patch and asking for a followup. Behavior
>> change or significant refactor should not be fixed in flight.
>
>
>FYI, I pushed a cleanup to crew for this:
>https://urldefense.proofpoint.com/v1/url?u=http://hg.intevation.org/mercur
>ial/crew/rev/8d1c508181e6&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=bEOykPJz1n0%
>2F%2B%2BVnnlfR9Q%3D%3D%0A&m=EzUAL2%2Bgyf5mvi4y5srzo3Ou7%2Fagq%2F8h5gPfQ0%2
>BqoDo%3D%0A&s=04e340ef490873e5b24b4968045fffd5db944be2eb3ba632304da713d71c
>8d52

Patch

diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -316,6 +316,9 @@ 
         chunkcachesize = self.ui.configint('format', 'chunkcachesize')
         if chunkcachesize is not None:
             self.sopener.options['chunkcachesize'] = chunkcachesize
+        maxchainlen = self.ui.configint('revlog', 'maxchainlen')
+        if maxchainlen is not None:
+            self.sopener.options['maxchainlen'] = maxchainlen
 
     def _writerequirements(self):
         reqfile = self.opener("requires", "w")
diff --git a/mercurial/revlog.py b/mercurial/revlog.py
--- a/mercurial/revlog.py
+++ b/mercurial/revlog.py
@@ -204,6 +204,7 @@ 
         self._basecache = None
         self._chunkcache = (0, '')
         self._chunkcachesize = 65536
+        self._maxchainlen = None
         self.index = []
         self._pcache = {}
         self._nodecache = {nullid: nullrev}
@@ -219,6 +220,8 @@ 
                 v = 0
             if 'chunkcachesize' in opts:
                 self._chunkcachesize = opts['chunkcachesize']
+            if 'maxchainlen' in opts:
+                self._maxchainlen = opts['maxchainlen']
 
         if self._chunkcachesize <= 0:
             raise RevlogError(_('revlog chunk cache size %r is not greater '
@@ -1216,11 +1219,13 @@ 
                 base = rev
             else:
                 base = chainbase
-            return dist, l, data, base, chainbase
+            chainlen = self.chainlen(rev) + 1
+            return dist, l, data, base, chainbase, chainlen
 
         curr = len(self)
         prev = curr - 1
         base = chainbase = curr
+        chainlen = None
         offset = self.end(prev)
         flags = 0
         d = None
@@ -1240,7 +1245,7 @@ 
                     d = builddelta(prev)
             else:
                 d = builddelta(prev)
-            dist, l, data, base, chainbase = d
+            dist, l, data, base, chainbase, chainlen = d
 
         # full versions are inserted when the needed deltas
         # become comparable to the uncompressed text
@@ -1249,7 +1254,8 @@ 
                                         cachedelta[1])
         else:
             textlen = len(text)
-        if d is None or dist > textlen * 2:
+        if (d is None or dist > textlen * 2 or
+            self._maxchainlen and chainlen > self._maxchainlen):
             text = buildtext()
             data = self.compress(text)
             l = len(data[1]) + len(data[0])
diff --git a/tests/test-debugcommands.t b/tests/test-debugcommands.t
--- a/tests/test-debugcommands.t
+++ b/tests/test-debugcommands.t
@@ -24,6 +24,40 @@ 
   full revision size (min/max/avg)     : 44 / 44 / 44
   delta size (min/max/avg)             : 0 / 0 / 0
 
+Test max chain len
+  $ cat >> $HGRCPATH << EOF
+  > [revlog]
+  > maxchainlen=4
+  > EOF
+
+  $ echo "This test checks if maxchainlen config value is respected also it can serve as basic test for debugrevlog -d <file>.\n" >> a
+  $ hg ci -m a
+  $ echo "b\n" >> a
+  $ hg ci -m a
+  $ echo "c\n" >> a
+  $ hg ci -m a
+  $ echo "d\n" >> a
+  $ hg ci -m a
+  $ echo "e\n" >> a
+  $ hg ci -m a
+  $ echo "f\n" >> a
+  $ hg ci -m a
+  $ echo 'g\n' >> a
+  $ hg ci -m a
+  $ echo 'h\n' >> a
+  $ hg ci -m a
+  $ hg debugrevlog -d a
+  # rev p1rev p2rev start   end deltastart base   p1   p2 rawsize totalsize compression heads chainlen
+      0    -1    -1     0   ???          0    0    0    0     ???      ????           ?     1        0 (glob)
+      1     0    -1   ???   ???          0    0    0    0     ???      ????           ?     1        1 (glob)
+      2     1    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        2 (glob)
+      3     2    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        3 (glob)
+      4     3    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        4 (glob)
+      5     4    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        0 (glob)
+      6     5    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        1 (glob)
+      7     6    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        2 (glob)
+      8     7    -1   ???   ???        ???  ???  ???    0     ???      ????           ?     1        3 (glob)
+  $ cd ..
 
 Test internal debugstacktrace command