Patchwork [6,of,8,"] compression: introduce an official `format.revlog-compression` option

login
register
mail settings
Submitter Pierre-Yves David
Date March 31, 2019, 3:36 p.m.
Message ID <108e26fa0a97fe5342a1.1554046582@nodosa.octopoid.net>
Download mbox | patch
Permalink /patch/39428/
State Accepted
Headers show

Comments

Pierre-Yves David - March 31, 2019, 3:36 p.m.
# HG changeset patch
# User Pierre-Yves David <pierre-yves.david@octobus.net>
# Date 1553707614 -3600
#      Wed Mar 27 18:26:54 2019 +0100
# Node ID 108e26fa0a97fe5342a1ce246cc4e4c185803454
# Parent  28701199a78bdbab36aa422be0b4681941433823
# EXP-Topic zstd-revlog
# Available At https://bitbucket.org/octobus/mercurial-devel/
#              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 108e26fa0a97
compression: introduce an official `format.revlog-compression` option

This option superseed the `experiment.format.compression` option. The value
currently supported are zlib (default) and zstd (if Mercurial was compiled with
zstd support).

The option gained an explicite reference to `revlog` since this is the target
usage here. Different storage methods might requires different compression
strategies.

In our tests, using zstd give a significant CPU usage improvement (both
compression and decompressing) while keeping similar repository size.

Zstd as other interresting mode (dictionnaly, pre-text, etc…) that are probably
worth exploring. However, just play switching from zlib to zstd provide a large
benefit.
Josef 'Jeff' Sipek - April 2, 2019, 7:42 a.m.
On Sun, Mar 31, 2019 at 17:36:22 +0200, Pierre-Yves David wrote:
...
> compression: introduce an official `format.revlog-compression` option
> 
> This option superseed the `experiment.format.compression` option. The value

s/superseed/supersedes/ :)

> currently supported are zlib (default) and zstd (if Mercurial was compiled with
> zstd support).
> 
> The option gained an explicite reference to `revlog` since this is the target

s/explicite/explicit/

> usage here. Different storage methods might requires different compression
> strategies.

s/requires/require/

> 
> In our tests, using zstd give a significant CPU usage improvement (both
> compression and decompressing) while keeping similar repository size.
> 
> Zstd as other interresting mode (dictionnaly, pre-text, etc…) that are probably

I'm guessing here: s/dictionnaly/dictionary/ ?

> worth exploring. However, just play switching from zlib to zstd provide a large
> benefit.

s/play/plain/

...
> diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt
> --- a/mercurial/help/config.txt
> +++ b/mercurial/help/config.txt
> @@ -866,6 +866,13 @@ https://www.mercurial-scm.org/wiki/Missi
>      Repositories with this on-disk format require Mercurial version 4.7
>  
>      Enabled by default.
> +``revlog-compression``
> +    Compression algorithm used by revlog. Supported value are `zlib` and `zstd`.
> +    The `zlib` engine is the historical default of Mercurial. `zstd` is a newer
> +    format that is usually a net win over `zlib` operating faster at better
> +    compression rate. Use `zstd` to reduce CPU usage.
> +
> +    On some system, Mercurial installation may lack `zstd` supports. Default is `zlib`.

This says that 'zlib' is the default - twice.  Should it repeat itself like
this?

Jeff.
Pierre-Yves David - April 2, 2019, 1:53 p.m.
On 4/2/19 9:42 AM, Josef 'Jeff' Sipek wrote:
> On Sun, Mar 31, 2019 at 17:36:22 +0200, Pierre-Yves David wrote:
> ...
>> compression: introduce an official `format.revlog-compression` option
>>
>> This option superseed the `experiment.format.compression` option. The value
> 
> s/superseed/supersedes/ :)
> 
>> currently supported are zlib (default) and zstd (if Mercurial was compiled with
>> zstd support).
>>
>> The option gained an explicite reference to `revlog` since this is the target
> 
> s/explicite/explicit/
> 
>> usage here. Different storage methods might requires different compression
>> strategies.
> 
> s/requires/require/
> 
>>
>> In our tests, using zstd give a significant CPU usage improvement (both
>> compression and decompressing) while keeping similar repository size.
>>
>> Zstd as other interresting mode (dictionnaly, pre-text, etc…) that are probably
> 
> I'm guessing here: s/dictionnaly/dictionary/ ?
> 
>> worth exploring. However, just play switching from zlib to zstd provide a large
>> benefit.
> 
> s/play/plain/
> 
> ...
>> diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt
>> --- a/mercurial/help/config.txt
>> +++ b/mercurial/help/config.txt
>> @@ -866,6 +866,13 @@ https://www.mercurial-scm.org/wiki/Missi
>>       Repositories with this on-disk format require Mercurial version 4.7
>>   
>>       Enabled by default.
>> +``revlog-compression``
>> +    Compression algorithm used by revlog. Supported value are `zlib` and `zstd`.
>> +    The `zlib` engine is the historical default of Mercurial. `zstd` is a newer
>> +    format that is usually a net win over `zlib` operating faster at better
>> +    compression rate. Use `zstd` to reduce CPU usage.
>> +
>> +    On some system, Mercurial installation may lack `zstd` supports. Default is `zlib`.
> 
> This says that 'zlib' is the default - twice.  Should it repeat itself like
> this?

The first occurrence carry the information that zlib came before zstd 
(immutable fact). The second occurrence says that the -current- default 
is zlib (mutable fact).
Gregory Szorc - April 2, 2019, 6:19 p.m.
On Sun, Mar 31, 2019 at 8:39 AM Pierre-Yves David <
pierre-yves.david@ens-lyon.org> wrote:

> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@octobus.net>
> # Date 1553707614 -3600
> #      Wed Mar 27 18:26:54 2019 +0100
> # Node ID 108e26fa0a97fe5342a1ce246cc4e4c185803454
> # Parent  28701199a78bdbab36aa422be0b4681941433823
> # EXP-Topic zstd-revlog
> # Available At https://bitbucket.org/octobus/mercurial-devel/
> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r
> 108e26fa0a97
> compression: introduce an official `format.revlog-compression` option
>

Queued parts 1-6.


>
> This option superseed the `experiment.format.compression` option. The value
> currently supported are zlib (default) and zstd (if Mercurial was compiled
> with
> zstd support).
>
> The option gained an explicite reference to `revlog` since this is the
> target
> usage here. Different storage methods might requires different compression
> strategies.
>
> In our tests, using zstd give a significant CPU usage improvement (both
> compression and decompressing) while keeping similar repository size.
>
> Zstd as other interresting mode (dictionnaly, pre-text, etc…) that are
> probably
> worth exploring. However, just play switching from zlib to zstd provide a
> large
> benefit.
>
> diff --git a/mercurial/configitems.py b/mercurial/configitems.py
> --- a/mercurial/configitems.py
> +++ b/mercurial/configitems.py
> @@ -553,9 +553,6 @@ coreconfigitem('experimental', 'extended
>  coreconfigitem('experimental', 'extendedheader.similarity',
>      default=False,
>  )
> -coreconfigitem('experimental', 'format.compression',
> -    default='zlib',
> -)
>  coreconfigitem('experimental', 'graphshorten',
>      default=False,
>  )
> @@ -684,6 +681,10 @@ coreconfigitem('format', 'obsstore-versi
>  coreconfigitem('format', 'sparse-revlog',
>      default=True,
>  )
> +coreconfigitem('format', 'revlog-compression',
> +    default='zlib',
> +    alias=[('experimental', 'format.compression')]
> +)
>  coreconfigitem('format', 'usefncache',
>      default=True,
>  )
> diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt
> --- a/mercurial/help/config.txt
> +++ b/mercurial/help/config.txt
> @@ -866,6 +866,13 @@ https://www.mercurial-scm.org/wiki/Missi
>      Repositories with this on-disk format require Mercurial version 4.7
>
>      Enabled by default.
> +``revlog-compression``
> +    Compression algorithm used by revlog. Supported value are `zlib` and
> `zstd`.
> +    The `zlib` engine is the historical default of Mercurial. `zstd` is a
> newer
> +    format that is usually a net win over `zlib` operating faster at
> better
> +    compression rate. Use `zstd` to reduce CPU usage.
> +
> +    On some system, Mercurial installation may lack `zstd` supports.
> Default is `zlib`.
>
>  ``graph``
>  ---------
> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> --- a/mercurial/localrepo.py
> +++ b/mercurial/localrepo.py
> @@ -2920,10 +2920,10 @@ def newreporequirements(ui, createopts):
>              if ui.configbool('format', 'dotencode'):
>                  requirements.add('dotencode')
>
> -    compengine = ui.config('experimental', 'format.compression')
> +    compengine = ui.config('format', 'revlog-compression')
>      if compengine not in util.compengines:
>          raise error.Abort(_('compression engine %s defined by '
> -                            'experimental.format.compression not
> available') %
> +                            'format.revlog-compression not available') %
>                            compengine,
>                            hint=_('run "hg debuginstall" to list available
> '
>                                   'compression engines'))
> diff --git a/mercurial/upgrade.py b/mercurial/upgrade.py
> --- a/mercurial/upgrade.py
> +++ b/mercurial/upgrade.py
> @@ -332,7 +332,7 @@ class compressionengine(formatvariant):
>
>      @classmethod
>      def fromconfig(cls, repo):
> -        return repo.ui.config('experimental', 'format.compression')
> +        return repo.ui.config('format', 'revlog-compression')
>
>  @registerformatvariant
>  class compressionlevel(formatvariant):
> diff --git a/tests/test-repo-compengines.t b/tests/test-repo-compengines.t
> --- a/tests/test-repo-compengines.t
> +++ b/tests/test-repo-compengines.t
> @@ -21,8 +21,8 @@ A new repository uses zlib storage, whic
>
>  Unknown compression engine to format.compression aborts
>
> -  $ hg --config experimental.format.compression=unknown init unknown
> -  abort: compression engine unknown defined by
> experimental.format.compression not available
> +  $ hg --config format.revlog-compression=unknown init unknown
> +  abort: compression engine unknown defined by format.revlog-compression
> not available
>    (run "hg debuginstall" to list available compression engines)
>    [255]
>
> @@ -40,7 +40,7 @@ A requirement specifying an unknown comp
>
>  #if zstd
>
> -  $ hg --config experimental.format.compression=zstd init zstd
> +  $ hg --config format.revlog-compression=zstd init zstd
>    $ cd zstd
>    $ cat .hg/requires
>    dotencode
> @@ -66,7 +66,7 @@ with that engine or a requirement
>
>    $ cd default
>    $ touch bar
> -  $ hg --config experimental.format.compression=zstd -q commit -A -m 'add
> bar with a lot of repeated repeated repeated text'
> +  $ hg --config format.revlog-compression=zstd -q commit -A -m 'add bar
> with a lot of repeated repeated repeated text'
>
>    $ cat .hg/requires
>    dotencode
> @@ -141,13 +141,13 @@ Test error cases
>  checking zstd options
>  =====================
>
> -  $ hg init zstd-level-default --config
> experimental.format.compression=zstd
> -  $ hg init zstd-level-1 --config experimental.format.compression=zstd
> +  $ hg init zstd-level-default --config format.revlog-compression=zstd
> +  $ hg init zstd-level-1 --config format.revlog-compression=zstd
>    $ cat << EOF >> zstd-level-1/.hg/hgrc
>    > [storage]
>    > revlog.zstd.level=1
>    > EOF
> -  $ hg init zstd-level-22 --config experimental.format.compression=zstd
> +  $ hg init zstd-level-22 --config format.revlog-compression=zstd
>    $ cat << EOF >> zstd-level-22/.hg/hgrc
>    > [storage]
>    > revlog.zstd.level=22
> @@ -172,7 +172,7 @@ checking zstd options
>
>  Test error cases
>
> -  $ hg init zstd-level-invalid --config
> experimental.format.compression=zstd
> +  $ hg init zstd-level-invalid --config format.revlog-compression=zstd
>    $ cat << EOF >> zstd-level-invalid/.hg/hgrc
>    > [storage]
>    > revlog.zstd.level=foobar
> @@ -182,7 +182,7 @@ Test error cases
>    abort: storage.revlog.zstd.level is not a valid integer ('foobar')
>    [255]
>
> -  $ hg init zstd-level-out-of-range --config
> experimental.format.compression=zstd
> +  $ hg init zstd-level-out-of-range --config
> format.revlog-compression=zstd
>    $ cat << EOF >> zstd-level-out-of-range/.hg/hgrc
>    > [storage]
>    > revlog.zstd.level=42
>

Patch

diff --git a/mercurial/configitems.py b/mercurial/configitems.py
--- a/mercurial/configitems.py
+++ b/mercurial/configitems.py
@@ -553,9 +553,6 @@  coreconfigitem('experimental', 'extended
 coreconfigitem('experimental', 'extendedheader.similarity',
     default=False,
 )
-coreconfigitem('experimental', 'format.compression',
-    default='zlib',
-)
 coreconfigitem('experimental', 'graphshorten',
     default=False,
 )
@@ -684,6 +681,10 @@  coreconfigitem('format', 'obsstore-versi
 coreconfigitem('format', 'sparse-revlog',
     default=True,
 )
+coreconfigitem('format', 'revlog-compression',
+    default='zlib',
+    alias=[('experimental', 'format.compression')]
+)
 coreconfigitem('format', 'usefncache',
     default=True,
 )
diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt
--- a/mercurial/help/config.txt
+++ b/mercurial/help/config.txt
@@ -866,6 +866,13 @@  https://www.mercurial-scm.org/wiki/Missi
     Repositories with this on-disk format require Mercurial version 4.7
 
     Enabled by default.
+``revlog-compression``
+    Compression algorithm used by revlog. Supported value are `zlib` and `zstd`.
+    The `zlib` engine is the historical default of Mercurial. `zstd` is a newer
+    format that is usually a net win over `zlib` operating faster at better
+    compression rate. Use `zstd` to reduce CPU usage.
+
+    On some system, Mercurial installation may lack `zstd` supports. Default is `zlib`.
 
 ``graph``
 ---------
diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -2920,10 +2920,10 @@  def newreporequirements(ui, createopts):
             if ui.configbool('format', 'dotencode'):
                 requirements.add('dotencode')
 
-    compengine = ui.config('experimental', 'format.compression')
+    compengine = ui.config('format', 'revlog-compression')
     if compengine not in util.compengines:
         raise error.Abort(_('compression engine %s defined by '
-                            'experimental.format.compression not available') %
+                            'format.revlog-compression not available') %
                           compengine,
                           hint=_('run "hg debuginstall" to list available '
                                  'compression engines'))
diff --git a/mercurial/upgrade.py b/mercurial/upgrade.py
--- a/mercurial/upgrade.py
+++ b/mercurial/upgrade.py
@@ -332,7 +332,7 @@  class compressionengine(formatvariant):
 
     @classmethod
     def fromconfig(cls, repo):
-        return repo.ui.config('experimental', 'format.compression')
+        return repo.ui.config('format', 'revlog-compression')
 
 @registerformatvariant
 class compressionlevel(formatvariant):
diff --git a/tests/test-repo-compengines.t b/tests/test-repo-compengines.t
--- a/tests/test-repo-compengines.t
+++ b/tests/test-repo-compengines.t
@@ -21,8 +21,8 @@  A new repository uses zlib storage, whic
 
 Unknown compression engine to format.compression aborts
 
-  $ hg --config experimental.format.compression=unknown init unknown
-  abort: compression engine unknown defined by experimental.format.compression not available
+  $ hg --config format.revlog-compression=unknown init unknown
+  abort: compression engine unknown defined by format.revlog-compression not available
   (run "hg debuginstall" to list available compression engines)
   [255]
 
@@ -40,7 +40,7 @@  A requirement specifying an unknown comp
 
 #if zstd
 
-  $ hg --config experimental.format.compression=zstd init zstd
+  $ hg --config format.revlog-compression=zstd init zstd
   $ cd zstd
   $ cat .hg/requires
   dotencode
@@ -66,7 +66,7 @@  with that engine or a requirement
 
   $ cd default
   $ touch bar
-  $ hg --config experimental.format.compression=zstd -q commit -A -m 'add bar with a lot of repeated repeated repeated text'
+  $ hg --config format.revlog-compression=zstd -q commit -A -m 'add bar with a lot of repeated repeated repeated text'
 
   $ cat .hg/requires
   dotencode
@@ -141,13 +141,13 @@  Test error cases
 checking zstd options
 =====================
 
-  $ hg init zstd-level-default --config experimental.format.compression=zstd
-  $ hg init zstd-level-1 --config experimental.format.compression=zstd
+  $ hg init zstd-level-default --config format.revlog-compression=zstd
+  $ hg init zstd-level-1 --config format.revlog-compression=zstd
   $ cat << EOF >> zstd-level-1/.hg/hgrc
   > [storage]
   > revlog.zstd.level=1
   > EOF
-  $ hg init zstd-level-22 --config experimental.format.compression=zstd
+  $ hg init zstd-level-22 --config format.revlog-compression=zstd
   $ cat << EOF >> zstd-level-22/.hg/hgrc
   > [storage]
   > revlog.zstd.level=22
@@ -172,7 +172,7 @@  checking zstd options
 
 Test error cases
 
-  $ hg init zstd-level-invalid --config experimental.format.compression=zstd
+  $ hg init zstd-level-invalid --config format.revlog-compression=zstd
   $ cat << EOF >> zstd-level-invalid/.hg/hgrc
   > [storage]
   > revlog.zstd.level=foobar
@@ -182,7 +182,7 @@  Test error cases
   abort: storage.revlog.zstd.level is not a valid integer ('foobar')
   [255]
 
-  $ hg init zstd-level-out-of-range --config experimental.format.compression=zstd
+  $ hg init zstd-level-out-of-range --config format.revlog-compression=zstd
   $ cat << EOF >> zstd-level-out-of-range/.hg/hgrc
   > [storage]
   > revlog.zstd.level=42