Submitter | Gregory Szorc |
---|---|
Date | Aug. 8, 2016, 1:12 a.m. |
Message ID | <d5e9dedbc7f912ff2dc2.1470618757@ubuntu-vm-main> |
Download | mbox | patch |
Permalink | /patch/16188/ |
State | Accepted |
Headers | show |
Comments
08.08.2016, 09:12, "Gregory Szorc" <gregory.szorc@gmail.com>: > # HG changeset patch > # User Gregory Szorc <gregory.szorc@gmail.com> > # Date 1470618598 25200 > # Sun Aug 07 18:09:58 2016 -0700 > # Node ID d5e9dedbc7f912ff2dc2e92586347250aee11ac2 > # Parent 3ef9aa7ad1fc4c43b92d48e4bb1f4e3de68b6910 > hgweb: config option to control zlib compression level > > Before this patch, the HTTP transport protocol would always zlib > compress certain responses (notably "getbundle" wire protocol commands) > at zlib compression level 6. > > zlib can be a massive CPU resource sink for servers. Some server > operators may wish to reduce server-side CPU requirements while > requiring more bandwidth. This is common on corporate intranets, for > example. Others may wish to use more CPU but reduce bandwidth. > > This patch introduces a config option to allow server operators > to control the zlib compression level. > > On the "mozilla-unified" generaldelta repository, setting this > value to "0" (disable compression) results in server-side CPU > utilization for a `hg clone` going from ~180s to ~124s CPU time on > my i7-6700K. A level of "1" (which increases the transfer size from > ~1,074 MB at level 6 to ~1,222 MB) utilizes ~132s CPU time. Looks good to me. Does memlevel argument to zlib.compressobj look interesting? The docs say "higher values use more memory, but are faster and produce smaller output", maybe bumping it to 9 from the default 8 would also lower CPU time.
On Sun, Aug 7, 2016 at 6:31 PM, Anton Shestakov <engored@ya.ru> wrote: > 08.08.2016, 09:12, "Gregory Szorc" <gregory.szorc@gmail.com>: > > # HG changeset patch > > # User Gregory Szorc <gregory.szorc@gmail.com> > > # Date 1470618598 25200 > > # Sun Aug 07 18:09:58 2016 -0700 > > # Node ID d5e9dedbc7f912ff2dc2e92586347250aee11ac2 > > # Parent 3ef9aa7ad1fc4c43b92d48e4bb1f4e3de68b6910 > > hgweb: config option to control zlib compression level > > > > Before this patch, the HTTP transport protocol would always zlib > > compress certain responses (notably "getbundle" wire protocol commands) > > at zlib compression level 6. > > > > zlib can be a massive CPU resource sink for servers. Some server > > operators may wish to reduce server-side CPU requirements while > > requiring more bandwidth. This is common on corporate intranets, for > > example. Others may wish to use more CPU but reduce bandwidth. > > > > This patch introduces a config option to allow server operators > > to control the zlib compression level. > > > > On the "mozilla-unified" generaldelta repository, setting this > > value to "0" (disable compression) results in server-side CPU > > utilization for a `hg clone` going from ~180s to ~124s CPU time on > > my i7-6700K. A level of "1" (which increases the transfer size from > > ~1,074 MB at level 6 to ~1,222 MB) utilizes ~132s CPU time. > > Looks good to me. > > Does memlevel argument to zlib.compressobj look interesting? The docs say > "higher values use more memory, but are faster and produce smaller output", > maybe bumping it to 9 from the default 8 would also lower CPU time. > From zconf.h: /* The memory requirements for deflate are (in bytes): (1 << (windowBits+2)) + (1 << (memLevel+9)) windowBits is 15 by default. So the memory requirements for 8 vs 9 are 256k vs 384k. There is no impact on memory requirements for inflate. That memory use is insignificant compared to what Python is doing, so increasing shouldn't be a problem. However, on an uncompressed mozilla-unified generaldelta bundle, memlevel 9 is surprisingly both slower and yields worse compression than memlevel 8! On my machine, memlevel 9 uses ~5s more CPU time and produces an archive ~1.6 MB larger (from 1,072 MB). memlevel 6 is ~25s slower and produces an archive ~2.7 MB larger. That's definitely unexpected given zlib's docs. YMMV, of course, but it's not looking good for changing memlevel.
On Sun, 07 Aug 2016 18:12:37 -0700, Gregory Szorc wrote: > # HG changeset patch > # User Gregory Szorc <gregory.szorc@gmail.com> > # Date 1470618598 25200 > # Sun Aug 07 18:09:58 2016 -0700 > # Node ID d5e9dedbc7f912ff2dc2e92586347250aee11ac2 > # Parent 3ef9aa7ad1fc4c43b92d48e4bb1f4e3de68b6910 > hgweb: config option to control zlib compression level > > Before this patch, the HTTP transport protocol would always zlib > compress certain responses (notably "getbundle" wire protocol commands) > at zlib compression level 6. > > zlib can be a massive CPU resource sink for servers. Some server > operators may wish to reduce server-side CPU requirements while > requiring more bandwidth. This is common on corporate intranets, for > example. Others may wish to use more CPU but reduce bandwidth. > > This patch introduces a config option to allow server operators > to control the zlib compression level. Sounds good, queued, thanks. > --- a/mercurial/hgweb/protocol.py > +++ b/mercurial/hgweb/protocol.py > @@ -69,17 +69,17 @@ class webproto(wireproto.abstractserverp > def redirect(self): > self.oldio = self.ui.fout, self.ui.ferr > self.ui.ferr = self.ui.fout = stringio() > def restore(self): > val = self.ui.fout.getvalue() > self.ui.ferr, self.ui.fout = self.oldio > return val > def groupchunks(self, cg): > - z = zlib.compressobj() > + z = zlib.compressobj(self.ui.configint('server', 'zliblevel', -1)) Any idea if untrusted hgrc should be allowed?
On Mon, Aug 8, 2016 at 6:45 AM, Yuya Nishihara <yuya@tcha.org> wrote: > On Sun, 07 Aug 2016 18:12:37 -0700, Gregory Szorc wrote: > > # HG changeset patch > > # User Gregory Szorc <gregory.szorc@gmail.com> > > # Date 1470618598 25200 > > # Sun Aug 07 18:09:58 2016 -0700 > > # Node ID d5e9dedbc7f912ff2dc2e92586347250aee11ac2 > > # Parent 3ef9aa7ad1fc4c43b92d48e4bb1f4e3de68b6910 > > hgweb: config option to control zlib compression level > > > > Before this patch, the HTTP transport protocol would always zlib > > compress certain responses (notably "getbundle" wire protocol commands) > > at zlib compression level 6. > > > > zlib can be a massive CPU resource sink for servers. Some server > > operators may wish to reduce server-side CPU requirements while > > requiring more bandwidth. This is common on corporate intranets, for > > example. Others may wish to use more CPU but reduce bandwidth. > > > > This patch introduces a config option to allow server operators > > to control the zlib compression level. > > Sounds good, queued, thanks. > > > --- a/mercurial/hgweb/protocol.py > > +++ b/mercurial/hgweb/protocol.py > > @@ -69,17 +69,17 @@ class webproto(wireproto.abstractserverp > > def redirect(self): > > self.oldio = self.ui.fout, self.ui.ferr > > self.ui.ferr = self.ui.fout = stringio() > > def restore(self): > > val = self.ui.fout.getvalue() > > self.ui.ferr, self.ui.fout = self.oldio > > return val > > def groupchunks(self, cg): > > - z = zlib.compressobj() > > + z = zlib.compressobj(self.ui.configint('server', 'zliblevel', > -1)) > > Any idea if untrusted hgrc should be allowed? > I can go both ways on this. However, I'm inclined to say "no" because setting the compression level very high or disabling it could make a CPU or network usage increase significantly. I think server operators would want to retain control over this setting for that reason.
On Sun, 14 Aug 2016 14:36:03 -0700, Gregory Szorc wrote: > On Mon, Aug 8, 2016 at 6:45 AM, Yuya Nishihara <yuya@tcha.org> wrote: > > On Sun, 07 Aug 2016 18:12:37 -0700, Gregory Szorc wrote: > > > --- a/mercurial/hgweb/protocol.py > > > +++ b/mercurial/hgweb/protocol.py > > > @@ -69,17 +69,17 @@ class webproto(wireproto.abstractserverp > > > def redirect(self): > > > self.oldio = self.ui.fout, self.ui.ferr > > > self.ui.ferr = self.ui.fout = stringio() > > > def restore(self): > > > val = self.ui.fout.getvalue() > > > self.ui.ferr, self.ui.fout = self.oldio > > > return val > > > def groupchunks(self, cg): > > > - z = zlib.compressobj() > > > + z = zlib.compressobj(self.ui.configint('server', 'zliblevel', > > -1)) > > > > Any idea if untrusted hgrc should be allowed? > > I can go both ways on this. However, I'm inclined to say "no" because > setting the compression level very high or disabling it could make a CPU or > network usage increase significantly. I think server operators would want > to retain control over this setting for that reason. Okay, let's add a comment because hgweb generally reads config with untrusted=True.
Patch
diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt --- a/mercurial/help/config.txt +++ b/mercurial/help/config.txt @@ -1552,16 +1552,31 @@ Controls generic server settings. Like ``bundle1.pull`` but only used if the repository is using the *generaldelta* storage format. (default: True) Large repositories using the *generaldelta* storage format should consider setting this option because converting *generaldelta* repositories to the exchange format required by the bundle1 data format can consume a lot of CPU. +``zliblevel`` + Integer between ``-1`` and ``9`` that controls the zlib compression level + for wire protocol commands that send zlib compressed output (notably the + commands that send repository history data). + + The default (``-1``) uses the default zlib compression level, which is + likely equivalent to ``6``. ``0`` means no compression. ``9`` means + maximum compression. + + Setting this option allows server operators to make trade-offs between + bandwidth and CPU used. Lowering the compression lowers CPU utilization + but sends more bytes to clients. + + This option only impacts the HTTP server. + ``smtp`` -------- Configuration for extensions that need to send email messages. ``host`` Host name of mail server, e.g. "mail.example.com". diff --git a/mercurial/hgweb/protocol.py b/mercurial/hgweb/protocol.py --- a/mercurial/hgweb/protocol.py +++ b/mercurial/hgweb/protocol.py @@ -69,17 +69,17 @@ class webproto(wireproto.abstractserverp def redirect(self): self.oldio = self.ui.fout, self.ui.ferr self.ui.ferr = self.ui.fout = stringio() def restore(self): val = self.ui.fout.getvalue() self.ui.ferr, self.ui.fout = self.oldio return val def groupchunks(self, cg): - z = zlib.compressobj() + z = zlib.compressobj(self.ui.configint('server', 'zliblevel', -1)) while True: chunk = cg.read(4096) if not chunk: break yield z.compress(chunk) yield z.flush() def _client(self): return 'remote:%s:%s:%s' % (