Patchwork [10,of,11] hgweb: use compression engine API for zlib compression

login
register
mail settings
Submitter Gregory Szorc
Date Nov. 2, 2016, 12:08 a.m.
Message ID <fc426af4f25c3403703e.1478045323@ubuntu-vm-main>
Download mbox | patch
Permalink /patch/17253/
State Superseded
Headers show

Comments

Gregory Szorc - Nov. 2, 2016, 12:08 a.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1477160356 25200
#      Sat Oct 22 11:19:16 2016 -0700
# Node ID fc426af4f25c3403703e913ccb4a6865865fcb02
# Parent  03555032b7e3bc7192fd8bebf6af3f05b1e70516
hgweb: use compression engine API for zlib compression

More low-level compression code elimination because we now have nice
APIs.
Pierre-Yves David - Nov. 7, 2016, 2:25 p.m.
On 11/02/2016 01:08 AM, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1477160356 25200
> #      Sat Oct 22 11:19:16 2016 -0700
> # Node ID fc426af4f25c3403703e913ccb4a6865865fcb02
> # Parent  03555032b7e3bc7192fd8bebf6af3f05b1e70516
> hgweb: use compression engine API for zlib compression
>
> More low-level compression code elimination because we now have nice
> APIs.
>
> diff --git a/mercurial/hgweb/protocol.py b/mercurial/hgweb/protocol.py
> --- a/mercurial/hgweb/protocol.py
> +++ b/mercurial/hgweb/protocol.py
> @@ -83,24 +83,18 @@ class webproto(wireproto.abstractserverp
>                  yield chunk
>
>          return self.compresschunks(getchunks())
>
>      def compresschunks(self, chunks):
>          # Don't allow untrusted settings because disabling compression or
>          # setting a very high compression level could lead to flooding
>          # the server's network or CPU.
> -        z = zlib.compressobj(self.ui.configint('server', 'zliblevel', -1))
> -        for chunk in chunks:
> -            data = z.compress(chunk)
> -            # Not all calls to compress() emit data. It is cheaper to inspect
> -            # that here than to send it via the generator.
> -            if data:
> -                yield data
> -        yield z.flush()
> +        opts = {'level': self.ui.configint('server', 'zliblevel', -1)}
> +        return util.compressionengines['zlib'].compressstream(chunks, opts)

Out of curiosity, what is the long term plan for this zliblevel option here?
* Having some special case for each compressors in the code,
* Having a generic callback to set this up,
* Pass ui to the compressors for auto configuration,
* something else?

>      def _client(self):
>          return 'remote:%s:%s:%s' % (
>              self.req.env.get('wsgi.url_scheme') or 'http',
>              urlreq.quote(self.req.env.get('REMOTE_HOST', '')),
>              urlreq.quote(self.req.env.get('REMOTE_USER', '')))
>
>  def iscmd(cmd):

Cheers,
Gregory Szorc - Nov. 7, 2016, 5:46 p.m.
On Mon, Nov 7, 2016 at 6:25 AM, Pierre-Yves David <
pierre-yves.david@ens-lyon.org> wrote:

>
>
> On 11/02/2016 01:08 AM, Gregory Szorc wrote:
>
>> # HG changeset patch
>> # User Gregory Szorc <gregory.szorc@gmail.com>
>> # Date 1477160356 25200
>> #      Sat Oct 22 11:19:16 2016 -0700
>> # Node ID fc426af4f25c3403703e913ccb4a6865865fcb02
>> # Parent  03555032b7e3bc7192fd8bebf6af3f05b1e70516
>> hgweb: use compression engine API for zlib compression
>>
>> More low-level compression code elimination because we now have nice
>> APIs.
>>
>> diff --git a/mercurial/hgweb/protocol.py b/mercurial/hgweb/protocol.py
>> --- a/mercurial/hgweb/protocol.py
>> +++ b/mercurial/hgweb/protocol.py
>> @@ -83,24 +83,18 @@ class webproto(wireproto.abstractserverp
>>                  yield chunk
>>
>>          return self.compresschunks(getchunks())
>>
>>      def compresschunks(self, chunks):
>>          # Don't allow untrusted settings because disabling compression or
>>          # setting a very high compression level could lead to flooding
>>          # the server's network or CPU.
>> -        z = zlib.compressobj(self.ui.configint('server', 'zliblevel',
>> -1))
>> -        for chunk in chunks:
>> -            data = z.compress(chunk)
>> -            # Not all calls to compress() emit data. It is cheaper to
>> inspect
>> -            # that here than to send it via the generator.
>> -            if data:
>> -                yield data
>> -        yield z.flush()
>> +        opts = {'level': self.ui.configint('server', 'zliblevel', -1)}
>> +        return util.compressionengines['zlib'].compressstream(chunks,
>> opts)
>>
>
> Out of curiosity, what is the long term plan for this zliblevel option
> here?
> * Having some special case for each compressors in the code,
> * Having a generic callback to set this up,
> * Pass ui to the compressors for auto configuration,
> * something else?
>

I haven't fully solved this problem for all cases.

For bundles, I plan on extending the "bundle spec" mechanism to allow
defining compression parameters. See
https://hg.mozilla.org/users/gszorc_mozilla.com/hg/rev/04f0144c9142.

For the wire protocol, I was tentatively planning on reusing [server]. For
revlogs, we could reuse [format]. In many cases, yes, we'd need to pass a
ui or have the caller pass in options read from a ui.

The project survived for years without having any configuration knobs for
zlib. So I think we make sane default choices for new engines and add the
knobs later.


>
>      def _client(self):
>>          return 'remote:%s:%s:%s' % (
>>              self.req.env.get('wsgi.url_scheme') or 'http',
>>              urlreq.quote(self.req.env.get('REMOTE_HOST', '')),
>>              urlreq.quote(self.req.env.get('REMOTE_USER', '')))
>>
>>  def iscmd(cmd):
>>
>
> Cheers,
>
> --
> Pierre-Yves David
>

Patch

diff --git a/mercurial/hgweb/protocol.py b/mercurial/hgweb/protocol.py
--- a/mercurial/hgweb/protocol.py
+++ b/mercurial/hgweb/protocol.py
@@ -83,24 +83,18 @@  class webproto(wireproto.abstractserverp
                 yield chunk
 
         return self.compresschunks(getchunks())
 
     def compresschunks(self, chunks):
         # Don't allow untrusted settings because disabling compression or
         # setting a very high compression level could lead to flooding
         # the server's network or CPU.
-        z = zlib.compressobj(self.ui.configint('server', 'zliblevel', -1))
-        for chunk in chunks:
-            data = z.compress(chunk)
-            # Not all calls to compress() emit data. It is cheaper to inspect
-            # that here than to send it via the generator.
-            if data:
-                yield data
-        yield z.flush()
+        opts = {'level': self.ui.configint('server', 'zliblevel', -1)}
+        return util.compressionengines['zlib'].compressstream(chunks, opts)
 
     def _client(self):
         return 'remote:%s:%s:%s' % (
             self.req.env.get('wsgi.url_scheme') or 'http',
             urlreq.quote(self.req.env.get('REMOTE_HOST', '')),
             urlreq.quote(self.req.env.get('REMOTE_USER', '')))
 
 def iscmd(cmd):