Patchwork [1,of,1,v2] serve: use chunked encoding in hgweb responses

login
register
mail settings
Submitter Mads Kiilerich
Date Jan. 13, 2013, 11:03 p.m.
Message ID <cc9c9033808f0fef919a.1358118219@localhost6.localdomain6>
Download mbox | patch
Permalink /patch/591/
State Accepted
Commit cf5c76017e11ed711d02824117baf770e45c2655
Headers show

Comments

Mads Kiilerich - Jan. 13, 2013, 11:03 p.m.
# HG changeset patch
# User Mads Kiilerich <mads@kiilerich.com>
# Date 1358116000 -3600
# Node ID cc9c9033808f0fef919a2326be26e0db9592c469
# Parent  b37663b0f4a8106f60c13d229a34b442cf81ba11
serve: use chunked encoding in hgweb responses

'hg serve' used to close connections when sending a response with unknown
length ... such as a bundle or archive.

Now chunked encoding will be used for responses with unknown length, and the
connection do thus not have to be closed to indicate the end of the response.

Chunked encoding is only used if the length is unknown, if the connection
wouldn't be closed for other reasons, AND if it is a HTTP 1.1 request.

This will not benefit other users of hgweb ... but it can serve as an example
that it can be done.
Bryan O'Sullivan - Jan. 14, 2013, 6:15 p.m.
On Sun, Jan 13, 2013 at 3:03 PM, Mads Kiilerich <mads@kiilerich.com> wrote:

> serve: use chunked encoding in hgweb responses
>

Nice patch - please apply.
Thomas Arendsen Hein - Jan. 15, 2013, 3:44 p.m.
* Mads Kiilerich <mads@kiilerich.com> [20130114 00:07]:
> # HG changeset patch
> # User Mads Kiilerich <mads@kiilerich.com>
> # Date 1358116000 -3600
> # Node ID cc9c9033808f0fef919a2326be26e0db9592c469
> # Parent  b37663b0f4a8106f60c13d229a34b442cf81ba11
> serve: use chunked encoding in hgweb responses

After this patch (cf5c76017e11 in crew) I get:

$ ./hg serve
192.168.11.35 - - [15/Jan/2013 16:39:02] "GET /static/hglogo.png HTTP/1.1" 304 -
----------------------------------------
Exception happened during processing of request from ('192.168.11.35', 46429)
Traceback (most recent call last):
  File "/usr/lib/python2.6/SocketServer.py", line 560, in process_request_thread
    self.finish_request(request, client_address)
  File "/usr/lib/python2.6/SocketServer.py", line 322, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/home/thomas/hg/repos/tah/mercurial/hgweb/server.py", line 48, in __init__
    BaseHTTPServer.BaseHTTPRequestHandler.__init__(self, *args, **kargs)
  File "/usr/lib/python2.6/SocketServer.py", line 617, in __init__
    self.handle()
  File "/usr/lib/python2.6/BaseHTTPServer.py", line 331, in handle
    self.handle_one_request()
  File "/usr/lib/python2.6/BaseHTTPServer.py", line 312, in handle_one_request
    self.raw_requestline = self.rfile.readline()
  File "/usr/lib/python2.6/socket.py", line 444, in readline
    data = self._sock.recv(self._rbufsize)
error: [Errno 104] Connection reset by peer
----------------------------------------

when using Firefox 18 and doing a simple reload. When doing a full
reload (shift-reload) the error does not appear.

http request with simple reload:

GET /static/hglogo.png HTTP/1.1
Host: host.example.com:port
User-Agent: Mozilla/5.0 (X11; Linux i686; rv:18.0) Gecko/20130110 Firefox/18.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en,de;q=0.5
Accept-Encoding: gzip, deflate
Referer: http://host.example.com:port/
Connection: keep-alive
If-None-Match: 1358263473.31
Cache-Control: max-age=0

http reply with simple reload:

HTTP/1.1 304 Not Modified
Server: BaseHTTP/0.3 Python/2.6.6
Date: Tue, 15 Jan 2013 15:31:55 GMT
Transfer-Encoding: chunked

http request with forced reload:

GET /static/hglogo.png HTTP/1.1
Host: host.example.com:port
User-Agent: Mozilla/5.0 (X11; Linux i686; rv:18.0) Gecko/20130110 Firefox/18.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en,de;q=0.5
Accept-Encoding: gzip, deflate
Referer: http://host.example.com:port/
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache

http reply with forced reload:

HTTP/1.1 200 Script output follows
Server: BaseHTTP/0.3 Python/2.6.6
Date: Tue, 15 Jan 2013 15:32:00 GMT
ETag: 1358263473.31
Content-Type: image/png
Content-Length: 4123

<89>PNG
...


I haven't tried this with a real webserver.

Regards,
Thomas
Mads Kiilerich - Jan. 15, 2013, 4:51 p.m.
On 01/15/2013 04:44 PM, Thomas Arendsen Hein wrote:
> * Mads Kiilerich <mads@kiilerich.com> [20130114 00:07]:
>> # HG changeset patch
>> # User Mads Kiilerich <mads@kiilerich.com>
>> # Date 1358116000 -3600
>> # Node ID cc9c9033808f0fef919a2326be26e0db9592c469
>> # Parent  b37663b0f4a8106f60c13d229a34b442cf81ba11
>> serve: use chunked encoding in hgweb responses
> After this patch (cf5c76017e11 in crew) I get:
...
> http reply with simple reload:
>
> HTTP/1.1 304 Not Modified
> Server: BaseHTTP/0.3 Python/2.6.6
> Date: Tue, 15 Jan 2013 15:31:55 GMT
> Transfer-Encoding: chunked
>

Right, thanks. I'm sorry about that. An error introduced when moving 
code around late in the development process.

I have bombed a patch. I will push it later unless someone spot other 
errors there.

/Mads

Patch

diff --git a/mercurial/hgweb/server.py b/mercurial/hgweb/server.py
--- a/mercurial/hgweb/server.py
+++ b/mercurial/hgweb/server.py
@@ -133,10 +133,12 @@  class _httprequesthandler(BaseHTTPServer
         self.saved_headers = []
         self.sent_headers = False
         self.length = None
+        self._chunked = None
         for chunk in self.server.application(env, self._start_response):
             self._write(chunk)
         if not self.sent_headers:
             self.send_headers()
+        self._done()
 
     def send_headers(self):
         if not self.saved_status:
@@ -145,16 +147,19 @@  class _httprequesthandler(BaseHTTPServer
         saved_status = self.saved_status.split(None, 1)
         saved_status[0] = int(saved_status[0])
         self.send_response(*saved_status)
-        should_close = True
+        self.length = None
+        self._chunked = False
         for h in self.saved_headers:
             self.send_header(*h)
             if h[0].lower() == 'content-length':
-                should_close = False
                 self.length = int(h[1])
-        # The value of the Connection header is a list of case-insensitive
-        # tokens separated by commas and optional whitespace.
-        if should_close:
-            self.send_header('Connection', 'close')
+        if self.length is None:
+            self._chunked = (not self.close_connection and
+                             self.request_version == "HTTP/1.1")
+            if self._chunked:
+                self.send_header('Transfer-Encoding', 'chunked')
+            else:
+                self.send_header('Connection', 'close')
         self.end_headers()
         self.sent_headers = True
 
@@ -177,9 +182,16 @@  class _httprequesthandler(BaseHTTPServer
                 raise AssertionError("Content-length header sent, but more "
                                      "bytes than specified are being written.")
             self.length = self.length - len(data)
+        elif self._chunked and data:
+            data = '%x\r\n%s\r\n' % (len(data), data)
         self.wfile.write(data)
         self.wfile.flush()
 
+    def _done(self):
+        if self._chunked:
+            self.wfile.write('0\r\n\r\n')
+            self.wfile.flush()
+
 class _httprequesthandleropenssl(_httprequesthandler):
     """HTTPS handler based on pyOpenSSL"""
 
diff --git a/tests/test-https.t b/tests/test-https.t
--- a/tests/test-https.t
+++ b/tests/test-https.t
@@ -124,7 +124,6 @@  clone via pull
   adding manifests
   adding file changes
   added 1 changesets with 4 changes to 4 files
-  warning: localhost certificate with fingerprint 91:4f:1a:ff:87:24:9c:09:b6:85:9b:88:b1:90:6d:30:75:64:91:ca not verified (check hostfingerprints or web.cacerts config setting)
   updating to branch default
   4 files updated, 0 files merged, 0 files removed, 0 files unresolved
   $ hg verify -R copy-pull
@@ -152,7 +151,6 @@  pull without cacert
   adding manifests
   adding file changes
   added 1 changesets with 1 changes to 1 files
-  warning: localhost certificate with fingerprint 91:4f:1a:ff:87:24:9c:09:b6:85:9b:88:b1:90:6d:30:75:64:91:ca not verified (check hostfingerprints or web.cacerts config setting)
   changegroup hook: HG_NODE=5fed3813f7f5e1824344fdc9cf8f63bb662c292d HG_SOURCE=pull HG_URL=https://localhost:$HGPORT/
   (run 'hg update' to get a working copy)
   $ cd ..