Patchwork [2,of,2,fix-default-regression] bufferedinputpipe: remove N^2 computation of buffer length (issue4735)

login
register
mail settings
Submitter Pierre-Yves David
Date June 26, 2015, 6:42 p.m.
Message ID <cff5b737b1debd3bc0e2.1435344157@marginatus.alto.octopoid.net>
Download mbox | patch
Permalink /patch/9790/
State Superseded
Headers show

Comments

Pierre-Yves David - June 26, 2015, 6:42 p.m.
# HG changeset patch
# User Pierre-Yves David <pierre-yves.david@fb.com>
# Date 1435343390 25200
#      Fri Jun 26 11:29:50 2015 -0700
# Node ID cff5b737b1debd3bc0e23a4e469c09e8f4d726d4
# Parent  995f1afb89fac10d4759d6c3524693ecb926b9f4
bufferedinputpipe: remove N^2 computation of buffer length (issue4735)

The assumption that dynamically computing the length of the buffer was N^2, but
negligible because fast was False. So we drop the dynamic computation and
manually keep track of the buffer length.

He: Enter commit message.  Lines beginning with 'HG:' are removed.
Pierre-Yves David - June 27, 2015, 2 a.m.
On 06/26/2015 11:42 AM, Pierre-Yves David wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@fb.com>
> # Date 1435343390 25200
> #      Fri Jun 26 11:29:50 2015 -0700
> # Node ID cff5b737b1debd3bc0e23a4e469c09e8f4d726d4
> # Parent  995f1afb89fac10d4759d6c3524693ecb926b9f4
> bufferedinputpipe: remove N^2 computation of buffer length (issue4735)
>
> The assumption that dynamically computing the length of the buffer was N^2, but
> negligible because fast was False. So we drop the dynamic computation and
> manually keep track of the buffer length.
>
> He: Enter commit message.  Lines beginning with 'HG:' are removed.
>
> diff --git a/mercurial/util.py b/mercurial/util.py
> --- a/mercurial/util.py
> +++ b/mercurial/util.py
> @@ -252,10 +252,11 @@ class bufferedinputpipe(object):
>
>       def __init__(self, input):
>           self._input = input
>           self._buffer = []
>           self._eof = False
> +        self._lenbuf = 0
>
>       @property
>       def hasbuffer(self):
>           """True is any data is currently buffered
>
> @@ -281,10 +282,11 @@ class bufferedinputpipe(object):
>       def readline(self, *args, **kwargs):
>           if 1 < len(self._buffer):
>               # this should not happen because both read and readline end with a
>               # _frombuffer call that collapse it.
>               self._buffer = [''.join(self._buffer)]
> +            self._lenbuf = len(self._buffer[0])
>           lfi = -1
>           if self._buffer:
>               lfi = self._buffer[-1].find('\n')
>           while (not self._eof) and lfi < 0:
>               self._fillbuffer()
> @@ -296,15 +298,10 @@ class bufferedinputpipe(object):
>           elif 1 < len(self._buffer):
>               # we need to take previous chunks into account
>               size += self._lenbuf - len(self._buffer[-1])
>           return self._frombuffer(size)
>
> -    @property
> -    def _lenbuf(self):
> -        """return the current lengh of buffered data"""
> -        return sum(len(d) for d in self._buffer)
> -
>       def _frombuffer(self, size):
>           """return at most 'size' data from the buffer
>
>           The data are removed from the buffer."""
>           if size == 0 or not self._buffer:
> @@ -315,20 +312,23 @@ class bufferedinputpipe(object):
>
>           data = buf[:size]
>           buf = buf[len(data):]
>           if buf:
>               self._buffer = [buf]
> +            self._lenbuf = buf[0]

as spotted by Durham, this should be:

   self._lenbuf = len(buf)

This did not explode too much, because "foo" < 4 is always true, so, 
some conditional was misbehaving, sending the whole process back on track.

I'll send a V2.

Patch

diff --git a/mercurial/util.py b/mercurial/util.py
--- a/mercurial/util.py
+++ b/mercurial/util.py
@@ -252,10 +252,11 @@  class bufferedinputpipe(object):
 
     def __init__(self, input):
         self._input = input
         self._buffer = []
         self._eof = False
+        self._lenbuf = 0
 
     @property
     def hasbuffer(self):
         """True is any data is currently buffered
 
@@ -281,10 +282,11 @@  class bufferedinputpipe(object):
     def readline(self, *args, **kwargs):
         if 1 < len(self._buffer):
             # this should not happen because both read and readline end with a
             # _frombuffer call that collapse it.
             self._buffer = [''.join(self._buffer)]
+            self._lenbuf = len(self._buffer[0])
         lfi = -1
         if self._buffer:
             lfi = self._buffer[-1].find('\n')
         while (not self._eof) and lfi < 0:
             self._fillbuffer()
@@ -296,15 +298,10 @@  class bufferedinputpipe(object):
         elif 1 < len(self._buffer):
             # we need to take previous chunks into account
             size += self._lenbuf - len(self._buffer[-1])
         return self._frombuffer(size)
 
-    @property
-    def _lenbuf(self):
-        """return the current lengh of buffered data"""
-        return sum(len(d) for d in self._buffer)
-
     def _frombuffer(self, size):
         """return at most 'size' data from the buffer
 
         The data are removed from the buffer."""
         if size == 0 or not self._buffer:
@@ -315,20 +312,23 @@  class bufferedinputpipe(object):
 
         data = buf[:size]
         buf = buf[len(data):]
         if buf:
             self._buffer = [buf]
+            self._lenbuf = buf[0]
         else:
             self._buffer = []
+            self._lenbuf = 0
         return data
 
     def _fillbuffer(self):
         """read data to the buffer"""
         data = os.read(self._input.fileno(), _chunksize)
         if not data:
             self._eof = True
         else:
+            self._lenbuf += len(data)
             self._buffer.append(data)
 
 def popen2(cmd, env=None, newlines=False):
     # Setting bufsize to -1 lets the system decide the buffer size.
     # The default for bufsize is 0, meaning unbuffered. This leads to