Patchwork convert: when getting file from Perforce concatenate data at the end

login
register
mail settings
Submitter Eugene Baranov
Date July 29, 2015, 11:58 p.m.
Message ID <ced462474db13d175e76.1438214335@ADNADTX6400256.eng.citrite.net>
Download mbox | patch
Permalink /patch/10072/
State Accepted
Headers show

Comments

Eugene Baranov - July 29, 2015, 11:58 p.m.
# HG changeset patch
# User Eugene Baranov <eug.baranov@gmail.com>
# Date 1438214285 -3600
#      Thu Jul 30 00:58:05 2015 +0100
# Branch stable
# Node ID ced462474db13d175e76168014e94ea7be40b764
# Parent  7d2b42b3a15039ac01c218ce8e9d04854f8a7e02
convert: when getting file from Perforce concatenate data at the end

As it turned out, even when getting relatively small files, concatenating
string data every time when new chunk is received is very inefficient.
Maintaining a string list of data chunks and concatenating everything in one go
at the end seems much more efficient - in my testing it made getting 40 MB file
7 times faster, whilst converting of a particularly big changelist with some big
files went down from 20 hours to 3 hours.
Eugene Baranov - July 30, 2015, 6:12 p.m.
Apologies - I forgot to flag it for stable!

On 30 July 2015 at 00:58, Eugene Baranov <eug.baranov@gmail.com> wrote:
> # HG changeset patch
> # User Eugene Baranov <eug.baranov@gmail.com>
> # Date 1438214285 -3600
> #      Thu Jul 30 00:58:05 2015 +0100
> # Branch stable
> # Node ID ced462474db13d175e76168014e94ea7be40b764
> # Parent  7d2b42b3a15039ac01c218ce8e9d04854f8a7e02
> convert: when getting file from Perforce concatenate data at the end
>
> As it turned out, even when getting relatively small files, concatenating
> string data every time when new chunk is received is very inefficient.
> Maintaining a string list of data chunks and concatenating everything in one go
> at the end seems much more efficient - in my testing it made getting 40 MB file
> 7 times faster, whilst converting of a particularly big changelist with some big
> files went down from 20 hours to 3 hours.
>
> diff -r 7d2b42b3a150 -r ced462474db1 hgext/convert/p4.py
> --- a/hgext/convert/p4.py       Fri Jul 24 15:10:18 2015 +0100
> +++ b/hgext/convert/p4.py       Thu Jul 30 00:58:05 2015 +0100
> @@ -304,7 +304,7 @@
>              stdout = self.p4.runcommand(cmd)
>
>              mode = None
> -            contents = ""
> +            contents = []
>              keywords = None
>
>              for d in loaditer(stdout):
> @@ -340,7 +340,7 @@
>                              keywords = self.re_keywords
>
>                  elif code == "text" or code == "binary":
> -                    contents += data
> +                    contents.append(data)
>
>                  lasterror = None
>
> @@ -350,6 +350,8 @@
>          if mode is None:
>              return None, None
>
> +        contents = ''.join(contents)
> +
>          if keywords:
>              contents = keywords.sub("$\\1$", contents)
>          if mode == "l" and contents.endswith("\n"):
Matt Mackall - July 30, 2015, 10:49 p.m.
On Thu, 2015-07-30 at 00:58 +0100, Eugene Baranov wrote:
> # HG changeset patch
> # User Eugene Baranov <eug.baranov@gmail.com>
> # Date 1438214285 -3600
> #      Thu Jul 30 00:58:05 2015 +0100
> # Branch stable
> # Node ID ced462474db13d175e76168014e94ea7be40b764
> # Parent  7d2b42b3a15039ac01c218ce8e9d04854f8a7e02
> convert: when getting file from Perforce concatenate data at the end

Queued for stable, thanks.

Patch

diff -r 7d2b42b3a150 -r ced462474db1 hgext/convert/p4.py
--- a/hgext/convert/p4.py	Fri Jul 24 15:10:18 2015 +0100
+++ b/hgext/convert/p4.py	Thu Jul 30 00:58:05 2015 +0100
@@ -304,7 +304,7 @@ 
             stdout = self.p4.runcommand(cmd)
 
             mode = None
-            contents = ""
+            contents = []
             keywords = None
 
             for d in loaditer(stdout):
@@ -340,7 +340,7 @@ 
                             keywords = self.re_keywords
 
                 elif code == "text" or code == "binary":
-                    contents += data
+                    contents.append(data)
 
                 lasterror = None
 
@@ -350,6 +350,8 @@ 
         if mode is None:
             return None, None
 
+        contents = ''.join(contents)
+
         if keywords:
             contents = keywords.sub("$\\1$", contents)
         if mode == "l" and contents.endswith("\n"):