Patchwork [4,of,4] changegroup: increase write buffer size to 128k

login
register
mail settings
Submitter Gregory Szorc
Date Oct. 16, 2016, 8:35 p.m.
Message ID <4582f12754622ae049af.1476650137@ubuntu-vm-main>
Download mbox | patch
Permalink /patch/17149/
State Accepted
Headers show

Comments

Gregory Szorc - Oct. 16, 2016, 8:35 p.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1476650123 25200
#      Sun Oct 16 13:35:23 2016 -0700
# Node ID 4582f12754622ae049afefee05176ef107d99a7e
# Parent  9e2f957b05ac5c76595280a6084ba01d7b369a05
changegroup: increase write buffer size to 128k

By default, Python defers to the operating system for choosing the
default buffer size on opened files. On my Linux machine, the default
is 4k, which is really small for 2016.

This patch bumps the write buffer size when writing
changegroups/bundles to 128k. This matches the 128k read buffer
we already use on revlogs.

It's worth noting that this only impacts when writing to an explicit
file (such as during `hg bundle`). Buffers when writing to bundle
files via the repo vfs or to a temporary file are not impacted.

When producing a none-v2 bundle file of the mozilla-unified repository,
this change caused the number of write() system calls to drop from
952,449 to 29,788. After this change, the most frequent system
calls are fstat(), read(), lseek(), and open(). There were
2,523,672 system calls after this patch (so a net decrease of
~950k is statistically significant).

This change shows no performance change on my system. But I have a
high-end system with a fast SSD. It is quite possible this change
will have a significant impact on network file systems, where
extra network round trips due to excessive I/O system calls could
introduce significant latency.
Augie Fackler - Oct. 17, 2016, 11:56 p.m.
On Sun, Oct 16, 2016 at 01:35:37PM -0700, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1476650123 25200
> #      Sun Oct 16 13:35:23 2016 -0700
> # Node ID 4582f12754622ae049afefee05176ef107d99a7e
> # Parent  9e2f957b05ac5c76595280a6084ba01d7b369a05
> changegroup: increase write buffer size to 128k

This patch seems very reasonable. The others make me a tad nervous for
this late in the cycle - I don't /think/ we'll find weird concurrency
problems (past series from you on revlog handling likely have sniffed
out any potential bad actors around concurrency), but in the name of
paranoia let's land those right after we release 4.0. They look fine
to me as-is, for what it's worth, so please remail them on November 2
with me on the CC line.

>
> By default, Python defers to the operating system for choosing the
> default buffer size on opened files. On my Linux machine, the default
> is 4k, which is really small for 2016.
>
> This patch bumps the write buffer size when writing
> changegroups/bundles to 128k. This matches the 128k read buffer
> we already use on revlogs.
>
> It's worth noting that this only impacts when writing to an explicit
> file (such as during `hg bundle`). Buffers when writing to bundle
> files via the repo vfs or to a temporary file are not impacted.
>
> When producing a none-v2 bundle file of the mozilla-unified repository,
> this change caused the number of write() system calls to drop from
> 952,449 to 29,788. After this change, the most frequent system
> calls are fstat(), read(), lseek(), and open(). There were
> 2,523,672 system calls after this patch (so a net decrease of
> ~950k is statistically significant).
>
> This change shows no performance change on my system. But I have a
> high-end system with a fast SSD. It is quite possible this change
> will have a significant impact on network file systems, where
> extra network round trips due to excessive I/O system calls could
> introduce significant latency.
>
> diff --git a/mercurial/changegroup.py b/mercurial/changegroup.py
> --- a/mercurial/changegroup.py
> +++ b/mercurial/changegroup.py
> @@ -88,17 +88,19 @@ def writechunks(ui, chunks, filename, vf
>      """
>      fh = None
>      cleanup = None
>      try:
>          if filename:
>              if vfs:
>                  fh = vfs.open(filename, "wb")
>              else:
> -                fh = open(filename, "wb")
> +                # Increase default buffer size because default is usually
> +                # small (4k is common on Linux).
> +                fh = open(filename, "wb", 131072)
>          else:
>              fd, filename = tempfile.mkstemp(prefix="hg-bundle-", suffix=".hg")
>              fh = os.fdopen(fd, "wb")
>          cleanup = filename
>          for c in chunks:
>              fh.write(c)
>          cleanup = None
>          return filename
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/changegroup.py b/mercurial/changegroup.py
--- a/mercurial/changegroup.py
+++ b/mercurial/changegroup.py
@@ -88,17 +88,19 @@  def writechunks(ui, chunks, filename, vf
     """
     fh = None
     cleanup = None
     try:
         if filename:
             if vfs:
                 fh = vfs.open(filename, "wb")
             else:
-                fh = open(filename, "wb")
+                # Increase default buffer size because default is usually
+                # small (4k is common on Linux).
+                fh = open(filename, "wb", 131072)
         else:
             fd, filename = tempfile.mkstemp(prefix="hg-bundle-", suffix=".hg")
             fh = os.fdopen(fd, "wb")
         cleanup = filename
         for c in chunks:
             fh.write(c)
         cleanup = None
         return filename