Patchwork [4,of,4] merge: perform background file closing in batchget

login
register
mail settings
Submitter Gregory Szorc
Date Feb. 20, 2016, 11:56 p.m.
Message ID <ce2715270bdd0e74157e.1456012560@ubuntu-vm-main>
Download mbox | patch
Permalink /patch/13278/
State Accepted
Headers show

Comments

Gregory Szorc - Feb. 20, 2016, 11:56 p.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1456012449 28800
#      Sat Feb 20 15:54:09 2016 -0800
# Node ID ce2715270bdd0e74157e45a782cdc14ecc19de2b
# Parent  623ae98a2f3a933999ce9143611da0f733c7297d
merge: perform background file closing in batchget

As 2fdbf22a1b63 demonstrated with stream clones, closing files on
background threads on Windows can yield a significant speedup
because closing files that have been created/appended to is slow
on Windows/NTFS.

Working directory updates can write thousands of files. Therefore it
is susceptible to excessive slowness on Windows due to slow file
closes.

This patch enables background file closing when performing working
directory file writes. The impact when performing an `hg up tip` on
mozilla-central (136,357 files) from an empty working directory is
significant:

Before: 535s (8:55)
After:  133s (2:13)
Delta: -402s (6:42)

That's a 4x speedup!

By comparison, that same machine can perform the same operation
in ~15s on Linux. So Windows went from ~35x to ~9x slower. Not bad
but there's still work to do.

As a reminder, background file closing is only activated on Windows
because it is only beneficial on that platform. So this patch
shouldn't change non-Windows behavior at all.

It's worth noting that non-Windows systems perform working directory
updates with multiple processes. Unfortunately, worker.py doesn't
yet support Windows. So, there is still plenty of room for making
working directory updates faster on Windows. Even if multiple
processes are used on Windows, I believe background file closing
will still provide a benefit, as individual processes will still
be slowed down by the file close bottleneck (assuming the I/O system
isn't saturated).
David Soria Parra - Feb. 22, 2016, 6:01 p.m.
On Sat, Feb 20, 2016 at 03:56:00PM -0800, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1456012449 28800
> #      Sat Feb 20 15:54:09 2016 -0800
> # Node ID ce2715270bdd0e74157e45a782cdc14ecc19de2b
> # Parent  623ae98a2f3a933999ce9143611da0f733c7297d
> merge: perform background file closing in batchget
> 

series looks good to me. however i wont queue it as i haven't reviewed in a while,
so pyd&co needs to do that.
Augie Fackler - Feb. 23, 2016, 7:24 p.m.
On Mon, Feb 22, 2016 at 10:01:27AM -0800, David Soria Parra wrote:
> On Sat, Feb 20, 2016 at 03:56:00PM -0800, Gregory Szorc wrote:
> > # HG changeset patch
> > # User Gregory Szorc <gregory.szorc@gmail.com>
> > # Date 1456012449 28800
> > #      Sat Feb 20 15:54:09 2016 -0800
> > # Node ID ce2715270bdd0e74157e45a782cdc14ecc19de2b
> > # Parent  623ae98a2f3a933999ce9143611da0f733c7297d
> > merge: perform background file closing in batchget
> >
>
> series looks good to me. however i wont queue it as i haven't reviewed in a while,
> so pyd&co needs to do that.

Agreed, this looks good. Queued. Thanks!

> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/merge.py b/mercurial/merge.py
--- a/mercurial/merge.py
+++ b/mercurial/merge.py
@@ -1053,17 +1053,17 @@  def batchget(repo, mctx, actions):
 
     yields tuples for progress updates
     """
     verbose = repo.ui.verbose
     fctx = mctx.filectx
     wwrite = repo.wwrite
     ui = repo.ui
     i = 0
-    if True:
+    with repo.wvfs.backgroundclosing(ui, expectedcount=len(actions)):
         for f, (flags, backup), msg in actions:
             repo.ui.debug(" %s: %s -> g\n" % (f, msg))
             if verbose:
                 repo.ui.note(_("getting %s\n") % f)
 
             if backup:
                 absf = repo.wjoin(f)
                 orig = scmutil.origpath(ui, repo, absf)
@@ -1072,17 +1072,17 @@  def batchget(repo, mctx, actions):
                     # directory is replaced by a tracked file, or generally
                     # with file/directory merges. This needs to be sorted out.
                     if repo.wvfs.isfileorlink(f):
                         util.rename(absf, orig)
                 except OSError as e:
                     if e.errno != errno.ENOENT:
                         raise
 
-            wwrite(f, fctx(f).data(), flags)
+            wwrite(f, fctx(f).data(), flags, backgroundclose=True)
             if i == 100:
                 yield i, f
                 i = 0
             i += 1
     if i > 0:
         yield i, f
 
 def applyupdates(repo, actions, wctx, mctx, overwrite, labels=None):