Patchwork [14,of,14] sparse-revlog: put the native implementation of slicechunktodensity to use

login
register
mail settings
Submitter Boris Feld
Date Nov. 12, 2018, 9:55 a.m.
Message ID <2d6f7e64249ddfce01ad.1542016549@localhost.localdomain>
Download mbox | patch
Permalink /patch/36522/
State Accepted
Headers show

Comments

Boris Feld - Nov. 12, 2018, 9:55 a.m.
# HG changeset patch
# User Boris Feld <boris.feld@octobus.net>
# Date 1541980065 -3600
#      Mon Nov 12 00:47:45 2018 +0100
# Node ID 2d6f7e64249ddfce01ad5bea9b7ae409c752801f
# Parent  0d337528d627f35f8337fc68ea18245db0a608e1
# EXP-Topic sparse-perf
# Available At https://bitbucket.org/octobus/mercurial-devel/
#              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 2d6f7e64249d
sparse-revlog: put the native implementation of slicechunktodensity to use

When possible, the C implementation of delta chain slicing will be used.
providing a large boost in performance for this operation.

To take a practical example of restoring manifest revision '59547c40bc4c' for
a reference NetBeans repository (using sparse-revlog). The media time of the
step `slice-sparse-chain` of `perfrevlogrevision` improve from 0.660 ms to
0.098 ms;

The full series move delta chain slicing from 1.120 ms to 0.098 ms;

Implementing _slicechunktosize into C would yield further improvements.
However, the performance seems good enough for now.
Augie Fackler - Nov. 12, 2018, 7:15 p.m.
> On Nov 12, 2018, at 04:55, Boris Feld <boris.feld@octobus.net> wrote:
> 
> # HG changeset patch
> # User Boris Feld <boris.feld@octobus.net>
> # Date 1541980065 -3600
> #      Mon Nov 12 00:47:45 2018 +0100
> # Node ID 2d6f7e64249ddfce01ad5bea9b7ae409c752801f
> # Parent  0d337528d627f35f8337fc68ea18245db0a608e1
> # EXP-Topic sparse-perf
> # Available At https://bitbucket.org/octobus/mercurial-devel/
> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 2d6f7e64249d
> sparse-revlog: put the native implementation of slicechunktodensity to use

I didn't review closely, but if we're expanding the functionality of revlog.c it might make sense to try and fuzz it. parsers.so is already fuzzed for the manifest fuzzer, so in theory that could be a guide for how to fuzz other parts of parsers.so.

(Not a requirement, but definitely something worth exploring.)

> 
> When possible, the C implementation of delta chain slicing will be used.
> providing a large boost in performance for this operation.
> 
> To take a practical example of restoring manifest revision '59547c40bc4c' for
> a reference NetBeans repository (using sparse-revlog). The media time of the
> step `slice-sparse-chain` of `perfrevlogrevision` improve from 0.660 ms to
> 0.098 ms;
> 
> The full series move delta chain slicing from 1.120 ms to 0.098 ms;
> 
> Implementing _slicechunktosize into C would yield further improvements.
> However, the performance seems good enough for now.
> 
> diff --git a/mercurial/revlogutils/deltas.py b/mercurial/revlogutils/deltas.py
> --- a/mercurial/revlogutils/deltas.py
> +++ b/mercurial/revlogutils/deltas.py
> @@ -115,9 +115,12 @@ def slicechunk(revlog, revs, targetsize=
>         targetsize = max(targetsize, revlog._srmingapsize)
>     # targetsize should not be specified when evaluating delta candidates:
>     # * targetsize is used to ensure we stay within specification when reading,
> -    for chunk in _slicechunktodensity(revlog, revs,
> -                                      revlog._srdensitythreshold,
> -                                      revlog._srmingapsize):
> +    densityslicing = getattr(revlog.index, 'slicechunktodensity', None)
> +    if densityslicing is None:
> +        densityslicing = lambda x, y, z: _slicechunktodensity(revlog, x, y, z)
> +    for chunk in densityslicing(revs,
> +                                revlog._srdensitythreshold,
> +                                revlog._srmingapsize):
>         for subchunk in _slicechunktosize(revlog, chunk, targetsize):
>             yield subchunk
> 
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/revlogutils/deltas.py b/mercurial/revlogutils/deltas.py
--- a/mercurial/revlogutils/deltas.py
+++ b/mercurial/revlogutils/deltas.py
@@ -115,9 +115,12 @@  def slicechunk(revlog, revs, targetsize=
         targetsize = max(targetsize, revlog._srmingapsize)
     # targetsize should not be specified when evaluating delta candidates:
     # * targetsize is used to ensure we stay within specification when reading,
-    for chunk in _slicechunktodensity(revlog, revs,
-                                      revlog._srdensitythreshold,
-                                      revlog._srmingapsize):
+    densityslicing = getattr(revlog.index, 'slicechunktodensity', None)
+    if densityslicing is None:
+        densityslicing = lambda x, y, z: _slicechunktodensity(revlog, x, y, z)
+    for chunk in densityslicing(revs,
+                                revlog._srdensitythreshold,
+                                revlog._srmingapsize):
         for subchunk in _slicechunktosize(revlog, chunk, targetsize):
             yield subchunk