Patchwork [01,of,19] revlog: drop duplicated code

login
register
mail settings
Submitter Boris Feld
Date Sept. 8, 2018, 10:56 a.m.
Message ID <bdcfd96d87254a508e85.1536404215@localhost.localdomain>
Download mbox | patch
Permalink /patch/34425/
State Accepted
Headers show

Comments

Boris Feld - Sept. 8, 2018, 10:56 a.m.
# HG changeset patch
# User Boris Feld <boris.feld@octobus.net>
# Date 1536087921 -7200
#      Tue Sep 04 21:05:21 2018 +0200
# Node ID bdcfd96d87254a508e85a6eb502ffe4c57845bc8
# Parent  6268fed317d04dd8f0430468818ea5d0529b6503
# EXP-Topic sparse-snapshot
# Available At https://bitbucket.org/octobus/mercurial-devel/
#              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r bdcfd96d8725
revlog: drop duplicated code

This code probably got duplicated by a rebase/evolve conflict. We drop the
extra copy, the other copy is right below.

This had no real effects since other logic ensure that we never test the same
revision twice.
Gregory Szorc - Sept. 10, 2018, 4:52 p.m.
On Sat, Sep 8, 2018 at 3:57 AM Boris Feld <boris.feld@octobus.net> wrote:

> # HG changeset patch
> # User Boris Feld <boris.feld@octobus.net>
> # Date 1536087921 -7200
> #      Tue Sep 04 21:05:21 2018 +0200
> # Node ID bdcfd96d87254a508e85a6eb502ffe4c57845bc8
> # Parent  6268fed317d04dd8f0430468818ea5d0529b6503
> # EXP-Topic sparse-snapshot
> # Available At https://bitbucket.org/octobus/mercurial-devel/
> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r
> bdcfd96d8725
> revlog: drop duplicated code
>

Queued this series.

Thank you for the detailed explanations in the commit messages and the
inline comments. It really helps to grok how all of this works.

I find this work around sparse revlogs and more intelligent delta chains
*very* interesting. And the results obviously speak for themselves.

It's worth noting that when we have alternate storage backends that may not
be based on append-only storage and linear access models, more complex
delta strategies like this make a *lot* more sense and assuming the
performance impact is reasonable, should obviously be the preferred
mechanism.

I'm also a fan of limiting the max chain length: setting an upper bound on
the chain length starts to set an upper bound on the maximum work that
needs to be performed at read/write time. While not addressed by this
series, another potential area for improvement is setting a bounded upper
limit on the total size of stored deltas: since decompression and patch
application is somewhat linearly proportional to the input data size,
bounding the total size of data that needs to be decoded in order to
resolve a revision fulltext would potentially be a *better* bounding
mechanism than chain length itself. I'll be honest, I'm not sure what the
heuristic would be to determine what that bounded ceiling should be. IMO it
is worth investigating though.

Regarding performance, there are a handful of quadratic algorithms in this
code. We have several calls to deltaparent() both directly and indirectly.
I feel like this will add significant overhead when adding multiple
revisions to large revlogs (>100,000 revisions). What I'm not sure about is
how this overhead compares to the overhead of resolving parent revisions
and computing deltas against them. I expect revision resolution and delta
computation to take most of the CPU. But at the same time, quadratic
algorithms coupled with Python overhead can make things very slow. I look
forward to the follow-up series addressing performance.


>
> This code probably got duplicated by a rebase/evolve conflict. We drop the
> extra copy, the other copy is right below.
>
> This had no real effects since other logic ensure that we never test the
> same
> revision twice.
>
> diff --git a/mercurial/revlogutils/deltas.py
> b/mercurial/revlogutils/deltas.py
> --- a/mercurial/revlogutils/deltas.py
> +++ b/mercurial/revlogutils/deltas.py
> @@ -621,19 +621,6 @@ def _rawgroups(revlog, p1, p2, cachedelt
>      curr = len(revlog)
>      prev = curr - 1
>
> -    # should we try to build a delta?
> -    if prev != nullrev and revlog._storedeltachains:
> -        tested = set()
> -        # This condition is true most of the time when processing
> -        # changegroup data into a generaldelta repo. The only time it
> -        # isn't true is if this is the first revision in a delta chain
> -        # or if ``format.generaldelta=true`` disabled ``lazydeltabase``.
> -        if cachedelta and gdelta and revlog._lazydeltabase:
> -            # Assume what we received from the server is a good choice
> -            # build delta will reuse the cache
> -            yield (cachedelta[0],)
> -            tested.add(cachedelta[0])
> -
>      # This condition is true most of the time when processing
>      # changegroup data into a generaldelta repo. The only time it
>      # isn't true is if this is the first revision in a delta chain
>
Boris Feld - Sept. 13, 2018, 2:20 p.m.
On 10/09/2018 18:52, Gregory Szorc wrote:
> On Sat, Sep 8, 2018 at 3:57 AM Boris Feld <boris.feld@octobus.net 
> <mailto:boris.feld@octobus.net>> wrote:
>
>     # HG changeset patch
>     # User Boris Feld <boris.feld@octobus.net
>     <mailto:boris.feld@octobus.net>>
>     # Date 1536087921 -7200
>     #      Tue Sep 04 21:05:21 2018 +0200
>     # Node ID bdcfd96d87254a508e85a6eb502ffe4c57845bc8
>     # Parent  6268fed317d04dd8f0430468818ea5d0529b6503
>     # EXP-Topic sparse-snapshot
>     # Available At https://bitbucket.org/octobus/mercurial-devel/
>     #              hg pull
>     https://bitbucket.org/octobus/mercurial-devel/ -r bdcfd96d8725
>     revlog: drop duplicated code
>
>
> Queued this series.
>
> Thank you for the detailed explanations in the commit messages and the 
> inline comments. It really helps to grok how all of this works.
>
> I find this work around sparse revlogs and more intelligent delta 
> chains *very* interesting. And the results obviously speak for themselves.

So… we discovered an embarrassing mistake introduced in 37957e07138c. 
Two lines got swapped and the delta algorithm was silently dropping 
investigation when an empty delta was encountered. This created 
significantly worse revlog.

All the numbers in the last commit in this series have been affected.
This includes the number from the new strategy, however to a lesser 
extent as it was able to work around the issue the reasonably well.

We just gathered new numbers, and the intermediate-snapshot strategy 
still holds up to its promise. In particular, the new approach resists 
delta chain constraints well. It is still a choice worth considering in 
all cases. However it is no longer a pure win on all fronts, so we might 
have to decide for a trade-off between speed boosts from the delta chain 
and repository size (maybe slightly longer chain, we'll have to gather 
performance number).

Full detailed numbers at the end of this emails.
>
> It's worth noting that when we have alternate storage backends that 
> may not be based on append-only storage and linear access models, more 
> complex delta strategies like this make a *lot* more sense and 
> assuming the performance impact is reasonable, should obviously be the 
> preferred mechanism.
>
> I'm also a fan of limiting the max chain length: setting an upper 
> bound on the chain length starts to set an upper bound on the maximum 
> work that needs to be performed at read/write time. While not 
> addressed by this series, another potential area for improvement is 
> setting a bounded upper limit on the total size of stored deltas: 
> since decompression and patch application is somewhat linearly 
> proportional to the input data size, bounding the total size of data 
> that needs to be decoded in order to resolve a revision fulltext would 
> potentially be a *better* bounding mechanism than chain length itself. 
> I'll be honest, I'm not sure what the heuristic would be to determine 
> what that bounded ceiling should be. IMO it is worth investigating though.
I'm not sure what you mean here? Bounding the size of stored deltas 
necessary to restore a revision is something we already do. This has 
been the first constraint to delta chain when revlog was added. Do you 
mean something else?
>
> Regarding performance, there are a handful of quadratic algorithms in 
> this code. We have several calls to deltaparent() both directly and 
> indirectly. I feel like this will add significant overhead when adding 
> multiple revisions to large revlogs (>100,000 revisions). What I'm not 
> sure about is how this overhead compares to the overhead of resolving 
> parent revisions and computing deltas against them. I expect revision 
> resolution and delta computation to take most of the CPU. But at the 
> same time, quadratic algorithms coupled with Python overhead can make 
> things very slow. I look forward to the follow-up series addressing 
> performance.

For the full conversion, we seem to have a CPU increase that varies but 
stay roughly under 50%, but it needs better analysis. An important point 
is that the search for an appropriate intermediate snapshot base only 
happens if the parent where unsuitable bases. Since the main trigger for 
new base is now the delta chain length, this happens about 1 in 1000 
revisions. In addition, the large read performance boost we get from the 
delta chain has a large impact on the time we spend restoring candidate 
base for diffing, speeding some part of the process.


Full statistic of conversion after fixing the bug from 37957e07138c:

mercurial

     Manifest Size:
         limit       | none        |        1000
         ------------|-------------|------------
         no-sparse   |   6 143 044 |  6 269 496
         no-snapshot |   6 078 442 |  6 174 110
         snapshot    |   5 798 796 |  5 827 025

     Manifest Chain length data
         limit       ||       none        ||       1000
         value       || average |   max   || average |   max
------------||---------|---------||---------|---------
         no-sparse   ||     429 |   1 397 ||     397 |    1 000
         no-snapshot ||     430 |   1 397 ||     398 |    1 000
         snapshot    ||     326 |   1 290 ||     313 |    1 000

     Full Store Size
         limit       | none        |        1000
         ------------|-------------|------------
         no-sparse   |  46 944 775 | 47 166 129
         no-snapshot |  46 881 164 | 47 071 734
         snapshot    |  46 622 445 | 46 723 774

pypy

     Manifest Size:
         limit       | none        |        1000
         ------------|-------------|------------
         no-sparse   |  52 941 760 |  56 200 970
         no-snapshot |  28 934 814 |  34 410 548
         snapshot    |  26 348 229 |  27 384 133

     Manifest Chain length data
         limit       ||       none        ||       1000
         value       || average |   max   || average |   max
------------||---------|---------||---------|---------
         no-sparse   ||     769 |   3 889 ||     390 |    1 000
         no-snapshot ||   1 379 |   3 889 ||     501 |    1 000
         snapshot    ||   1 223 |   3 846 ||     495 |    1 000

     Full Store Size
         limit       | none        |        1000
         ------------|-------------|------------
         no-sparse   | 336 050 203 | 339 309 413
         no-snapshot | 312 193 772 | 317 669 506
         snapshot    | 338 673 985 | 339 709 889

Mozilla

     Manifest Size:
         limit       | none           |        1000
         ------------|----------------|---------------
         no-sparse   |    215 096 339 |  1 708 853 525
         no-snapshot |    199 492 686 |  1 590 880 707
         snapshot    |    188 947 271 |    278 894 170

     Manifest Chain length data
         limit       ||       none        ||       1000
         value       || average |   max   || average |   max
------------||---------|---------||---------|--------
         no-sparse   ||  20 454 |  59 562 ||     491 |   1 000
         no-snapshot ||  22 422 |  63 137 ||     489 |   1 000
         snapshot    ||  23 509 |  69 891 ||     489 |   1 000

     Full Store Size
         limit       | none           |        1000
         ------------|----------------|---------------
         no-sparse   |  2 377 578 715 |  3 876 258 798
         no-snapshot |  2 364 043 550 |  3 760 354 468
         snapshot    |  2 441 677 137 |  2 535 997 381

Netbeans

     Manifest Size:
         limit       | none           |        1000
         ------------|----------------|---------------
         no-sparse   |    130 088 982 |    741 590 565
         no-snapshot |    119 337 636 |    695 443 777
         snapshot    |    118 836 887 |    159 161 207

     Manifest Chain length data
         limit       ||       none        ||       1000
         value       || average |   max   || average |   max
------------||---------|---------||---------|---------
         no-sparse   ||  19 321 |  61 397 ||     510 |    1 000
         no-snapshot ||  21 572 |  61 397 ||     510 |    1 000
         snapshot    ||  21 240 |  61 583 ||     503 |    1 000

     Full Store Size
         limit       | none           |        1000
         ------------|----------------|---------------
         no-sparse   |  1 160 013 008 |  1 771 514 591
         no-snapshot |  1 149 360 447 |  1 725 466 588
         snapshot    |  1 164 959 988 |  1 205 284 308

Private repo #1

     Manifest Size:
         limit       | none            |        1000
         ------------|-----------------|---------------
         no-sparse   | 33 725 285 081  | 33 724 834 190
         no-snapshot |    390 302 545  |  1 891 428 641
         snapshot    |    350 542 420  |    423 470 579

     Manifest Chain length data
         limit       ||       none        ||       1000
         value       || average |   max   || average |   max
------------||---------|---------||---------|---------
         no-sparse   ||         |         ||         |    1 000
         no-snapshot ||         |         ||         |    1 000
         snapshot    ||         |         ||         |    1 000

     Full Store Size
         limit       | none           |        1000
         ------------|----------------|---------------
         no-sparse   |  41544149652   | 41 543 698 761
         no-snapshot |  8 152 000 440 |  9 653 126 536
         snapshot    |  8 448 037 300 |  8 520 965 459
Boris Feld - Sept. 13, 2018, 3:49 p.m.
There was some missing numbers near the end, here there are:


On 13/09/2018 16:20, Boris FELD wrote:
> On 10/09/2018 18:52, Gregory Szorc wrote:
>> On Sat, Sep 8, 2018 at 3:57 AM Boris Feld <boris.feld@octobus.net 
>> <mailto:boris.feld@octobus.net>> wrote:
>>
>>     # HG changeset patch
>>     # User Boris Feld <boris.feld@octobus.net
>>     <mailto:boris.feld@octobus.net>>
>>     # Date 1536087921 -7200
>>     #      Tue Sep 04 21:05:21 2018 +0200
>>     # Node ID bdcfd96d87254a508e85a6eb502ffe4c57845bc8
>>     # Parent  6268fed317d04dd8f0430468818ea5d0529b6503
>>     # EXP-Topic sparse-snapshot
>>     # Available At https://bitbucket.org/octobus/mercurial-devel/
>>     #              hg pull
>>     https://bitbucket.org/octobus/mercurial-devel/ -r bdcfd96d8725
>>     revlog: drop duplicated code
>>
>>
>> Queued this series.
>>
>> Thank you for the detailed explanations in the commit messages and 
>> the inline comments. It really helps to grok how all of this works.
>>
>> I find this work around sparse revlogs and more intelligent delta 
>> chains *very* interesting. And the results obviously speak for 
>> themselves.
>
> So… we discovered an embarrassing mistake introduced in 37957e07138c. 
> Two lines got swapped and the delta algorithm was silently dropping 
> investigation when an empty delta was encountered. This created 
> significantly worse revlog.
>
> All the numbers in the last commit in this series have been affected.
> This includes the number from the new strategy, however to a lesser 
> extent as it was able to work around the issue the reasonably well.
>
> We just gathered new numbers, and the intermediate-snapshot strategy 
> still holds up to its promise. In particular, the new approach resists 
> delta chain constraints well. It is still a choice worth considering 
> in all cases. However it is no longer a pure win on all fronts, so we 
> might have to decide for a trade-off between speed boosts from the 
> delta chain and repository size (maybe slightly longer chain, we'll 
> have to gather performance number).
>
> Full detailed numbers at the end of this emails.
>>
>> It's worth noting that when we have alternate storage backends that 
>> may not be based on append-only storage and linear access models, 
>> more complex delta strategies like this make a *lot* more sense and 
>> assuming the performance impact is reasonable, should obviously be 
>> the preferred mechanism.
>>
>> I'm also a fan of limiting the max chain length: setting an upper 
>> bound on the chain length starts to set an upper bound on the maximum 
>> work that needs to be performed at read/write time. While not 
>> addressed by this series, another potential area for improvement is 
>> setting a bounded upper limit on the total size of stored deltas: 
>> since decompression and patch application is somewhat linearly 
>> proportional to the input data size, bounding the total size of data 
>> that needs to be decoded in order to resolve a revision fulltext 
>> would potentially be a *better* bounding mechanism than chain length 
>> itself. I'll be honest, I'm not sure what the heuristic would be to 
>> determine what that bounded ceiling should be. IMO it is worth 
>> investigating though.
> I'm not sure what you mean here? Bounding the size of stored deltas 
> necessary to restore a revision is something we already do. This has 
> been the first constraint to delta chain when revlog was added. Do you 
> mean something else?
>>
>> Regarding performance, there are a handful of quadratic algorithms in 
>> this code. We have several calls to deltaparent() both directly and 
>> indirectly. I feel like this will add significant overhead when 
>> adding multiple revisions to large revlogs (>100,000 revisions). What 
>> I'm not sure about is how this overhead compares to the overhead of 
>> resolving parent revisions and computing deltas against them. I 
>> expect revision resolution and delta computation to take most of the 
>> CPU. But at the same time, quadratic algorithms coupled with Python 
>> overhead can make things very slow. I look forward to the follow-up 
>> series addressing performance.
>
> For the full conversion, we seem to have a CPU increase that varies 
> but stay roughly under 50%, but it needs better analysis. An important 
> point is that the search for an appropriate intermediate snapshot base 
> only happens if the parent where unsuitable bases. Since the main 
> trigger for new base is now the delta chain length, this happens about 
> 1 in 1000 revisions. In addition, the large read performance boost we 
> get from the delta chain has a large impact on the time we spend 
> restoring candidate base for diffing, speeding some part of the process.
>
>
> Full statistic of conversion after fixing the bug from 37957e07138c:
>
> mercurial
>
>     Manifest Size:
>         limit       | none        |        1000
>         ------------|-------------|------------
>         no-sparse   |   6 143 044 |  6 269 496
>         no-snapshot |   6 078 442 |  6 174 110
>         snapshot    |   5 798 796 |  5 827 025
>
>     Manifest Chain length data
>         limit       ||       none        ||       1000
>         value       || average |   max   || average | max
> ------------||---------|---------||---------|---------
>         no-sparse   ||     429 |   1 397 ||     397 | 1 000
>         no-snapshot ||     430 |   1 397 ||     398 | 1 000
>         snapshot    ||     326 |   1 290 ||     313 | 1 000
>
>     Full Store Size
>         limit       | none        |        1000
>         ------------|-------------|------------
>         no-sparse   |  46 944 775 | 47 166 129
>         no-snapshot |  46 881 164 | 47 071 734
>         snapshot    |  46 622 445 | 46 723 774
>
> pypy
>
>     Manifest Size:
>         limit       | none        |        1000
>         ------------|-------------|------------
>         no-sparse   |  52 941 760 |  56 200 970
>         no-snapshot |  28 934 814 |  34 410 548
>         snapshot    |  26 348 229 |  27 384 133
>
>     Manifest Chain length data
>         limit       ||       none        ||       1000
>         value       || average |   max   || average | max
> ------------||---------|---------||---------|---------
>         no-sparse   ||     769 |   3 889 ||     390 | 1 000
>         no-snapshot ||   1 379 |   3 889 ||     501 | 1 000
>         snapshot    ||   1 223 |   3 846 ||     495 | 1 000
>
>     Full Store Size
>         limit       | none        |        1000
>         ------------|-------------|------------
>         no-sparse   | 336 050 203 | 339 309 413
>         no-snapshot | 312 193 772 | 317 669 506
>         snapshot    | 338 673 985 | 339 709 889
>
> Mozilla
>
>     Manifest Size:
>         limit       | none           |        1000
>         ------------|----------------|---------------
>         no-sparse   |    215 096 339 |  1 708 853 525
>         no-snapshot |    199 492 686 |  1 590 880 707
>         snapshot    |    188 947 271 |    278 894 170
>
>     Manifest Chain length data
>         limit       ||       none        ||       1000
>         value       || average |   max   || average | max
> ------------||---------|---------||---------|--------
>         no-sparse   ||  20 454 |  59 562 ||     491 |   1 000
>         no-snapshot ||  22 422 |  63 137 ||     489 |   1 000
>         snapshot    ||  23 509 |  69 891 ||     489 |   1 000
>
>     Full Store Size
>         limit       | none           |        1000
>         ------------|----------------|---------------
>         no-sparse   |  2 377 578 715 |  3 876 258 798
>         no-snapshot |  2 364 043 550 |  3 760 354 468
>         snapshot    |  2 441 677 137 |  2 535 997 381
>
> Netbeans
>
>     Manifest Size:
>         limit       | none           |        1000
>         ------------|----------------|---------------
>         no-sparse   |    130 088 982 |    741 590 565
>         no-snapshot |    119 337 636 |    695 443 777
>         snapshot    |    118 836 887 |    159 161 207
>
>     Manifest Chain length data
>         limit       ||       none        ||       1000
>         value       || average |   max   || average | max
> ------------||---------|---------||---------|---------
>         no-sparse   ||  19 321 |  61 397 ||     510 | 1 000
>         no-snapshot ||  21 572 |  61 397 ||     510 | 1 000
>         snapshot    ||  21 240 |  61 583 ||     503 | 1 000
>
>     Full Store Size
>         limit       | none           |        1000
>         ------------|----------------|---------------
>         no-sparse   |  1 160 013 008 |  1 771 514 591
>         no-snapshot |  1 149 360 447 |  1 725 466 588
>         snapshot    |  1 164 959 988 |  1 205 284 308
>
> Private repo #1
>
>     Manifest Size:
>         limit       | none            |        1000
>         ------------|-----------------|---------------
>         no-sparse   | 33 725 285 081  | 33 724 834 190
>         no-snapshot |    390 302 545  |  1 891 428 641
>         snapshot    |    350 542 420  |    423 470 579
>
>
      Manifest Chain length data
          limit       ||       none        ||       1000
          value       || average |   max   || average | max
------------||---------|---------||---------|---------
          no-sparse   ||     282 |   8 885 ||     113 |    1 000
          no-snapshot ||   3 475 |   8 885 ||     533 |    1 000
          snapshot    ||   3 655 |   8 951 ||     530 |    1 000
>
>     Full Store Size
>         limit       | none           |        1000
>         ------------|----------------|---------------
>         no-sparse   |  41544149652   | 41 543 698 761
>         no-snapshot |  8 152 000 440 |  9 653 126 536
>         snapshot    |  8 448 037 300 |  8 520 965 459
>
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/revlogutils/deltas.py b/mercurial/revlogutils/deltas.py
--- a/mercurial/revlogutils/deltas.py
+++ b/mercurial/revlogutils/deltas.py
@@ -621,19 +621,6 @@  def _rawgroups(revlog, p1, p2, cachedelt
     curr = len(revlog)
     prev = curr - 1
 
-    # should we try to build a delta?
-    if prev != nullrev and revlog._storedeltachains:
-        tested = set()
-        # This condition is true most of the time when processing
-        # changegroup data into a generaldelta repo. The only time it
-        # isn't true is if this is the first revision in a delta chain
-        # or if ``format.generaldelta=true`` disabled ``lazydeltabase``.
-        if cachedelta and gdelta and revlog._lazydeltabase:
-            # Assume what we received from the server is a good choice
-            # build delta will reuse the cache
-            yield (cachedelta[0],)
-            tested.add(cachedelta[0])
-
     # This condition is true most of the time when processing
     # changegroup data into a generaldelta repo. The only time it
     # isn't true is if this is the first revision in a delta chain