Patchwork [2,of,6,V3] hidden: use _domainancestors to compute revs revealed by dynamic blocker

login
register
mail settings
Submitter Pierre-Yves David
Date May 26, 2017, 6:34 p.m.
Message ID <c95d2c1ff6128ab5580d.1495823686@nodosa.octopoid.net>
Download mbox | patch
Permalink /patch/20932/
State Accepted
Headers show

Comments

Pierre-Yves David - May 26, 2017, 6:34 p.m.
# HG changeset patch
# User Pierre-Yves David <pierre-yves.david@octobus.net>
# Date 1495373721 -7200
#      Sun May 21 15:35:21 2017 +0200
# Node ID c95d2c1ff6128ab5580d8e28ef8632ae87ecccc5
# Parent  95b7279b3b2ec46607dbc22ebe4e8a27381c5ace
# EXP-Topic fast-compute-hidden
# Available At https://www.mercurial-scm.org/repo/users/marmoute/mercurial/
#              hg pull https://www.mercurial-scm.org/repo/users/marmoute/mercurial/ -r c95d2c1ff612
hidden: use _domainancestors to compute revs revealed by dynamic blocker

The complexity of computing the revealed changesets is now 'O(revealed)'.
This massively speeds up the computation on large repository. Moving it to the
millisecond range.

Below are timing from two Mozilla repositories with different contents:

1) mozilla repository with:
 * 400667 changesets
 * 35 hidden changesets (first rev-268334)
 * 288 visible drafts
 * obsolete working copy (dynamicblockers),

Before:
! visible
! wall 0.030247 comb 0.030000 user 0.030000 sys 0.000000 (best of 100)

After:
! visible
! wall 0.000585 comb 0.000000 user 0.000000 sys 0.000000 (best of 4221)

The timing above include the computation of obsolete changeset:
! obsolete
! wall 0.000396 comb 0.000000 user 0.000000 sys 0.000000 (best of 6816)

So adjusted time give 30ms before versus 0.2ms after. A 150x speedup.

2) mozilla repository with:
 * 405645 changesets
 * 4312 hidden changesets (first rev-326004)
 * 264 visible drafts
 * obsolete working copy (dynamicblockers),

Before:
! visible
! wall 0.168658 comb 0.170000 user 0.170000 sys 0.000000 (best of 48)

After
! visible
! wall 0.008612 comb 0.010000 user 0.010000 sys 0.000000 (best of 325)

The timing above include the computation of obsolete changeset:
! obsolete
! wall 0.006408 comb 0.010000 user 0.010000 sys 0.000000 (best of 404)

So adjusted time give 160ms before versus 2ms after. A 75x speedup.
via Mercurial-devel - May 26, 2017, 8:43 p.m.
+mercurial-devel again (I clicked just "Reply" by mistake yet again)

On Fri, May 26, 2017 at 1:19 PM, Martin von Zweigbergk
<martinvonz@google.com> wrote:
> On Fri, May 26, 2017 at 11:34 AM, Pierre-Yves David
> <pierre-yves.david@ens-lyon.org> wrote:
>> # HG changeset patch
>> # User Pierre-Yves David <pierre-yves.david@octobus.net>
>> # Date 1495373721 -7200
>> #      Sun May 21 15:35:21 2017 +0200
>> # Node ID c95d2c1ff6128ab5580d8e28ef8632ae87ecccc5
>> # Parent  95b7279b3b2ec46607dbc22ebe4e8a27381c5ace
>> # EXP-Topic fast-compute-hidden
>> # Available At https://www.mercurial-scm.org/repo/users/marmoute/mercurial/
>> #              hg pull https://www.mercurial-scm.org/repo/users/marmoute/mercurial/ -r c95d2c1ff612
>> hidden: use _domainancestors to compute revs revealed by dynamic blocker
>>
>> The complexity of computing the revealed changesets is now 'O(revealed)'.
>> This massively speeds up the computation on large repository. Moving it to the
>> millisecond range.
>>
>> Below are timing from two Mozilla repositories with different contents:
>>
>> 1) mozilla repository with:
>>  * 400667 changesets
>>  * 35 hidden changesets (first rev-268334)
>>  * 288 visible drafts
>>  * obsolete working copy (dynamicblockers),
>>
>> Before:
>> ! visible
>> ! wall 0.030247 comb 0.030000 user 0.030000 sys 0.000000 (best of 100)
>>
>> After:
>> ! visible
>> ! wall 0.000585 comb 0.000000 user 0.000000 sys 0.000000 (best of 4221)
>>
>> The timing above include the computation of obsolete changeset:
>> ! obsolete
>> ! wall 0.000396 comb 0.000000 user 0.000000 sys 0.000000 (best of 6816)
>>
>> So adjusted time give 30ms before versus 0.2ms after. A 150x speedup.
>>
>> 2) mozilla repository with:
>>  * 405645 changesets
>>  * 4312 hidden changesets (first rev-326004)
>>  * 264 visible drafts
>>  * obsolete working copy (dynamicblockers),
>>
>> Before:
>> ! visible
>> ! wall 0.168658 comb 0.170000 user 0.170000 sys 0.000000 (best of 48)
>>
>> After
>> ! visible
>> ! wall 0.008612 comb 0.010000 user 0.010000 sys 0.000000 (best of 325)
>>
>> The timing above include the computation of obsolete changeset:
>> ! obsolete
>> ! wall 0.006408 comb 0.010000 user 0.010000 sys 0.000000 (best of 404)
>>
>> So adjusted time give 160ms before versus 2ms after. A 75x speedup.
>
> I've said before that I'm more interested in effect on real-life
> commands (which is not to say that everyone has to agree and share
> timings for real-life commands). 160ms is a pretty good win, but not
> so relevant if the command still takes 10s. Roughly how long does a
> "hg log -r ." (?) take with the setup above?
>
>>
>> diff --git a/mercurial/repoview.py b/mercurial/repoview.py
>> --- a/mercurial/repoview.py
>> +++ b/mercurial/repoview.py
>> @@ -224,8 +224,10 @@ def computehidden(repo):
>>          # changesets and remove those.
>>          dynamic = hidden & revealedrevs(repo)
>>          if dynamic:
>> -            blocked = cl.ancestors(dynamic, inclusive=True)
>> -            hidden = frozenset(r for r in hidden if r not in blocked)
>> +            pfunc = cl.parentrevs
>> +            mutablephases = (phases.draft, phases.secret)
>> +            mutable = repo._phasecache.getrevset(repo, mutablephases)
>> +            hidden = hidden - _domainancestors(pfunc, dynamic, mutable)
>
> It looks like we're adding more code that assumes public commits can
> not be hidden. That seems fine to me, but I wanted to mention it to
> see if anyone else has objections.
>
>>      return hidden
>>
>>  def computeunserved(repo):
>> _______________________________________________
>> Mercurial-devel mailing list
>> Mercurial-devel@mercurial-scm.org
>> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Pierre-Yves David - May 27, 2017, 12:09 a.m.
On 05/26/2017 10:43 PM, Martin von Zweigbergk wrote:
> +mercurial-devel again (I clicked just "Reply" by mistake yet again)
>
> On Fri, May 26, 2017 at 1:19 PM, Martin von Zweigbergk
> <martinvonz@google.com> wrote:
>> On Fri, May 26, 2017 at 11:34 AM, Pierre-Yves David
>> <pierre-yves.david@ens-lyon.org> wrote:
>>> # HG changeset patch
>>> # User Pierre-Yves David <pierre-yves.david@octobus.net>
>>> # Date 1495373721 -7200
>>> #      Sun May 21 15:35:21 2017 +0200
>>> # Node ID c95d2c1ff6128ab5580d8e28ef8632ae87ecccc5
>>> # Parent  95b7279b3b2ec46607dbc22ebe4e8a27381c5ace
>>> # EXP-Topic fast-compute-hidden
>>> # Available At https://www.mercurial-scm.org/repo/users/marmoute/mercurial/
>>> #              hg pull https://www.mercurial-scm.org/repo/users/marmoute/mercurial/ -r c95d2c1ff612
>>> hidden: use _domainancestors to compute revs revealed by dynamic blocker
>>>
>>> The complexity of computing the revealed changesets is now 'O(revealed)'.
>>> This massively speeds up the computation on large repository. Moving it to the
>>> millisecond range.
>>>
>>> Below are timing from two Mozilla repositories with different contents:
>>>
>>> 1) mozilla repository with:
>>>  * 400667 changesets
>>>  * 35 hidden changesets (first rev-268334)
>>>  * 288 visible drafts
>>>  * obsolete working copy (dynamicblockers),
>>>
>>> Before:
>>> ! visible
>>> ! wall 0.030247 comb 0.030000 user 0.030000 sys 0.000000 (best of 100)
>>>
>>> After:
>>> ! visible
>>> ! wall 0.000585 comb 0.000000 user 0.000000 sys 0.000000 (best of 4221)
>>>
>>> The timing above include the computation of obsolete changeset:
>>> ! obsolete
>>> ! wall 0.000396 comb 0.000000 user 0.000000 sys 0.000000 (best of 6816)
>>>
>>> So adjusted time give 30ms before versus 0.2ms after. A 150x speedup.
>>>
>>> 2) mozilla repository with:
>>>  * 405645 changesets
>>>  * 4312 hidden changesets (first rev-326004)
>>>  * 264 visible drafts
>>>  * obsolete working copy (dynamicblockers),
>>>
>>> Before:
>>> ! visible
>>> ! wall 0.168658 comb 0.170000 user 0.170000 sys 0.000000 (best of 48)
>>>
>>> After
>>> ! visible
>>> ! wall 0.008612 comb 0.010000 user 0.010000 sys 0.000000 (best of 325)
>>>
>>> The timing above include the computation of obsolete changeset:
>>> ! obsolete
>>> ! wall 0.006408 comb 0.010000 user 0.010000 sys 0.000000 (best of 404)
>>>
>>> So adjusted time give 160ms before versus 2ms after. A 75x speedup.
>>
>> I've said before that I'm more interested in effect on real-life
>> commands (which is not to say that everyone has to agree and share
>> timings for real-life commands). 160ms is a pretty good win, but not
>> so relevant if the command still takes 10s. Roughly how long does a
>> "hg log -r ." (?) take with the setup above?

The hidden code is run for every single command since it is on the path 
to retrieve a "visible" filtered changelog. Any win here is a win for 
all commands. (and the win is is asymptotic).

Example timing below.

Setup (1)
   hg id -r .
     before: 0.205 total
     after:  0.165 total

     full series combined with obscache: 0.103 total

Setup (2)
   hg id -r .
     before: 0.328 total
     after:  0.192 total

     full series combined with obscache: 0.098 total

This works in the same area as my obscache series: 
https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-May/098136.html

>>> diff --git a/mercurial/repoview.py b/mercurial/repoview.py
>>> --- a/mercurial/repoview.py
>>> +++ b/mercurial/repoview.py
>>> @@ -224,8 +224,10 @@ def computehidden(repo):
>>>          # changesets and remove those.
>>>          dynamic = hidden & revealedrevs(repo)
>>>          if dynamic:
>>> -            blocked = cl.ancestors(dynamic, inclusive=True)
>>> -            hidden = frozenset(r for r in hidden if r not in blocked)
>>> +            pfunc = cl.parentrevs
>>> +            mutablephases = (phases.draft, phases.secret)
>>> +            mutable = repo._phasecache.getrevset(repo, mutablephases)
>>> +            hidden = hidden - _domainancestors(pfunc, dynamic, mutable)
>>
>> It looks like we're adding more code that assumes public commits can
>> not be hidden. That seems fine to me, but I wanted to mention it to
>> see if anyone else has objections.

The old code already assume this[1]. Before this series we have two 
functions (one assuming non-public) and one on-disk caching layer. After 
this series we have one (simpler) function (assuming non-public). So we 
do not adds new assumption.

(note: the new code have simpler code that would make it simpler to 
change that assumption (by extending the "domain" set))

[1] https://www.mercurial-scm.org/repo/hg/file/tip/mercurial/repoview.py#l68

Cheers,
via Mercurial-devel - May 27, 2017, 6:40 a.m.
On Fri, May 26, 2017 at 5:09 PM, Pierre-Yves David
<pierre-yves.david@ens-lyon.org> wrote:
>
>
> On 05/26/2017 10:43 PM, Martin von Zweigbergk wrote:
>>
>> +mercurial-devel again (I clicked just "Reply" by mistake yet again)
>>
>> On Fri, May 26, 2017 at 1:19 PM, Martin von Zweigbergk
>> <martinvonz@google.com> wrote:
>>>
>>> On Fri, May 26, 2017 at 11:34 AM, Pierre-Yves David
>>> <pierre-yves.david@ens-lyon.org> wrote:
>>>>
>>>> # HG changeset patch
>>>> # User Pierre-Yves David <pierre-yves.david@octobus.net>
>>>> # Date 1495373721 -7200
>>>> #      Sun May 21 15:35:21 2017 +0200
>>>> # Node ID c95d2c1ff6128ab5580d8e28ef8632ae87ecccc5
>>>> # Parent  95b7279b3b2ec46607dbc22ebe4e8a27381c5ace
>>>> # EXP-Topic fast-compute-hidden
>>>> # Available At
>>>> https://www.mercurial-scm.org/repo/users/marmoute/mercurial/
>>>> #              hg pull
>>>> https://www.mercurial-scm.org/repo/users/marmoute/mercurial/ -r c95d2c1ff612
>>>> hidden: use _domainancestors to compute revs revealed by dynamic blocker
>>>>
>>>> The complexity of computing the revealed changesets is now
>>>> 'O(revealed)'.
>>>> This massively speeds up the computation on large repository. Moving it
>>>> to the
>>>> millisecond range.
>>>>
>>>> Below are timing from two Mozilla repositories with different contents:
>>>>
>>>> 1) mozilla repository with:
>>>>  * 400667 changesets
>>>>  * 35 hidden changesets (first rev-268334)
>>>>  * 288 visible drafts
>>>>  * obsolete working copy (dynamicblockers),
>>>>
>>>> Before:
>>>> ! visible
>>>> ! wall 0.030247 comb 0.030000 user 0.030000 sys 0.000000 (best of 100)
>>>>
>>>> After:
>>>> ! visible
>>>> ! wall 0.000585 comb 0.000000 user 0.000000 sys 0.000000 (best of 4221)
>>>>
>>>> The timing above include the computation of obsolete changeset:
>>>> ! obsolete
>>>> ! wall 0.000396 comb 0.000000 user 0.000000 sys 0.000000 (best of 6816)
>>>>
>>>> So adjusted time give 30ms before versus 0.2ms after. A 150x speedup.
>>>>
>>>> 2) mozilla repository with:
>>>>  * 405645 changesets
>>>>  * 4312 hidden changesets (first rev-326004)
>>>>  * 264 visible drafts
>>>>  * obsolete working copy (dynamicblockers),
>>>>
>>>> Before:
>>>> ! visible
>>>> ! wall 0.168658 comb 0.170000 user 0.170000 sys 0.000000 (best of 48)
>>>>
>>>> After
>>>> ! visible
>>>> ! wall 0.008612 comb 0.010000 user 0.010000 sys 0.000000 (best of 325)
>>>>
>>>> The timing above include the computation of obsolete changeset:
>>>> ! obsolete
>>>> ! wall 0.006408 comb 0.010000 user 0.010000 sys 0.000000 (best of 404)
>>>>
>>>> So adjusted time give 160ms before versus 2ms after. A 75x speedup.
>>>
>>>
>>> I've said before that I'm more interested in effect on real-life
>>> commands (which is not to say that everyone has to agree and share
>>> timings for real-life commands). 160ms is a pretty good win, but not
>>> so relevant if the command still takes 10s. Roughly how long does a
>>> "hg log -r ." (?) take with the setup above?
>
>
> The hidden code is run for every single command since it is on the path to
> retrieve a "visible" filtered changelog. Any win here is a win for all
> commands. (and the win is is asymptotic).
>
> Example timing below.
>
> Setup (1)
>   hg id -r .
>     before: 0.205 total
>     after:  0.165 total
>
>     full series combined with obscache: 0.103 total
>
> Setup (2)
>   hg id -r .
>     before: 0.328 total
>     after:  0.192 total
>
>     full series combined with obscache: 0.098 total

That's pretty good. I wasn't able to measure any difference at all on
my hg core repo, but I guess the effect is stronger when history is
longer (as it is in the Mozilla repo). I think any speedup is drown by
obsstore load time in the hg core repo (53k revisions and 142k
obsmarkers).

>
> This works in the same area as my obscache series:
> https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-May/098136.html
>
>>>> diff --git a/mercurial/repoview.py b/mercurial/repoview.py
>>>> --- a/mercurial/repoview.py
>>>> +++ b/mercurial/repoview.py
>>>> @@ -224,8 +224,10 @@ def computehidden(repo):
>>>>          # changesets and remove those.
>>>>          dynamic = hidden & revealedrevs(repo)
>>>>          if dynamic:
>>>> -            blocked = cl.ancestors(dynamic, inclusive=True)
>>>> -            hidden = frozenset(r for r in hidden if r not in blocked)
>>>> +            pfunc = cl.parentrevs
>>>> +            mutablephases = (phases.draft, phases.secret)
>>>> +            mutable = repo._phasecache.getrevset(repo, mutablephases)
>>>> +            hidden = hidden - _domainancestors(pfunc, dynamic, mutable)
>>>
>>>
>>> It looks like we're adding more code that assumes public commits can
>>> not be hidden. That seems fine to me, but I wanted to mention it to
>>> see if anyone else has objections.
>
>
> The old code already assume this[1]. Before this series we have two
> functions (one assuming non-public) and one on-disk caching layer. After
> this series we have one (simpler) function (assuming non-public). So we do
> not adds new assumption.
>
> (note: the new code have simpler code that would make it simpler to change
> that assumption (by extending the "domain" set))
>
> [1] https://www.mercurial-scm.org/repo/hg/file/tip/mercurial/repoview.py#l68
>
> Cheers,
>
> --
> Pierre-Yves David
Pierre-Yves David - May 27, 2017, 11:17 a.m.
On 05/27/2017 08:40 AM, Martin von Zweigbergk wrote:
> On Fri, May 26, 2017 at 5:09 PM, Pierre-Yves David
> <pierre-yves.david@ens-lyon.org> wrote:
>>
>>
>> On 05/26/2017 10:43 PM, Martin von Zweigbergk wrote:
>>>
>>> +mercurial-devel again (I clicked just "Reply" by mistake yet again)
>>>
>>> On Fri, May 26, 2017 at 1:19 PM, Martin von Zweigbergk
>>> <martinvonz@google.com> wrote:
>>>>
>>>> On Fri, May 26, 2017 at 11:34 AM, Pierre-Yves David
>>>> <pierre-yves.david@ens-lyon.org> wrote:
>>>>>
>>>>> # HG changeset patch
>>>>> # User Pierre-Yves David <pierre-yves.david@octobus.net>
>>>>> # Date 1495373721 -7200
>>>>> #      Sun May 21 15:35:21 2017 +0200
>>>>> # Node ID c95d2c1ff6128ab5580d8e28ef8632ae87ecccc5
>>>>> # Parent  95b7279b3b2ec46607dbc22ebe4e8a27381c5ace
>>>>> # EXP-Topic fast-compute-hidden
>>>>> # Available At
>>>>> https://www.mercurial-scm.org/repo/users/marmoute/mercurial/
>>>>> #              hg pull
>>>>> https://www.mercurial-scm.org/repo/users/marmoute/mercurial/ -r c95d2c1ff612
>>>>> hidden: use _domainancestors to compute revs revealed by dynamic blocker
>>>>>
>>>>> The complexity of computing the revealed changesets is now
>>>>> 'O(revealed)'.
>>>>> This massively speeds up the computation on large repository. Moving it
>>>>> to the
>>>>> millisecond range.
>>>>>
>>>>> Below are timing from two Mozilla repositories with different contents:
>>>>>
>>>>> 1) mozilla repository with:
>>>>>  * 400667 changesets
>>>>>  * 35 hidden changesets (first rev-268334)
>>>>>  * 288 visible drafts
>>>>>  * obsolete working copy (dynamicblockers),
>>>>>
>>>>> Before:
>>>>> ! visible
>>>>> ! wall 0.030247 comb 0.030000 user 0.030000 sys 0.000000 (best of 100)
>>>>>
>>>>> After:
>>>>> ! visible
>>>>> ! wall 0.000585 comb 0.000000 user 0.000000 sys 0.000000 (best of 4221)
>>>>>
>>>>> The timing above include the computation of obsolete changeset:
>>>>> ! obsolete
>>>>> ! wall 0.000396 comb 0.000000 user 0.000000 sys 0.000000 (best of 6816)
>>>>>
>>>>> So adjusted time give 30ms before versus 0.2ms after. A 150x speedup.
>>>>>
>>>>> 2) mozilla repository with:
>>>>>  * 405645 changesets
>>>>>  * 4312 hidden changesets (first rev-326004)
>>>>>  * 264 visible drafts
>>>>>  * obsolete working copy (dynamicblockers),
>>>>>
>>>>> Before:
>>>>> ! visible
>>>>> ! wall 0.168658 comb 0.170000 user 0.170000 sys 0.000000 (best of 48)
>>>>>
>>>>> After
>>>>> ! visible
>>>>> ! wall 0.008612 comb 0.010000 user 0.010000 sys 0.000000 (best of 325)
>>>>>
>>>>> The timing above include the computation of obsolete changeset:
>>>>> ! obsolete
>>>>> ! wall 0.006408 comb 0.010000 user 0.010000 sys 0.000000 (best of 404)
>>>>>
>>>>> So adjusted time give 160ms before versus 2ms after. A 75x speedup.
>>>>
>>>>
>>>> I've said before that I'm more interested in effect on real-life
>>>> commands (which is not to say that everyone has to agree and share
>>>> timings for real-life commands). 160ms is a pretty good win, but not
>>>> so relevant if the command still takes 10s. Roughly how long does a
>>>> "hg log -r ." (?) take with the setup above?
>>
>>
>> The hidden code is run for every single command since it is on the path to
>> retrieve a "visible" filtered changelog. Any win here is a win for all
>> commands. (and the win is is asymptotic).
>>
>> Example timing below.
>>
>> Setup (1)
>>   hg id -r .
>>     before: 0.205 total
>>     after:  0.165 total
>>
>>     full series combined with obscache: 0.103 total
>>
>> Setup (2)
>>   hg id -r .
>>     before: 0.328 total
>>     after:  0.192 total
>>
>>     full series combined with obscache: 0.098 total
>
> That's pretty good. I wasn't able to measure any difference at all on
> my hg core repo, but I guess the effect is stronger when history is
> longer (as it is in the Mozilla repo). I think any speedup is drown by
> obsstore load time in the hg core repo (53k revisions and 142k
> obsmarkers).

Yes, the mercurial repository has a shorter history (so traversing the 
changelog hurt us less) and the time loading markers dominates. My 
benchmark reply to jun radix tree[1] has some number in this area.

In my mercurial-dev repository:
   visible  (obsstore loading included):
     obscache: 0.010617
     both:     0.002851

   full execution time on 'hg id -r .' (no extension)
     obscache: 0.099
     both:     0.084
     16% of the full time shaved off.
https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-May/098348.html

Patch

diff --git a/mercurial/repoview.py b/mercurial/repoview.py
--- a/mercurial/repoview.py
+++ b/mercurial/repoview.py
@@ -224,8 +224,10 @@  def computehidden(repo):
         # changesets and remove those.
         dynamic = hidden & revealedrevs(repo)
         if dynamic:
-            blocked = cl.ancestors(dynamic, inclusive=True)
-            hidden = frozenset(r for r in hidden if r not in blocked)
+            pfunc = cl.parentrevs
+            mutablephases = (phases.draft, phases.secret)
+            mutable = repo._phasecache.getrevset(repo, mutablephases)
+            hidden = hidden - _domainancestors(pfunc, dynamic, mutable)
     return hidden
 
 def computeunserved(repo):