Patchwork [3,of,6,V3] hidden: simplify the computation of consistency blocker

login
register
mail settings
Submitter Pierre-Yves David
Date May 26, 2017, 6:34 p.m.
Message ID <cfd0735170e8ef4d5913.1495823687@nodosa.octopoid.net>
Download mbox | patch
Permalink /patch/20936/
State Accepted
Headers show

Comments

Pierre-Yves David - May 26, 2017, 6:34 p.m.
# HG changeset patch
# User Pierre-Yves David <pierre-yves.david@octobus.net>
# Date 1495375280 -7200
#      Sun May 21 16:01:20 2017 +0200
# Node ID cfd0735170e8ef4d5913f8dd0a9864f9c13bfd28
# Parent  c95d2c1ff6128ab5580d8e28ef8632ae87ecccc5
# EXP-Topic fast-compute-hidden
# Available At https://www.mercurial-scm.org/repo/users/marmoute/mercurial/
#              hg pull https://www.mercurial-scm.org/repo/users/marmoute/mercurial/ -r cfd0735170e8
hidden: simplify the computation of consistency blocker

For a couple of years, we now have precomputed set for all mutable phases. We
can use this set restrict our search and quickly detect non-hideable children of
hideable changesets.  This speeds up the hidden computation. See docstring of
the new function for details.

This new version reuses the '_domainancestors' function to keep the computation
of revealed changeset in O(len(visible))

Below are timing from two Mozilla repositories with different contents.
hidden cache is disabled while obtaining them.

1) Mozilla repository with:
 * 400667 changesets
 * 35 hidden changesets (first rev-268334)
 * 288 visible drafts
 * 1 unstable changeset

Before:
! visible
! wall 0.001744 comb 0.000000 user 0.000000 sys 0.000000 (best of 1563)

After:
! visible
! wall 0.000742 comb 0.000000 user 0.000000 sys 0.000000 (best of 3755)


The timing above include the computation of obsolete changeset:
! obsolete
! wall 0.000396 comb 0.000000 user 0.000000 sys 0.000000 (best of 6816)

So adjusted time give 1.3ms before versus 0.3ms after. A 4x speedup.

2) Mozilla repository with:
 * 405645 changesets
 * 4312 hidden changesets (first rev-326004)
 * 264 visible drafts
 * 1 unstable changeset

Before:
! visible
! wall 0.025476 comb 0.030000 user 0.030000 sys 0.000000 (best of 111)


After
! visible
! wall 0.007703 comb 0.010000 user 0.010000 sys 0.000000 (best of 358)


The timing above include the computation of obsolete changeset:
! obsolete
! wall 0.006408 comb 0.010000 user 0.010000 sys 0.000000 (best of 404)

So adjusted time give 19ms before versus 1.3ms after. A 17x speedup.
via Mercurial-devel - May 27, 2017, 9:18 p.m.
On Fri, May 26, 2017 at 11:34 AM, Pierre-Yves David
<pierre-yves.david@ens-lyon.org> wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@octobus.net>
> # Date 1495375280 -7200
> #      Sun May 21 16:01:20 2017 +0200
> # Node ID cfd0735170e8ef4d5913f8dd0a9864f9c13bfd28
> # Parent  c95d2c1ff6128ab5580d8e28ef8632ae87ecccc5
> # EXP-Topic fast-compute-hidden
> # Available At https://www.mercurial-scm.org/repo/users/marmoute/mercurial/
> #              hg pull https://www.mercurial-scm.org/repo/users/marmoute/mercurial/ -r cfd0735170e8
> hidden: simplify the computation of consistency blocker
>
> For a couple of years, we now have precomputed set for all mutable phases. We
> can use this set restrict our search and quickly detect non-hideable children of
> hideable changesets.  This speeds up the hidden computation. See docstring of
> the new function for details.
>
> This new version reuses the '_domainancestors' function to keep the computation
> of revealed changeset in O(len(visible))
>
> Below are timing from two Mozilla repositories with different contents.

I'll insert "perfvolatilesets" before timing in flight.

> hidden cache is disabled while obtaining them.
>
> 1) Mozilla repository with:
>  * 400667 changesets
>  * 35 hidden changesets (first rev-268334)
>  * 288 visible drafts
>  * 1 unstable changeset
>
> Before:
> ! visible
> ! wall 0.001744 comb 0.000000 user 0.000000 sys 0.000000 (best of 1563)
>
> After:
> ! visible
> ! wall 0.000742 comb 0.000000 user 0.000000 sys 0.000000 (best of 3755)
>
>
> The timing above include the computation of obsolete changeset:
> ! obsolete
> ! wall 0.000396 comb 0.000000 user 0.000000 sys 0.000000 (best of 6816)
>
> So adjusted time give 1.3ms before versus 0.3ms after. A 4x speedup.
>
> 2) Mozilla repository with:
>  * 405645 changesets
>  * 4312 hidden changesets (first rev-326004)
>  * 264 visible drafts
>  * 1 unstable changeset
>
> Before:
> ! visible
> ! wall 0.025476 comb 0.030000 user 0.030000 sys 0.000000 (best of 111)
>
>
> After
> ! visible
> ! wall 0.007703 comb 0.010000 user 0.010000 sys 0.000000 (best of 358)
>
>
> The timing above include the computation of obsolete changeset:
> ! obsolete
> ! wall 0.006408 comb 0.010000 user 0.010000 sys 0.000000 (best of 404)
>
> So adjusted time give 19ms before versus 1.3ms after. A 17x speedup.
>
> diff --git a/mercurial/repoview.py b/mercurial/repoview.py
> --- a/mercurial/repoview.py
> +++ b/mercurial/repoview.py
> @@ -10,7 +10,6 @@ from __future__ import absolute_import
>
>  import copy
>  import hashlib
> -import heapq
>  import struct
>
>  from .node import nullrev
> @@ -63,35 +62,33 @@ def _getstatichidden(repo):
>
>      """
>      assert not repo.changelog.filteredrevs
> -    hidden = set(hideablerevs(repo))
> +    hidden = hideablerevs(repo)
>      if hidden:
> -        getphase = repo._phasecache.phase
> -        getparentrevs = repo.changelog.parentrevs
> -        # Skip heads which are public (guaranteed to not be hidden)
> -        heap = [-r for r in repo.changelog.headrevs() if getphase(repo, r)]
> -        heapq.heapify(heap)
> -        heappop = heapq.heappop
> -        heappush = heapq.heappush
> -        seen = set() # no need to init it with heads, they have no children
> -        while heap:
> -            rev = -heappop(heap)
> -            # All children have been processed so at that point, if no children
> -            # removed 'rev' from the 'hidden' set, 'rev' is going to be hidden.
> -            blocker = rev not in hidden
> -            for parent in getparentrevs(rev):
> -                if parent == nullrev:
> -                    continue
> -                if blocker:
> -                    # If visible, ensure parent will be visible too
> -                    hidden.discard(parent)
> -                # - Avoid adding the same revision twice
> -                # - Skip nodes which are public (guaranteed to not be hidden)
> -                pre = len(seen)
> -                seen.add(parent)
> -                if pre < len(seen) and getphase(repo, rev):
> -                    heappush(heap, -parent)
> +        pfunc = repo.changelog.parentrevs
> +
> +        mutablephases = (phases.draft, phases.secret)
> +        mutable = repo._phasecache.getrevset(repo, mutablephases)
> +        blockers = _consistencyblocker(pfunc, hidden, mutable)
> +
> +        if blockers:
> +            hidden = hidden - _domainancestors(pfunc, blockers, mutable)
>      return hidden
>
> +def _consistencyblocker(pfunc, hideable, domain):

I still think "consistency blocker" is not a good name. How about
renaming "blocker" to "anchor" (suggested by Augie at some point)? I
think that's a much better name. But we can do that in a followup.


> +    """return non-hideable changeset blocking hideable one
> +
> +    For consistency, we cannot actually hide a changeset if one of it children
> +    are visible, this function find such children.
> +    """
> +    others = domain - hideable
> +    blockers = set()
> +    for r in others:
> +        for p in pfunc(r):
> +            if p != nullrev and p in hideable:
> +                blockers.add(r)
> +                break # no little profit

I'll drop the "no little profit" in flight.

> +    return blockers
> +
>  def _domainancestors(pfunc, revs, domain):
>      """return ancestors of 'revs' within 'domain'
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
via Mercurial-devel - May 28, 2017, 6:18 a.m.
On Sat, May 27, 2017 at 2:18 PM, Martin von Zweigbergk
<martinvonz@google.com> wrote:
> On Fri, May 26, 2017 at 11:34 AM, Pierre-Yves David
> <pierre-yves.david@ens-lyon.org> wrote:
>> +def _consistencyblocker(pfunc, hideable, domain):
>
> I still think "consistency blocker" is not a good name. How about
> renaming "blocker" to "anchor" (suggested by Augie at some point)? I
> think that's a much better name. But we can do that in a followup.

I just sent a series that does this and more. Since I couldn't really
measure any performance impact, please let me know if that series has
any negative impact.

Patch

diff --git a/mercurial/repoview.py b/mercurial/repoview.py
--- a/mercurial/repoview.py
+++ b/mercurial/repoview.py
@@ -10,7 +10,6 @@  from __future__ import absolute_import
 
 import copy
 import hashlib
-import heapq
 import struct
 
 from .node import nullrev
@@ -63,35 +62,33 @@  def _getstatichidden(repo):
 
     """
     assert not repo.changelog.filteredrevs
-    hidden = set(hideablerevs(repo))
+    hidden = hideablerevs(repo)
     if hidden:
-        getphase = repo._phasecache.phase
-        getparentrevs = repo.changelog.parentrevs
-        # Skip heads which are public (guaranteed to not be hidden)
-        heap = [-r for r in repo.changelog.headrevs() if getphase(repo, r)]
-        heapq.heapify(heap)
-        heappop = heapq.heappop
-        heappush = heapq.heappush
-        seen = set() # no need to init it with heads, they have no children
-        while heap:
-            rev = -heappop(heap)
-            # All children have been processed so at that point, if no children
-            # removed 'rev' from the 'hidden' set, 'rev' is going to be hidden.
-            blocker = rev not in hidden
-            for parent in getparentrevs(rev):
-                if parent == nullrev:
-                    continue
-                if blocker:
-                    # If visible, ensure parent will be visible too
-                    hidden.discard(parent)
-                # - Avoid adding the same revision twice
-                # - Skip nodes which are public (guaranteed to not be hidden)
-                pre = len(seen)
-                seen.add(parent)
-                if pre < len(seen) and getphase(repo, rev):
-                    heappush(heap, -parent)
+        pfunc = repo.changelog.parentrevs
+
+        mutablephases = (phases.draft, phases.secret)
+        mutable = repo._phasecache.getrevset(repo, mutablephases)
+        blockers = _consistencyblocker(pfunc, hidden, mutable)
+
+        if blockers:
+            hidden = hidden - _domainancestors(pfunc, blockers, mutable)
     return hidden
 
+def _consistencyblocker(pfunc, hideable, domain):
+    """return non-hideable changeset blocking hideable one
+
+    For consistency, we cannot actually hide a changeset if one of it children
+    are visible, this function find such children.
+    """
+    others = domain - hideable
+    blockers = set()
+    for r in others:
+        for p in pfunc(r):
+            if p != nullrev and p in hideable:
+                blockers.add(r)
+                break # no little profit
+    return blockers
+
 def _domainancestors(pfunc, revs, domain):
     """return ancestors of 'revs' within 'domain'