Patchwork [4,of,8] hidden: simplify the computation of consistency blocker

login
register
mail settings
Submitter Pierre-Yves David
Date May 21, 2017, 3:20 p.m.
Message ID <e4a404417397b4bc8bb7.1495380039@nodosa.octopoid.net>
Download mbox | patch
Permalink /patch/20790/
State Accepted
Headers show

Comments

Pierre-Yves David - May 21, 2017, 3:20 p.m.
# HG changeset patch
# User Pierre-Yves David <pierre-yves.david@octobus.net>
# Date 1495375280 -7200
#      Sun May 21 16:01:20 2017 +0200
# Node ID e4a404417397b4bc8bb7f6fe2ec96c9aed1c4a3c
# Parent  bde4b1ab7b0111988a521c4c4c28b3963010eeff
# EXP-Topic dynamicblocker
# Available At https://www.mercurial-scm.org/repo/users/marmoute/mercurial/
#              hg pull https://www.mercurial-scm.org/repo/users/marmoute/mercurial/ -r e4a404417397
hidden: simplify the computation of consistency blocker

For a couple of years, we now have precomputed set for all mutable phase. We can
use this set to quickly detect non-hideable children of hideable changesets.
This massively speeds up the hidden computation.

Timing is a bit annoying to get because of the hidden cache. See the next
changeset for similar usage and speedup.
Augie Fackler - May 22, 2017, 10:03 p.m.
On Sun, May 21, 2017 at 05:20:39PM +0200, Pierre-Yves David wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@octobus.net>
> # Date 1495375280 -7200
> #      Sun May 21 16:01:20 2017 +0200
> # Node ID e4a404417397b4bc8bb7f6fe2ec96c9aed1c4a3c
> # Parent  bde4b1ab7b0111988a521c4c4c28b3963010eeff
> # EXP-Topic dynamicblocker
> # Available At https://www.mercurial-scm.org/repo/users/marmoute/mercurial/
> #              hg pull https://www.mercurial-scm.org/repo/users/marmoute/mercurial/ -r e4a404417397
> hidden: simplify the computation of consistency blocker

I have no idea what a consistency blocker is.

>
> For a couple of years, we now have precomputed set for all mutable phase. We can
> use this set to quickly detect non-hideable children of hideable changesets.
> This massively speeds up the hidden computation.
>
> Timing is a bit annoying to get because of the hidden cache. See the next
> changeset for similar usage and speedup.
>
> diff --git a/mercurial/repoview.py b/mercurial/repoview.py
> --- a/mercurial/repoview.py
> +++ b/mercurial/repoview.py
> @@ -10,7 +10,6 @@ from __future__ import absolute_import
>
>  import copy
>  import hashlib
> -import heapq
>  import struct
>
>  from .node import nullrev
> @@ -63,35 +62,33 @@ def _getstatichidden(repo):
>
>      """
>      assert not repo.changelog.filteredrevs
> -    hidden = set(hideablerevs(repo))
> +    hidden = hideablerevs(repo)
>      if hidden:
> -        getphase = repo._phasecache.phase
> -        getparentrevs = repo.changelog.parentrevs
> -        # Skip heads which are public (guaranteed to not be hidden)
> -        heap = [-r for r in repo.changelog.headrevs() if getphase(repo, r)]
> -        heapq.heapify(heap)
> -        heappop = heapq.heappop
> -        heappush = heapq.heappush
> -        seen = set() # no need to init it with heads, they have no children
> -        while heap:
> -            rev = -heappop(heap)
> -            # All children have been processed so at that point, if no children
> -            # removed 'rev' from the 'hidden' set, 'rev' is going to be hidden.
> -            blocker = rev not in hidden
> -            for parent in getparentrevs(rev):
> -                if parent == nullrev:
> -                    continue
> -                if blocker:
> -                    # If visible, ensure parent will be visible too
> -                    hidden.discard(parent)
> -                # - Avoid adding the same revision twice
> -                # - Skip nodes which are public (guaranteed to not be hidden)
> -                pre = len(seen)
> -                seen.add(parent)
> -                if pre < len(seen) and getphase(repo, rev):
> -                    heappush(heap, -parent)
> +        pfunc = repo.changelog.parentrevs
> +
> +        mutablephases = (phases.draft, phases.secret)
> +        mutable = repo._phasecache.getrevset(repo, mutablephases)
> +        blockers = _consistencyblocker(pfunc, hidden, mutable)
> +
> +        if blockers:
> +            hidden = hidden - _domainancestors(pfunc, blockers, mutable)
>      return hidden
>
> +def _consistencyblocker(pfunc, hideable, domain):
> +    """return non-hideable changeset blocking hideable one
> +
> +    For consistency, we cannot actually hide a changeset if one of it children
> +    are visible, this function find such children.
> +    """
> +    others = domain - hideable
> +    blockers = set()
> +    for r in others:
> +        for p in pfunc(r):
> +            if p != nullrev and p in hideable:
> +                blockers.add(r)
> +                break # no little profit
> +    return blockers
> +
>  def _domainancestors(pfunc, revs, domain):
>      """return ancestors of 'revs' within 'domain'
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
via Mercurial-devel - May 23, 2017, 9:57 p.m.
On Mon, May 22, 2017 at 3:03 PM, Augie Fackler <raf@durin42.com> wrote:
> On Sun, May 21, 2017 at 05:20:39PM +0200, Pierre-Yves David wrote:
>> # HG changeset patch
>> # User Pierre-Yves David <pierre-yves.david@octobus.net>
>> # Date 1495375280 -7200
>> #      Sun May 21 16:01:20 2017 +0200
>> # Node ID e4a404417397b4bc8bb7f6fe2ec96c9aed1c4a3c
>> # Parent  bde4b1ab7b0111988a521c4c4c28b3963010eeff
>> # EXP-Topic dynamicblocker
>> # Available At https://www.mercurial-scm.org/repo/users/marmoute/mercurial/
>> #              hg pull https://www.mercurial-scm.org/repo/users/marmoute/mercurial/ -r e4a404417397
>> hidden: simplify the computation of consistency blocker
>
> I have no idea what a consistency blocker is.
>
>>
>> For a couple of years, we now have precomputed set for all mutable phase. We can
>> use this set to quickly detect non-hideable children of hideable changesets.
>> This massively speeds up the hidden computation.

I also have trouble following this. One thing that confuses me is the
use of "hideable" instead of "hidden". Any non-public commit is
hideable, no? The sentence then becomes "We can use this set to
quickly detect public children of non-public changesets.", which
doesn't make sense.

As for the term "consistency blocker", I agree with Augie that it's
optimistic to expect the reader to understand what it means. It sounds
very much like it's blocking consistency, but looking at the function,
it seems to be about creating consistency. However, I don't think
"consistency" is the right word either. It seems to be about finding
unstable commits on top of suspended commits (I think they're not
known to be suspended before calling this function). Why not call it
findunstable()? I'm sure you have a reason not to call it that, but
please consider if may help readability by being less pedantic (I
assume the reason you didn't call it findunstable() is that there's
some case where not-unstable revisions can be returned).

>>
>> Timing is a bit annoying to get because of the hidden cache. See the next
>> changeset for similar usage and speedup.
>>
>> diff --git a/mercurial/repoview.py b/mercurial/repoview.py
>> --- a/mercurial/repoview.py
>> +++ b/mercurial/repoview.py
>> @@ -10,7 +10,6 @@ from __future__ import absolute_import
>>
>>  import copy
>>  import hashlib
>> -import heapq
>>  import struct
>>
>>  from .node import nullrev
>> @@ -63,35 +62,33 @@ def _getstatichidden(repo):
>>
>>      """
>>      assert not repo.changelog.filteredrevs
>> -    hidden = set(hideablerevs(repo))
>> +    hidden = hideablerevs(repo)
>>      if hidden:
>> -        getphase = repo._phasecache.phase
>> -        getparentrevs = repo.changelog.parentrevs
>> -        # Skip heads which are public (guaranteed to not be hidden)
>> -        heap = [-r for r in repo.changelog.headrevs() if getphase(repo, r)]
>> -        heapq.heapify(heap)
>> -        heappop = heapq.heappop
>> -        heappush = heapq.heappush
>> -        seen = set() # no need to init it with heads, they have no children
>> -        while heap:
>> -            rev = -heappop(heap)
>> -            # All children have been processed so at that point, if no children
>> -            # removed 'rev' from the 'hidden' set, 'rev' is going to be hidden.
>> -            blocker = rev not in hidden
>> -            for parent in getparentrevs(rev):
>> -                if parent == nullrev:
>> -                    continue
>> -                if blocker:
>> -                    # If visible, ensure parent will be visible too
>> -                    hidden.discard(parent)
>> -                # - Avoid adding the same revision twice
>> -                # - Skip nodes which are public (guaranteed to not be hidden)
>> -                pre = len(seen)
>> -                seen.add(parent)
>> -                if pre < len(seen) and getphase(repo, rev):
>> -                    heappush(heap, -parent)
>> +        pfunc = repo.changelog.parentrevs
>> +
>> +        mutablephases = (phases.draft, phases.secret)
>> +        mutable = repo._phasecache.getrevset(repo, mutablephases)
>> +        blockers = _consistencyblocker(pfunc, hidden, mutable)
>> +
>> +        if blockers:
>> +            hidden = hidden - _domainancestors(pfunc, blockers, mutable)
>>      return hidden
>>
>> +def _consistencyblocker(pfunc, hideable, domain):

I couldn't figure out what pfunc was from reading the function. The
call site tells me it's about parents. Please rename to e.g.
"parentsfunc".

>> +    """return non-hideable changeset blocking hideable one
>> +
>> +    For consistency, we cannot actually hide a changeset if one of it children
>> +    are visible, this function find such children.
>> +    """
>> +    others = domain - hideable
>> +    blockers = set()
>> +    for r in others:
>> +        for p in pfunc(r):
>> +            if p != nullrev and p in hideable:
>> +                blockers.add(r)
>> +                break # no little profit

Did you mean to drop either "no" or "little" here?

>> +    return blockers
>> +
>>  def _domainancestors(pfunc, revs, domain):
>>      """return ancestors of 'revs' within 'domain'
>>
>> _______________________________________________
>> Mercurial-devel mailing list
>> Mercurial-devel@mercurial-scm.org
>> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Pierre-Yves David - May 24, 2017, 1:39 p.m.
On 05/23/2017 11:57 PM, Martin von Zweigbergk wrote:
> On Mon, May 22, 2017 at 3:03 PM, Augie Fackler <raf@durin42.com> wrote:
>> On Sun, May 21, 2017 at 05:20:39PM +0200, Pierre-Yves David wrote:
>>> # HG changeset patch
>>> # User Pierre-Yves David <pierre-yves.david@octobus.net>
>>> # Date 1495375280 -7200
>>> #      Sun May 21 16:01:20 2017 +0200
>>> # Node ID e4a404417397b4bc8bb7f6fe2ec96c9aed1c4a3c
>>> # Parent  bde4b1ab7b0111988a521c4c4c28b3963010eeff
>>> # EXP-Topic dynamicblocker
>>> # Available At https://www.mercurial-scm.org/repo/users/marmoute/mercurial/
>>> #              hg pull https://www.mercurial-scm.org/repo/users/marmoute/mercurial/ -r e4a404417397
>>> hidden: simplify the computation of consistency blocker
>>
>> I have no idea what a consistency blocker is.
>>
>>>
>>> For a couple of years, we now have precomputed set for all mutable phase. We can
>>> use this set to quickly detect non-hideable children of hideable changesets.
>>> This massively speeds up the hidden computation.
>
> I also have trouble following this. One thing that confuses me is the
> use of "hideable" instead of "hidden". Any non-public commit is
> hideable, no?

The file use "hideable" for all revisions candidate for hiding (eg: 
obsolete changeset). They might end not being hidden (eg: working copy 
ancestors). You can get more details from the docstring of the 
'hideablerevs' method in that function.

If that helps you, replace "hideable" with "obsolete" in your head. (but 
we won't use "obsolete" in the code for reasons explained further down).

(here, the set of non-public changeset is usually refered as: "mutable")

[1] 
https://www.mercurial-scm.org/repo/hg/file/bdc4861ffe59/mercurial/repoview.py#l24

> As for the term "consistency blocker", I agree with Augie that it's
> optimistic to expect the reader to understand what it means. It sounds
> very much like it's blocking consistency, but looking at the function,
> it seems to be about creating consistency.

The terms the "blockers" mirror the dynamic/static-blockers used in the 
code[1]. "consistency blocker" is sort for "hide-blocker from enforcing 
the consistency".

[1] 
https://www.mercurial-scm.org/repo/hg/file/bdc4861ffe59/mercurial/repoview.py#l76
     (the function just got renamed, but the "blockers" terminology is
      also clarified in the last patch of this series).

> However, I don't think
> "consistency" is the right word either. It seems to be about finding
> unstable commits on top of suspended commits (I think they're not
> known to be suspended before calling this function). Why not call it
> findunstable()?

The hiding logic is independent from the obsolescence logic. It just 
hide stuff that other logic request to see hidden.
It just happens "obsolescence" is the only source feeding the 
'hideablerevs' in core right now. No obsolescence specific vocabulary is 
appropriate here.

> I'm sure you have a reason not to call it that, but
> please consider if may help readability by being less pedantic (I
> assume the reason you didn't call it findunstable() is that there's
> some case where not-unstable revisions can be returned).

So 'findunstable' is ruled-out since "unstable" in evolve specific. I 
can change the name if you find a better idea. I recommended checking 
the final state of computehidden[1] when thinking about that.

[1] 
https://www.mercurial-scm.org/repo/users/marmoute/mercurial/file/66479a8be5d5/mercurial/repoview.py#l90

>>> Timing is a bit annoying to get because of the hidden cache. See the next
>>> changeset for similar usage and speedup.
>>>
>>> diff --git a/mercurial/repoview.py b/mercurial/repoview.py
>>> --- a/mercurial/repoview.py
>>> +++ b/mercurial/repoview.py
>>> @@ -10,7 +10,6 @@ from __future__ import absolute_import
>>>
>>>  import copy
>>>  import hashlib
>>> -import heapq
>>>  import struct
>>>
>>>  from .node import nullrev
>>> @@ -63,35 +62,33 @@ def _getstatichidden(repo):
>>>
>>>      """
>>>      assert not repo.changelog.filteredrevs
>>> -    hidden = set(hideablerevs(repo))
>>> +    hidden = hideablerevs(repo)
>>>      if hidden:
>>> -        getphase = repo._phasecache.phase
>>> -        getparentrevs = repo.changelog.parentrevs
>>> -        # Skip heads which are public (guaranteed to not be hidden)
>>> -        heap = [-r for r in repo.changelog.headrevs() if getphase(repo, r)]
>>> -        heapq.heapify(heap)
>>> -        heappop = heapq.heappop
>>> -        heappush = heapq.heappush
>>> -        seen = set() # no need to init it with heads, they have no children
>>> -        while heap:
>>> -            rev = -heappop(heap)
>>> -            # All children have been processed so at that point, if no children
>>> -            # removed 'rev' from the 'hidden' set, 'rev' is going to be hidden.
>>> -            blocker = rev not in hidden
>>> -            for parent in getparentrevs(rev):
>>> -                if parent == nullrev:
>>> -                    continue
>>> -                if blocker:
>>> -                    # If visible, ensure parent will be visible too
>>> -                    hidden.discard(parent)
>>> -                # - Avoid adding the same revision twice
>>> -                # - Skip nodes which are public (guaranteed to not be hidden)
>>> -                pre = len(seen)
>>> -                seen.add(parent)
>>> -                if pre < len(seen) and getphase(repo, rev):
>>> -                    heappush(heap, -parent)
>>> +        pfunc = repo.changelog.parentrevs
>>> +
>>> +        mutablephases = (phases.draft, phases.secret)
>>> +        mutable = repo._phasecache.getrevset(repo, mutablephases)
>>> +        blockers = _consistencyblocker(pfunc, hidden, mutable)
>>> +
>>> +        if blockers:
>>> +            hidden = hidden - _domainancestors(pfunc, blockers, mutable)
>>>      return hidden
>>>
>>> +def _consistencyblocker(pfunc, hideable, domain):
>
> I couldn't figure out what pfunc was from reading the function. The
> call site tells me it's about parents. Please rename to e.g.
> "parentsfunc".

The name "pfunc" in standard for that kind of function in "ancestors.py" 
(see link to example below). So I would prefer to keep that name.

  * 
https://www.mercurial-scm.org/repo/hg/file/bdc4861ffe59/mercurial/ancestor.py#l15
  * 
https://www.mercurial-scm.org/repo/hg/file/bdc4861ffe59/mercurial/ancestor.py#l71
  * 
https://www.mercurial-scm.org/repo/hg/file/bdc4861ffe59/mercurial/ancestor.py#l140
  * 
https://www.mercurial-scm.org/repo/hg/file/bdc4861ffe59/mercurial/ancestor.py#l260



>
>>> +    """return non-hideable changeset blocking hideable one
>>> +
>>> +    For consistency, we cannot actually hide a changeset if one of it children
>>> +    are visible, this function find such children.
>>> +    """
>>> +    others = domain - hideable
>>> +    blockers = set()
>>> +    for r in others:
>>> +        for p in pfunc(r):
>>> +            if p != nullrev and p in hideable:
>>> +                blockers.add(r)
>>> +                break # no little profit
>
> Did you mean to drop either "no" or "little" here?

This is supposed to mean "There are no such thing as too small profit". 
That is quite useless and apparently confusing so I'll just drop the 
comment.

Cheers,

Patch

diff --git a/mercurial/repoview.py b/mercurial/repoview.py
--- a/mercurial/repoview.py
+++ b/mercurial/repoview.py
@@ -10,7 +10,6 @@  from __future__ import absolute_import
 
 import copy
 import hashlib
-import heapq
 import struct
 
 from .node import nullrev
@@ -63,35 +62,33 @@  def _getstatichidden(repo):
 
     """
     assert not repo.changelog.filteredrevs
-    hidden = set(hideablerevs(repo))
+    hidden = hideablerevs(repo)
     if hidden:
-        getphase = repo._phasecache.phase
-        getparentrevs = repo.changelog.parentrevs
-        # Skip heads which are public (guaranteed to not be hidden)
-        heap = [-r for r in repo.changelog.headrevs() if getphase(repo, r)]
-        heapq.heapify(heap)
-        heappop = heapq.heappop
-        heappush = heapq.heappush
-        seen = set() # no need to init it with heads, they have no children
-        while heap:
-            rev = -heappop(heap)
-            # All children have been processed so at that point, if no children
-            # removed 'rev' from the 'hidden' set, 'rev' is going to be hidden.
-            blocker = rev not in hidden
-            for parent in getparentrevs(rev):
-                if parent == nullrev:
-                    continue
-                if blocker:
-                    # If visible, ensure parent will be visible too
-                    hidden.discard(parent)
-                # - Avoid adding the same revision twice
-                # - Skip nodes which are public (guaranteed to not be hidden)
-                pre = len(seen)
-                seen.add(parent)
-                if pre < len(seen) and getphase(repo, rev):
-                    heappush(heap, -parent)
+        pfunc = repo.changelog.parentrevs
+
+        mutablephases = (phases.draft, phases.secret)
+        mutable = repo._phasecache.getrevset(repo, mutablephases)
+        blockers = _consistencyblocker(pfunc, hidden, mutable)
+
+        if blockers:
+            hidden = hidden - _domainancestors(pfunc, blockers, mutable)
     return hidden
 
+def _consistencyblocker(pfunc, hideable, domain):
+    """return non-hideable changeset blocking hideable one
+
+    For consistency, we cannot actually hide a changeset if one of it children
+    are visible, this function find such children.
+    """
+    others = domain - hideable
+    blockers = set()
+    for r in others:
+        for p in pfunc(r):
+            if p != nullrev and p in hideable:
+                blockers.add(r)
+                break # no little profit
+    return blockers
+
 def _domainancestors(pfunc, revs, domain):
     """return ancestors of 'revs' within 'domain'