Patchwork [4,of,4] templater: add subsetparents(rev, revset) function

login
register
mail settings
Submitter Yuya Nishihara
Date March 24, 2020, 2:42 p.m.
Message ID <700af5e971d4715b2557.1585060956@mimosa>
Download mbox | patch
Permalink /patch/45873/
State Accepted
Headers show

Comments

Yuya Nishihara - March 24, 2020, 2:42 p.m.
# HG changeset patch
# User Yuya Nishihara <yuya@tcha.org>
# Date 1584256318 -32400
#      Sun Mar 15 16:11:58 2020 +0900
# Node ID 700af5e971d4715b255782007f3ee967ed44fd56
# Parent  d686edd2bf8868661260854dfbe37292fec79b4c
templater: add subsetparents(rev, revset) function

Naming suggestions are welcome. And this could be flagged as an (ADVANCED)
function since the primary use case is to draw a graph.

This provides all data needed for drawing revisions graph filtered by revset,
and allows us to implement a GUI graph viewer in some languages better than
Python. A frontend grapher will be quite similar to our graphmod since
subsetparents() just returns parent-child relations in the filtered sub graph.

Frontend example:

  https://hg.sr.ht/~yuja/hgv/browse/default/core/hgchangesetgrapher.cpp

However, the resulting graph will be simpler than the one "hg log -G" would
generate because redundant edges are eliminated. This should be the same graph
rendering strategy as TortoiseHg.

This function could be implemented as a revset predicate, but that would mean
the scanning state couldn't be cached and thus slow.

Test cases are split to new file since test-template-functions.t is quite
big and we'll need a new branchy repository anyway.
Augie Fackler - March 24, 2020, 10:16 p.m.
queued with enthusiasm

> On Mar 24, 2020, at 10:42, Yuya Nishihara <yuya@tcha.org> wrote:
> 
> # HG changeset patch
> # User Yuya Nishihara <yuya@tcha.org>
> # Date 1584256318 -32400
> #      Sun Mar 15 16:11:58 2020 +0900
> # Node ID 700af5e971d4715b255782007f3ee967ed44fd56
> # Parent  d686edd2bf8868661260854dfbe37292fec79b4c
> templater: add subsetparents(rev, revset) function
> 
> Naming suggestions are welcome. And this could be flagged as an (ADVANCED)
> function since the primary use case is to draw a graph.
> 
> This provides all data needed for drawing revisions graph filtered by revset,
> and allows us to implement a GUI graph viewer in some languages better than
> Python. A frontend grapher will be quite similar to our graphmod since
> subsetparents() just returns parent-child relations in the filtered sub graph.
> 
> Frontend example:
> 
>  https://hg.sr.ht/~yuja/hgv/browse/default/core/hgchangesetgrapher.cpp
> 
> However, the resulting graph will be simpler than the one "hg log -G" would
> generate because redundant edges are eliminated. This should be the same graph
> rendering strategy as TortoiseHg.
> 
> This function could be implemented as a revset predicate, but that would mean
> the scanning state couldn't be cached and thus slow.
> 
> Test cases are split to new file since test-template-functions.t is quite
> big and we'll need a new branchy repository anyway.
> 
> diff --git a/mercurial/dagop.py b/mercurial/dagop.py
> --- a/mercurial/dagop.py
> +++ b/mercurial/dagop.py
> @@ -274,6 +274,238 @@ def descendantrevs(revs, revsfn, parentr
>                 break
> 
> 
> +class subsetparentswalker(object):
> +    r"""Scan adjacent ancestors in the graph given by the subset
> +
> +    This computes parent-child relations in the sub graph filtered by
> +    a revset. Primary use case is to draw a revisions graph.
> +
> +    In the following example, we consider that the node 'f' has edges to all
> +    ancestor nodes, but redundant paths are eliminated. The edge 'f'->'b'
> +    is eliminated because there is a path 'f'->'c'->'b' for example.
> +
> +          - d - e -
> +         /         \
> +        a - b - c - f
> +
> +    If the node 'c' is filtered out, the edge 'f'->'b' is activated.
> +
> +          - d - e -
> +         /         \
> +        a - b -(c)- f
> +
> +    Likewise, if 'd' and 'e' are filtered out, this edge is fully eliminated
> +    since there is a path 'f'->'c'->'b'->'a' for 'f'->'a'.
> +
> +           (d) (e)
> +
> +        a - b - c - f
> +
> +    Implementation-wise, 'f' is passed down to 'a' as unresolved through the
> +    'f'->'e'->'d'->'a' path, whereas we do also remember that 'f' has already
> +    been resolved while walking down the 'f'->'c'->'b'->'a' path. When
> +    processing the node 'a', the unresolved 'f'->'a' path is eliminated as
> +    the 'f' end is marked as resolved.
> +
> +    Ancestors are searched from the tipmost revision in the subset so the
> +    results can be cached. You should specify startrev to narrow the search
> +    space to ':startrev'.
> +    """
> +
> +    def __init__(self, repo, subset, startrev=None):
> +        if startrev is not None:
> +            subset = repo.revs(b'%d:null', startrev) & subset
> +
> +        # equivalent to 'subset = subset.sorted(reverse=True)', but there's
> +        # no such function.
> +        fastdesc = subset.fastdesc
> +        if fastdesc:
> +            desciter = fastdesc()
> +        else:
> +            if not subset.isdescending() and not subset.istopo():
> +                subset = smartset.baseset(subset)
> +                subset.sort(reverse=True)
> +            desciter = iter(subset)
> +
> +        self._repo = repo
> +        self._changelog = repo.changelog
> +        self._subset = subset
> +
> +        # scanning state (see _scanparents):
> +        self._tovisit = []
> +        self._pendingcnt = {}
> +        self._pointers = {}
> +        self._parents = {}
> +        self._inputhead = nullrev  # reassigned by self._advanceinput()
> +        self._inputtail = desciter
> +        self._bottomrev = nullrev
> +        self._advanceinput()
> +
> +    def parentsset(self, rev):
> +        """Look up parents of the given revision in the subset, and returns
> +        as a smartset"""
> +        return smartset.baseset(self.parents(rev))
> +
> +    def parents(self, rev):
> +        """Look up parents of the given revision in the subset
> +
> +        The returned revisions are sorted by parent index (p1/p2).
> +        """
> +        self._scanparents(rev)
> +        return [r for _c, r in sorted(self._parents.get(rev, []))]
> +
> +    def _parentrevs(self, rev):
> +        try:
> +            revs = self._changelog.parentrevs(rev)
> +            if revs[-1] == nullrev:
> +                return revs[:-1]
> +            return revs
> +        except error.WdirUnsupported:
> +            return tuple(pctx.rev() for pctx in self._repo[None].parents())
> +
> +    def _advanceinput(self):
> +        """Advance the input iterator and set the next revision to _inputhead"""
> +        if self._inputhead < nullrev:
> +            return
> +        try:
> +            self._inputhead = next(self._inputtail)
> +        except StopIteration:
> +            self._bottomrev = self._inputhead
> +            self._inputhead = nullrev - 1
> +
> +    def _scanparents(self, stoprev):
> +        """Scan ancestors until the parents of the specified stoprev are
> +        resolved"""
> +
> +        # 'tovisit' is the queue of the input revisions and their ancestors.
> +        # It will be populated incrementally to minimize the initial cost
> +        # of computing the given subset.
> +        #
> +        # For to-visit revisions, we keep track of
> +        # - the number of the unresolved paths: pendingcnt[rev],
> +        # - dict of the unresolved descendants and chains: pointers[rev][0],
> +        # - set of the already resolved descendants: pointers[rev][1].
> +        #
> +        # When a revision is visited, 'pointers[rev]' should be popped and
> +        # propagated to its parents accordingly.
> +        #
> +        # Once all pending paths have been resolved, 'pendingcnt[rev]' becomes
> +        # 0 and 'parents[rev]' contains the unsorted list of parent revisions
> +        # and p1/p2 chains (excluding linear paths.)
> +
> +        subset = self._subset
> +        tovisit = self._tovisit  # heap queue of [-rev]
> +        pendingcnt = self._pendingcnt  # {rev: count} for visited revisions
> +        pointers = self._pointers  # {rev: [{unresolved_rev: chain}, resolved]}
> +        parents = self._parents  # {rev: [(chain, rev)]}
> +
> +        while tovisit or self._inputhead >= nullrev:
> +            if pendingcnt.get(stoprev) == 0:
> +                return
> +
> +            # feed greater revisions from input set to queue
> +            if not tovisit:
> +                heapq.heappush(tovisit, -self._inputhead)
> +                self._advanceinput()
> +            while self._inputhead >= -tovisit[0]:
> +                heapq.heappush(tovisit, -self._inputhead)
> +                self._advanceinput()
> +
> +            rev = -heapq.heappop(tovisit)
> +            if rev < self._bottomrev:
> +                return
> +            if rev in pendingcnt and rev not in pointers:
> +                continue  # already visited
> +
> +            curactive = rev in subset
> +            pendingcnt.setdefault(rev, 0)  # mark as visited
> +            if curactive:
> +                assert rev not in parents
> +                parents[rev] = []
> +            unresolved, resolved = pointers.pop(rev, ({}, set()))
> +
> +            if curactive:
> +                # reached to active rev, resolve pending descendants' parents
> +                for r, c in unresolved.items():
> +                    pendingcnt[r] -= 1
> +                    assert pendingcnt[r] >= 0
> +                    if r in resolved:
> +                        continue  # eliminate redundant path
> +                    parents[r].append((c, rev))
> +                    # mark the descendant 'r' as resolved through this path if
> +                    # there are still pending pointers. the 'resolved' set may
> +                    # be concatenated later at a fork revision.
> +                    if pendingcnt[r] > 0:
> +                        resolved.add(r)
> +                unresolved.clear()
> +                # occasionally clean resolved markers. otherwise the set
> +                # would grow indefinitely.
> +                resolved = {r for r in resolved if pendingcnt[r] > 0}
> +
> +            parentrevs = self._parentrevs(rev)
> +            bothparentsactive = all(p in subset for p in parentrevs)
> +
> +            # set up or propagate tracking pointers if
> +            # - one of the parents is not active,
> +            # - or descendants' parents are unresolved.
> +            if not bothparentsactive or unresolved or resolved:
> +                if len(parentrevs) > 1:
> +                    # 'rev' is a merge revision. increment the pending count
> +                    # as the 'unresolved' dict will be duplicated.
> +                    for r in unresolved:
> +                        pendingcnt[r] += 1
> +                reusable = True  # can we avoid copying the tracking pointer?
> +                for i, p in enumerate(parentrevs):
> +                    assert p < rev
> +                    heapq.heappush(tovisit, -p)
> +                    if p in pointers:
> +                        # 'p' is a fork revision. concatenate tracking pointers
> +                        # and decrement the pending count accordingly.
> +                        knownunresolved, knownresolved = pointers[p]
> +                        for r, c in unresolved.items():
> +                            c += [b'1', b'2'][i]
> +                            if r in knownunresolved:
> +                                # unresolved at both paths
> +                                pendingcnt[r] -= 1
> +                                assert pendingcnt[r] > 0
> +                                # take shorter chain
> +                                knownunresolved[r] = min(c, knownunresolved[r])
> +                            else:
> +                                knownunresolved[r] = c
> +                        # simply propagate the 'resolved' set as deduplicating
> +                        # 'unresolved' here would be slightly complicated.
> +                        knownresolved.update(resolved)
> +                    elif reusable:
> +                        pointers[p] = (unresolved, resolved)
> +                        reusable = False
> +                    else:
> +                        pointers[p] = (unresolved.copy(), resolved.copy())
> +
> +            # then, populate the active parents directly and add the current
> +            # 'rev' to the tracking pointers of the inactive parents.
> +            # 'pointers[p]' may be optimized out if both parents are active.
> +            if curactive and bothparentsactive:
> +                for i, p in enumerate(parentrevs):
> +                    c = [b'1', b'2'][i]
> +                    parents[rev].append((c, p))
> +                    # no need to mark 'rev' as resolved since the 'rev' should
> +                    # be fully resolved (i.e. pendingcnt[rev] == 0)
> +                assert pendingcnt[rev] == 0
> +            elif curactive:
> +                for i, p in enumerate(parentrevs):
> +                    unresolved, resolved = pointers[p]
> +                    assert rev not in unresolved
> +                    c = [b'1', b'2'][i]
> +                    if p in subset:
> +                        parents[rev].append((c, p))
> +                        # mark 'rev' as resolved through this path
> +                        resolved.add(rev)
> +                    else:
> +                        pendingcnt[rev] += 1
> +                        unresolved[rev] = c
> +                assert 0 < pendingcnt[rev] <= 2
> +
> +
> def _reachablerootspure(pfunc, minroot, roots, heads, includepath):
>     """See revlog.reachableroots"""
>     if not roots:
> diff --git a/mercurial/templatefuncs.py b/mercurial/templatefuncs.py
> --- a/mercurial/templatefuncs.py
> +++ b/mercurial/templatefuncs.py
> @@ -16,6 +16,7 @@ from .node import (
> )
> from . import (
>     color,
> +    dagop,
>     diffutil,
>     encoding,
>     error,
> @@ -842,6 +843,45 @@ def startswith(context, mapping, args):
>     return b''
> 
> 
> +@templatefunc(
> +    b'subsetparents(rev, revset)',
> +    argspec=b'rev revset',
> +    requires={b'repo', b'cache'},
> +)
> +def subsetparents(context, mapping, args):
> +    """Look up parents of the rev in the sub graph given by the revset."""
> +    if b'rev' not in args or b'revset' not in args:
> +        # i18n: "subsetparents" is a keyword
> +        raise error.ParseError(_(b"subsetparents expects two arguments"))
> +
> +    repo = context.resource(mapping, b'repo')
> +
> +    rev = templateutil.evalinteger(context, mapping, args[b'rev'])
> +
> +    # TODO: maybe subsetparents(rev) should be allowed. the default revset
> +    # will be the revisions specified by -rREV argument.
> +    q = templateutil.evalwrapped(context, mapping, args[b'revset'])
> +    if not isinstance(q, templateutil.revslist):
> +        # i18n: "subsetparents" is a keyword
> +        raise error.ParseError(_(b"subsetparents expects a queried revset"))
> +    subset = q.tovalue(context, mapping)
> +    key = q.cachekey
> +
> +    if key:
> +        # cache only if revset query isn't dynamic
> +        cache = context.resource(mapping, b'cache')
> +        walkercache = cache.setdefault(b'subsetparentswalker', {})
> +        if key in walkercache:
> +            walker = walkercache[key]
> +        else:
> +            walker = dagop.subsetparentswalker(repo, subset)
> +            walkercache[key] = walker
> +    else:
> +        # for one-shot use, specify startrev to limit the search space
> +        walker = dagop.subsetparentswalker(repo, subset, startrev=rev)
> +    return templateutil.revslist(repo, walker.parentsset(rev))
> +
> +
> @templatefunc(b'word(number, text[, separator])')
> def word(context, mapping, args):
>     """Return the nth word from a string."""
> diff --git a/tests/test-template-graph.t b/tests/test-template-graph.t
> new file mode 100644
> --- /dev/null
> +++ b/tests/test-template-graph.t
> @@ -0,0 +1,337 @@
> +Test graph-related template functions
> +=====================================
> +
> +  $ cat <<'EOF' >> $HGRCPATH
> +  > [extensions]
> +  > drawdag = $RUNTESTDIR/drawdag.py
> +  > EOF
> +
> +  $ hg init a
> +  $ cd a
> +
> +  $ hg debugdrawdag <<'EOF'
> +  >   l
> +  >  / \
> +  > |   k
> +  > |   |\
> +  > |   | j
> +  > |   | |
> +  > i   | |
> +  > |\  | |
> +  > h | | |
> +  > | | | |
> +  > | g | |
> +  > | | | |
> +  > f | | |
> +  > | |/ /
> +  > | e |
> +  > | |/
> +  > | d
> +  > |/|
> +  > c |
> +  > | |
> +  > b |
> +  >   |
> +  >   a
> +  > EOF
> +
> +  $ hg log -Gq -T'{rev} {tags}\n'
> +  o    11 l tip
> +  |\
> +  | o    10 i
> +  | |\
> +  o \ \    9 k
> +  |\ \ \
> +  +-----o  8 g
> +  | | |
> +  | o |  7 j
> +  | | |
> +  | | o  6 h
> +  | | |
> +  o | |  5 e
> +  |/ /
> +  | o  4 f
> +  | |
> +  o |  3 d
> +  |\|
> +  | o  2 c
> +  | |
> +  | o  1 b
> +  |
> +  o  0 a
> +  
> +
> +  $ cd ..
> +
> +subsetparents
> +-------------
> +
> +  $ cd a
> +
> +  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+i"))}\n' -r 'c+i'
> +  o  10 i: 2
> +  :
> +  o  2 c:
> +  |
> +  ~
> +
> +  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+h+i"))}\n' -r 'c+h+i'
> +  o    10 i: 6
> +  |\
> +  o :  6 h: 2
> +  :/
> +  o  2 c:
> +  |
> +  ~
> +
> +  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+h+l"))}\n' -r 'c+h+l'
> +  o    11 l tip: 6
> +  :\
> +  : o  6 h: 2
> +  :/
> +  o  2 c:
> +  |
> +  ~
> +
> +  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+f+l"))}\n' -r 'c+f+l'
> +  o    11 l tip: 4
> +  :\
> +  : o  4 f: 2
> +  :/
> +  o  2 c:
> +  |
> +  ~
> +
> +  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+h+i+k"))}\n' -r 'c+h+i+k'
> +  o    10 i: 6
> +  |\
> +  | : o  9 k: 2
> +  | :/
> +  o :  6 h: 2
> +  :/
> +  o  2 c:
> +  |
> +  ~
> +
> +  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+d+h+i+k"))}\n' -r 'c+d+h+i+k'
> +  o    10 i: 6 3
> +  |\
> +  | : o  9 k: 3
> +  | :/
> +  o :  6 h: 2
> +  : :
> +  : o  3 d: 2
> +  :/|
> +  : ~
> +  o  2 c:
> +  |
> +  ~
> +
> +  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+j+k+i"))}\n' -r 'c+j+k+i'
> +  o  10 i: 2
> +  :
> +  : o  9 k: 7
> +  :/|
> +  : o  7 j: 2
> +  :/
> +  o  2 c:
> +  |
> +  ~
> +
> +  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+e+f+j"))}\n' -r 'c+e+f+j'
> +  o  7 j: 2
> +  :
> +  : o  5 e: 2
> +  :/
> +  : o  4 f: 2
> +  :/
> +  o  2 c:
> +  |
> +  ~
> +
> +  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("b+e+f+j"))}\n' -r 'b+e+f+j'
> +  o  7 j: 1
> +  :
> +  : o  5 e: 1
> +  :/
> +  : o  4 f: 1
> +  :/
> +  o  1 b:
> +  
> +
> +  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("a+c+f+g+j+l"))}\n' -r 'a+c+f+g+j+l'
> +  o    11 l tip: 4 8 7
> +  :\
> +  : \
> +  : :\
> +  : : \
> +  : : :\
> +  : : : \
> +  : : : :\
> +  : o---+ :  8 g: 0 2
> +  : :/ / /
> +  : +---o  7 j: 0 2
> +  : : :/
> +  o---+  4 f: 2
> +   / /
> +  : o  2 c:
> +  : |
> +  : ~
> +  o  0 a:
> +  
> +
> +  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("b+i+l"))}\n' -r 'b+i+l'
> +  o    11 l tip: 10
> +  |\
> +  o :  10 i: 1
> +  :/
> +  o  1 b:
> +  
> +
> +  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("b+i+j+l"))}\n' -r 'b+i+j+l'
> +  o    11 l tip: 10 7
> +  |\
> +  | \
> +  | :\
> +  o : :  10 i: 1
> +  :/ /
> +  : o  7 j: 1
> +  :/
> +  o  1 b:
> +  
> +
> +null in subset:
> +
> +  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("null+a+c+f"))}\n' -r 'null+a+c+f'
> +  o  4 f: 2
> +  |
> +  o  2 c: -1
> +  :
> +  : o  0 a: -1
> +  :/
> +  @  -1 : -1
> +  
> +
> +  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("null+a+b+c+f"))}\n' -r 'null+a+b+c+f'
> +  o  4 f: 2
> +  |
> +  o  2 c: 1
> +  |
> +  o  1 b: -1
> +  |
> +  | o  0 a: -1
> +  |/
> +  @  -1 : -1
> +  
> +
> +wdir in subset:
> +
> +  $ hg update -qC i
> +
> +  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("f+k+wdir()"))}\n' -r 'f+k+wdir()'
> +  o  2147483647 : 4
> +  :
> +  : o    9 k:
> +  : |\
> +  : ~ ~
> +  o  4 f:
> +  |
> +  ~
> +
> +  $ hg update -qC null
> +
> +Revisions not in subset:
> +
> +  $ hg log -T '{rev} {tags}: {subsetparents(rev, revset("a+c+f+g+j+l"))}\n'
> +  11 l tip: 4 8 7
> +  10 i: 
> +  9 k: 
> +  8 g: 0 2
> +  7 j: 0 2
> +  6 h: 
> +  5 e: 
> +  4 f: 2
> +  3 d: 
> +  2 c: 
> +  1 b: 
> +  0 a: 
> +
> +  $ hg log -T '{rev} {tags}: {subsetparents(rev, revset("b+c"))}\n'
> +  11 l tip: 
> +  10 i: 
> +  9 k: 
> +  8 g: 
> +  7 j: 
> +  6 h: 
> +  5 e: 
> +  4 f: 
> +  3 d: 
> +  2 c: 1
> +  1 b: 
> +  0 a: 
> +
> +  $ hg log -T '{rev} {tags}: {subsetparents(rev, revset("b+c"))}\n' -r'reverse(null:2)'
> +  2 c: 1
> +  1 b: 
> +  0 a: 
> +  -1 : 
> +
> +Nothing excluded:
> +
> +  $ hg log -T '{rev} {tags}: {subsetparents(rev, revset("null:wdir()"))}\n' -r'reverse(null:wdir())'
> +  2147483647 : -1
> +  11 l tip: 10 9
> +  10 i: 6 8
> +  9 k: 5 7
> +  8 g: 5
> +  7 j: 3
> +  6 h: 4
> +  5 e: 3
> +  4 f: 2
> +  3 d: 0 2
> +  2 c: 1
> +  1 b: -1
> +  0 a: -1
> +  -1 : -1
> +
> +Uncachable query:
> +
> +  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("%d:%d", rev, rev - 1))}\n'
> +  o    11 l tip: 10
> +  |\
> +  | o    10 i:
> +  | |\
> +  o \ \    9 k:
> +  |\ \ \
> +  +-----o  8 g:
> +  | | |
> +  | o |  7 j:
> +  | | |
> +  | | o  6 h:
> +  | | |
> +  o | |  5 e:
> +  |/ /
> +  | o  4 f:
> +  | |
> +  o |  3 d: 2
> +  |\|
> +  | o  2 c: 1
> +  | |
> +  | o  1 b:
> +  |
> +  o  0 a: -1
> +  
> +
> +Invalid arguments:
> +
> +  $ hg log -T '{subsetparents()}\n'
> +  hg: parse error: subsetparents expects two arguments
> +  [255]
> +  $ hg log -T '{subsetparents("a")}\n'
> +  hg: parse error: subsetparents expects two arguments
> +  [255]
> +  $ hg log -T '{subsetparents(rev, extras)}\n'
> +  hg: parse error: subsetparents expects a queried revset
> +  [255]
> +
> +  $ cd ..
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Yuya Nishihara - March 25, 2020, 3:54 p.m.
On Tue, 24 Mar 2020 18:16:45 -0400, Augie Fackler wrote:
> > +            # set up or propagate tracking pointers if
> > +            # - one of the parents is not active,
> > +            # - or descendants' parents are unresolved.
> > +            if not bothparentsactive or unresolved or resolved:
> > +                if len(parentrevs) > 1:
> > +                    # 'rev' is a merge revision. increment the pending count
> > +                    # as the 'unresolved' dict will be duplicated.
> > +                    for r in unresolved:
> > +                        pendingcnt[r] += 1
> > +                reusable = True  # can we avoid copying the tracking pointer?
> > +                for i, p in enumerate(parentrevs):
> > +                    assert p < rev
> > +                    heapq.heappush(tovisit, -p)
> > +                    if p in pointers:
> > +                        # 'p' is a fork revision. concatenate tracking pointers
> > +                        # and decrement the pending count accordingly.
> > +                        knownunresolved, knownresolved = pointers[p]
> > +                        for r, c in unresolved.items():
> > +                            c += [b'1', b'2'][i]

I noticed tracking p1/p2 chain here isn't correct since the current rev
is not a merge point but a fork point. I'll send a follow up patch later.

Anyway, thanks for the review!

Patch

diff --git a/mercurial/dagop.py b/mercurial/dagop.py
--- a/mercurial/dagop.py
+++ b/mercurial/dagop.py
@@ -274,6 +274,238 @@  def descendantrevs(revs, revsfn, parentr
                 break
 
 
+class subsetparentswalker(object):
+    r"""Scan adjacent ancestors in the graph given by the subset
+
+    This computes parent-child relations in the sub graph filtered by
+    a revset. Primary use case is to draw a revisions graph.
+
+    In the following example, we consider that the node 'f' has edges to all
+    ancestor nodes, but redundant paths are eliminated. The edge 'f'->'b'
+    is eliminated because there is a path 'f'->'c'->'b' for example.
+
+          - d - e -
+         /         \
+        a - b - c - f
+
+    If the node 'c' is filtered out, the edge 'f'->'b' is activated.
+
+          - d - e -
+         /         \
+        a - b -(c)- f
+
+    Likewise, if 'd' and 'e' are filtered out, this edge is fully eliminated
+    since there is a path 'f'->'c'->'b'->'a' for 'f'->'a'.
+
+           (d) (e)
+
+        a - b - c - f
+
+    Implementation-wise, 'f' is passed down to 'a' as unresolved through the
+    'f'->'e'->'d'->'a' path, whereas we do also remember that 'f' has already
+    been resolved while walking down the 'f'->'c'->'b'->'a' path. When
+    processing the node 'a', the unresolved 'f'->'a' path is eliminated as
+    the 'f' end is marked as resolved.
+
+    Ancestors are searched from the tipmost revision in the subset so the
+    results can be cached. You should specify startrev to narrow the search
+    space to ':startrev'.
+    """
+
+    def __init__(self, repo, subset, startrev=None):
+        if startrev is not None:
+            subset = repo.revs(b'%d:null', startrev) & subset
+
+        # equivalent to 'subset = subset.sorted(reverse=True)', but there's
+        # no such function.
+        fastdesc = subset.fastdesc
+        if fastdesc:
+            desciter = fastdesc()
+        else:
+            if not subset.isdescending() and not subset.istopo():
+                subset = smartset.baseset(subset)
+                subset.sort(reverse=True)
+            desciter = iter(subset)
+
+        self._repo = repo
+        self._changelog = repo.changelog
+        self._subset = subset
+
+        # scanning state (see _scanparents):
+        self._tovisit = []
+        self._pendingcnt = {}
+        self._pointers = {}
+        self._parents = {}
+        self._inputhead = nullrev  # reassigned by self._advanceinput()
+        self._inputtail = desciter
+        self._bottomrev = nullrev
+        self._advanceinput()
+
+    def parentsset(self, rev):
+        """Look up parents of the given revision in the subset, and returns
+        as a smartset"""
+        return smartset.baseset(self.parents(rev))
+
+    def parents(self, rev):
+        """Look up parents of the given revision in the subset
+
+        The returned revisions are sorted by parent index (p1/p2).
+        """
+        self._scanparents(rev)
+        return [r for _c, r in sorted(self._parents.get(rev, []))]
+
+    def _parentrevs(self, rev):
+        try:
+            revs = self._changelog.parentrevs(rev)
+            if revs[-1] == nullrev:
+                return revs[:-1]
+            return revs
+        except error.WdirUnsupported:
+            return tuple(pctx.rev() for pctx in self._repo[None].parents())
+
+    def _advanceinput(self):
+        """Advance the input iterator and set the next revision to _inputhead"""
+        if self._inputhead < nullrev:
+            return
+        try:
+            self._inputhead = next(self._inputtail)
+        except StopIteration:
+            self._bottomrev = self._inputhead
+            self._inputhead = nullrev - 1
+
+    def _scanparents(self, stoprev):
+        """Scan ancestors until the parents of the specified stoprev are
+        resolved"""
+
+        # 'tovisit' is the queue of the input revisions and their ancestors.
+        # It will be populated incrementally to minimize the initial cost
+        # of computing the given subset.
+        #
+        # For to-visit revisions, we keep track of
+        # - the number of the unresolved paths: pendingcnt[rev],
+        # - dict of the unresolved descendants and chains: pointers[rev][0],
+        # - set of the already resolved descendants: pointers[rev][1].
+        #
+        # When a revision is visited, 'pointers[rev]' should be popped and
+        # propagated to its parents accordingly.
+        #
+        # Once all pending paths have been resolved, 'pendingcnt[rev]' becomes
+        # 0 and 'parents[rev]' contains the unsorted list of parent revisions
+        # and p1/p2 chains (excluding linear paths.)
+
+        subset = self._subset
+        tovisit = self._tovisit  # heap queue of [-rev]
+        pendingcnt = self._pendingcnt  # {rev: count} for visited revisions
+        pointers = self._pointers  # {rev: [{unresolved_rev: chain}, resolved]}
+        parents = self._parents  # {rev: [(chain, rev)]}
+
+        while tovisit or self._inputhead >= nullrev:
+            if pendingcnt.get(stoprev) == 0:
+                return
+
+            # feed greater revisions from input set to queue
+            if not tovisit:
+                heapq.heappush(tovisit, -self._inputhead)
+                self._advanceinput()
+            while self._inputhead >= -tovisit[0]:
+                heapq.heappush(tovisit, -self._inputhead)
+                self._advanceinput()
+
+            rev = -heapq.heappop(tovisit)
+            if rev < self._bottomrev:
+                return
+            if rev in pendingcnt and rev not in pointers:
+                continue  # already visited
+
+            curactive = rev in subset
+            pendingcnt.setdefault(rev, 0)  # mark as visited
+            if curactive:
+                assert rev not in parents
+                parents[rev] = []
+            unresolved, resolved = pointers.pop(rev, ({}, set()))
+
+            if curactive:
+                # reached to active rev, resolve pending descendants' parents
+                for r, c in unresolved.items():
+                    pendingcnt[r] -= 1
+                    assert pendingcnt[r] >= 0
+                    if r in resolved:
+                        continue  # eliminate redundant path
+                    parents[r].append((c, rev))
+                    # mark the descendant 'r' as resolved through this path if
+                    # there are still pending pointers. the 'resolved' set may
+                    # be concatenated later at a fork revision.
+                    if pendingcnt[r] > 0:
+                        resolved.add(r)
+                unresolved.clear()
+                # occasionally clean resolved markers. otherwise the set
+                # would grow indefinitely.
+                resolved = {r for r in resolved if pendingcnt[r] > 0}
+
+            parentrevs = self._parentrevs(rev)
+            bothparentsactive = all(p in subset for p in parentrevs)
+
+            # set up or propagate tracking pointers if
+            # - one of the parents is not active,
+            # - or descendants' parents are unresolved.
+            if not bothparentsactive or unresolved or resolved:
+                if len(parentrevs) > 1:
+                    # 'rev' is a merge revision. increment the pending count
+                    # as the 'unresolved' dict will be duplicated.
+                    for r in unresolved:
+                        pendingcnt[r] += 1
+                reusable = True  # can we avoid copying the tracking pointer?
+                for i, p in enumerate(parentrevs):
+                    assert p < rev
+                    heapq.heappush(tovisit, -p)
+                    if p in pointers:
+                        # 'p' is a fork revision. concatenate tracking pointers
+                        # and decrement the pending count accordingly.
+                        knownunresolved, knownresolved = pointers[p]
+                        for r, c in unresolved.items():
+                            c += [b'1', b'2'][i]
+                            if r in knownunresolved:
+                                # unresolved at both paths
+                                pendingcnt[r] -= 1
+                                assert pendingcnt[r] > 0
+                                # take shorter chain
+                                knownunresolved[r] = min(c, knownunresolved[r])
+                            else:
+                                knownunresolved[r] = c
+                        # simply propagate the 'resolved' set as deduplicating
+                        # 'unresolved' here would be slightly complicated.
+                        knownresolved.update(resolved)
+                    elif reusable:
+                        pointers[p] = (unresolved, resolved)
+                        reusable = False
+                    else:
+                        pointers[p] = (unresolved.copy(), resolved.copy())
+
+            # then, populate the active parents directly and add the current
+            # 'rev' to the tracking pointers of the inactive parents.
+            # 'pointers[p]' may be optimized out if both parents are active.
+            if curactive and bothparentsactive:
+                for i, p in enumerate(parentrevs):
+                    c = [b'1', b'2'][i]
+                    parents[rev].append((c, p))
+                    # no need to mark 'rev' as resolved since the 'rev' should
+                    # be fully resolved (i.e. pendingcnt[rev] == 0)
+                assert pendingcnt[rev] == 0
+            elif curactive:
+                for i, p in enumerate(parentrevs):
+                    unresolved, resolved = pointers[p]
+                    assert rev not in unresolved
+                    c = [b'1', b'2'][i]
+                    if p in subset:
+                        parents[rev].append((c, p))
+                        # mark 'rev' as resolved through this path
+                        resolved.add(rev)
+                    else:
+                        pendingcnt[rev] += 1
+                        unresolved[rev] = c
+                assert 0 < pendingcnt[rev] <= 2
+
+
 def _reachablerootspure(pfunc, minroot, roots, heads, includepath):
     """See revlog.reachableroots"""
     if not roots:
diff --git a/mercurial/templatefuncs.py b/mercurial/templatefuncs.py
--- a/mercurial/templatefuncs.py
+++ b/mercurial/templatefuncs.py
@@ -16,6 +16,7 @@  from .node import (
 )
 from . import (
     color,
+    dagop,
     diffutil,
     encoding,
     error,
@@ -842,6 +843,45 @@  def startswith(context, mapping, args):
     return b''
 
 
+@templatefunc(
+    b'subsetparents(rev, revset)',
+    argspec=b'rev revset',
+    requires={b'repo', b'cache'},
+)
+def subsetparents(context, mapping, args):
+    """Look up parents of the rev in the sub graph given by the revset."""
+    if b'rev' not in args or b'revset' not in args:
+        # i18n: "subsetparents" is a keyword
+        raise error.ParseError(_(b"subsetparents expects two arguments"))
+
+    repo = context.resource(mapping, b'repo')
+
+    rev = templateutil.evalinteger(context, mapping, args[b'rev'])
+
+    # TODO: maybe subsetparents(rev) should be allowed. the default revset
+    # will be the revisions specified by -rREV argument.
+    q = templateutil.evalwrapped(context, mapping, args[b'revset'])
+    if not isinstance(q, templateutil.revslist):
+        # i18n: "subsetparents" is a keyword
+        raise error.ParseError(_(b"subsetparents expects a queried revset"))
+    subset = q.tovalue(context, mapping)
+    key = q.cachekey
+
+    if key:
+        # cache only if revset query isn't dynamic
+        cache = context.resource(mapping, b'cache')
+        walkercache = cache.setdefault(b'subsetparentswalker', {})
+        if key in walkercache:
+            walker = walkercache[key]
+        else:
+            walker = dagop.subsetparentswalker(repo, subset)
+            walkercache[key] = walker
+    else:
+        # for one-shot use, specify startrev to limit the search space
+        walker = dagop.subsetparentswalker(repo, subset, startrev=rev)
+    return templateutil.revslist(repo, walker.parentsset(rev))
+
+
 @templatefunc(b'word(number, text[, separator])')
 def word(context, mapping, args):
     """Return the nth word from a string."""
diff --git a/tests/test-template-graph.t b/tests/test-template-graph.t
new file mode 100644
--- /dev/null
+++ b/tests/test-template-graph.t
@@ -0,0 +1,337 @@ 
+Test graph-related template functions
+=====================================
+
+  $ cat <<'EOF' >> $HGRCPATH
+  > [extensions]
+  > drawdag = $RUNTESTDIR/drawdag.py
+  > EOF
+
+  $ hg init a
+  $ cd a
+
+  $ hg debugdrawdag <<'EOF'
+  >   l
+  >  / \
+  > |   k
+  > |   |\
+  > |   | j
+  > |   | |
+  > i   | |
+  > |\  | |
+  > h | | |
+  > | | | |
+  > | g | |
+  > | | | |
+  > f | | |
+  > | |/ /
+  > | e |
+  > | |/
+  > | d
+  > |/|
+  > c |
+  > | |
+  > b |
+  >   |
+  >   a
+  > EOF
+
+  $ hg log -Gq -T'{rev} {tags}\n'
+  o    11 l tip
+  |\
+  | o    10 i
+  | |\
+  o \ \    9 k
+  |\ \ \
+  +-----o  8 g
+  | | |
+  | o |  7 j
+  | | |
+  | | o  6 h
+  | | |
+  o | |  5 e
+  |/ /
+  | o  4 f
+  | |
+  o |  3 d
+  |\|
+  | o  2 c
+  | |
+  | o  1 b
+  |
+  o  0 a
+  
+
+  $ cd ..
+
+subsetparents
+-------------
+
+  $ cd a
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+i"))}\n' -r 'c+i'
+  o  10 i: 2
+  :
+  o  2 c:
+  |
+  ~
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+h+i"))}\n' -r 'c+h+i'
+  o    10 i: 6
+  |\
+  o :  6 h: 2
+  :/
+  o  2 c:
+  |
+  ~
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+h+l"))}\n' -r 'c+h+l'
+  o    11 l tip: 6
+  :\
+  : o  6 h: 2
+  :/
+  o  2 c:
+  |
+  ~
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+f+l"))}\n' -r 'c+f+l'
+  o    11 l tip: 4
+  :\
+  : o  4 f: 2
+  :/
+  o  2 c:
+  |
+  ~
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+h+i+k"))}\n' -r 'c+h+i+k'
+  o    10 i: 6
+  |\
+  | : o  9 k: 2
+  | :/
+  o :  6 h: 2
+  :/
+  o  2 c:
+  |
+  ~
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+d+h+i+k"))}\n' -r 'c+d+h+i+k'
+  o    10 i: 6 3
+  |\
+  | : o  9 k: 3
+  | :/
+  o :  6 h: 2
+  : :
+  : o  3 d: 2
+  :/|
+  : ~
+  o  2 c:
+  |
+  ~
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+j+k+i"))}\n' -r 'c+j+k+i'
+  o  10 i: 2
+  :
+  : o  9 k: 7
+  :/|
+  : o  7 j: 2
+  :/
+  o  2 c:
+  |
+  ~
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+e+f+j"))}\n' -r 'c+e+f+j'
+  o  7 j: 2
+  :
+  : o  5 e: 2
+  :/
+  : o  4 f: 2
+  :/
+  o  2 c:
+  |
+  ~
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("b+e+f+j"))}\n' -r 'b+e+f+j'
+  o  7 j: 1
+  :
+  : o  5 e: 1
+  :/
+  : o  4 f: 1
+  :/
+  o  1 b:
+  
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("a+c+f+g+j+l"))}\n' -r 'a+c+f+g+j+l'
+  o    11 l tip: 4 8 7
+  :\
+  : \
+  : :\
+  : : \
+  : : :\
+  : : : \
+  : : : :\
+  : o---+ :  8 g: 0 2
+  : :/ / /
+  : +---o  7 j: 0 2
+  : : :/
+  o---+  4 f: 2
+   / /
+  : o  2 c:
+  : |
+  : ~
+  o  0 a:
+  
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("b+i+l"))}\n' -r 'b+i+l'
+  o    11 l tip: 10
+  |\
+  o :  10 i: 1
+  :/
+  o  1 b:
+  
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("b+i+j+l"))}\n' -r 'b+i+j+l'
+  o    11 l tip: 10 7
+  |\
+  | \
+  | :\
+  o : :  10 i: 1
+  :/ /
+  : o  7 j: 1
+  :/
+  o  1 b:
+  
+
+null in subset:
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("null+a+c+f"))}\n' -r 'null+a+c+f'
+  o  4 f: 2
+  |
+  o  2 c: -1
+  :
+  : o  0 a: -1
+  :/
+  @  -1 : -1
+  
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("null+a+b+c+f"))}\n' -r 'null+a+b+c+f'
+  o  4 f: 2
+  |
+  o  2 c: 1
+  |
+  o  1 b: -1
+  |
+  | o  0 a: -1
+  |/
+  @  -1 : -1
+  
+
+wdir in subset:
+
+  $ hg update -qC i
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("f+k+wdir()"))}\n' -r 'f+k+wdir()'
+  o  2147483647 : 4
+  :
+  : o    9 k:
+  : |\
+  : ~ ~
+  o  4 f:
+  |
+  ~
+
+  $ hg update -qC null
+
+Revisions not in subset:
+
+  $ hg log -T '{rev} {tags}: {subsetparents(rev, revset("a+c+f+g+j+l"))}\n'
+  11 l tip: 4 8 7
+  10 i: 
+  9 k: 
+  8 g: 0 2
+  7 j: 0 2
+  6 h: 
+  5 e: 
+  4 f: 2
+  3 d: 
+  2 c: 
+  1 b: 
+  0 a: 
+
+  $ hg log -T '{rev} {tags}: {subsetparents(rev, revset("b+c"))}\n'
+  11 l tip: 
+  10 i: 
+  9 k: 
+  8 g: 
+  7 j: 
+  6 h: 
+  5 e: 
+  4 f: 
+  3 d: 
+  2 c: 1
+  1 b: 
+  0 a: 
+
+  $ hg log -T '{rev} {tags}: {subsetparents(rev, revset("b+c"))}\n' -r'reverse(null:2)'
+  2 c: 1
+  1 b: 
+  0 a: 
+  -1 : 
+
+Nothing excluded:
+
+  $ hg log -T '{rev} {tags}: {subsetparents(rev, revset("null:wdir()"))}\n' -r'reverse(null:wdir())'
+  2147483647 : -1
+  11 l tip: 10 9
+  10 i: 6 8
+  9 k: 5 7
+  8 g: 5
+  7 j: 3
+  6 h: 4
+  5 e: 3
+  4 f: 2
+  3 d: 0 2
+  2 c: 1
+  1 b: -1
+  0 a: -1
+  -1 : -1
+
+Uncachable query:
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("%d:%d", rev, rev - 1))}\n'
+  o    11 l tip: 10
+  |\
+  | o    10 i:
+  | |\
+  o \ \    9 k:
+  |\ \ \
+  +-----o  8 g:
+  | | |
+  | o |  7 j:
+  | | |
+  | | o  6 h:
+  | | |
+  o | |  5 e:
+  |/ /
+  | o  4 f:
+  | |
+  o |  3 d: 2
+  |\|
+  | o  2 c: 1
+  | |
+  | o  1 b:
+  |
+  o  0 a: -1
+  
+
+Invalid arguments:
+
+  $ hg log -T '{subsetparents()}\n'
+  hg: parse error: subsetparents expects two arguments
+  [255]
+  $ hg log -T '{subsetparents("a")}\n'
+  hg: parse error: subsetparents expects two arguments
+  [255]
+  $ hg log -T '{subsetparents(rev, extras)}\n'
+  hg: parse error: subsetparents expects a queried revset
+  [255]
+
+  $ cd ..