Submitter | Boris Feld |
---|---|
Date | Sept. 7, 2018, 3:04 p.m. |
Message ID | <eb80c721aea9715e23dc.1536332645@localhost.localdomain> |
Download | mbox | patch |
Permalink | /patch/34393/ |
State | Accepted |
Headers | show |
Comments
On Fri, Sep 7, 2018 at 8:08 AM Boris Feld <boris.feld@octobus.net> wrote: > # HG changeset patch > # User Boris Feld <boris.feld@octobus.net> > # Date 1536267628 14400 > # Thu Sep 06 17:00:28 2018 -0400 > # Node ID eb80c721aea9715e23dc35cdd119428aa120ea93 > # Parent ab452995eafffa69c34e863e4d8c03e163d8f3ad > # EXP-Topic issue5979 > # Available At https://bitbucket.org/octobus/mercurial-devel/ > # hg pull https://bitbucket.org/octobus/mercurial-devel/ -r > eb80c721aea9 > ancestors: actually iterate over ancestors in topological order (issue5979) > > This code previously used a dequeue logic, the first ancestors seen were > the > first ancestors to be emitted. In branching/merging situations, it can > result > in ancestors being yielded before their descendants, breaking the object > contract. > > We got affected by this issue while working on the copy tracing code. At > about > the same time, Axel Hecht <axel@mozilla.com> reported the issue and > provided > the test case used in this changeset. Thanks Axel. > > Running `hg perfancestors` does not show a significant difference between > the > old and the new version. > In my hg repo, it went from 0.047230 to 0.093331, which is clearly significant. Maybe you made a last-minute change that made it slower? Or is my repo just different from yours?
On Sun, 9 Sep 2018 23:36:15 -0700, Martin von Zweigbergk via Mercurial-devel wrote: > On Fri, Sep 7, 2018 at 8:08 AM Boris Feld <boris.feld@octobus.net> wrote: > > # HG changeset patch > > # User Boris Feld <boris.feld@octobus.net> > > # Date 1536267628 14400 > > # Thu Sep 06 17:00:28 2018 -0400 > > # Node ID eb80c721aea9715e23dc35cdd119428aa120ea93 > > # Parent ab452995eafffa69c34e863e4d8c03e163d8f3ad > > # EXP-Topic issue5979 > > # Available At https://bitbucket.org/octobus/mercurial-devel/ > > # hg pull https://bitbucket.org/octobus/mercurial-devel/ -r > > eb80c721aea9 > > ancestors: actually iterate over ancestors in topological order (issue5979) > > > > This code previously used a dequeue logic, the first ancestors seen were > > the > > first ancestors to be emitted. In branching/merging situations, it can > > result > > in ancestors being yielded before their descendants, breaking the object > > contract. > > > > We got affected by this issue while working on the copy tracing code. At > > about > > the same time, Axel Hecht <axel@mozilla.com> reported the issue and > > provided > > the test case used in this changeset. Thanks Axel. > > > > Running `hg perfancestors` does not show a significant difference between > > the > > old and the new version. > > > > In my hg repo, it went from 0.047230 to 0.093331, which is clearly > significant. Maybe you made a last-minute change that made it slower? Or is > my repo just different from yours? I got a similar result. The heapq version is ~2x slower than the original buggy implementation. It seems I made it to 1-1.5x slowness by optimizing out heappush/heappop for contiguous cases (i.e. overwrite visit[0] if p1 is known to be the next largest revision.) I'll send the patches if I can be sure that's correct.
Patch
diff --git a/mercurial/ancestor.py b/mercurial/ancestor.py --- a/mercurial/ancestor.py +++ b/mercurial/ancestor.py @@ -7,7 +7,6 @@ from __future__ import absolute_import -import collections import heapq from .node import nullrev @@ -305,14 +304,15 @@ class lazyancestors(object): """Generate the ancestors of _initrevs in reverse topological order. If inclusive is False, yield a sequence of revision numbers starting - with the parents of each revision in revs, i.e., each revision is *not* - considered an ancestor of itself. Results are in breadth-first order: - parents of each rev in revs, then parents of those, etc. + with the parents of each revision in revs, i.e., each revision is + *not* considered an ancestor of itself. Results are emitted in reverse + revision number order. That order is also topological: a child is + always emitted before its parent. If inclusive is True, yield all the revs first (ignoring stoprev), - then yield all the ancestors of revs as when inclusive is False. - If an element in revs is an ancestor of a different rev it is not - yielded again.""" + then yield all the ancestors of revs as when inclusive is False. If an + element in revs is an ancestor of a different rev it is not yielded + again.""" seen = set() revs = self._initrevs if self._inclusive: @@ -322,17 +322,27 @@ class lazyancestors(object): parentrevs = self._parentrevs stoprev = self._stoprev - visit = collections.deque(revs) + visit = [] + heapq.heapify(visit) + schedule = heapq.heappush + nextitem = heapq.heappop + see = seen.add + see(nullrev) - see = seen.add - schedule = visit.append + for r in revs: + for parent in parentrevs(r): + if parent not in seen: + schedule(visit, -parent) + see(parent) while visit: - for parent in parentrevs(visit.popleft()): - if parent >= stoprev and parent not in seen: - schedule(parent) - see(parent) - yield parent + current = -nextitem(visit) + if current >= stoprev: + yield current + for parent in parentrevs(current): + if parent not in seen: + schedule(visit, -parent) + see(parent) def __contains__(self, target): """Test whether target is an ancestor of self._initrevs.""" diff --git a/tests/test-ancestor.py.out b/tests/test-ancestor.py.out --- a/tests/test-ancestor.py.out +++ b/tests/test-ancestor.py.out @@ -3,16 +3,16 @@ membership: [] iteration: [] % lazy ancestor set for [11, 13], stoprev = 0, inclusive = False membership: [7, 8, 3, 4, 1, 0] -iteration: [3, 7, 8, 1, 4, 0, 2] +iteration: [8, 7, 4, 3, 2, 1, 0] % lazy ancestor set for [1, 3], stoprev = 0, inclusive = False membership: [1, 0] -iteration: [0, 1] +iteration: [1, 0] % lazy ancestor set for [11, 13], stoprev = 0, inclusive = True membership: [11, 13, 7, 8, 3, 4, 1, 0] -iteration: [11, 13, 3, 7, 8, 1, 4, 0, 2] +iteration: [11, 13, 8, 7, 4, 3, 2, 1, 0] % lazy ancestor set for [11, 13], stoprev = 6, inclusive = False membership: [7, 8] -iteration: [7, 8] +iteration: [8, 7] % lazy ancestor set for [11, 13], stoprev = 6, inclusive = True membership: [11, 13, 7, 8] -iteration: [11, 13, 7, 8] +iteration: [11, 13, 8, 7] diff --git a/tests/test-issue5979.t b/tests/test-issue5979.t new file mode 100644 --- /dev/null +++ b/tests/test-issue5979.t @@ -0,0 +1,34 @@ + $ hg init r1 + $ cd r1 + $ hg ci --config ui.allowemptycommit=true -m c0 + $ hg ci --config ui.allowemptycommit=true -m c1 + $ hg ci --config ui.allowemptycommit=true -m c2 + $ hg co -q 0 + $ hg ci --config ui.allowemptycommit=true -m c3 + created new head + $ hg co -q 3 + $ hg merge --quiet + $ hg ci --config ui.allowemptycommit=true -m c4 + + $ hg log -G -T'{desc}' + @ c4 + |\ + | o c3 + | | + o | c2 + | | + o | c1 + |/ + o c0 + + + >>> from mercurial.ui import ui + >>> from mercurial.hg import repository + >>> repo = repository(ui()) + >>> for anc in repo.changelog.ancestors([4], inclusive=True): + ... print(anc) + 4 + 3 + 2 + 1 + 0 diff --git a/tests/test-revlog-ancestry.py.out b/tests/test-revlog-ancestry.py.out --- a/tests/test-revlog-ancestry.py.out +++ b/tests/test-revlog-ancestry.py.out @@ -1,13 +1,13 @@ Ancestors of 5 4 2 0 Ancestors of 6 and 5 -3 4 2 1 0 +4 3 2 1 0 Ancestors of 5 and 4 4 2 0 Ancestors of 7, stop at 6 6 Ancestors of 7, including revs -7 6 5 3 4 2 1 0 +7 6 5 4 3 2 1 0 Ancestors of 7, 5 and 3, including revs 7 5 3 6 4 2 1 0