Patchwork D8125: transactions: convert changes['phases'] to list of ranges

login
register
mail settings
Submitter phabricator
Date Feb. 15, 2020, 9:32 p.m.
Message ID <differential-rev-PHID-DREV-izjut6qw7736aj53n5qk-req@mercurial-scm.org>
Download mbox | patch
Permalink /patch/45263/
State Superseded
Headers show

Comments

phabricator - Feb. 15, 2020, 9:32 p.m.
joerg.sonnenberger created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Consecutive revisions are often in the same phase, especially public
  revisions. This means that a dictionary keyed by the revision for the
  phase transitions is highly redundant. Build a list of (range, (old,
  new)) entries instead and aggressively merge ranges with the same
  transition. For the test case in issue5691, this reduces memory use by
  ~20MB.

REPOSITORY
  rHG Mercurial

BRANCH
  default

REVISION DETAIL
  https://phab.mercurial-scm.org/D8125

AFFECTED FILES
  mercurial/localrepo.py
  mercurial/phases.py
  mercurial/scmutil.py
  tests/testlib/ext-phase-report.py

CHANGE DETAILS




To: joerg.sonnenberger, #hg-reviewers
Cc: mercurial-devel
phabricator - Feb. 17, 2020, 10:49 p.m.
marmoute added a comment.


  I've a couple of high level question:
  
  - This reduce the memory usage of 20MB compare to what total usage ?
  - Did you asses the performance impact of this? and if so, what is it?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D8125/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D8125

To: joerg.sonnenberger, #hg-reviewers
Cc: marmoute, mercurial-devel
phabricator - Feb. 17, 2020, 11:28 p.m.
joerg.sonnenberger added a comment.


  I haven't measured run-time impact. The sorting should ensure that the lists are normally kept small, but when many updates apply to very fragmented repositories, it could be worse. 20MB is relative to the 600MB peak RSS from the referenced issue.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D8125/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D8125

To: joerg.sonnenberger, #hg-reviewers
Cc: marmoute, mercurial-devel
phabricator - March 3, 2020, 6:30 p.m.
This revision now requires changes to proceed.
durin42 added a comment.
durin42 requested changes to this revision.


  Overall seems fine, some structural nits on the code.

INLINE COMMENTS

> phases.py:232
> +
> +    def insert(data, idx, rev, t):
> +        merge_before = False

It doesn't look like there's a good reason for this function to be nested like this. Can we move it to module-level with an underscore prefix?

> phases.py:252
> +
> +    def split_range(data, idx, rev, t):
> +        r1, t1 = data[idx]

here too

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D8125/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D8125

To: joerg.sonnenberger, #hg-reviewers, durin42
Cc: durin42, marmoute, mercurial-devel

Patch

diff --git a/tests/testlib/ext-phase-report.py b/tests/testlib/ext-phase-report.py
--- a/tests/testlib/ext-phase-report.py
+++ b/tests/testlib/ext-phase-report.py
@@ -5,21 +5,22 @@ 
 
 def reposetup(ui, repo):
     def reportphasemove(tr):
-        for rev, move in sorted(tr.changes[b'phases'].items()):
-            if move[0] is None:
-                ui.write(
-                    (
-                        b'test-debug-phase: new rev %d:  x -> %d\n'
-                        % (rev, move[1])
+        for revs, move in sorted(tr.changes[b"phases"], key=lambda r: r[0][0]):
+            for rev in revs:
+                if move[0] is None:
+                    ui.write(
+                        (
+                            b'test-debug-phase: new rev %d:  x -> %d\n'
+                            % (rev, move[1])
+                        )
                     )
-                )
-            else:
-                ui.write(
-                    (
-                        b'test-debug-phase: move rev %d: %d -> %d\n'
-                        % (rev, move[0], move[1])
+                else:
+                    ui.write(
+                        (
+                            b'test-debug-phase: move rev %d: %d -> %d\n'
+                            % (rev, move[0], move[1])
+                        )
                     )
-                )
 
     class reportphaserepo(repo.__class__):
         def transaction(self, *args, **kwargs):
diff --git a/mercurial/scmutil.py b/mercurial/scmutil.py
--- a/mercurial/scmutil.py
+++ b/mercurial/scmutil.py
@@ -2047,14 +2047,11 @@ 
             pull/unbundle.
             """
             origrepolen = tr.changes.get(b'origrepolen', len(repo))
-            phasetracking = tr.changes.get(b'phases', {})
-            if not phasetracking:
-                return
-            published = [
-                rev
-                for rev, (old, new) in pycompat.iteritems(phasetracking)
-                if new == phases.public and rev < origrepolen
-            ]
+            published = []
+            for revs, (old, new) in tr.changes.get(b'phases', []):
+                if new != phases.public:
+                    continue
+                published.extend(rev for rev in revs if rev < origrepolen)
             if not published:
                 return
             repo.ui.status(
diff --git a/mercurial/phases.py b/mercurial/phases.py
--- a/mercurial/phases.py
+++ b/mercurial/phases.py
@@ -217,16 +217,98 @@ 
 
 
 def _trackphasechange(data, rev, old, new):
-    """add a phase move the <data> dictionnary
+    """add a phase move to the <data> list of ranges
 
     If data is None, nothing happens.
     """
     if data is None:
         return
-    existing = data.get(rev)
-    if existing is not None:
-        old = existing[0]
-    data[rev] = (old, new)
+
+    # If data is empty, create a one-revision range and done
+    if not data:
+        data.insert(0, (pycompat.xrange(rev, rev + 1), (old, new)))
+        return
+
+    def insert(data, idx, rev, t):
+        merge_before = False
+        if idx:
+            r1, t1 = data[idx - 1]
+            merge_before = r1[-1] + 1 == rev and t1 == t
+        merge_after = False
+        if idx < len(data):
+            r2, t2 = data[idx]
+            merge_after = r2[0] == rev + 1 and t2 == t
+
+        if merge_before and merge_after:
+            data[idx - 1] = (pycompat.xrange(r1[0], r2[-1] + 1), t)
+            data.pop(idx)
+        elif merge_before:
+            data[idx - 1] = (pycompat.xrange(r1[0], rev + 1), t)
+        elif merge_after:
+            data[idx] = (pycompat.xrange(rev, r2[-1] + 1), t)
+        else:
+            data.insert(idx, (pycompat.xrange(rev, rev + 1), t))
+
+    def split_range(data, idx, rev, t):
+        r1, t1 = data[idx]
+        if t == t1:
+            return
+        t = (t1[0], t[1])
+        if len(r1) == 1:
+            data.pop(idx)
+            insert(data, idx, rev, t)
+        elif r1[0] == rev:
+            data[idx] = (pycompat.xrange(rev + 1, r1[-1] + 1), t1)
+            insert(data, idx, rev, t)
+        elif r1[-1] == rev:
+            data[idx] = (pycompat.xrange(r1[0], rev), t1)
+            insert(data, idx + 1, rev, t)
+        else:
+            data[idx : idx + 1] = [
+                (pycompat.xrange(r1[0], rev), t1),
+                (pycompat.xrange(rev, rev + 1), t),
+                (pycompat.xrange(rev + 1, r1[-1] + 1), t1),
+            ]
+
+    low = 0
+    high = len(data)
+    t = (old, new)
+    while low < high:
+        mid = (low + high) // 2
+        revs = data[mid][0]
+
+        if rev in revs:
+            split_range(data, mid, rev, t)
+            return
+
+        if revs[0] == rev + 1:
+            if mid and data[mid - 1][0][-1] == rev:
+                split_range(data, mid - 1, rev, t)
+            else:
+                insert(data, mid, rev, t)
+            return
+
+        if revs[-1] == rev - 1:
+            if mid + 1 < len(data) and data[mid + 1][0][0] == rev:
+                split_range(data, mid + 1, rev, t)
+            else:
+                insert(data, mid + 1, rev, t)
+            return
+
+        if revs[0] > rev:
+            high = mid
+        else:
+            low = mid + 1
+
+    if low == len(data):
+        data.append((pycompat.xrange(rev, rev + 1), t))
+        return
+
+    r1, t1 = data[low]
+    if r1[0] > rev:
+        data.insert(low, (pycompat.xrange(rev, rev + 1), t))
+    else:
+        data.insert(low + 1, (pycompat.xrange(rev, rev + 1), t))
 
 
 class phasecache(object):
@@ -400,8 +482,9 @@ 
             phasetracking = tr.changes[b'phases']
             torev = repo.changelog.rev
             phase = self.phase
-            for n in nodes:
-                rev = torev(n)
+            revs = [torev(node) for node in nodes]
+            revs.sort()
+            for rev in revs:
                 revphase = phase(repo, rev)
                 _trackphasechange(phasetracking, rev, None, revphase)
         repo.invalidatevolatilesets()
@@ -485,7 +568,7 @@ 
                     affected -= revs
                 else:  # public phase
                     revs = affected
-                for r in revs:
+                for r in sorted(revs):
                     _trackphasechange(phasetracking, r, phase, targetphase)
         repo.invalidatevolatilesets()
 
diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -2179,15 +2179,16 @@ 
                     )
             if hook.hashook(repo.ui, b'pretxnclose-phase'):
                 cl = repo.unfiltered().changelog
-                for rev, (old, new) in tr.changes[b'phases'].items():
-                    args = tr.hookargs.copy()
-                    node = hex(cl.node(rev))
-                    args.update(phases.preparehookargs(node, old, new))
-                    repo.hook(
-                        b'pretxnclose-phase',
-                        throw=True,
-                        **pycompat.strkwargs(args)
-                    )
+                for revs, (old, new) in tr.changes[b'phases']:
+                    for rev in revs:
+                        args = tr.hookargs.copy()
+                        node = hex(cl.node(rev))
+                        args.update(phases.preparehookargs(node, old, new))
+                        repo.hook(
+                            b'pretxnclose-phase',
+                            throw=True,
+                            **pycompat.strkwargs(args)
+                        )
 
             repo.hook(
                 b'pretxnclose', throw=True, **pycompat.strkwargs(tr.hookargs)
@@ -2232,7 +2233,7 @@ 
         )
         tr.changes[b'origrepolen'] = len(self)
         tr.changes[b'obsmarkers'] = set()
-        tr.changes[b'phases'] = {}
+        tr.changes[b'phases'] = []
         tr.changes[b'bookmarks'] = {}
 
         tr.hookargs[b'txnid'] = txnid
@@ -2266,16 +2267,19 @@ 
 
                 if hook.hashook(repo.ui, b'txnclose-phase'):
                     cl = repo.unfiltered().changelog
-                    phasemv = sorted(tr.changes[b'phases'].items())
-                    for rev, (old, new) in phasemv:
-                        args = tr.hookargs.copy()
-                        node = hex(cl.node(rev))
-                        args.update(phases.preparehookargs(node, old, new))
-                        repo.hook(
-                            b'txnclose-phase',
-                            throw=False,
-                            **pycompat.strkwargs(args)
-                        )
+                    phasemv = sorted(
+                        tr.changes[b'phases'], key=lambda r: r[0][0]
+                    )
+                    for revs, (old, new) in phasemv:
+                        for rev in revs:
+                            args = tr.hookargs.copy()
+                            node = hex(cl.node(rev))
+                            args.update(phases.preparehookargs(node, old, new))
+                            repo.hook(
+                                b'txnclose-phase',
+                                throw=False,
+                                **pycompat.strkwargs(args)
+                            )
 
                 repo.hook(
                     b'txnclose', throw=False, **pycompat.strkwargs(hookargs)