Patchwork [4,of,4] rebase: sort rebaseset using destination information

login
register
mail settings
Submitter Jun Wu
Date June 24, 2017, 5:04 a.m.
Message ID <2f4532b4dd510d358a38.1498280683@x1c>
Download mbox | patch
Permalink /patch/21661/
State Changes Requested
Headers show

Comments

Jun Wu - June 24, 2017, 5:04 a.m.
# HG changeset patch
# User Jun Wu <quark@fb.com>
# Date 1498280490 25200
#      Fri Jun 23 22:01:30 2017 -0700
# Node ID 2f4532b4dd510d358a3803743ff04ae4390adf30
# Parent  707de31cfae074a767511bd51a197afc7c599323
# Available At https://bitbucket.org/quark-zju/hg-draft
#              hg pull https://bitbucket.org/quark-zju/hg-draft -r 2f4532b4dd51
rebase: sort rebaseset using destination information

Previously rebase only uses changelog topological order and rebase source
and destination could not overlap. With mult-destination support, source and
destination could reasonably partially overlap. That requires another
topological sort on the {sourcerev: destrev} graph. This patch implements
that.

A "Resolve instability" test was added to demonstrate how to use
"uniquesuccessor" revset and rebase to resolve instability. It is similar to
a killer feature provided by "hg evolve" implemented externally. But the
revset is more flexible, using the same test case, "hg evolve" as of today,
will move C on top of B2, while the demonstrated revset moves C on top of J,
which could be more expected in some work flows.

Note: the "uniquesuccessor" revset is for demonstrating purpose. To
implement a formal revset for solving instability, we need to take more
things into consideration, like split. So there is still some work to define
good revsets to get "hg evolve"-like user experience. But the most important
change is now done in rebase.py.

Some error message has changed from "source is ancestor of destination" to
"source and destination form a cycle". The new error message is also correct
and detecting "cycle" is cheaper and easier than "ancestor of destination"
because the later requires changelog and may require runtime effort to
figure out what the changelog DAG looks like. Note: if we have some cheap
in-memory-only mutable changelog, we can dry-run the rebase and do the
"ancestor" check, but it's still more expensive than the "cycle" check.

The experimental "rebasemultidest" flag is now turned on by default.

Patch

diff --git a/hgext/rebase.py b/hgext/rebase.py
--- a/hgext/rebase.py
+++ b/hgext/rebase.py
@@ -361,5 +361,5 @@  class rebaseruntime(object):
 
     def _performrebase(self, tr):
-        repo, ui, opts = self.repo, self.ui, self.opts
+        repo, ui = self.repo, self.ui
         if self.keepbranchesf:
             # insert _savebranch at the start of extrafns so if
@@ -385,8 +385,15 @@  class rebaseruntime(object):
         self.storestatus()
 
-        sortedrevs = repo.revs('sort(%ld, -topo)', self.state)
         cands = [k for k, v in self.state.iteritems() if v == revtodo]
         total = len(cands)
         pos = 0
+        for subset in sortsource(self.destmap):
+            pos = self._performrebasesubset(tr, subset, pos, total)
+        ui.progress(_('rebasing'), None)
+        ui.note(_('rebase merging completed\n'))
+
+    def _performrebasesubset(self, tr, subset, pos, total):
+        repo, ui, opts = self.repo, self.ui, self.opts
+        sortedrevs = repo.revs('sort(%ld, -topo)', subset)
         for rev in sortedrevs:
             dest = self.destmap[rev]
@@ -463,7 +470,5 @@  class rebaseruntime(object):
                 ui.status(_('already rebased %s as %s\n') %
                           (desc, repo[self.state[rev]]))
-
-        ui.progress(_('rebasing'), None)
-        ui.note(_('rebase merging completed\n'))
+        return pos
 
     def _finishrebase(self):
@@ -853,5 +858,5 @@  def _definesets(ui, repo, destf=None, sr
             # might have used "SRC" or "ALLSRC", raise if multidest feature is
             # not enabled
-            if not ui.configbool('experimental', 'rebasemultidest', False):
+            if not ui.configbool('experimental', 'rebasemultidest', True):
                 raise
 
@@ -988,13 +993,32 @@  def adjusteddest(repo, rev, prev, destma
 
     The destination will be adjusted from F to B'.
+
+    Besides, adjust dest according to existing rebase information. For example,
+
+      B C D    B wants to be rebased on top of C, C wants to be rebased on top
+       \|/     of D.
+        A
+
+          C'   After rebasing C, when considering B's dest, replace it with C'.
+          |
+      B   D
+       \ /
+        A
     """
     dest = destmap[rev]
-    if prev == nullrev:
-        return dest
-    # interesting rebase source - having a same dest and is rebased
-    source = [s for s, d in state.items() if d > 0 and destmap[s] == dest]
-    candidate = repo.revs('max(%ld and (::%d))', source, prev).first()
-    if candidate is not None:
-        return state[candidate]
+    if prev != nullrev:
+        # interesting rebase source - having a same dest and is rebased
+        source = [s for s, d in state.items() if d > 0 and destmap[s] == dest]
+        candidate = repo.revs('max(%ld and (::%d))', source, prev).first()
+        if candidate is not None:
+            return state[candidate]
+    if dest in state:
+        revstate = state[dest]
+        if revstate == revtodo:
+            # sortsource should produce an order that makes this impossible
+            raise error.ProgrammingError(
+                'rev %d should be rebased already at this time' % dest)
+        elif revstate > revtodo:
+            return revstate
     return dest
 
@@ -1090,4 +1114,10 @@  def defineparents(repo, rev, destmap, st
                             'result would have 3 parents') % rev)
 
+    # Source should not be ancestor of dest. The check here guarantees it's
+    # impossible. Since the check before running actual rebase does not cover
+    # complex cases.
+    if any(p != nullrev and ancestor(p, rev) == rev for p in newps):
+        raise error.Abort(_('source is ancestor of destination'))
+
     repo.ui.debug(" future parents are %d and %d\n" % tuple(newps))
 
@@ -1281,4 +1311,30 @@  def abort(repo, originalwd, destmap, sta
     return 0
 
+def sortsource(destmap):
+    """yield source revisions in an order that we only rebase things once
+
+    destmap is {srcrev: destrev}
+
+    If source and destination overlaps, we should filter out revisions
+    depending on other revisions which hasn't been rebased yet.
+
+    Yield a sorted list of revisions each time.
+
+    Raise if there is a cycle so we cannot rebase everything.
+    """
+    srcset = set(destmap)
+    while srcset:
+        srclist = sorted(srcset)
+        toremove = set()
+        result = []
+        for r in srclist:
+            if destmap[r] not in srcset:
+                toremove.add(r)
+                result.append(r)
+        srcset -= toremove
+        if not result:
+            raise error.Abort(_('source and destination form a cycle'))
+        yield result
+
 def buildstate(repo, destmap, rebaseset, collapse, obsoletenotrebased):
     '''Define which revisions are going to be rebased and where
@@ -1299,10 +1355,15 @@  def buildstate(repo, destmap, rebaseset,
             raise error.Abort(_('cannot rebase onto an applied mq patch'))
 
-    roots = list(repo.set('roots(%ld)', rebaseset))
-    if not roots:
+    # Expand the generator to list so we get "cycle" error early.
+    sortedsrc = list(sortsource(destmap)) # a list of list of revs
+    if not sortedsrc:
         raise error.Abort(_('no matching revisions'))
+
+    # Only check a subset of rebaseset which do not depend on other rebaseset.
+    # That's first "source batch" returned by sortsource.
+    roots = list(repo.set('roots(%ld)', sortedsrc[0]))
     roots.sort()
     state = dict.fromkeys(rebaseset, revtodo)
-    emptyrebase = True
+    emptyrebase = (len(sortedsrc) == 1)
     for root in roots:
         dest = repo[destmap[root.rev()]]
diff --git a/tests/drawdag.py b/tests/drawdag.py
--- a/tests/drawdag.py
+++ b/tests/drawdag.py
@@ -73,4 +73,9 @@  as expected - the result would be an iso
 This is because 'o' is specially handled in the input: instead of using 'o' as
 the node name, the word to the right will be used.
+
+Some special comments could have side effects:
+
+    - Create an obsmarker with precursor A, successors B and C
+      # obsmarker: A -> B, C
 """
 from __future__ import absolute_import, print_function
@@ -84,4 +89,5 @@  from mercurial import (
     error,
     node,
+    obsolete,
     registrar,
     scmutil,
@@ -315,2 +321,16 @@  def debugdrawdag(ui, repo, **opts):
         tagsmod.tag(repo, name, n, message=None, user=None, date=None,
                     local=True)
+
+    # handle special comments
+    with repo.lock():
+        for line in text.splitlines():
+            if ' # ' not in line:
+                continue
+            comment = line.split(' # ', 1)[1].split(' # ')[0].strip()
+            if comment.startswith('obsmarker: '):
+                markerstr = comment[len('obsmarker: '):]
+                precstr, succstr = [s.strip() for s in markerstr.split('->')]
+                prec = repo[committed[precstr]]
+                succs = [repo[committed[s.strip()]] for s in succstr.split(',')]
+                rels = [(prec, succs)]
+                obsolete.createmarkers(repo, rels, date=(0, 0))
diff --git a/tests/test-rebase-dest.t b/tests/test-rebase-dest.t
--- a/tests/test-rebase-dest.t
+++ b/tests/test-rebase-dest.t
@@ -201,5 +201,5 @@  Destination is an ancestor of source:
   > Z
   > EOS
-  abort: source is ancestor of destination
+  abort: source and destination form a cycle
   [255]
 
@@ -288,5 +288,5 @@  Move to a previous parent:
   o  0: A
   
-Source overlaps with destination (not handled well currently):
+Source overlaps with destination:
 
   $ rebasewithdag -s 'B+C+D' -d 'map(SRC, "B:C,C:D")' <<'EOS'
@@ -295,14 +295,191 @@  Source overlaps with destination (not ha
   >   A
   > EOS
+  rebasing 2:dc0947a82db8 "C" (C)
   rebasing 1:112478962961 "B" (B)
-  rebasing 2:dc0947a82db8 "C" (C)
-  o  5: C
+  o  5: B
+  |
+  o  4: C
+  |
+  o  3: D
+  |
+  o  0: A
+  
+Detect cycles early:
+
+  $ rebasewithdag -r 'all()-Z' -d 'map(SRC, "A:B,B:C,C:D,D:B")' <<'EOS'
+  > A B C
+  >  \|/
+  >   | D
+  >   |/
+  >   Z
+  > EOS
+  abort: source and destination form a cycle
+  [255]
+
+Detect source is ancestor of dest in runtime:
+
+  $ rebasewithdag -r 'C+B' -d 'map(SRC, "C:B,B:D")' -q <<'EOS'
+  >   D
+  >   |
+  > B C
+  >  \|
+  >   A
+  > EOS
+  transaction abort!
+  rollback completed
+  abort: source is ancestor of destination
+  [255]
+
+"Already rebase" fast path still works:
+
+  $ rebasewithdag -r 'all()' -d 'SRC^' <<'EOS'
+  >   E F
+  >  /| |
+  > B C D
+  >  \|/
+  >   A
+  > EOS
+  already rebased 1:112478962961 "B" (B)
+  already rebased 2:dc0947a82db8 "C" (C)
+  already rebased 3:b18e25de2cf5 "D" (D)
+  already rebased 4:48e1736f2484 "E" (E)
+  already rebased 5:ad6717a6a58e "F" (F tip)
+  o  5: F
   |
   o  3: D
   |
-  | o  4: B unstable
+  | o    4: E
+  | |\
+  +---o  2: C
   | |
-  | x  2: C
+  | o  1: B
   |/
   o  0: A
   
+Massively rewrite the DAG:
+
+  $ rebasewithdag -r 'all()' -d 'map(SRC, "A:I,I:null,H:A,B:J,J:C,C:H,D:E,F:G,G:K,K:D,E:B")' <<'EOS'
+  > D G K
+  > | | |
+  > C F J
+  > | | |
+  > B E I
+  >  \| |
+  >   A H
+  > EOS
+  rebasing 4:701514e1408d "I" (I)
+  rebasing 0:426bada5c675 "A" (A)
+  rebasing 1:e7050b6e5048 "H" (H)
+  rebasing 5:26805aba1e60 "C" (C)
+  rebasing 7:cf89f86b485b "J" (J)
+  rebasing 2:112478962961 "B" (B)
+  rebasing 3:7fb047a69f22 "E" (E)
+  rebasing 8:f585351a92f8 "D" (D)
+  rebasing 10:ae41898d7875 "K" (K tip)
+  rebasing 9:711f53bbef0b "G" (G)
+  rebasing 6:64a8289d2492 "F" (F)
+  o  21: F
+  |
+  o  20: G
+  |
+  o  19: K
+  |
+  o  18: D
+  |
+  o  17: E
+  |
+  o  16: B
+  |
+  o  15: J
+  |
+  o  14: C
+  |
+  o  13: H
+  |
+  o  12: A
+  |
+  o  11: I
+  
+Resolve instability:
+
+  $ cat > $TESTTMP/uniquesuccessor.py <<EOF
+  > from mercurial import obsolete, registrar, revset, smartset
+  > revsetpredicate = registrar.revsetpredicate()
+  > cache = {}
+  > @revsetpredicate('uniquesuccessor')
+  > def uniquesuccessor(repo, subset, x):
+  >     """unique visible successor or empty set"""
+  >     s = revset.getset(repo, smartset.fullreposet(repo), x)
+  >     if not s:
+  >         return smartset.baseset()
+  >     succs = obsolete.successorssets(repo, repo[s.first()].node(),
+  >                                     cache=cache)
+  >     if len(succs) != 1:
+  >         return smartset.baseset()
+  >     nodes = list(succs)[0]
+  >     if len(nodes) != 1:
+  >         return smartset.baseset()
+  >     node = list(nodes)[0]
+  >     rev = repo.changelog.nodemap.get(node)
+  >     if not rev:
+  >         return smartset.baseset()
+  >     return smartset.baseset([rev])
+  > EOF
+
+  $ cat >> $HGRCPATH <<EOF
+  > [extensions]
+  > uniquesuccessor=$TESTTMP/uniquesuccessor.py
+  > [revsetalias]
+  > s = unstable()-obsolete()
+  > d = max(uniquesuccessor(max(roots(ALLSRC) & ::SRC)^)::)
+  > EOF
+
+  $ hg init r1
+  $ cd r1
+  $ rebasewithdag <<'EOF' -r s -d d
+  >      F2
+  >      |
+  >    J E E2
+  >    | |/
+  > I2 I | E3
+  >   \| |/
+  >    H | G
+  >    | | |
+  >   B2 D F
+  >    | |/         # obsmarker: B -> B2  # rebase
+  >    N C          # obsmarker: E -> E2  # amend
+  >    | |          # obsmarker: E2 -> E3 # amend
+  >    M B          # obsmarker: F -> F2  # rebase
+  >     \|          # obsmarker: I -> I2  # amend
+  >      A
+  > EOF
+  rebasing 16:5c432343bf59 "J" (J tip)
+  rebasing 3:26805aba1e60 "C" (C)
+  rebasing 6:f585351a92f8 "D" (D)
+  rebasing 10:ffebc37c5d0b "E3" (E3)
+  rebasing 13:fb184bcfeee8 "F2" (F2)
+  rebasing 11:dc838ab4c0da "G" (G)
+  o  22: G
+  |
+  o  21: F2
+  |
+  o  20: E3
+  |
+  o  19: D
+  |
+  o  18: C
+  |
+  o  17: J
+  |
+  o  15: I2
+  |
+  o  12: H
+  |
+  o  5: B2
+  |
+  o  4: N
+  |
+  o  2: M
+  |
+  o  0: A
+  
diff --git a/tests/test-rebase-named-branches.t b/tests/test-rebase-named-branches.t
--- a/tests/test-rebase-named-branches.t
+++ b/tests/test-rebase-named-branches.t
@@ -246,5 +246,5 @@  Rebasing descendant onto ancestor across
   
   $ hg rebase -s 5 -d 6
-  abort: source is ancestor of destination
+  abort: source and destination form a cycle
   [255]
 
diff --git a/tests/test-rebase-scenario-global.t b/tests/test-rebase-scenario-global.t
--- a/tests/test-rebase-scenario-global.t
+++ b/tests/test-rebase-scenario-global.t
@@ -264,5 +264,5 @@  F onto G - rebase onto a descendant:
 
   $ hg rebase -s 5 -d 6
-  abort: source is ancestor of destination
+  abort: source and destination form a cycle
   [255]