Patchwork [3,of,3] histedit: revive commits on demand

login
register
mail settings
Submitter Jun Wu
Date March 26, 2017, 6:41 p.m.
Message ID <b6766d75404fb8c5d26a.1490553701@localhost.localdomain>
Download mbox | patch
Permalink /patch/19696/
State Deferred
Headers show

Comments

Jun Wu - March 26, 2017, 6:41 p.m.
# HG changeset patch
# User Jun Wu <quark@fb.com>
# Date 1490552007 25200
#      Sun Mar 26 11:13:27 2017 -0700
# Node ID b6766d75404fb8c5d26af016caa76f44b47ce156
# Parent  336512ee2f947f07149e399a84927f9d820d2b62
# Available At https://bitbucket.org/quark-zju/hg-draft
#              hg pull https://bitbucket.org/quark-zju/hg-draft -r b6766d75404f
histedit: revive commits on demand

This is to address the "histedit --abort" issue mentioned in [1].

This solution is safer than what [1] proposed because it does not use unsafe
history writing ("strip"). The "strip" has caused several repo corruptions
in production.

It's similar to what "hg touch" in mutable-history does [3], without using
random number. So it could be more reliably tested.

Note that if we have obsolete marker versions, and obsoleted revisions are
revived automatically, as I proposed in [2], the "revive" hack will be no
longer necessary.

[1]: https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-March/095572.html
[2]: https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-March/094586.html
[3]: https://bitbucket.org/marmoute/mutable-history/src/c7da63d48/hgext3rd/evolve/__init__.py?at=default&fileviewer=file-view-default#__init__.py-2541
Jun Wu - March 26, 2017, 6:48 p.m.
Excerpts from Jun Wu's message of 2017-03-26 11:41:41 -0700:
> # HG changeset patch
> # User Jun Wu <quark@fb.com>
> # Date 1490552007 25200
> #      Sun Mar 26 11:13:27 2017 -0700
> # Node ID b6766d75404fb8c5d26af016caa76f44b47ce156
> # Parent  336512ee2f947f07149e399a84927f9d820d2b62
> # Available At https://bitbucket.org/quark-zju/hg-draft 
> #              hg pull https://bitbucket.org/quark-zju/hg-draft  -r b6766d75404f
> histedit: revive commits on demand
> 
> This is to address the "histedit --abort" issue mentioned in [1].
> 
> This solution is safer than what [1] proposed because it does not use unsafe
> history writing ("strip"). The "strip" has caused several repo corruptions
> in production.
> 
> It's similar to what "hg touch" in mutable-history does [3], without using
> random number. So it could be more reliably tested.
> 
> Note that if we have obsolete marker versions, and obsoleted revisions are
> revived automatically, as I proposed in [2], the "revive" hack will be no
> longer necessary.
> 
> [1]: https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-March/095572.html 
> [2]: https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-March/094586.html 
> [3]: https://bitbucket.org/marmoute/mutable-history/src/c7da63d48/hgext3rd/evolve/__init__.py?at=default&fileviewer=file-view-default#__init__.py-2541 
> 
> diff --git a/hgext/histedit.py b/hgext/histedit.py
> --- a/hgext/histedit.py
> +++ b/hgext/histedit.py
> @@ -507,4 +507,30 @@ class histeditaction(object):
>          return ctx, [(self.node, (ctx.node(),))]
>  
> +def _revive(repo, n, updatedirstate=True):
> +    """given a node, make sure it's not obsoleted by adding noises
> +
> +    If updatedirstate is True and new node is created, set dirstate parent to
> +    the new node.
> +
> +    Return new node that is not obsoleted.
> +    """
> +    unfi = repo.unfiltered()
> +    origctx = ctx = unfi[n]
> +    while ctx.obsolete():
> +        extra = dict(ctx.extra())
> +        revive = int(extra.get('__revive__', '0')) + 1
> +        extra['__revive__'] = str(revive)
> +        newctx = context.metadataonlyctx(repo, ctx, date=ctx.date(),
> +                                         user=ctx.user(), extra=extra)
> +        newnode = newctx.commit()
> +        repo.ui.debug('node %s was revived as %s (revive=%d)\n'
> +                      % (node.short(ctx.node()), node.short(newnode), revive))
> +        ctx = unfi[newnode]
> +    if origctx != ctx and updatedirstate:
> +        repo.dirstate.beginparentchange()
> +        repo.dirstate.setparents(ctx.node())
> +        repo.dirstate.endparentchange()
> +    return ctx.node()
> +

Maybe this should be in core commit function as it should also applies to
cases like "graft" etc.

>  def commitfuncfor(repo, src):
>      """Build a commit function for the replacement of <src>
> @@ -524,5 +550,6 @@ def commitfuncfor(repo, src):
>              extra['histedit_source'] = src.hex()
>              kwargs['extra'] = extra
> -            return repo.commit(**kwargs)
> +            newnode = repo.commit(**kwargs)
> +            return _revive(repo, newnode)
>      return commitfunc
>  
> diff --git a/tests/test-histedit-obsolete.t b/tests/test-histedit-obsolete.t
> --- a/tests/test-histedit-obsolete.t
> +++ b/tests/test-histedit-obsolete.t
> @@ -552,5 +552,5 @@ Abort
>  Re-run a similar histedit plan
>  
> -  $ hg histedit -r 'b449568bf7fc' --commands - << EOF
> +  $ hg histedit --debug -r 'b449568bf7fc' --commands - << EOF | grep revive
>    > pick b449568bf7fc 13 f
>    > pick 7395e1ff83bd 15 h
> @@ -560,4 +560,21 @@ Re-run a similar histedit plan
>    > pick ee118ab9fa44 18 k
>    > EOF
> -  abort: 00changelog.i@4dc06258baa667440b8af3c849ddd9c10a4b0cb8: filtered node!
> -  [255]
> +  node 4dc06258baa6 was revived as 7bd549b16383 (revive=1)
> +
> +  $ hg log -G
> +  @  27:563ef4a77b72 (secret) k
> +  |
> +  o  26:f371aa2c4158 (secret) j
> +  |
> +  o  25:5bfeafa73282 (secret) i
> +  |
> +  o  24:c53c6695c868 (draft) g
> +  |
> +  o  23:7bd549b16383 (draft) h
> +  |
> +  o  13:b449568bf7fc (draft) f
> +  |
> +  o  12:40db8afa467b (public) c
> +  |
> +  o  0:cb9a9f314b8b (public) a
> +
Pierre-Yves David - March 27, 2017, 7:17 a.m.
On 03/26/2017 08:41 PM, Jun Wu wrote:
> # HG changeset patch
> # User Jun Wu <quark@fb.com>
> # Date 1490552007 25200
> #      Sun Mar 26 11:13:27 2017 -0700
> # Node ID b6766d75404fb8c5d26af016caa76f44b47ce156
> # Parent  336512ee2f947f07149e399a84927f9d820d2b62
> # Available At https://bitbucket.org/quark-zju/hg-draft
> #              hg pull https://bitbucket.org/quark-zju/hg-draft -r b6766d75404f
> histedit: revive commits on demand
>
> This is to address the "histedit --abort" issue mentioned in [1].

There are still issues with that approach. For example in this 
distributed case:

1) Alice send changesets to Bob,
2) Bob try to histedit something in the stack, this fails
3) Bob --abort the histedit
4) Bob tell Alice to perform the operation herself since she know the 
code better and can resolve the conflict
5) Alice histedit with success

If step (3) created markers for the histedit partial result, these 
markers will apply to the result of (5) and create confusion/change 
"loss" when the two states are gathered.

This is a non-trivial question and there is good reason why rebase and 
histedit do not use obsolescence marker to restore the repository to its 
former state.

note: this is a case were a local-light-weight "archiving" mechanism 
could be handy.

> This solution is safer than what [1] proposed because it does not use unsafe
> history writing ("strip"). The "strip" has caused several repo corruptions
> in production.

I'm sure upstream would be happy to have some bug report about this and 
maybe some investigation about this corruption. Strip corrupting 
repository would happily rank a "critical".

> It's similar to what "hg touch" in mutable-history does [3], without using
> random number. So it could be more reliably tested.

The reason why 'touch' use random number is to ensure operation in one 
repository will not -unexpectedly- affect the one in another repo (kind 
of like here). We could use sometimes else than random but it cannot be 
a simple sequential number.

Note that using random will not solve the issue here since (5) does not 
even know yet that the markers in (3) exists.

> Note that if we have obsolete marker versions, and obsoleted revisions are
> revived automatically, as I proposed in [2], the "revive" hack will be
> longer be necessary.

Please see my (fresh) reply to that other series.
Jun Wu - March 27, 2017, 10:29 a.m.
Excerpts from Pierre-Yves David's message of 2017-03-27 09:17:59 +0200:
> 
> On 03/26/2017 08:41 PM, Jun Wu wrote:
> > # HG changeset patch
> > # User Jun Wu <quark@fb.com>
> > # Date 1490552007 25200
> > #      Sun Mar 26 11:13:27 2017 -0700
> > # Node ID b6766d75404fb8c5d26af016caa76f44b47ce156
> > # Parent  336512ee2f947f07149e399a84927f9d820d2b62
> > # Available At https://bitbucket.org/quark-zju/hg-draft 
> > #              hg pull https://bitbucket.org/quark-zju/hg-draft  -r b6766d75404f
> > histedit: revive commits on demand
> >
> > This is to address the "histedit --abort" issue mentioned in [1].
> 
> There are still issues with that approach. For example in this 
> distributed case:
> 
> 1) Alice send changesets to Bob,
> 2) Bob try to histedit something in the stack, this fails
> 3) Bob --abort the histedit
> 4) Bob tell Alice to perform the operation herself since she know the 
> code better and can resolve the conflict
> 5) Alice histedit with success

Good example. But that's easily solvable by not writing the "parent"
information when creating the markers, so the markers won't get exchanged.

> If step (3) created markers for the histedit partial result, these 
> markers will apply to the result of (5) and create confusion/change 
> "loss" when the two states are gathered.
> 
> This is a non-trivial question and there is good reason why rebase and 
> histedit do not use obsolescence marker to restore the repository to its 
> former state.
> 
> note: this is a case were a local-light-weight "archiving" mechanism 
> could be handy.
> 
> > This solution is safer than what [1] proposed because it does not use unsafe
> > history writing ("strip"). The "strip" has caused several repo corruptions
> > in production.
> 
> I'm sure upstream would be happy to have some bug report about this and 
> maybe some investigation about this corruption. Strip corrupting 
> repository would happily rank a "critical".
> 
> > It's similar to what "hg touch" in mutable-history does [3], without using
> > random number. So it could be more reliably tested.
> 
> The reason why 'touch' use random number is to ensure operation in one 
> repository will not -unexpectedly- affect the one in another repo (kind 
> of like here). We could use sometimes else than random but it cannot be 
> a simple sequential number.
> 
> Note that using random will not solve the issue here since (5) does not 
> even know yet that the markers in (3) exists.
> 
> > Note that if we have obsolete marker versions, and obsoleted revisions are
> > revived automatically, as I proposed in [2], the "revive" hack will be
> > longer be necessary.
> 
> Please see my (fresh) reply to that other series.
>
Pierre-Yves David - March 27, 2017, 11:16 a.m.
On 03/27/2017 12:29 PM, Jun Wu wrote:
> Excerpts from Pierre-Yves David's message of 2017-03-27 09:17:59 +0200:
>>
>> On 03/26/2017 08:41 PM, Jun Wu wrote:
>>> # HG changeset patch
>>> # User Jun Wu <quark@fb.com>
>>> # Date 1490552007 25200
>>> #      Sun Mar 26 11:13:27 2017 -0700
>>> # Node ID b6766d75404fb8c5d26af016caa76f44b47ce156
>>> # Parent  336512ee2f947f07149e399a84927f9d820d2b62
>>> # Available At https://bitbucket.org/quark-zju/hg-draft
>>> #              hg pull https://bitbucket.org/quark-zju/hg-draft  -r b6766d75404f
>>> histedit: revive commits on demand
>>>
>>> This is to address the "histedit --abort" issue mentioned in [1].
>>
>> There are still issues with that approach. For example in this
>> distributed case:
>>
>> 1) Alice send changesets to Bob,
>> 2) Bob try to histedit something in the stack, this fails
>> 3) Bob --abort the histedit
>> 4) Bob tell Alice to perform the operation herself since she know the
>> code better and can resolve the conflict
>> 5) Alice histedit with success
>
> Good example. But that's easily solvable by not writing the "parent"
> information when creating the markers, so the markers won't get exchanged.

This does not solve the issue, When Bob will pull from Alice, the issue 
will appears in Bob repository.

Please slow down.
Jun Wu - March 27, 2017, 4:24 p.m.
Excerpts from Pierre-Yves David's message of 2017-03-27 13:16:15 +0200:
> 
> On 03/27/2017 12:29 PM, Jun Wu wrote:
> > Excerpts from Pierre-Yves David's message of 2017-03-27 09:17:59 +0200:
> >>
> >> On 03/26/2017 08:41 PM, Jun Wu wrote:
> >>> # HG changeset patch
> >>> # User Jun Wu <quark@fb.com>
> >>> # Date 1490552007 25200
> >>> #      Sun Mar 26 11:13:27 2017 -0700
> >>> # Node ID b6766d75404fb8c5d26af016caa76f44b47ce156
> >>> # Parent  336512ee2f947f07149e399a84927f9d820d2b62
> >>> # Available At https://bitbucket.org/quark-zju/hg-draft 
> >>> #              hg pull https://bitbucket.org/quark-zju/hg-draft   -r b6766d75404f
> >>> histedit: revive commits on demand
> >>>
> >>> This is to address the "histedit --abort" issue mentioned in [1].
> >>
> >> There are still issues with that approach. For example in this
> >> distributed case:
> >>
> >> 1) Alice send changesets to Bob,
> >> 2) Bob try to histedit something in the stack, this fails
> >> 3) Bob --abort the histedit
> >> 4) Bob tell Alice to perform the operation herself since she know the
> >> code better and can resolve the conflict
> >> 5) Alice histedit with success
> >
> > Good example. But that's easily solvable by not writing the "parent"
> > information when creating the markers, so the markers won't get exchanged.
> 
> This does not solve the issue, When Bob will pull from Alice, the issue 
> will appears in Bob repository.
> 
> Please slow down.

Why? Bob won't get the marker from Alice.
Jun Wu - March 27, 2017, 4:29 p.m.
Excerpts from Jun Wu's message of 2017-03-27 09:24:00 -0700:
> Excerpts from Pierre-Yves David's message of 2017-03-27 13:16:15 +0200:
> > 
> > On 03/27/2017 12:29 PM, Jun Wu wrote:
> > > Excerpts from Pierre-Yves David's message of 2017-03-27 09:17:59 +0200:
> > >>
> > >> On 03/26/2017 08:41 PM, Jun Wu wrote:
> > >>> # HG changeset patch
> > >>> # User Jun Wu <quark@fb.com>
> > >>> # Date 1490552007 25200
> > >>> #      Sun Mar 26 11:13:27 2017 -0700
> > >>> # Node ID b6766d75404fb8c5d26af016caa76f44b47ce156
> > >>> # Parent  336512ee2f947f07149e399a84927f9d820d2b62
> > >>> # Available At https://bitbucket.org/quark-zju/hg-draft  
> > >>> #              hg pull https://bitbucket.org/quark-zju/hg-draft    -r b6766d75404f
> > >>> histedit: revive commits on demand
> > >>>
> > >>> This is to address the "histedit --abort" issue mentioned in [1].
> > >>
> > >> There are still issues with that approach. For example in this
> > >> distributed case:
> > >>
> > >> 1) Alice send changesets to Bob,
> > >> 2) Bob try to histedit something in the stack, this fails
> > >> 3) Bob --abort the histedit
> > >> 4) Bob tell Alice to perform the operation herself since she know the
> > >> code better and can resolve the conflict
> > >> 5) Alice histedit with success
> > >
> > > Good example. But that's easily solvable by not writing the "parent"
> > > information when creating the markers, so the markers won't get exchanged.
> > 
> > This does not solve the issue, When Bob will pull from Alice, the issue 
> > will appears in Bob repository.
> > 
> > Please slow down.
> 
> Why? Bob won't get the marker from Alice.

It seems to be another example of what "node version" could solve.

Patch

diff --git a/hgext/histedit.py b/hgext/histedit.py
--- a/hgext/histedit.py
+++ b/hgext/histedit.py
@@ -507,4 +507,30 @@  class histeditaction(object):
         return ctx, [(self.node, (ctx.node(),))]
 
+def _revive(repo, n, updatedirstate=True):
+    """given a node, make sure it's not obsoleted by adding noises
+
+    If updatedirstate is True and new node is created, set dirstate parent to
+    the new node.
+
+    Return new node that is not obsoleted.
+    """
+    unfi = repo.unfiltered()
+    origctx = ctx = unfi[n]
+    while ctx.obsolete():
+        extra = dict(ctx.extra())
+        revive = int(extra.get('__revive__', '0')) + 1
+        extra['__revive__'] = str(revive)
+        newctx = context.metadataonlyctx(repo, ctx, date=ctx.date(),
+                                         user=ctx.user(), extra=extra)
+        newnode = newctx.commit()
+        repo.ui.debug('node %s was revived as %s (revive=%d)\n'
+                      % (node.short(ctx.node()), node.short(newnode), revive))
+        ctx = unfi[newnode]
+    if origctx != ctx and updatedirstate:
+        repo.dirstate.beginparentchange()
+        repo.dirstate.setparents(ctx.node())
+        repo.dirstate.endparentchange()
+    return ctx.node()
+
 def commitfuncfor(repo, src):
     """Build a commit function for the replacement of <src>
@@ -524,5 +550,6 @@  def commitfuncfor(repo, src):
             extra['histedit_source'] = src.hex()
             kwargs['extra'] = extra
-            return repo.commit(**kwargs)
+            newnode = repo.commit(**kwargs)
+            return _revive(repo, newnode)
     return commitfunc
 
diff --git a/tests/test-histedit-obsolete.t b/tests/test-histedit-obsolete.t
--- a/tests/test-histedit-obsolete.t
+++ b/tests/test-histedit-obsolete.t
@@ -552,5 +552,5 @@  Abort
 Re-run a similar histedit plan
 
-  $ hg histedit -r 'b449568bf7fc' --commands - << EOF
+  $ hg histedit --debug -r 'b449568bf7fc' --commands - << EOF | grep revive
   > pick b449568bf7fc 13 f
   > pick 7395e1ff83bd 15 h
@@ -560,4 +560,21 @@  Re-run a similar histedit plan
   > pick ee118ab9fa44 18 k
   > EOF
-  abort: 00changelog.i@4dc06258baa667440b8af3c849ddd9c10a4b0cb8: filtered node!
-  [255]
+  node 4dc06258baa6 was revived as 7bd549b16383 (revive=1)
+
+  $ hg log -G
+  @  27:563ef4a77b72 (secret) k
+  |
+  o  26:f371aa2c4158 (secret) j
+  |
+  o  25:5bfeafa73282 (secret) i
+  |
+  o  24:c53c6695c868 (draft) g
+  |
+  o  23:7bd549b16383 (draft) h
+  |
+  o  13:b449568bf7fc (draft) f
+  |
+  o  12:40db8afa467b (public) c
+  |
+  o  0:cb9a9f314b8b (public) a
+