Patchwork [v4] histedit: make histedit aware of obsoletions not stored in state (issue4800)

login
register
mail settings
Submitter Kostia Balytskyi
Date Feb. 23, 2016, 11:45 a.m.
Message ID <18e547e7742a2e54fc10.1456227925@android-802ec2ba4da5933b.dhcp.thefacebook.com>
Download mbox | patch
Permalink /patch/13306/
State Superseded
Headers show

Comments

Kostia Balytskyi - Feb. 23, 2016, 11:45 a.m.
# HG changeset patch
# User Kostia Balytskyi <ikostia@fb.com>
# Date 1456227840 0
#      Tue Feb 23 11:44:00 2016 +0000
# Node ID 18e547e7742a2e54fc108748736d463262aabd38
# Parent  28e1694ca60056d609ae2c8e0ad5cb2891416ea3
histedit: make histedit aware of obsoletions not stored in state (issue4800)

Before this change, when histedit exited to interactive session (during edit
command for example), user could introduce obsolescence markers that would not
be known to histedit. For example, user could've amended one of the commits
(introducing an obsoletion change). The fact of this amendment would not be
stored in histedit's state file and later, when histedit would try to process
all the replacements that happened, one of the final successors (in its opinion)
would turn out to be hidden. This behavior is described in issue4800. This
commit fixes it.
Pierre-Yves David - Feb. 23, 2016, 8:38 p.m.
On 02/23/2016 12:45 PM, Kostia Balytskyi wrote:
> # HG changeset patch
> # User Kostia Balytskyi <ikostia@fb.com>
> # Date 1456227840 0
> #      Tue Feb 23 11:44:00 2016 +0000
> # Node ID 18e547e7742a2e54fc108748736d463262aabd38
> # Parent  28e1694ca60056d609ae2c8e0ad5cb2891416ea3
> histedit: make histedit aware of obsoletions not stored in state (issue4800)

obsoletions is not a thing. we use obsolescence.

> Before this change, when histedit exited to interactive session (during edit
> command for example), user could introduce obsolescence markers that would not
> be known to histedit. For example, user could've amended one of the commits
> (introducing an obsoletion change). The fact of this amendment would not be

I'm not sure what you mean with obsoletion change here.

> stored in histedit's state file and later, when histedit would try to process
> all the replacements that happened, one of the final successors (in its opinion)
> would turn out to be hidden. This behavior is described in issue4800. This
> commit fixes it.

Actually we only fixes it in the evolution case. it is still broken in 
the non-evolve case. In the non-evolve case, we should use the timeless 
approach of forbiding it (to be resend by timeless once this is in?)

> diff --git a/hg b/hg
> --- a/hg
> +++ b/hg
> @@ -39,5 +39,4 @@
>
>   for fp in (sys.stdin, sys.stdout, sys.stderr):
>       mercurial.util.setbinary(fp)
> -
>   mercurial.dispatch.run()

Unrelated change (probably unintentional too).

> diff --git a/hgext/histedit.py b/hgext/histedit.py
> --- a/hgext/histedit.py
> +++ b/hgext/histedit.py
> @@ -173,6 +173,7 @@
>   import errno
>   import os
>   import sys
> +import itertools

We barely use itertools around Mercurial in general.

>   from mercurial import bundle2
>   from mercurial import cmdutil
> @@ -1386,13 +1387,48 @@
>                   hint=_('use "drop %s" to discard, see also: '
>                          '"hg help -e histedit.config"') % missing[0][:12])
>
> +def adjustreplacementsfrommarkers(repo, oldreplacements):
> +    """Adjust replacements from obsolescense markers
> +
> +    Replacements structure is originally generated based on
> +    histedit's state and does not account for changes that are
> +    not recorded there. This function fixes that by adding
> +    data read from obsolescense markers"""
> +    if not obsolete.isenabled(repo, obsolete.createmarkersopt):
> +        return oldreplacements
> +
> +    unfi = repo.unfiltered()
> +    newreplacements = list(oldreplacements)
> +    oldsuccs = itertools.imap(lambda r: r[1], oldreplacements)
> +    # successors that have already been added to succstocheck once
> +    seensuccs = set().union(*oldsuccs)

Why not just "set(oldsuccs)"?

> +    succstocheck = list(seensuccs)
> +    while succstocheck:
> +        n = succstocheck.pop()
> +        try:
> +            ctx = unfi[n]
> +        except error.RepoError:

building ctx are expensive and the error handing here is painful. 
Consider using "nodemap.get(n)" instead, it return None if the changeset 
is unknown, the rev number otherwise.

You get nodemap from repo.changelog.nodemap

> +            # node unknown locally, should be treated as replaced
> +            newreplacements.append((n, ()))
> +            continue

Can you elaborate on the logic here. I'm not sure what are the planned 
case and why that move to replaced make sense.

What I would expect when the node is unknown is to follow markers until 
we find a node we know.

I'm not we have any command that actually strip anything on such path 
one evolve is enabled so that might be "fine" to handle that.

> +        for marker in obsolete.successormarkers(ctx):

To handle gap, you can't use ctx (because we have no data for the node)) 
and you have to use lower level function to access raw markers.

   obsstore.successors.get(node, ())

> +            nsuccs = tuple(marker.succnodes())

Why are you upgrading to tuple here? I think it is already one.

> +            newreplacements.append((n, nsuccs))

If some of nsuccs is no longer in the repository. This will get caught 
in a later loop by the try except I was point at earlier.

This will likely gives the wrong result for the replacement tracking 
(but at least won't crash.

> +            for nsucc in nsuccs:
> +                if nsucc not in seensuccs:
> +                    seensuccs.add(nsucc)
> +                    succstocheck.append(nsucc)
> +
> +    return newreplacements
> +

Conclusion: the logic is sound except for the handling of "missing" 
changeset. Using lower function would let us track this properly.

The new code seems to give bad result in some corner case, but at least 
dont crash. I'm okay with moving forward as is if we get a follow up. 
Would be also happy to have a V5 with this fixed.
Kostia Balytskyi - Feb. 23, 2016, 9:40 p.m.
On 2/23/16, 8:38 PM, "Pierre-Yves David" <pierre-yves.david@ens-lyon.org> wrote:



>

>

>On 02/23/2016 12:45 PM, Kostia Balytskyi wrote:

>> # HG changeset patch

>> # User Kostia Balytskyi <ikostia@fb.com>

>> # Date 1456227840 0

>> #      Tue Feb 23 11:44:00 2016 +0000

>> # Node ID 18e547e7742a2e54fc108748736d463262aabd38

>> # Parent  28e1694ca60056d609ae2c8e0ad5cb2891416ea3

>> histedit: make histedit aware of obsoletions not stored in state (issue4800)

>

>obsoletions is not a thing. we use obsolescence.


Will fix.
>

>> Before this change, when histedit exited to interactive session (during edit

>> command for example), user could introduce obsolescence markers that would not

>> be known to histedit. For example, user could've amended one of the commits

>> (introducing an obsoletion change). The fact of this amendment would not be

>

>I'm not sure what you mean with obsoletion change here.


Removed.
>

>> stored in histedit's state file and later, when histedit would try to process

>> all the replacements that happened, one of the final successors (in its opinion)

>> would turn out to be hidden. This behavior is described in issue4800. This

>> commit fixes it.

>

>Actually we only fixes it in the evolution case. it is still broken in 

>the non-evolve case. In the non-evolve case, we should use the timeless 

>approach of forbiding it (to be resend by timeless once this is in?)

>

>> diff --git a/hg b/hg

>> --- a/hg

>> +++ b/hg

>> @@ -39,5 +39,4 @@

>>

>>   for fp in (sys.stdin, sys.stdout, sys.stderr):

>>       mercurial.util.setbinary(fp)

>> -

>>   mercurial.dispatch.run()

>

>Unrelated change (probably unintentional too).


Will fix.
>

>> diff --git a/hgext/histedit.py b/hgext/histedit.py

>> --- a/hgext/histedit.py

>> +++ b/hgext/histedit.py

>> @@ -173,6 +173,7 @@

>>   import errno

>>   import os

>>   import sys

>> +import itertools

>

>We barely use itertools around Mercurial in general.


I presume it's not a reason to not use it, right? Not un-using it at the moment.
>

>>   from mercurial import bundle2

>>   from mercurial import cmdutil

>> @@ -1386,13 +1387,48 @@

>>                   hint=_('use "drop %s" to discard, see also: '

>>                          '"hg help -e histedit.config"') % missing[0][:12])

>>

>> +def adjustreplacementsfrommarkers(repo, oldreplacements):

>> +    """Adjust replacements from obsolescense markers

>> +

>> +    Replacements structure is originally generated based on

>> +    histedit's state and does not account for changes that are

>> +    not recorded there. This function fixes that by adding

>> +    data read from obsolescense markers"""

>> +    if not obsolete.isenabled(repo, obsolete.createmarkersopt):

>> +        return oldreplacements

>> +

>> +    unfi = repo.unfiltered()

>> +    newreplacements = list(oldreplacements)

>> +    oldsuccs = itertools.imap(lambda r: r[1], oldreplacements)

>> +    # successors that have already been added to succstocheck once

>> +    seensuccs = set().union(*oldsuccs)

>

>Why not just "set(oldsuccs)"?


Because oldsuccs is a generator that returns a sequence of tuples which we want to merge in a set.
>

>> +    succstocheck = list(seensuccs)

>> +    while succstocheck:

>> +        n = succstocheck.pop()

>> +        try:

>> +            ctx = unfi[n]

>> +        except error.RepoError:

>

>building ctx are expensive and the error handing here is painful. 

>Consider using "nodemap.get(n)" instead, it return None if the changeset 

>is unknown, the rev number otherwise.

>

>You get nodemap from repo.changelog.nodemap

>

>> +            # node unknown locally, should be treated as replaced

>> +            newreplacements.append((n, ()))

>> +            continue

>

>Can you elaborate on the logic here. I'm not sure what are the planned 

>case and why that move to replaced make sense.

>

>What I would expect when the node is unknown is to follow markers until 

>we find a node we know.

>

>I'm not we have any command that actually strip anything on such path 

>one evolve is enabled so that might be "fine" to handle that.

My assumption was kinda "unknown locally should be treated as replaced with nothing".
Not even sure how we can introduce locally unknown commits while in histedit.

>

>> +        for marker in obsolete.successormarkers(ctx):

>

>To handle gap, you can't use ctx (because we have no data for the node)) 

>and you have to use lower level function to access raw markers.

>

>   obsstore.successors.get(node, ())

>

>> +            nsuccs = tuple(marker.succnodes())

>

>Why are you upgrading to tuple here? I think it is already one.

My bad, fixed.

>

>> +            newreplacements.append((n, nsuccs))

>

>If some of nsuccs is no longer in the repository. This will get caught 

>in a later loop by the try except I was point at earlier.

>

>This will likely gives the wrong result for the replacement tracking 

>(but at least won't crash.e

>

>> +            for nsucc in nsuccs:

>> +                if nsucc not in seensuccs:

>> +                    seensuccs.add(nsucc)

>> +                    succstocheck.append(nsucc)

>> +

>> +    return newreplacements

>> +

>

>Conclusion: the logic is sound except for the handling of "missing" 

>changeset. Using lower function would let us track this properly.

>

>The new code seems to give bad result in some corner case, but at least 

>dont crash. I'm okay with moving forward as is if we get a follow up. 

>Would be also happy to have a V5 with this fixed.

>

>-- 

>Pierre-Yves David


I feel like replacing ctx-related functions with lower-level stuff is related
to a rather obscure case and I feel like you're happy to accept these changes
without that fixed. I've sent patch v5 that does not fix ctx but touches the rest.
Feel free to accept it if you're happy :)
Pierre-Yves David - Feb. 23, 2016, 10:48 p.m.
On 02/23/2016 10:40 PM, Kostia Balytskyi wrote:
> On 2/23/16, 8:38 PM, "Pierre-Yves David" <pierre-yves.david@ens-lyon.org> wrote:
>
>
>
>>
>>
>> On 02/23/2016 12:45 PM, Kostia Balytskyi wrote:
>>> # HG changeset patch
>>> # User Kostia Balytskyi <ikostia@fb.com>
>>> # Date 1456227840 0
>>> #      Tue Feb 23 11:44:00 2016 +0000
>>> # Node ID 18e547e7742a2e54fc108748736d463262aabd38
>>> # Parent  28e1694ca60056d609ae2c8e0ad5cb2891416ea3
>>> histedit: make histedit aware of obsoletions not stored in state (issue4800)
>>
>> obsoletions is not a thing. we use obsolescence.
>
> Will fix.
>>
>>> Before this change, when histedit exited to interactive session (during edit
>>> command for example), user could introduce obsolescence markers that would not
>>> be known to histedit. For example, user could've amended one of the commits
>>> (introducing an obsoletion change). The fact of this amendment would not be
>>
>> I'm not sure what you mean with obsoletion change here.
>
> Removed.
>>
>>> stored in histedit's state file and later, when histedit would try to process
>>> all the replacements that happened, one of the final successors (in its opinion)
>>> would turn out to be hidden. This behavior is described in issue4800. This
>>> commit fixes it.
>>
>> Actually we only fixes it in the evolution case. it is still broken in
>> the non-evolve case. In the non-evolve case, we should use the timeless
>> approach of forbiding it (to be resend by timeless once this is in?)
>>
>>> diff --git a/hg b/hg
>>> --- a/hg
>>> +++ b/hg
>>> @@ -39,5 +39,4 @@
>>>
>>>    for fp in (sys.stdin, sys.stdout, sys.stderr):
>>>        mercurial.util.setbinary(fp)
>>> -
>>>    mercurial.dispatch.run()
>>
>> Unrelated change (probably unintentional too).
>
> Will fix.
>>
>>> diff --git a/hgext/histedit.py b/hgext/histedit.py
>>> --- a/hgext/histedit.py
>>> +++ b/hgext/histedit.py
>>> @@ -173,6 +173,7 @@
>>>    import errno
>>>    import os
>>>    import sys
>>> +import itertools
>>
>> We barely use itertools around Mercurial in general.
>
> I presume it's not a reason to not use it, right? Not un-using it at the moment.

In your case it does not seems to be a reason to prefer itertools over 
list comprehension/generator. Cf my comment on your V5.

>>
>>>    from mercurial import bundle2
>>>    from mercurial import cmdutil
>>> @@ -1386,13 +1387,48 @@
>>>                    hint=_('use "drop %s" to discard, see also: '
>>>                           '"hg help -e histedit.config"') % missing[0][:12])
>>>
>>> +def adjustreplacementsfrommarkers(repo, oldreplacements):
>>> +    """Adjust replacements from obsolescense markers
>>> +
>>> +    Replacements structure is originally generated based on
>>> +    histedit's state and does not account for changes that are
>>> +    not recorded there. This function fixes that by adding
>>> +    data read from obsolescense markers"""
>>> +    if not obsolete.isenabled(repo, obsolete.createmarkersopt):
>>> +        return oldreplacements
>>> +
>>> +    unfi = repo.unfiltered()
>>> +    newreplacements = list(oldreplacements)
>>> +    oldsuccs = itertools.imap(lambda r: r[1], oldreplacements)
>>> +    # successors that have already been added to succstocheck once
>>> +    seensuccs = set().union(*oldsuccs)
>>
>> Why not just "set(oldsuccs)"?
>
> Because oldsuccs is a generator that returns a sequence of tuples which we want to merge in a set.
>>
>>> +    succstocheck = list(seensuccs)
>>> +    while succstocheck:
>>> +        n = succstocheck.pop()
>>> +        try:
>>> +            ctx = unfi[n]
>>> +        except error.RepoError:
>>
>> building ctx are expensive and the error handing here is painful.
>> Consider using "nodemap.get(n)" instead, it return None if the changeset
>> is unknown, the rev number otherwise.
>>
>> You get nodemap from repo.changelog.nodemap
>>
>>> +            # node unknown locally, should be treated as replaced
>>> +            newreplacements.append((n, ()))
>>> +            continue
>>
>> Can you elaborate on the logic here. I'm not sure what are the planned
>> case and why that move to replaced make sense.
>>
>> What I would expect when the node is unknown is to follow markers until
>> we find a node we know.
>>
>> I'm not we have any command that actually strip anything on such path
>> one evolve is enabled so that might be "fine" to handle that.
> My assumption was kinda "unknown locally should be treated as replaced with nothing".
> Not even sure how we can introduce locally unknown commits while in histedit.

The way the rest of Mercurial behave is to keep following the 
obsolescence marker when a changeset is missing locally. This is a very 
common case with evolution and well defined.

If you asked me to produce a test case, I would make a two amend in a 
row and explicitly strip the intermediate changeset (after first amend) 
out. Some strange extensions command or some convoluted user could be 
doing for some reason.

If you start to add exchange, it get fairly easy, if for some reason you 
pull during and histedit, you could receive changeset affecting your 
working set.

>>> +        for marker in obsolete.successormarkers(ctx):
>>
>> To handle gap, you can't use ctx (because we have no data for the node))
>> and you have to use lower level function to access raw markers.
>>
>>    obsstore.successors.get(node, ())
>>
>>> +            nsuccs = tuple(marker.succnodes())
>>
>> Why are you upgrading to tuple here? I think it is already one.
> My bad, fixed.
>
>>
>>> +            newreplacements.append((n, nsuccs))
>>
>> If some of nsuccs is no longer in the repository. This will get caught
>> in a later loop by the try except I was point at earlier.
>>
>> This will likely gives the wrong result for the replacement tracking
>> (but at least won't crash.e
>>
>>> +            for nsucc in nsuccs:
>>> +                if nsucc not in seensuccs:
>>> +                    seensuccs.add(nsucc)
>>> +                    succstocheck.append(nsucc)
>>> +
>>> +    return newreplacements
>>> +
>>
>> Conclusion: the logic is sound except for the handling of "missing"
>> changeset. Using lower function would let us track this properly.
>>
>> The new code seems to give bad result in some corner case, but at least
>> dont crash. I'm okay with moving forward as is if we get a follow up.
>> Would be also happy to have a V5 with this fixed.
>>
>> --
>> Pierre-Yves David
>
> I feel like replacing ctx-related functions with lower-level stuff is related
> to a rather obscure case and I feel like you're happy to accept these changes
> without that fixed. I've sent patch v5 that does not fix ctx but touches the rest.
> Feel free to accept it if you're happy :)

The "locally missing" case is not obscure it is well charted territory 
for evolution. We want to handle it correctly. Given that this patch is 
already an improvement and did many round trip already, I'm okay with 
taking it as is. But we must follow up with a proper fix in that case.

Cheers,

Patch

diff --git a/hg b/hg
--- a/hg
+++ b/hg
@@ -39,5 +39,4 @@ 
 
 for fp in (sys.stdin, sys.stdout, sys.stderr):
     mercurial.util.setbinary(fp)
-
 mercurial.dispatch.run()
diff --git a/hgext/histedit.py b/hgext/histedit.py
--- a/hgext/histedit.py
+++ b/hgext/histedit.py
@@ -173,6 +173,7 @@ 
 import errno
 import os
 import sys
+import itertools
 
 from mercurial import bundle2
 from mercurial import cmdutil
@@ -1386,13 +1387,48 @@ 
                 hint=_('use "drop %s" to discard, see also: '
                        '"hg help -e histedit.config"') % missing[0][:12])
 
+def adjustreplacementsfrommarkers(repo, oldreplacements):
+    """Adjust replacements from obsolescense markers
+
+    Replacements structure is originally generated based on
+    histedit's state and does not account for changes that are
+    not recorded there. This function fixes that by adding
+    data read from obsolescense markers"""
+    if not obsolete.isenabled(repo, obsolete.createmarkersopt):
+        return oldreplacements
+
+    unfi = repo.unfiltered()
+    newreplacements = list(oldreplacements)
+    oldsuccs = itertools.imap(lambda r: r[1], oldreplacements)
+    # successors that have already been added to succstocheck once
+    seensuccs = set().union(*oldsuccs)
+    succstocheck = list(seensuccs)
+    while succstocheck:
+        n = succstocheck.pop()
+        try:
+            ctx = unfi[n]
+        except error.RepoError:
+            # node unknown locally, should be treated as replaced
+            newreplacements.append((n, ()))
+            continue
+
+        for marker in obsolete.successormarkers(ctx):
+            nsuccs = tuple(marker.succnodes())
+            newreplacements.append((n, nsuccs))
+            for nsucc in nsuccs:
+                if nsucc not in seensuccs:
+                    seensuccs.add(nsucc)
+                    succstocheck.append(nsucc)
+
+    return newreplacements
+
 def processreplacement(state):
     """process the list of replacements to return
 
     1) the final mapping between original and created nodes
     2) the list of temporary node created by histedit
     3) the list of new commit created by histedit"""
-    replacements = state.replacements
+    replacements = adjustreplacementsfrommarkers(state.repo, state.replacements)
     allsuccs = set()
     replaced = set()
     fullmapping = {}
diff --git a/tests/test-histedit-obsolete.t b/tests/test-histedit-obsolete.t
--- a/tests/test-histedit-obsolete.t
+++ b/tests/test-histedit-obsolete.t
@@ -14,6 +14,36 @@ 
   > rebase=
   > EOF
 
+Test that histedit learns about obsoletions not stored in histedit state
+  $ hg init boo
+  $ cd boo
+  $ echo a > a
+  $ hg ci -Am a
+  adding a
+  $ echo a > b
+  $ echo a > c
+  $ echo a > c
+  $ hg ci -Am b
+  adding b
+  adding c
+  $ echo a > d
+  $ hg ci -Am c
+  adding d
+  $ echo "pick `hg log -r 0 -T '{node|short}'`" > plan
+  $ echo "pick `hg log -r 2 -T '{node|short}'`" >> plan
+  $ echo "edit `hg log -r 1 -T '{node|short}'`" >> plan
+  $ hg histedit -r 'all()' --commands plan
+  Editing (1b2d564fad96), you may commit or record as needed now.
+  (hg histedit --continue to resume)
+  [1]
+  $ hg st
+  A b
+  A c
+  ? plan
+  $ hg commit --amend b
+  $ hg histedit --continue
+  $ cd ..
+
   $ hg init base
   $ cd base