Patchwork convert: replace old sha1s in the description

login
register
mail settings
Submitter Sean Farley
Date March 27, 2013, 10:42 p.m.
Message ID <d7639b2cb14271ab88ed.1364424123@laptop.local>
Download mbox | patch
Permalink /patch/1202/
State Superseded, archived
Commit 45562379ce4e663260ee4bd66605d2843da0ae76
Headers show

Comments

Sean Farley - March 27, 2013, 10:42 p.m.
# HG changeset patch
# User Sean Farley <sean.michael.farley@gmail.com>
# Date 1363449738 18000
#      Sat Mar 16 11:02:18 2013 -0500
# Node ID d7639b2cb14271ab88ed71ecb3ce0a7595d41d25
# Parent  3839baf52f2f24c289487111a95e9e835d1e1c4d
convert: replace old sha1s in the description

This is a simple find-and-replace strategy for matching anything in the
old description of a converted commit and, if that matched sha1 exists
in the mapping, replacing it with the new sha1.

In particular, this is helpful for descriptions that contain tags with
messages such as, "Added tag 1.0 for commit abcde1234567" which will now
be automatically converted.

Tests have been updated accordingly.
Angel Ezquerra - March 27, 2013, 11:35 p.m.
On Wed, Mar 27, 2013 at 11:42 PM, Sean Farley
<sean.michael.farley@gmail.com> wrote:
> # HG changeset patch
> # User Sean Farley <sean.michael.farley@gmail.com>
> # Date 1363449738 18000
> #      Sat Mar 16 11:02:18 2013 -0500
> # Node ID d7639b2cb14271ab88ed71ecb3ce0a7595d41d25
> # Parent  3839baf52f2f24c289487111a95e9e835d1e1c4d
> convert: replace old sha1s in the description
>
> This is a simple find-and-replace strategy for matching anything in the
> old description of a converted commit and, if that matched sha1 exists
> in the mapping, replacing it with the new sha1.
>
> In particular, this is helpful for descriptions that contain tags with
> messages such as, "Added tag 1.0 for commit abcde1234567" which will now
> be automatically converted.
>
> Tests have been updated accordingly.
>
> diff --git a/hgext/convert/hg.py b/hgext/convert/hg.py
> --- a/hgext/convert/hg.py
> +++ b/hgext/convert/hg.py
> @@ -23,10 +23,13 @@
>  from mercurial.node import bin, hex, nullid
>  from mercurial import hg, util, context, bookmarks, error
>
>  from common import NoRepo, commit, converter_source, converter_sink
>
> +import re
> +sha1re = re.compile(r'\b[0-9a-f]{12,40}\b')
> +
>  class mercurial_sink(converter_sink):
>      def __init__(self, ui, path):
>          converter_sink.__init__(self, ui, path)
>          self.branchnames = ui.configbool('convert', 'hg.usebranchnames', True)
>          self.clonebranches = ui.configbool('convert', 'hg.clonebranches', False)
> @@ -155,10 +158,18 @@
>          if len(parents) < 2:
>              parents.append(nullid)
>          p2 = parents.pop(0)
>
>          text = commit.desc
> +
> +        sha1s = re.findall(sha1re, text)
> +        for sha1 in sha1s:
> +            oldrev = source.lookuprev(sha1)
> +            newrev = revmap.get(oldrev)
> +            if newrev is not None:
> +                text = text.replace(sha1, newrev[:len(sha1)])
> +
>          extra = commit.extra.copy()
>          if self.branchnames and commit.branch:
>              extra['branch'] = commit.branch
>          if commit.rev:
>              extra['convert_revision'] = commit.rev
> diff --git a/tests/test-convert-hg-sink.t b/tests/test-convert-hg-sink.t
> --- a/tests/test-convert-hg-sink.t
> +++ b/tests/test-convert-hg-sink.t
> @@ -117,8 +117,8 @@
>    2 add foo/file
>    1 Added tag some-tag for changeset ad681a868e44
>    0 add baz
>    $ cd new-filemap
>    $ hg tags
> -  tip                                2:6f4fd1df87fb
> +  tip                                2:3c74706b1ff8
>    some-tag                           0:ba8636729451
>    $ cd ..

+1 to this. This would have come very, very handy this week, where
I've had to manually redo a few converted repos because the "tag
commits" had the wrong message.

What about actually "retagging" the repo during conversion? That is,
if a commit that only modifies the .hgtags file is found, try to
replicate it by running "hg tag" on the target repo...

Note that I know next to nothing about the internals of hgext/convert
so this may not be possible...

Cheers,

Angel
Sean Farley - March 27, 2013, 11:45 p.m.
Angel Ezquerra writes:

> On Wed, Mar 27, 2013 at 11:42 PM, Sean Farley
> <sean.michael.farley@gmail.com> wrote:
>> # HG changeset patch
>> # User Sean Farley <sean.michael.farley@gmail.com>
>> # Date 1363449738 18000
>> #      Sat Mar 16 11:02:18 2013 -0500
>> # Node ID d7639b2cb14271ab88ed71ecb3ce0a7595d41d25
>> # Parent  3839baf52f2f24c289487111a95e9e835d1e1c4d
>> convert: replace old sha1s in the description
>>
>> This is a simple find-and-replace strategy for matching anything in the
>> old description of a converted commit and, if that matched sha1 exists
>> in the mapping, replacing it with the new sha1.
>>
>> In particular, this is helpful for descriptions that contain tags with
>> messages such as, "Added tag 1.0 for commit abcde1234567" which will now
>> be automatically converted.
>>
>> Tests have been updated accordingly.
>>
>> diff --git a/hgext/convert/hg.py b/hgext/convert/hg.py
>> --- a/hgext/convert/hg.py
>> +++ b/hgext/convert/hg.py
>> @@ -23,10 +23,13 @@
>>  from mercurial.node import bin, hex, nullid
>>  from mercurial import hg, util, context, bookmarks, error
>>
>>  from common import NoRepo, commit, converter_source, converter_sink
>>
>> +import re
>> +sha1re = re.compile(r'\b[0-9a-f]{12,40}\b')
>> +
>>  class mercurial_sink(converter_sink):
>>      def __init__(self, ui, path):
>>          converter_sink.__init__(self, ui, path)
>>          self.branchnames = ui.configbool('convert', 'hg.usebranchnames', True)
>>          self.clonebranches = ui.configbool('convert', 'hg.clonebranches', False)
>> @@ -155,10 +158,18 @@
>>          if len(parents) < 2:
>>              parents.append(nullid)
>>          p2 = parents.pop(0)
>>
>>          text = commit.desc
>> +
>> +        sha1s = re.findall(sha1re, text)
>> +        for sha1 in sha1s:
>> +            oldrev = source.lookuprev(sha1)
>> +            newrev = revmap.get(oldrev)
>> +            if newrev is not None:
>> +                text = text.replace(sha1, newrev[:len(sha1)])
>> +
>>          extra = commit.extra.copy()
>>          if self.branchnames and commit.branch:
>>              extra['branch'] = commit.branch
>>          if commit.rev:
>>              extra['convert_revision'] = commit.rev
>> diff --git a/tests/test-convert-hg-sink.t b/tests/test-convert-hg-sink.t
>> --- a/tests/test-convert-hg-sink.t
>> +++ b/tests/test-convert-hg-sink.t
>> @@ -117,8 +117,8 @@
>>    2 add foo/file
>>    1 Added tag some-tag for changeset ad681a868e44
>>    0 add baz
>>    $ cd new-filemap
>>    $ hg tags
>> -  tip                                2:6f4fd1df87fb
>> +  tip                                2:3c74706b1ff8
>>    some-tag                           0:ba8636729451
>>    $ cd ..
>
> +1 to this. This would have come very, very handy this week, where
> I've had to manually redo a few converted repos because the "tag
> commits" had the wrong message.

Thanks :-)

> What about actually "retagging" the repo during conversion? That is,
> if a commit that only modifies the .hgtags file is found, try to
> replicate it by running "hg tag" on the target repo...
>
> Note that I know next to nothing about the internals of hgext/convert
> so this may not be possible...

What exactly are you trying to accomplish? Renaming the tags? If so,
Martin wrote a good starting point here:

http://stackoverflow.com/questions/7866379/renaming-tags-while-converting-a-mercurial-repository
Angel Ezquerra - March 27, 2013, 11:55 p.m.
On Thu, Mar 28, 2013 at 12:45 AM, Sean Farley
<sean.michael.farley@gmail.com> wrote:
>
> Angel Ezquerra writes:
>
>> On Wed, Mar 27, 2013 at 11:42 PM, Sean Farley
>> <sean.michael.farley@gmail.com> wrote:
>>> # HG changeset patch
>>> # User Sean Farley <sean.michael.farley@gmail.com>
>>> # Date 1363449738 18000
>>> #      Sat Mar 16 11:02:18 2013 -0500
>>> # Node ID d7639b2cb14271ab88ed71ecb3ce0a7595d41d25
>>> # Parent  3839baf52f2f24c289487111a95e9e835d1e1c4d
>>> convert: replace old sha1s in the description
>>>
>>> This is a simple find-and-replace strategy for matching anything in the
>>> old description of a converted commit and, if that matched sha1 exists
>>> in the mapping, replacing it with the new sha1.
>>>
>>> In particular, this is helpful for descriptions that contain tags with
>>> messages such as, "Added tag 1.0 for commit abcde1234567" which will now
>>> be automatically converted.
>>>
>>> Tests have been updated accordingly.
>>>
>>> diff --git a/hgext/convert/hg.py b/hgext/convert/hg.py
>>> --- a/hgext/convert/hg.py
>>> +++ b/hgext/convert/hg.py
>>> @@ -23,10 +23,13 @@
>>>  from mercurial.node import bin, hex, nullid
>>>  from mercurial import hg, util, context, bookmarks, error
>>>
>>>  from common import NoRepo, commit, converter_source, converter_sink
>>>
>>> +import re
>>> +sha1re = re.compile(r'\b[0-9a-f]{12,40}\b')
>>> +
>>>  class mercurial_sink(converter_sink):
>>>      def __init__(self, ui, path):
>>>          converter_sink.__init__(self, ui, path)
>>>          self.branchnames = ui.configbool('convert', 'hg.usebranchnames', True)
>>>          self.clonebranches = ui.configbool('convert', 'hg.clonebranches', False)
>>> @@ -155,10 +158,18 @@
>>>          if len(parents) < 2:
>>>              parents.append(nullid)
>>>          p2 = parents.pop(0)
>>>
>>>          text = commit.desc
>>> +
>>> +        sha1s = re.findall(sha1re, text)
>>> +        for sha1 in sha1s:
>>> +            oldrev = source.lookuprev(sha1)
>>> +            newrev = revmap.get(oldrev)
>>> +            if newrev is not None:
>>> +                text = text.replace(sha1, newrev[:len(sha1)])
>>> +
>>>          extra = commit.extra.copy()
>>>          if self.branchnames and commit.branch:
>>>              extra['branch'] = commit.branch
>>>          if commit.rev:
>>>              extra['convert_revision'] = commit.rev
>>> diff --git a/tests/test-convert-hg-sink.t b/tests/test-convert-hg-sink.t
>>> --- a/tests/test-convert-hg-sink.t
>>> +++ b/tests/test-convert-hg-sink.t
>>> @@ -117,8 +117,8 @@
>>>    2 add foo/file
>>>    1 Added tag some-tag for changeset ad681a868e44
>>>    0 add baz
>>>    $ cd new-filemap
>>>    $ hg tags
>>> -  tip                                2:6f4fd1df87fb
>>> +  tip                                2:3c74706b1ff8
>>>    some-tag                           0:ba8636729451
>>>    $ cd ..
>>
>> +1 to this. This would have come very, very handy this week, where
>> I've had to manually redo a few converted repos because the "tag
>> commits" had the wrong message.
>
> Thanks :-)
>
>> What about actually "retagging" the repo during conversion? That is,
>> if a commit that only modifies the .hgtags file is found, try to
>> replicate it by running "hg tag" on the target repo...
>>
>> Note that I know next to nothing about the internals of hgext/convert
>> so this may not be possible...
>
> What exactly are you trying to accomplish? Renaming the tags? If so,
> Martin wrote a good starting point here:
>
> http://stackoverflow.com/questions/7866379/renaming-tags-while-converting-a-mercurial-repository

That's a pretty complete answer. It is a pity that Martin never got to
send it to the list, or did he?

Angel
Sean Farley - March 28, 2013, 12:19 a.m.
Angel Ezquerra writes:

[snip]
>>> What about actually "retagging" the repo during conversion? That is,
>>> if a commit that only modifies the .hgtags file is found, try to
>>> replicate it by running "hg tag" on the target repo...
>>>
>>> Note that I know next to nothing about the internals of hgext/convert
>>> so this may not be possible...
>>
>> What exactly are you trying to accomplish? Renaming the tags? If so,
>> Martin wrote a good starting point here:
>>
>> http://stackoverflow.com/questions/7866379/renaming-tags-while-converting-a-mercurial-repository
>
> That's a pretty complete answer. It is a pity that Martin never got to
> send it to the list, or did he?

I don't recall him sending that to the list but he could have sent it
before I joined ~1.5 years ago.
Matt Harbison - March 29, 2013, 2:18 a.m.
On Wed, 27 Mar 2013 17:42:03 -0500, Sean Farley wrote:

> # HG changeset patch
> # User Sean Farley <sean.michael.farley@gmail.com>
> # Date 1363449738 18000
> #      Sat Mar 16 11:02:18 2013 -0500
> # Node ID d7639b2cb14271ab88ed71ecb3ce0a7595d41d25
> # Parent  3839baf52f2f24c289487111a95e9e835d1e1c4d
> convert: replace old sha1s in the description
> 
> This is a simple find-and-replace strategy for matching anything in the
> old description of a converted commit and, if that matched sha1 exists
> in the mapping, replacing it with the new sha1.
> 
> In particular, this is helpful for descriptions that contain tags with
> messages such as, "Added tag 1.0 for commit abcde1234567" which will now
> be automatically converted.
> 
> Tests have been updated accordingly.
> 
> diff --git a/hgext/convert/hg.py b/hgext/convert/hg.py
> --- a/hgext/convert/hg.py
> +++ b/hgext/convert/hg.py
> @@ -23,10 +23,13 @@
>  from mercurial.node import bin, hex, nullid
>  from mercurial import hg, util, context, bookmarks, error
>  
>  from common import NoRepo, commit, converter_source, converter_sink
>  
> +import re
> +sha1re = re.compile(r'\b[0-9a-f]{12,40}\b')
> +
>  class mercurial_sink(converter_sink):
>      def __init__(self, ui, path):
>          converter_sink.__init__(self, ui, path)
>          self.branchnames = ui.configbool('convert', 'hg.usebranchnames', 
True)
>          self.clonebranches = ui.configbool('convert', 
'hg.clonebranches', False)
> @@ -155,10 +158,18 @@
>          if len(parents) < 2:
>              parents.append(nullid)
>          p2 = parents.pop(0)
>  
>          text = commit.desc
> +
> +        sha1s = re.findall(sha1re, text)
> +        for sha1 in sha1s:
> +            oldrev = source.lookuprev(sha1)
> +            newrev = revmap.get(oldrev)
> +            if newrev is not None:
> +                text = text.replace(sha1, newrev[:len(sha1)])
> +
>          extra = commit.extra.copy()
>          if self.branchnames and commit.branch:
>              extra['branch'] = commit.branch
>          if commit.rev:
>              extra['convert_revision'] = commit.rev
> diff --git a/tests/test-convert-hg-sink.t b/tests/test-convert-hg-sink.t
> --- a/tests/test-convert-hg-sink.t
> +++ b/tests/test-convert-hg-sink.t
> @@ -117,8 +117,8 @@
>    2 add foo/file
>    1 Added tag some-tag for changeset ad681a868e44
>    0 add baz
>    $ cd new-filemap
>    $ hg tags
> -  tip                                2:6f4fd1df87fb
> +  tip                                2:3c74706b1ff8
>    some-tag                           0:ba8636729451
>    $ cd ..

One nit and one random thought.  Shouldn't this test do a 'log -r' to 
print the (changed) message, and maybe the hash of the tagged rev too, in 
order to demonstrate that they are in sync?

The random thought is, would it be useful to write something to stdout if 
oldrev is not None but newrev is?  IOW you have a known source hash 
reference, but you haven't converted that cset yet.  The --rev option 
would then let you incrementally convert the required cset first.  You 
might be able to trigger this with the various sort options if you've 
grafted between branches, or commit msgs in branch A mention commits in 
branch B and vice-versa.  Incremental convert probably isn't what you want 
to do with sort options, but at least it warns you off.

When I was fiddling with something similar to this patch, I remember 
finding the warning useful.  OTOH, the code that updates the tags file 
doesn't warn for this case either.

--Matt
Matt Mackall - April 17, 2013, 12:26 a.m.
On Fri, 2013-03-29 at 02:18 +0000, Matt Harbison wrote:
> On Wed, 27 Mar 2013 17:42:03 -0500, Sean Farley wrote:
> 
> > # HG changeset patch
> > # User Sean Farley <sean.michael.farley@gmail.com>
> > # Date 1363449738 18000
> > #      Sat Mar 16 11:02:18 2013 -0500
> > # Node ID d7639b2cb14271ab88ed71ecb3ce0a7595d41d25
> > # Parent  3839baf52f2f24c289487111a95e9e835d1e1c4d
> > convert: replace old sha1s in the description
> > 
> > This is a simple find-and-replace strategy for matching anything in the
> > old description of a converted commit and, if that matched sha1 exists
> > in the mapping, replacing it with the new sha1.
> > 
> > In particular, this is helpful for descriptions that contain tags with
> > messages such as, "Added tag 1.0 for commit abcde1234567" which will now
> > be automatically converted.
> > 
> > Tests have been updated accordingly.
> > 
> > diff --git a/hgext/convert/hg.py b/hgext/convert/hg.py
> > --- a/hgext/convert/hg.py
> > +++ b/hgext/convert/hg.py
> > @@ -23,10 +23,13 @@
> >  from mercurial.node import bin, hex, nullid
> >  from mercurial import hg, util, context, bookmarks, error
> >  
> >  from common import NoRepo, commit, converter_source, converter_sink
> >  
> > +import re
> > +sha1re = re.compile(r'\b[0-9a-f]{12,40}\b')
> > +
> >  class mercurial_sink(converter_sink):
> >      def __init__(self, ui, path):
> >          converter_sink.__init__(self, ui, path)
> >          self.branchnames = ui.configbool('convert', 'hg.usebranchnames', 
> True)
> >          self.clonebranches = ui.configbool('convert', 
> 'hg.clonebranches', False)
> > @@ -155,10 +158,18 @@
> >          if len(parents) < 2:
> >              parents.append(nullid)
> >          p2 = parents.pop(0)
> >  
> >          text = commit.desc
> > +
> > +        sha1s = re.findall(sha1re, text)
> > +        for sha1 in sha1s:
> > +            oldrev = source.lookuprev(sha1)
> > +            newrev = revmap.get(oldrev)
> > +            if newrev is not None:
> > +                text = text.replace(sha1, newrev[:len(sha1)])
> > +
> >          extra = commit.extra.copy()
> >          if self.branchnames and commit.branch:
> >              extra['branch'] = commit.branch
> >          if commit.rev:
> >              extra['convert_revision'] = commit.rev
> > diff --git a/tests/test-convert-hg-sink.t b/tests/test-convert-hg-sink.t
> > --- a/tests/test-convert-hg-sink.t
> > +++ b/tests/test-convert-hg-sink.t
> > @@ -117,8 +117,8 @@
> >    2 add foo/file
> >    1 Added tag some-tag for changeset ad681a868e44
> >    0 add baz
> >    $ cd new-filemap
> >    $ hg tags
> > -  tip                                2:6f4fd1df87fb
> > +  tip                                2:3c74706b1ff8
> >    some-tag                           0:ba8636729451
> >    $ cd ..
> 
> One nit and one random thought.  Shouldn't this test do a 'log -r' to 
> print the (changed) message, and maybe the hash of the tagged rev too, in 
> order to demonstrate that they are in sync?
> 
> The random thought is, would it be useful to write something to stdout if 
> oldrev is not None but newrev is?  IOW you have a known source hash 
> reference, but you haven't converted that cset yet.  The --rev option 
> would then let you incrementally convert the required cset first.  You 
> might be able to trigger this with the various sort options if you've 
> grafted between branches, or commit msgs in branch A mention commits in 
> branch B and vice-versa.  Incremental convert probably isn't what you want 
> to do with sort options, but at least it warns you off.
> 
> When I was fiddling with something similar to this patch, I remember 
> finding the warning useful.  OTOH, the code that updates the tags file 
> doesn't warn for this case either.

Seem to be a few unanswered questions in this thread. Going to drop this
for now.

Patch

diff --git a/hgext/convert/hg.py b/hgext/convert/hg.py
--- a/hgext/convert/hg.py
+++ b/hgext/convert/hg.py
@@ -23,10 +23,13 @@ 
 from mercurial.node import bin, hex, nullid
 from mercurial import hg, util, context, bookmarks, error
 
 from common import NoRepo, commit, converter_source, converter_sink
 
+import re
+sha1re = re.compile(r'\b[0-9a-f]{12,40}\b')
+
 class mercurial_sink(converter_sink):
     def __init__(self, ui, path):
         converter_sink.__init__(self, ui, path)
         self.branchnames = ui.configbool('convert', 'hg.usebranchnames', True)
         self.clonebranches = ui.configbool('convert', 'hg.clonebranches', False)
@@ -155,10 +158,18 @@ 
         if len(parents) < 2:
             parents.append(nullid)
         p2 = parents.pop(0)
 
         text = commit.desc
+
+        sha1s = re.findall(sha1re, text)
+        for sha1 in sha1s:
+            oldrev = source.lookuprev(sha1)
+            newrev = revmap.get(oldrev)
+            if newrev is not None:
+                text = text.replace(sha1, newrev[:len(sha1)])
+
         extra = commit.extra.copy()
         if self.branchnames and commit.branch:
             extra['branch'] = commit.branch
         if commit.rev:
             extra['convert_revision'] = commit.rev
diff --git a/tests/test-convert-hg-sink.t b/tests/test-convert-hg-sink.t
--- a/tests/test-convert-hg-sink.t
+++ b/tests/test-convert-hg-sink.t
@@ -117,8 +117,8 @@ 
   2 add foo/file
   1 Added tag some-tag for changeset ad681a868e44
   0 add baz
   $ cd new-filemap
   $ hg tags
-  tip                                2:6f4fd1df87fb
+  tip                                2:3c74706b1ff8
   some-tag                           0:ba8636729451
   $ cd ..