Patchwork cvsps: use commitids (when present) to detect changesets

login
register
mail settings
Submitter Frank Kingswood
Date Jan. 8, 2013, 8:18 p.m.
Message ID <309deee127b77d45c9d2.1357676280@andromeda.kingswood-consulting.co.uk>
Download mbox | patch
Permalink /patch/499/
State Accepted
Commit 1b7b5975793f7b72d2ed9337be543389d7b1ba4c
Headers show

Comments

Frank Kingswood - Jan. 8, 2013, 8:18 p.m.
# HG changeset patch
# User Frank Kingswood <frank@kingswood-consulting.co.uk>
# Date 1357675880 0
# Node ID 309deee127b77d45c9d2f7574a4addcc3bb286f1
# Parent  a32255dee8599db1e1fdcca1a4d09f0ed25b6d60
cvsps: use commitids (when present) to detect changesets
Simplify core logic by no longer attempting to work around missing
class attributes. Instead always generate the attributes and ignore
the cache if the attributes are missing
Bryan O'Sullivan - Jan. 8, 2013, 11:38 p.m.
On Tue, Jan 08, 2013 at 08:18:00PM +0000, Frank Kingswood wrote:
> cvsps: use commitids (when present) to detect changesets

Applied, but ...

>          # first changeset and bar the next and MYBRANCH and MYBRANCH2
>          # should both start off of the bar changeset. No provisions are
>          # made to ensure that this is, in fact, what happens.
> +        if not (c and e.branchpoints == c.branchpoints and
> +                  (   # cvs commitids
> +                      (e.commitid is not None and e.commitid == c.commitid)
> +                      or
> +                      ( # no commitids, use fuzzy commit detection
> +                        (e.commitid is None or c.commitid is None) and
> +                        e.comment == c.comment and
> +                        e.author == c.author and
> +                        e.branch == c.branch and
> +                        ((c.date[0] + c.date[1]) <=
> +                         (e.date[0] + e.date[1]) <=
> +                         (c.date[0] + c.date[1]) + fuzz) and
> +                        e.file not in files
> +                      )
> +                  )):

This is the most incomprehensible if-expression I've seen in many a year :-(

(Don't worry, I'm aware you didn't make it worse.)
Bryan O'Sullivan - Jan. 8, 2013, 11:45 p.m.
On Tue, Jan 08, 2013 at 08:18:00PM +0000, Frank Kingswood wrote:
> cvsps: use commitids (when present) to detect changesets

Incidentally, your change broke some tests, which I had to fix after
the fact:

http://hg.intevation.org/mercurial/crew/rev/ed923a2d5ae9
Bryan O'Sullivan - Jan. 9, 2013, 12:51 a.m.
On Tue, Jan 8, 2013 at 3:45 PM, Bryan O'Sullivan <bos@serpentine.com> wrote:

> Incidentally, your change broke some tests, which I had to fix after
> the fact:
>
> http://hg.intevation.org/mercurial/crew/rev/ed923a2d5ae9
>

Ugh, it's worse than I thought. Your change made the test produce different
output depending on whether cvs 1.12 or older is installed on the system,
so my commit doesn't fix anything, it just changes the set of machines on
which the test fails.

Please advise.
Frank Kingswood - Jan. 9, 2013, 9:25 a.m.
On 09/01/13 00:51, Bryan O'Sullivan wrote:
> On Tue, Jan 8, 2013 at 3:45 PM, Bryan O'Sullivan <bos@serpentine.com 
> <mailto:bos@serpentine.com>> wrote:
>
>     Incidentally, your change broke some tests, which I had to fix after
>     the fact:
>
>     http://hg.intevation.org/mercurial/crew/rev/ed923a2d5ae9
>
>
> Ugh, it's worse than I thought. Your change made the test produce 
> different output depending on whether cvs 1.12 or older is installed 
> on the system, so my commit doesn't fix anything, it just changes the 
> set of machines on which the test fails.
Eww, that's awkward.

Of course the *user* will automagically get the right behaviour: if they 
have an old cvs then they probably have no commitids anyway, and if 
their repository has commitids then they probably have a new cvs.
For that reason I would prefer not to make this conditional - as you 
point out it already has the mother of all if expressions.

Could we make the changed testcase require cvs 1.12 or is there no 
mechanism to check dependent tool version numbers?

Frank
Bryan O'Sullivan - Jan. 9, 2013, 9:18 p.m.
On Wed, Jan 9, 2013 at 1:25 AM, Frank Kingswood <
frank@kingswood-consulting.co.uk> wrote:

> Could we make the changed testcase require cvs 1.12 or is there no
> mechanism to check dependent tool version numbers?
>

That's easily done: http://hg.intevation.org/mercurial/crew/rev/9589227657bc

However, there remains a problem with your patch.

Prior to your patch, the output of test-convert-cvs.t and
test-convert-cvs-synthetic.t were stable.

Your patch broke the output of test-convert-cvs.t, which I fixed in
http://hg.intevation.org/mercurial/crew/rev/ed923a2d5ae9

The output of that test is now stable under both CVS 1.11 and 1.12.

Your patch *also* broke the output of test-convert-cvs-synthetic.t. It now
produces different results than before under both CVS 1.11 and CVS 1.12,
and the 1.11 and 1.12 results differ from each other. I didn't catch this
distinction until this morning.

Since the new output is different from the old output no matter which
version of CVS I use, I suspect that your patch has introduced a regression.

I don't know squat about CVS, so I am not the right person to clean this
up. Please run the test suite from the crew repo, figure out what the
correct output from test-convert-cvs-synthetic.t ought to be (and which
version of CVS the test should use), and send a patch that fixes the
regression.
Frank Kingswood - Jan. 11, 2013, 5:38 p.m.
On 09/01/13 21:18, Bryan O'Sullivan wrote:

> Since the new output is different from the old output no matter which
> version of CVS I use, I suspect that your patch has introduced a regression.

Okay I have a patch which I thought I had sent yesterday but it seems it 
did not go to the list, apologies. That fixes a number of issues and 
does a bit of cleanup (for example that huge conditional).

I'll resend that patch this weekend but it is not actually correct 
either, but better. Someone who actually uses CVS should have a look. I 
think Martin wrote bits of that code, so I've CCed him in the hope that 
he might be the right person.

> I don't know squat about CVS, so I am not the right person to clean this
> up. Please run the test suite from the crew repo, figure out what the

So I'm hoping for help with this, I'm not able to spend more than 
trivial time on maintaining cvs conversion.

Frank

Patch

diff --git a/hgext/convert/cvsps.py b/hgext/convert/cvsps.py
--- a/hgext/convert/cvsps.py
+++ b/hgext/convert/cvsps.py
@@ -19,6 +19,7 @@ 
         .branch    - name of branch this revision is on
         .branches  - revision tuple of branches starting at this revision
         .comment   - commit message
+        .commitid  - CVS commitid or None
         .date      - the commit date as a (time, tz) tuple
         .dead      - true if file revision is dead
         .file      - Name of file
@@ -28,19 +29,17 @@ 
         .revision  - revision number as tuple
         .tags      - list of tags on the file
         .synthetic - is this a synthetic "file ... added on ..." revision?
-        .mergepoint- the branch that has been merged from
-                     (if present in rlog output)
-        .branchpoints- the branches that start at the current entry
+        .mergepoint - the branch that has been merged from (if present in
+                      rlog output) or None
+        .branchpoints - the branches that start at the current entry or empty
     '''
     def __init__(self, **entries):
         self.synthetic = False
         self.__dict__.update(entries)
 
     def __repr__(self):
-        return "<%s at 0x%x: %s %s>" % (self.__class__.__name__,
-                                        id(self),
-                                        self.file,
-                                        ".".join(map(str, self.revision)))
+        items = ("%s=%r"%(k, self.__dict__[k]) for k in sorted(self.__dict__))
+        return "%s(%s)"%(type(self).__name__, ", ".join(items))
 
 class logerror(Exception):
     pass
@@ -113,6 +112,7 @@ 
     re_50 = re.compile('revision ([\\d.]+)(\s+locked by:\s+.+;)?$')
     re_60 = re.compile(r'date:\s+(.+);\s+author:\s+(.+);\s+state:\s+(.+?);'
                        r'(\s+lines:\s+(\+\d+)?\s+(-\d+)?;)?'
+                       r'(\s+commitid:\s+([^;]+);)?'
                        r'(.*mergepoint:\s+([^;]+);)?')
     re_70 = re.compile('branches: (.+);$')
 
@@ -171,6 +171,14 @@ 
         try:
             ui.note(_('reading cvs log cache %s\n') % cachefile)
             oldlog = pickle.load(open(cachefile))
+            for e in oldlog:
+               if not (util.safehasattr(e, 'branchpoints') and
+                       util.safehasattr(e, 'commitid') and
+                       util.safehasattr(e, 'mergepoint')):
+                  ui.status(_('ignoring old cache\n'))
+                  oldlog = []
+                  break
+
             ui.note(_('cache has %d log entries\n') % len(oldlog))
         except Exception, e:
             ui.note(_('error reading cache: %r\n') % e)
@@ -298,7 +306,8 @@ 
             assert match, _('expected revision number')
             e = logentry(rcs=scache(rcs), file=scache(filename),
                     revision=tuple([int(x) for x in match.group(1).split('.')]),
-                    branches=[], parent=None)
+                    branches=[], parent=None, commitid=None, mergepoint=None, branchpoints=set())
+
             state = 6
 
         elif state == 6:
@@ -329,8 +338,11 @@ 
             else:
                 e.lines = None
 
-            if match.group(7): # cvsnt mergepoint
-                myrev = match.group(8).split('.')
+            if match.group(7): # cvs 1.12 commitid
+                e.commitid = match.group(8)
+
+            if match.group(9): # cvsnt mergepoint
+                myrev = match.group(10).split('.')
                 if len(myrev) == 2: # head
                     e.mergepoint = 'HEAD'
                 else:
@@ -339,8 +351,7 @@ 
                     assert len(branches) == 1, ('unknown branch: %s'
                                                 % e.mergepoint)
                     e.mergepoint = branches[0]
-            else:
-                e.mergepoint = None
+
             e.comment = []
             state = 7
 
@@ -469,23 +480,22 @@ 
         .author    - author name as CVS knows it
         .branch    - name of branch this changeset is on, or None
         .comment   - commit message
+        .commitid  - CVS commitid or None
         .date      - the commit date as a (time,tz) tuple
         .entries   - list of logentry objects in this changeset
         .parents   - list of one or two parent changesets
         .tags      - list of tags on this changeset
         .synthetic - from synthetic revision "file ... added on branch ..."
-        .mergepoint- the branch that has been merged from
-                     (if present in rlog output)
-        .branchpoints- the branches that start at the current entry
+        .mergepoint- the branch that has been merged from or None
+        .branchpoints- the branches that start at the current entry or empty
     '''
     def __init__(self, **entries):
         self.synthetic = False
         self.__dict__.update(entries)
 
     def __repr__(self):
-        return "<%s at 0x%x: %s>" % (self.__class__.__name__,
-                                     id(self),
-                                     getattr(self, 'id', "(no id)"))
+        items = ("%s=%r"%(k, self.__dict__[k]) for k in sorted(self.__dict__))
+        return "%s(%s)"%(type(self).__name__, ", ".join(items))
 
 def createchangeset(ui, log, fuzz=60, mergefrom=None, mergeto=None):
     '''Convert log into changesets.'''
@@ -493,8 +503,7 @@ 
     ui.status(_('creating changesets\n'))
 
     # Merge changesets
-
-    log.sort(key=lambda x: (x.comment, x.author, x.branch, x.date))
+    log.sort(key=lambda x: (x.commitid, x.comment, x.author, x.branch, x.date, x.branchpoints))
 
     changesets = []
     files = set()
@@ -517,22 +526,27 @@ 
         # first changeset and bar the next and MYBRANCH and MYBRANCH2
         # should both start off of the bar changeset. No provisions are
         # made to ensure that this is, in fact, what happens.
-        if not (c and
-                  e.comment == c.comment and
-                  e.author == c.author and
-                  e.branch == c.branch and
-                  (not util.safehasattr(e, 'branchpoints') or
-                    not util.safehasattr (c, 'branchpoints') or
-                    e.branchpoints == c.branchpoints) and
-                  ((c.date[0] + c.date[1]) <=
-                   (e.date[0] + e.date[1]) <=
-                   (c.date[0] + c.date[1]) + fuzz) and
-                  e.file not in files):
+        if not (c and e.branchpoints == c.branchpoints and
+                  (   # cvs commitids
+                      (e.commitid is not None and e.commitid == c.commitid)
+                      or
+                      ( # no commitids, use fuzzy commit detection
+                        (e.commitid is None or c.commitid is None) and
+                        e.comment == c.comment and
+                        e.author == c.author and
+                        e.branch == c.branch and
+                        ((c.date[0] + c.date[1]) <=
+                         (e.date[0] + e.date[1]) <=
+                         (c.date[0] + c.date[1]) + fuzz) and
+                        e.file not in files
+                      )
+                  )):
             c = changeset(comment=e.comment, author=e.author,
-                          branch=e.branch, date=e.date, entries=[],
-                          mergepoint=getattr(e, 'mergepoint', None),
-                          branchpoints=getattr(e, 'branchpoints', set()))
+                          branch=e.branch, date=e.date,
+                          entries=[], mergepoint=e.mergepoint,
+                          branchpoints=e.branchpoints, commitid=e.commitid)
             changesets.append(c)
+
             files = set()
             if len(changesets) % 100 == 0:
                 t = '%d %s' % (len(changesets), repr(e.comment)[1:-1])
@@ -808,9 +822,8 @@ 
             ui.write(('Branch: %s\n' % (cs.branch or 'HEAD')))
             ui.write(('Tag%s: %s \n' % (['', 's'][len(cs.tags) > 1],
                                   ','.join(cs.tags) or '(none)')))
-            branchpoints = getattr(cs, 'branchpoints', None)
-            if branchpoints:
-                ui.write(('Branchpoints: %s \n' % ', '.join(branchpoints)))
+            if cs.branchpoints:
+                ui.write('Branchpoints: %s \n' % ', '.join(cs.branchpoints))
             if opts["parents"] and cs.parents:
                 if len(cs.parents) > 1:
                     ui.write(('Parents: %s\n' %