Patchwork [2,of,5,STABLE] merge: new format for the state file

login
register
mail settings
Submitter Pierre-Yves David
Date Feb. 26, 2014, 10:58 p.m.
Message ID <d229b5d2096e7fdcc362.1393455484@marginatus.alto.octopoid.net>
Download mbox | patch
Permalink /patch/3774/
State Superseded
Headers show

Comments

Pierre-Yves David - Feb. 26, 2014, 10:58 p.m.
# HG changeset patch
# User Pierre-Yves David <pierre-yves.david@fb.com>
# Date 1393382226 28800
#      Tue Feb 25 18:37:06 2014 -0800
# Branch stable
# Node ID d229b5d2096e7fdcc362fe22ff049ae257365c51
# Parent  a0c9e2941511a01624a0a75c000190f4ce3ed467
merge: new format for the state file

This new format will allow use to address common bugs with doing special merge
(graft, backout, rebaseā€¦) and record user choice during conflict resolution.

The format is open so we can add more record for future usage.

I'm keeping hexified version of node in this file to help human willing to debug
it by hand. I do not expect the overhead or oversize to be an issue.
Siddharth Agarwal - Feb. 26, 2014, 11:07 p.m.
On 02/26/2014 02:58 PM, pierre-yves.david@ens-lyon.org wrote:
> +_pack = struct.pack
> +_unpack = struct.unpack
> +
>   class mergestate(object):
> -    '''track 3-way merge state of individual files'''
> -    statepath = "merge/state"
> +    '''track 3-way merge state of individual files
>   
> +    current format is a list of arbitrary record of the form:
> +
> +        [type][length][content]
> +
> +    Type is a single character, length is a 4 bytes integer, content is an
> +    arbitrary suites of bytes of lenght `length`.
> +
> +    Type should be a letter. Capital letter are mandatory record, Mercurial
> +    should abort if they are unknown. lower case record can be safely ignored.
> +
> +    Currently known record:
> +
> +    L: the node of the "local" part of the merge (hexified version)
> +    F: a file to be merged entry
> +    '''
> +    statepath = "merge/state2"

So a merge started with a version of hg with this patch cannot be 
completed with a version of hg without this patch, and vice versa.
Pierre-Yves David - Feb. 26, 2014, 11:10 p.m.
On 02/26/2014 03:07 PM, Siddharth Agarwal wrote:
> On 02/26/2014 02:58 PM, pierre-yves.david@ens-lyon.org wrote:
>> +_pack = struct.pack
>> +_unpack = struct.unpack
>> +
>>   class mergestate(object):
>> -    '''track 3-way merge state of individual files'''
>> -    statepath = "merge/state"
>> +    '''track 3-way merge state of individual files
>> +    current format is a list of arbitrary record of the form:
>> +
>> +        [type][length][content]
>> +
>> +    Type is a single character, length is a 4 bytes integer, content
>> is an
>> +    arbitrary suites of bytes of lenght `length`.
>> +
>> +    Type should be a letter. Capital letter are mandatory record,
>> Mercurial
>> +    should abort if they are unknown. lower case record can be safely
>> ignored.
>> +
>> +    Currently known record:
>> +
>> +    L: the node of the "local" part of the merge (hexified version)
>> +    F: a file to be merged entry
>> +    '''
>> +    statepath = "merge/state2"
>
> So a merge started with a version of hg with this patch cannot be
> completed with a version of hg without this patch, and vice versa.

We have some idea for compatibility. Matt wanted to see a simple version 
first.
Olle Lundberg - Feb. 27, 2014, 12:17 a.m.
On Wed, Feb 26, 2014 at 11:58 PM, <pierre-yves.david@ens-lyon.org> wrote:

> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@fb.com>
> # Date 1393382226 28800
> #      Tue Feb 25 18:37:06 2014 -0800
> # Branch stable
> # Node ID d229b5d2096e7fdcc362fe22ff049ae257365c51
> # Parent  a0c9e2941511a01624a0a75c000190f4ce3ed467
> merge: new format for the state file
>
> This new format will allow use to address common bugs with doing special
> merge
> (graft, backout, rebase...) and record user choice during conflict
> resolution.
>
> The format is open so we can add more record for future usage.
>
> I'm keeping hexified version of node in this file to help human willing to
> debug
> it by hand. I do not expect the overhead or oversize to be an issue.
>
> diff --git a/mercurial/merge.py b/mercurial/merge.py
> --- a/mercurial/merge.py
> +++ b/mercurial/merge.py
> @@ -3,20 +3,40 @@
>  # Copyright 2006, 2007 Matt Mackall <mpm@selenic.com>
>  #
>  # This software may be used and distributed according to the terms of the
>  # GNU General Public License version 2 or any later version.
>
> +import struct
> +
>  from node import nullid, nullrev, hex, bin
>  from i18n import _
>  from mercurial import obsolete
>  import error, util, filemerge, copies, subrepo, worker, dicthelpers
>  import errno, os, shutil
>
> +_pack = struct.pack
> +_unpack = struct.unpack
> +
>  class mergestate(object):
> -    '''track 3-way merge state of individual files'''
> -    statepath = "merge/state"
> +    '''track 3-way merge state of individual files
>
> +    current format is a list of arbitrary record of the form:
> +
> +        [type][length][content]
> +
> +    Type is a single character, length is a 4 bytes integer, content is an
> +    arbitrary suites of bytes of lenght `length`.
>
length

> +
> +    Type should be a letter. Capital letter are mandatory record,
> Mercurial
> +    should abort if they are unknown. lower case record can be safely
> ignored.
> +
> +    Currently known record:
> +
> +    L: the node of the "local" part of the merge (hexified version)
> +    F: a file to be merged entry
> +    '''
> +    statepath = "merge/state2"
>      def __init__(self, repo):
>          self._repo = repo
>          self._dirty = False
>          self._read()
>      def reset(self, node=None):
> @@ -27,29 +47,52 @@ class mergestate(object):
>          self._dirty = False
>      def _read(self):
>          self._state = {}
>          try:
>              f = self._repo.opener(self.statepath)
> -            for i, l in enumerate(f):
> -                if i == 0:
> -                    self._local = bin(l[:-1])
> -                else:
> -                    bits = l[:-1].split("\0")
> +            for rtype, record in self._readrecords(f):
> +                if rtype == 'L':
> +                    self._local = bin(record)
> +                elif rtype == "F":
> +                    bits = record.split("\0")
>                      self._state[bits[0]] = bits[1:]
> +                elif not rtype.islower():
> +                    raise util.Abort(_('unsupported merge state record:'
> +                                       % rtype))
>              f.close()
>          except IOError, err:
>              if err.errno != errno.ENOENT:
>                  raise
>          self._dirty = False
> +    def _readrecords(self, f):
> +        data = f.read()
> +        off = 0
> +        end = len(data)
> +        while off < end:
> +            rtype = data[off]
> +            off += 1
> +            lenght = _unpack('>I', data[off:(off + 4)])[0]
>
length

> +            off += 4
> +            record = data[off:(off + lenght)]
>
length

> +            off += lenght
>
length

> +            yield rtype, record
> +
>      def commit(self):
>          if self._dirty:
> +            records = []
> +            records.append(("L", hex(self._local)))
> +            for d, v in self._state.iteritems():
> +                records.append(("F", "\0".join([d] + v)))
> +            self._writerecords(records)
> +            self._dirty = False
> +    def _writerecords(self, records):
>              f = self._repo.opener(self.statepath, "w")
> -            f.write(hex(self._local) + "\n")
> -            for d, v in self._state.iteritems():
> -                f.write("\0".join([d] + v) + "\n")
> +            for key, data in records:
> +                assert len(key) == 1
> +                format = ">sI%is" % len(data)
> +                f.write(_pack(format, key, len(data), data))
>              f.close()
> -            self._dirty = False
>      def add(self, fcl, fco, fca, fd):
>          hash = util.sha1(fcl.path()).hexdigest()
>          self._repo.opener.write("merge/" + hash, fcl.data())
>          self._state[fd] = ['u', hash, fcl.path(), fca.path(),
>                             hex(fca.filenode()), fco.path(), fcl.flags()]
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
>

Patch

diff --git a/mercurial/merge.py b/mercurial/merge.py
--- a/mercurial/merge.py
+++ b/mercurial/merge.py
@@ -3,20 +3,40 @@ 
 # Copyright 2006, 2007 Matt Mackall <mpm@selenic.com>
 #
 # This software may be used and distributed according to the terms of the
 # GNU General Public License version 2 or any later version.
 
+import struct
+
 from node import nullid, nullrev, hex, bin
 from i18n import _
 from mercurial import obsolete
 import error, util, filemerge, copies, subrepo, worker, dicthelpers
 import errno, os, shutil
 
+_pack = struct.pack
+_unpack = struct.unpack
+
 class mergestate(object):
-    '''track 3-way merge state of individual files'''
-    statepath = "merge/state"
+    '''track 3-way merge state of individual files
 
+    current format is a list of arbitrary record of the form:
+
+        [type][length][content]
+
+    Type is a single character, length is a 4 bytes integer, content is an
+    arbitrary suites of bytes of lenght `length`.
+
+    Type should be a letter. Capital letter are mandatory record, Mercurial
+    should abort if they are unknown. lower case record can be safely ignored.
+
+    Currently known record:
+
+    L: the node of the "local" part of the merge (hexified version)
+    F: a file to be merged entry
+    '''
+    statepath = "merge/state2"
     def __init__(self, repo):
         self._repo = repo
         self._dirty = False
         self._read()
     def reset(self, node=None):
@@ -27,29 +47,52 @@  class mergestate(object):
         self._dirty = False
     def _read(self):
         self._state = {}
         try:
             f = self._repo.opener(self.statepath)
-            for i, l in enumerate(f):
-                if i == 0:
-                    self._local = bin(l[:-1])
-                else:
-                    bits = l[:-1].split("\0")
+            for rtype, record in self._readrecords(f):
+                if rtype == 'L':
+                    self._local = bin(record)
+                elif rtype == "F":
+                    bits = record.split("\0")
                     self._state[bits[0]] = bits[1:]
+                elif not rtype.islower():
+                    raise util.Abort(_('unsupported merge state record:'
+                                       % rtype))
             f.close()
         except IOError, err:
             if err.errno != errno.ENOENT:
                 raise
         self._dirty = False
+    def _readrecords(self, f):
+        data = f.read()
+        off = 0
+        end = len(data)
+        while off < end:
+            rtype = data[off]
+            off += 1
+            lenght = _unpack('>I', data[off:(off + 4)])[0]
+            off += 4
+            record = data[off:(off + lenght)]
+            off += lenght
+            yield rtype, record
+
     def commit(self):
         if self._dirty:
+            records = []
+            records.append(("L", hex(self._local)))
+            for d, v in self._state.iteritems():
+                records.append(("F", "\0".join([d] + v)))
+            self._writerecords(records)
+            self._dirty = False
+    def _writerecords(self, records):
             f = self._repo.opener(self.statepath, "w")
-            f.write(hex(self._local) + "\n")
-            for d, v in self._state.iteritems():
-                f.write("\0".join([d] + v) + "\n")
+            for key, data in records:
+                assert len(key) == 1
+                format = ">sI%is" % len(data)
+                f.write(_pack(format, key, len(data), data))
             f.close()
-            self._dirty = False
     def add(self, fcl, fco, fca, fd):
         hash = util.sha1(fcl.path()).hexdigest()
         self._repo.opener.write("merge/" + hash, fcl.data())
         self._state[fd] = ['u', hash, fcl.path(), fca.path(),
                            hex(fca.filenode()), fco.path(), fcl.flags()]