Patchwork [RFC,v7] scmutil: add a simple key-value file helper

login
register
mail settings
Submitter Kostia Balytskyi
Date Dec. 6, 2016, 9:41 p.m.
Message ID <211cfe5a24d18218720c.1481060467@dev1902.lla1.facebook.com>
Download mbox | patch
Permalink /patch/17842/
State Changes Requested
Headers show

Comments

Kostia Balytskyi - Dec. 6, 2016, 9:41 p.m.
# HG changeset patch
# User Kostia Balytskyi <ikostia@fb.com>
# Date 1481060343 28800
#      Tue Dec 06 13:39:03 2016 -0800
# Node ID 211cfe5a24d18218720c38a601dfb2b88373b59a
# Parent  831d29deed083618bddc2fade7d2a6fe297c896d
scmutil: add a simple key-value file helper

The purpose of the added class is to serve purposes like save files of shelve
or state files of shelve, rebase and histedit. Keys of these files can be
alphanumeric and start with letters, while values must not contain newlines.
Keys which start with an uppercase letter are required, while other keys
are optional.

In light of Mercurial's reluctancy to use Python's json module, this tries
to provide a reasonable alternative for a non-nested named data.
Comparing to current approach of storing state in plain text files, where
semantic meaning of lines of text is only determined by their oreder,
simple key-value file allows for reordering lines and thus helps handle
optional values.

Initial use-case I see for this is obs-shelve's shelve files. Later we
can possibly migrate state files to this approach.

The test is in a new file beause I did not figure out where to put it
within existing test suite. If you give me a better idea, I will gladly
follow it.
Jun Wu - Dec. 6, 2016, 9:53 p.m.
Excerpts from Kostia Balytskyi's message of 2016-12-06 13:41:07 -0800:
>  class CorruptedState(Exception):
>      """error raised when a command is not able to read its state from file"""
> +
> +class MissingRequiredKeyInFileException(Exception):
> +    """error raised when simple key-value file misses a required key"""

I still think "CorruptedState" is better. It fits the use-case, is short and
concise. 6-word exception sounds strange to me, partially because everything
else in error.py is at most 4 words. If we have to use a new exception,
maybe just "MissingRequiredKey", or just "KeyError".
Kostia Balytskyi - Dec. 6, 2016, 10:04 p.m.
On 12/6/16 9:53 PM, Jun Wu wrote:
> Excerpts from Kostia Balytskyi's message of 2016-12-06 13:41:07 -0800:

>>   class CorruptedState(Exception):

>>       """error raised when a command is not able to read its state from file"""

>> +

>> +class MissingRequiredKeyInFileException(Exception):

>> +    """error raised when simple key-value file misses a required key"""

> I still think "CorruptedState" is better. It fits the use-case, is short and

> concise. 6-word exception sounds strange to me, partially because everything

> else in error.py is at most 4 words. If we have to use a new exception,

> maybe just "MissingRequiredKey", or just "KeyError".

I agree that the name I chose is too long. But CorruptedState is a 
significantly different thing than missing key and gut feeling tells me 
it should sometimes be treated differently. I can't think of an example 
right now though.
> _______________________________________________

> Mercurial-devel mailing list

> Mercurial-devel@mercurial-scm.org

> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Pierre-Yves David - Dec. 14, 2016, 2:17 a.m.
On 12/06/2016 10:53 PM, Jun Wu wrote:
> Excerpts from Kostia Balytskyi's message of 2016-12-06 13:41:07 -0800:
>>  class CorruptedState(Exception):
>>      """error raised when a command is not able to read its state from file"""
>> +
>> +class MissingRequiredKeyInFileException(Exception):
>> +    """error raised when simple key-value file misses a required key"""
>
> I still think "CorruptedState" is better. It fits the use-case, is short and
> concise. 6-word exception sounds strange to me, partially because everything
> else in error.py is at most 4 words. If we have to use a new exception,
> maybe just "MissingRequiredKey", or just "KeyError".

I've not looked at any logic or any context for this, but 
"MissingRequiredKeyInFileException" is most certainly a 
TooLongNameToBeAccepted

See https://lwn.net/Articles/455265/ for details ;-)

Cheers,
Augie Fackler - Dec. 15, 2016, 4:49 p.m.
On Wed, Dec 14, 2016 at 03:17:42AM +0100, Pierre-Yves David wrote:
>
>
> On 12/06/2016 10:53 PM, Jun Wu wrote:
> >Excerpts from Kostia Balytskyi's message of 2016-12-06 13:41:07 -0800:
> >> class CorruptedState(Exception):
> >>     """error raised when a command is not able to read its state from file"""
> >>+
> >>+class MissingRequiredKeyInFileException(Exception):
> >>+    """error raised when simple key-value file misses a required key"""
> >
> >I still think "CorruptedState" is better. It fits the use-case, is short and
> >concise. 6-word exception sounds strange to me, partially because everything
> >else in error.py is at most 4 words. If we have to use a new exception,
> >maybe just "MissingRequiredKey", or just "KeyError".
>
> I've not looked at any logic or any context for this, but
> "MissingRequiredKeyInFileException" is most certainly a
> TooLongNameToBeAccepted

+1 to MissingRequiredKey (let's avoid KeyError since this is slightly
more domain-specific and that's a builtin name)

>
> See https://lwn.net/Articles/455265/ for details ;-)
>
> Cheers,
>
> --
> Pierre-Yves David
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Kostia Balytskyi - Dec. 15, 2016, 6:52 p.m.
On 12/15/16 4:49 PM, Augie Fackler wrote:
> On Wed, Dec 14, 2016 at 03:17:42AM +0100, Pierre-Yves David wrote:

>>

>> On 12/06/2016 10:53 PM, Jun Wu wrote:

>>> Excerpts from Kostia Balytskyi's message of 2016-12-06 13:41:07 -0800:

>>>> class CorruptedState(Exception):

>>>>      """error raised when a command is not able to read its state from file"""

>>>> +

>>>> +class MissingRequiredKeyInFileException(Exception):

>>>> +    """error raised when simple key-value file misses a required key"""

>>> I still think "CorruptedState" is better. It fits the use-case, is short and

>>> concise. 6-word exception sounds strange to me, partially because everything

>>> else in error.py is at most 4 words. If we have to use a new exception,

>>> maybe just "MissingRequiredKey", or just "KeyError".

>> I've not looked at any logic or any context for this, but

>> "MissingRequiredKeyInFileException" is most certainly a

>> TooLongNameToBeAccepted

> +1 to MissingRequiredKey (let's avoid KeyError since this is slightly

> more domain-specific and that's a builtin name)

That's what I've sent in v8.
>

>> See https://lwn.net/Articles/455265/ for details ;-)

>>

>> Cheers,

>>

>> --

>> Pierre-Yves David

>> _______________________________________________

>> Mercurial-devel mailing list

>> Mercurial-devel@mercurial-scm.org

>> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

> _______________________________________________

> Mercurial-devel mailing list

> Mercurial-devel@mercurial-scm.org

> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/error.py b/mercurial/error.py
--- a/mercurial/error.py
+++ b/mercurial/error.py
@@ -246,3 +246,6 @@  class UnsupportedBundleSpecification(Exc
 
 class CorruptedState(Exception):
     """error raised when a command is not able to read its state from file"""
+
+class MissingRequiredKeyInFileException(Exception):
+    """error raised when simple key-value file misses a required key"""
diff --git a/mercurial/scmutil.py b/mercurial/scmutil.py
--- a/mercurial/scmutil.py
+++ b/mercurial/scmutil.py
@@ -1571,3 +1571,123 @@  class checkambigatclosing(closewrapbase)
     def close(self):
         self._origfh.close()
         self._checkambig()
+
+class simplekeyvaluefile(object):
+    """A simple file with key=value lines
+
+    Keys must be alphanumerics and start with a letter, values must not
+    contain '\n' characters
+
+    >>> contents = {}
+    >>> class fileobj(object):
+    ...     def __init__(self, name):
+    ...         self.name = name
+    ...     def __enter__(self):
+    ...         return self
+    ...     def __exit__(self, *args, **kwargs):
+    ...         pass
+    ...     def write(self, text):
+    ...         contents[self.name] = text
+    ...     def read(self):
+    ...         return contents[self.name]
+    >>> class mockvfs(object):
+    ...     def read(self, path):
+    ...         return fileobj(path).read()
+    ...     def readlines(self, path):
+    ...         return fileobj(path).read().split('\\n')
+    ...     def __call__(self, path, mode, atomictemp):
+    ...         return fileobj(path)
+    >>> vfs = mockvfs()
+
+    Basic testing of whether simple key-value file works:
+    >>> d = {'key1': 'value1', 'Key2': 'value2'}
+    >>> simplekeyvaluefile(vfs, 'kvfile').write(d)
+    >>> print sorted(vfs.read('kvfile').split('\\n'))
+    ['', 'Key2=value2', 'key1=value1']
+
+    Testing of whether invalid keys are detected:
+    >>> d = {'0key1': 'value1', 'Key2': 'value2'}
+    >>> simplekeyvaluefile(vfs, 'kvfile').write(d)
+    Traceback (most recent call last):
+        ...
+    ProgrammingError: keys must start with a letter ...
+    >>> d = {'key1@': 'value1', 'Key2': 'value2'}
+    >>> simplekeyvaluefile(vfs, 'kvfile').write(d)
+    Traceback (most recent call last):
+        ...
+    ProgrammingError: invalid key name in a simple key-value file
+
+    Testing of whether invalid values are detected:
+    >>> d = {'key1': 'value1', 'Key2': 'value2\\n'}
+    >>> simplekeyvaluefile(vfs, 'kvfile').write(d)
+    Traceback (most recent call last):
+        ...
+    ProgrammingError: invalid value in a simple key-value file
+
+    Test cases when necessary keys are present
+    >>> d = {'key1': 'value1', 'Key2': 'value2'}
+    >>> simplekeyvaluefile(vfs, 'allkeyshere').write(d)
+    >>> class kvf(simplekeyvaluefile):
+    ...     KEYS = [('key3', False), ('Key2', True)]
+    >>> print sorted(kvf(vfs, 'allkeyshere').read().items())
+    [('Key2', 'value'), ('key1', 'value')]
+
+    Test cases when necessary keys are absent
+    >>> d = {'key1': 'value1', 'Key3': 'value2'}
+    >>> simplekeyvaluefile(vfs, 'missingkeys').write(d)
+    >>> class kvf(simplekeyvaluefile):
+    ...     KEYS = [('key3', False), ('Key2', True)]
+    >>> print sorted(kvf(vfs, 'missingkeys').read().items())
+    Traceback (most recent call last):
+        ...
+    MissingRequiredKeyInFileException: missing a required key: 'Key2'
+
+    Test cases when file is not really a simple key-value file
+    >>> contents['badfile'] = 'ababagalamaga\\n'
+    >>> simplekeyvaluefile(vfs, 'badfile').read()
+    Traceback (most recent call last):
+        ...
+    CorruptedState: dictionary ... element #0 has length 1; 2 is required
+    """
+
+    # if KEYS is non-empty, read values are validated against it:
+    # each key is a tuple (keyname, required)
+    KEYS = []
+
+    def __init__(self, vfs, path):
+        self.vfs = vfs
+        self.path = path
+
+    def validate(self, d):
+        for key, req in self.KEYS:
+            if req and key not in d:
+                e = "missing a required key: '%s'" % key
+                raise error.MissingRequiredKeyInFileException(e)
+
+    def read(self):
+        lines = self.vfs.readlines(self.path)
+        try:
+            d = dict(line[:-1].split('=', 1) for line in lines if line)
+        except ValueError as e:
+            raise error.CorruptedState(str(e))
+        self.validate(d)
+        return d
+
+    def write(self, data):
+        """Write key=>value mapping to a file
+        data is a dict. Keys must be alphanumerical and start with a letter.
+        Values must not contain newline characters."""
+        lines = []
+        for k, v in data.items():
+            if not k[0].isalpha():
+                e = "keys must start with a letter in a key-value file"
+                raise error.ProgrammingError(e)
+            if not k.isalnum():
+                e = "invalid key name in a simple key-value file"
+                raise error.ProgrammingError(e)
+            if '\n' in v:
+                e = "invalid value in a simple key-value file"
+                raise error.ProgrammingError(e)
+            lines.append("%s=%s\n" % (k, v))
+        with self.vfs(self.path, mode='wb', atomictemp=True) as fp:
+            fp.write(''.join(lines))
diff --git a/tests/test-doctest.py b/tests/test-doctest.py
--- a/tests/test-doctest.py
+++ b/tests/test-doctest.py
@@ -29,6 +29,7 @@  testmod('mercurial.patch')
 testmod('mercurial.pathutil')
 testmod('mercurial.parser')
 testmod('mercurial.revset')
+testmod('mercurial.scmutil', optionflags=doctest.ELLIPSIS)
 testmod('mercurial.store')
 testmod('mercurial.subrepo')
 testmod('mercurial.templatefilters')