Patchwork [RFC,v10] scmutil: add a simple key-value file helper

login
register
mail settings
Submitter Kostia Balytskyi
Date Feb. 3, 2017, 10:47 a.m.
Message ID <19a449c91ef14e691cf1.1486118846@devvm1416.lla2.facebook.com>
Download mbox | patch
Permalink /patch/18309/
State Changes Requested
Headers show

Comments

Kostia Balytskyi - Feb. 3, 2017, 10:47 a.m.
# HG changeset patch
# User Kostia Balytskyi <ikostia@fb.com>
# Date 1484824655 28800
#      Thu Jan 19 03:17:35 2017 -0800
# Node ID 19a449c91ef14e691cf1347748473e0094fedc86
# Parent  9f264adbe75bfae8551dc0e6e0fce8d43fc7b43a
scmutil: add a simple key-value file helper

The purpose of the added class is to serve purposes like save files of shelve
or state files of shelve, rebase and histedit. Keys of these files can be
alphanumeric and start with letters, while values must not contain newlines.
Keys which start with an uppercase letter are required, while other keys
are optional.

In light of Mercurial's reluctancy to use Python's json module, this tries
to provide a reasonable alternative for a non-nested named data.
Comparing to current approach of storing state in plain text files, where
semantic meaning of lines of text is only determined by their oreder,
simple key-value file allows for reordering lines and thus helps handle
optional values.

Initial use-case I see for this is obs-shelve's shelve files. Later we
can possibly migrate state files to this approach.

The test is in a new file beause I did not figure out where to put it
within existing test suite. If you give me a better idea, I will gladly
follow it.
Jun Wu - Feb. 13, 2017, 9:58 p.m.
As said before, I still prefer a simpler design without the "required"
boolean flag, and move the responsibility of checking requirements to the
higher-level code. Because:

  - As Yuya pointed out at the first place, "required" here will have
    trouble dealing with future schema changes. So it's not that useful
    unless you are 100% sure that the schema stays unchanged forever.
  - It'll be shorter, and easier to read - no need to jump here to
    understand what "True" means in "KEYS = ..." in shelve.py.
  - It'll be faster, although just slightly.

From another perspective, "Checking requirements" is part of "data
validation", which cannot be done in this low-level simple structure alone.
For example, some fields may have to be 40-byte hex string and that kind of
checks are probably done in a higher level. Since the higher level code will
do "data validation" anyway, it makes more sense to just not do "data
validation" here so they happen in a single place. The higher level code
could also print more user-friendly error messages.

If you agree these are reasonable, but have a long stack which is hard to
change, I could help fix them later.

Excerpts from Kostia Balytskyi's message of 2017-02-03 02:47:26 -0800:
> # HG changeset patch
> # User Kostia Balytskyi <ikostia@fb.com>
> # Date 1484824655 28800
> #      Thu Jan 19 03:17:35 2017 -0800
> # Node ID 19a449c91ef14e691cf1347748473e0094fedc86
> # Parent  9f264adbe75bfae8551dc0e6e0fce8d43fc7b43a
> scmutil: add a simple key-value file helper
> 
> The purpose of the added class is to serve purposes like save files of shelve
> or state files of shelve, rebase and histedit. Keys of these files can be
> alphanumeric and start with letters, while values must not contain newlines.
> Keys which start with an uppercase letter are required, while other keys
> are optional.
> 
> In light of Mercurial's reluctancy to use Python's json module, this tries
> to provide a reasonable alternative for a non-nested named data.
> Comparing to current approach of storing state in plain text files, where
> semantic meaning of lines of text is only determined by their oreder,
> simple key-value file allows for reordering lines and thus helps handle
> optional values.
> 
> Initial use-case I see for this is obs-shelve's shelve files. Later we
> can possibly migrate state files to this approach.
> 
> The test is in a new file beause I did not figure out where to put it
> within existing test suite. If you give me a better idea, I will gladly
> follow it.
> 
> diff --git a/mercurial/error.py b/mercurial/error.py
> --- a/mercurial/error.py
> +++ b/mercurial/error.py
> @@ -246,3 +246,6 @@ class UnsupportedBundleSpecification(Exc
>  
>  class CorruptedState(Exception):
>      """error raised when a command is not able to read its state from file"""
> +
> +class MissingRequiredKey(Exception):
> +    """error raised when simple key-value file misses a required key"""
> diff --git a/mercurial/scmutil.py b/mercurial/scmutil.py
> --- a/mercurial/scmutil.py
> +++ b/mercurial/scmutil.py
> @@ -1571,3 +1571,51 @@ class checkambigatclosing(closewrapbase)
>      def close(self):
>          self._origfh.close()
>          self._checkambig()
> +
> +class simplekeyvaluefile(object):
> +    """A simple file with key=value lines
> +
> +    Keys must be alphanumerics and start with a letter, values must not
> +    contain '\n' characters"""
> +
> +    # if KEYS is non-empty, read values are validated against it:
> +    # each key is a tuple (keyname, required)
> +    KEYS = []
> +
> +    def __init__(self, vfs, path):
> +        self.vfs = vfs
> +        self.path = path
> +
> +    def validate(self, d):
> +        for key, req in self.KEYS:
> +            if req and key not in d:
> +                e = "missing a required key: '%s'" % key
> +                raise error.MissingRequiredKey(e)
> +
> +    def read(self):
> +        lines = self.vfs.readlines(self.path)
> +        try:
> +            d = dict(line[:-1].split('=', 1) for line in lines if line)
> +        except ValueError as e:
> +            raise error.CorruptedState(str(e))
> +        self.validate(d)
> +        return d
> +
> +    def write(self, data):
> +        """Write key=>value mapping to a file
> +        data is a dict. Keys must be alphanumerical and start with a letter.
> +        Values must not contain newline characters."""
> +        lines = []
> +        for k, v in data.items():
> +            if not k[0].isalpha():
> +                e = "keys must start with a letter in a key-value file"
> +                raise error.ProgrammingError(e)
> +            if not k.isalnum():
> +                e = "invalid key name in a simple key-value file"
> +                raise error.ProgrammingError(e)
> +            if '\n' in v:
> +                e = "invalid value in a simple key-value file"
> +                raise error.ProgrammingError(e)
> +            lines.append("%s=%s\n" % (k, v))
> +        with self.vfs(self.path, mode='wb', atomictemp=True) as fp:
> +            fp.write(''.join(lines))
> diff --git a/tests/test-simplekeyvaluefile.py b/tests/test-simplekeyvaluefile.py
> new file mode 100644
> --- /dev/null
> +++ b/tests/test-simplekeyvaluefile.py
> @@ -0,0 +1,87 @@
> +from __future__ import absolute_import
> +
> +import unittest
> +import silenttestrunner
> +
> +from mercurial import (
> +    error,
> +    scmutil,
> +)
> +
> +contents = {}
> +
> +class fileobj(object):
> +    def __init__(self, name):
> +        self.name = name
> +
> +    def __enter__(self):
> +        return self
> +
> +    def __exit__(self, *args, **kwargs):
> +        pass
> +
> +    def write(self, text):
> +        contents[self.name] = text
> +
> +    def read(self):
> +        return contents[self.name]
> +
> +class mockvfs(object):
> +    def read(self, path):
> +        return fileobj(path).read()
> +
> +    def readlines(self, path):
> +        return fileobj(path).read().split('\n')
> +
> +    def __call__(self, path, mode, atomictemp):
> +        return fileobj(path)
> +
> +class testsimplekeyvaluefile(unittest.TestCase):
> +    def setUp(self):
> +        self.vfs = mockvfs()
> +
> +    def testbasicwriting(self):
> +        d = {'key1': 'value1', 'Key2': 'value2'}
> +        scmutil.simplekeyvaluefile(self.vfs, 'kvfile').write(d)
> +        self.assertEqual(sorted(self.vfs.read('kvfile').split('\n')),
> +                         ['', 'Key2=value2', 'key1=value1'])
> +
> +    def testinvalidkeys(self):
> +        d = {'0key1': 'value1', 'Key2': 'value2'}
> +        with self.assertRaisesRegexp(error.ProgrammingError,
> +                                     "keys must start with a letter.*"):
> +            scmutil.simplekeyvaluefile(self.vfs, 'kvfile').write(d)
> +        d = {'key1@': 'value1', 'Key2': 'value2'}
> +        with self.assertRaisesRegexp(error.ProgrammingError, "invalid key.*"):
> +            scmutil.simplekeyvaluefile(self.vfs, 'kvfile').write(d)
> +
> +    def testinvalidvalues(self):
> +        d = {'key1': 'value1', 'Key2': 'value2\n'}
> +        with self.assertRaisesRegexp(error.ProgrammingError, "invalid val.*"):
> +            scmutil.simplekeyvaluefile(self.vfs, 'kvfile').write(d)
> +
> +    def testrequiredkeys(self):
> +        d = {'key1': 'value1', 'Key2': 'value2'}
> +        scmutil.simplekeyvaluefile(self.vfs, 'allkeyshere').write(d)
> +
> +        class kvf(scmutil.simplekeyvaluefile):
> +            KEYS = [('key3', False), ('Key2', True)]
> +        self.assertEqual(sorted(kvf(self.vfs, 'allkeyshere').read().items()),
> +                         [('Key2', 'value'), ('key1', 'value')])
> +
> +        d = {'key1': 'value1', 'Key3': 'value2'}
> +        scmutil.simplekeyvaluefile(self.vfs, 'missingkeys').write(d)
> +
> +        class kvf(scmutil.simplekeyvaluefile):
> +            KEYS = [('key3', False), ('Key2', True)]
> +        with self.assertRaisesRegexp(error.MissingRequiredKey, "missing a.*"):
> +            kvf(self.vfs, 'missingkeys').read()
> +
> +    def testcorruptedfile(self):
> +        contents['badfile'] = 'ababagalamaga\n'
> +        with self.assertRaisesRegexp(error.CorruptedState,
> +                                     "dictionary.*element.*"):
> +            scmutil.simplekeyvaluefile(self.vfs, 'badfile').read()
> +
> +if __name__ == "__main__":
> +    silenttestrunner.main(__name__)
Kostia Balytskyi - Feb. 13, 2017, 10:44 p.m.
On 13/02/2017 21:58, Jun Wu wrote:
> As said before, I still prefer a simpler design without the "required"

> boolean flag, and move the responsibility of checking requirements to the

> higher-level code. Because:

>

>    - As Yuya pointed out at the first place, "required" here will have

>      trouble dealing with future schema changes. So it's not that useful

>      unless you are 100% sure that the schema stays unchanged forever.

That's the whole point of required - to mark fields which you are sure 
can't be removed.  At least the way I see it and the way I intended to 
use it - is to future-proof my schema.
>    - It'll be shorter, and easier to read - no need to jump here to

>      understand what "True" means in "KEYS = ..." in shelve.py.

We can use named tuples for this purpose to say "required=True" ...
>    - It'll be faster, although just slightly.

>

>  From another perspective, "Checking requirements" is part of "data

> validation", which cannot be done in this low-level simple structure alone.

> For example, some fields may have to be 40-byte hex string and that kind of

> checks are probably done in a higher level. Since the higher level code will

> do "data validation" anyway, it makes more sense to just not do "data

> validation" here so they happen in a single place. The higher level code

> could also print more user-friendly error messages.

Yes, fully functional data validation is not possible on this level, but 
we can make some people's lives easier by outsourcing requirements checking.
>

> If you agree these are reasonable, but have a long stack which is hard to

> change, I could help fix them later.

I will change and re-send withoug required-related functionality as this 
patch is taking forever to get in.
>

> Excerpts from Kostia Balytskyi's message of 2017-02-03 02:47:26 -0800:

>> # HG changeset patch

>> # User Kostia Balytskyi <ikostia@fb.com>

>> # Date 1484824655 28800

>> #      Thu Jan 19 03:17:35 2017 -0800

>> # Node ID 19a449c91ef14e691cf1347748473e0094fedc86

>> # Parent  9f264adbe75bfae8551dc0e6e0fce8d43fc7b43a

>> scmutil: add a simple key-value file helper

>>

>> The purpose of the added class is to serve purposes like save files of shelve

>> or state files of shelve, rebase and histedit. Keys of these files can be

>> alphanumeric and start with letters, while values must not contain newlines.

>> Keys which start with an uppercase letter are required, while other keys

>> are optional.

>>

>> In light of Mercurial's reluctancy to use Python's json module, this tries

>> to provide a reasonable alternative for a non-nested named data.

>> Comparing to current approach of storing state in plain text files, where

>> semantic meaning of lines of text is only determined by their oreder,

>> simple key-value file allows for reordering lines and thus helps handle

>> optional values.

>>

>> Initial use-case I see for this is obs-shelve's shelve files. Later we

>> can possibly migrate state files to this approach.

>>

>> The test is in a new file beause I did not figure out where to put it

>> within existing test suite. If you give me a better idea, I will gladly

>> follow it.

>>

>> diff --git a/mercurial/error.py b/mercurial/error.py

>> --- a/mercurial/error.py

>> +++ b/mercurial/error.py

>> @@ -246,3 +246,6 @@ class UnsupportedBundleSpecification(Exc

>>   

>>   class CorruptedState(Exception):

>>       """error raised when a command is not able to read its state from file"""

>> +

>> +class MissingRequiredKey(Exception):

>> +    """error raised when simple key-value file misses a required key"""

>> diff --git a/mercurial/scmutil.py b/mercurial/scmutil.py

>> --- a/mercurial/scmutil.py

>> +++ b/mercurial/scmutil.py

>> @@ -1571,3 +1571,51 @@ class checkambigatclosing(closewrapbase)

>>       def close(self):

>>           self._origfh.close()

>>           self._checkambig()

>> +

>> +class simplekeyvaluefile(object):

>> +    """A simple file with key=value lines

>> +

>> +    Keys must be alphanumerics and start with a letter, values must not

>> +    contain '\n' characters"""

>> +

>> +    # if KEYS is non-empty, read values are validated against it:

>> +    # each key is a tuple (keyname, required)

>> +    KEYS = []

>> +

>> +    def __init__(self, vfs, path):

>> +        self.vfs = vfs

>> +        self.path = path

>> +

>> +    def validate(self, d):

>> +        for key, req in self.KEYS:

>> +            if req and key not in d:

>> +                e = "missing a required key: '%s'" % key

>> +                raise error.MissingRequiredKey(e)

>> +

>> +    def read(self):

>> +        lines = self.vfs.readlines(self.path)

>> +        try:

>> +            d = dict(line[:-1].split('=', 1) for line in lines if line)

>> +        except ValueError as e:

>> +            raise error.CorruptedState(str(e))

>> +        self.validate(d)

>> +        return d

>> +

>> +    def write(self, data):

>> +        """Write key=>value mapping to a file

>> +        data is a dict. Keys must be alphanumerical and start with a letter.

>> +        Values must not contain newline characters."""

>> +        lines = []

>> +        for k, v in data.items():

>> +            if not k[0].isalpha():

>> +                e = "keys must start with a letter in a key-value file"

>> +                raise error.ProgrammingError(e)

>> +            if not k.isalnum():

>> +                e = "invalid key name in a simple key-value file"

>> +                raise error.ProgrammingError(e)

>> +            if '\n' in v:

>> +                e = "invalid value in a simple key-value file"

>> +                raise error.ProgrammingError(e)

>> +            lines.append("%s=%s\n" % (k, v))

>> +        with self.vfs(self.path, mode='wb', atomictemp=True) as fp:

>> +            fp.write(''.join(lines))

>> diff --git a/tests/test-simplekeyvaluefile.py b/tests/test-simplekeyvaluefile.py

>> new file mode 100644

>> --- /dev/null

>> +++ b/tests/test-simplekeyvaluefile.py

>> @@ -0,0 +1,87 @@

>> +from __future__ import absolute_import

>> +

>> +import unittest

>> +import silenttestrunner

>> +

>> +from mercurial import (

>> +    error,

>> +    scmutil,

>> +)

>> +

>> +contents = {}

>> +

>> +class fileobj(object):

>> +    def __init__(self, name):

>> +        self.name = name

>> +

>> +    def __enter__(self):

>> +        return self

>> +

>> +    def __exit__(self, *args, **kwargs):

>> +        pass

>> +

>> +    def write(self, text):

>> +        contents[self.name] = text

>> +

>> +    def read(self):

>> +        return contents[self.name]

>> +

>> +class mockvfs(object):

>> +    def read(self, path):

>> +        return fileobj(path).read()

>> +

>> +    def readlines(self, path):

>> +        return fileobj(path).read().split('\n')

>> +

>> +    def __call__(self, path, mode, atomictemp):

>> +        return fileobj(path)

>> +

>> +class testsimplekeyvaluefile(unittest.TestCase):

>> +    def setUp(self):

>> +        self.vfs = mockvfs()

>> +

>> +    def testbasicwriting(self):

>> +        d = {'key1': 'value1', 'Key2': 'value2'}

>> +        scmutil.simplekeyvaluefile(self.vfs, 'kvfile').write(d)

>> +        self.assertEqual(sorted(self.vfs.read('kvfile').split('\n')),

>> +                         ['', 'Key2=value2', 'key1=value1'])

>> +

>> +    def testinvalidkeys(self):

>> +        d = {'0key1': 'value1', 'Key2': 'value2'}

>> +        with self.assertRaisesRegexp(error.ProgrammingError,

>> +                                     "keys must start with a letter.*"):

>> +            scmutil.simplekeyvaluefile(self.vfs, 'kvfile').write(d)

>> +        d = {'key1@': 'value1', 'Key2': 'value2'}

>> +        with self.assertRaisesRegexp(error.ProgrammingError, "invalid key.*"):

>> +            scmutil.simplekeyvaluefile(self.vfs, 'kvfile').write(d)

>> +

>> +    def testinvalidvalues(self):

>> +        d = {'key1': 'value1', 'Key2': 'value2\n'}

>> +        with self.assertRaisesRegexp(error.ProgrammingError, "invalid val.*"):

>> +            scmutil.simplekeyvaluefile(self.vfs, 'kvfile').write(d)

>> +

>> +    def testrequiredkeys(self):

>> +        d = {'key1': 'value1', 'Key2': 'value2'}

>> +        scmutil.simplekeyvaluefile(self.vfs, 'allkeyshere').write(d)

>> +

>> +        class kvf(scmutil.simplekeyvaluefile):

>> +            KEYS = [('key3', False), ('Key2', True)]

>> +        self.assertEqual(sorted(kvf(self.vfs, 'allkeyshere').read().items()),

>> +                         [('Key2', 'value'), ('key1', 'value')])

>> +

>> +        d = {'key1': 'value1', 'Key3': 'value2'}

>> +        scmutil.simplekeyvaluefile(self.vfs, 'missingkeys').write(d)

>> +

>> +        class kvf(scmutil.simplekeyvaluefile):

>> +            KEYS = [('key3', False), ('Key2', True)]

>> +        with self.assertRaisesRegexp(error.MissingRequiredKey, "missing a.*"):

>> +            kvf(self.vfs, 'missingkeys').read()

>> +

>> +    def testcorruptedfile(self):

>> +        contents['badfile'] = 'ababagalamaga\n'

>> +        with self.assertRaisesRegexp(error.CorruptedState,

>> +                                     "dictionary.*element.*"):

>> +            scmutil.simplekeyvaluefile(self.vfs, 'badfile').read()

>> +

>> +if __name__ == "__main__":

>> +    silenttestrunner.main(__name__)

> _______________________________________________

> Mercurial-devel mailing list

> Mercurial-devel@mercurial-scm.org

> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Augie Fackler - Feb. 13, 2017, 11:27 p.m.
On Fri, Feb 03, 2017 at 02:47:26AM -0800, Kostia Balytskyi wrote:
> # HG changeset patch
> # User Kostia Balytskyi <ikostia@fb.com>
> # Date 1484824655 28800
> #      Thu Jan 19 03:17:35 2017 -0800
> # Node ID 19a449c91ef14e691cf1347748473e0094fedc86
> # Parent  9f264adbe75bfae8551dc0e6e0fce8d43fc7b43a
> scmutil: add a simple key-value file helper

Overall I'm +1 on this, may as well include it in the next round of
whatever depends on it? I'm hesitant to land it without an immediate
consumer.

(I agree that it seems like a reasonable fit for your shelve thing,
but we've got some bits to sort out over there I'm afraid.)

>
> The purpose of the added class is to serve purposes like save files of shelve
> or state files of shelve, rebase and histedit. Keys of these files can be
> alphanumeric and start with letters, while values must not contain newlines.
> Keys which start with an uppercase letter are required, while other keys
> are optional.
>
> In light of Mercurial's reluctancy to use Python's json module, this tries
> to provide a reasonable alternative for a non-nested named data.
> Comparing to current approach of storing state in plain text files, where
> semantic meaning of lines of text is only determined by their oreder,
> simple key-value file allows for reordering lines and thus helps handle
> optional values.
>
> Initial use-case I see for this is obs-shelve's shelve files. Later we
> can possibly migrate state files to this approach.
>
> The test is in a new file beause I did not figure out where to put it
> within existing test suite. If you give me a better idea, I will gladly
> follow it.
>
> diff --git a/mercurial/error.py b/mercurial/error.py
> --- a/mercurial/error.py
> +++ b/mercurial/error.py
> @@ -246,3 +246,6 @@ class UnsupportedBundleSpecification(Exc
>
>  class CorruptedState(Exception):
>      """error raised when a command is not able to read its state from file"""
> +
> +class MissingRequiredKey(Exception):
> +    """error raised when simple key-value file misses a required key"""
> diff --git a/mercurial/scmutil.py b/mercurial/scmutil.py
> --- a/mercurial/scmutil.py
> +++ b/mercurial/scmutil.py
> @@ -1571,3 +1571,51 @@ class checkambigatclosing(closewrapbase)
>      def close(self):
>          self._origfh.close()
>          self._checkambig()
> +
> +class simplekeyvaluefile(object):
> +    """A simple file with key=value lines
> +
> +    Keys must be alphanumerics and start with a letter, values must not
> +    contain '\n' characters"""
> +
> +    # if KEYS is non-empty, read values are validated against it:
> +    # each key is a tuple (keyname, required)
> +    KEYS = []

Nit: this is something you're expecting subclasses to specify? Could
it instead be injected at construction time, or is that too
error-prone? (I'm envisioning some sort of factory method that lets us
skip making subclasses, eg:

def obsmarkerstorage(vfs, path):
  return simplekeyvaluefile(vfs, path, ['Required', 'optional'])

or similar as an API.)

> +    def __init__(self, vfs, path):
> +        self.vfs = vfs
> +        self.path = path
> +
> +    def validate(self, d):
> +        for key, req in self.KEYS:
> +            if req and key not in d:
> +                e = "missing a required key: '%s'" % key
> +                raise error.MissingRequiredKey(e)
> +
> +    def read(self):
> +        lines = self.vfs.readlines(self.path)
> +        try:
> +            d = dict(line[:-1].split('=', 1) for line in lines if line)
> +        except ValueError as e:
> +            raise error.CorruptedState(str(e))
> +        self.validate(d)
> +        return d
> +
> +    def write(self, data):
> +        """Write key=>value mapping to a file
> +        data is a dict. Keys must be alphanumerical and start with a letter.
> +        Values must not contain newline characters."""
> +        lines = []
> +        for k, v in data.items():
> +            if not k[0].isalpha():
> +                e = "keys must start with a letter in a key-value file"
> +                raise error.ProgrammingError(e)
> +            if not k.isalnum():
> +                e = "invalid key name in a simple key-value file"
> +                raise error.ProgrammingError(e)
> +            if '\n' in v:
> +                e = "invalid value in a simple key-value file"
> +                raise error.ProgrammingError(e)
> +            lines.append("%s=%s\n" % (k, v))
> +        with self.vfs(self.path, mode='wb', atomictemp=True) as fp:
> +            fp.write(''.join(lines))
> diff --git a/tests/test-simplekeyvaluefile.py b/tests/test-simplekeyvaluefile.py
> new file mode 100644
> --- /dev/null
> +++ b/tests/test-simplekeyvaluefile.py
> @@ -0,0 +1,87 @@
> +from __future__ import absolute_import
> +
> +import unittest
> +import silenttestrunner
> +
> +from mercurial import (
> +    error,
> +    scmutil,
> +)
> +
> +contents = {}

Rather than a module-global for this could you have a
set_file_contents_for_testing method (but pick a legal name!) on the
mockvfs and then the state is all contained in the vfs instance for
each test method? This feels too spooky.

> +
> +class fileobj(object):

nit: fakefile or mockfile for this?

> +    def __init__(self, name):
> +        self.name = name
> +
> +    def __enter__(self):
> +        return self
> +
> +    def __exit__(self, *args, **kwargs):
> +        pass
> +
> +    def write(self, text):
> +        contents[self.name] = text
> +
> +    def read(self):
> +        return contents[self.name]
> +
> +class mockvfs(object):
> +    def read(self, path):
> +        return fileobj(path).read()
> +
> +    def readlines(self, path):
> +        return fileobj(path).read().split('\n')
> +
> +    def __call__(self, path, mode, atomictemp):
> +        return fileobj(path)
> +
> +class testsimplekeyvaluefile(unittest.TestCase):
> +    def setUp(self):
> +        self.vfs = mockvfs()
> +
> +    def testbasicwriting(self):
> +        d = {'key1': 'value1', 'Key2': 'value2'}
> +        scmutil.simplekeyvaluefile(self.vfs, 'kvfile').write(d)
> +        self.assertEqual(sorted(self.vfs.read('kvfile').split('\n')),
> +                         ['', 'Key2=value2', 'key1=value1'])
> +
> +    def testinvalidkeys(self):
> +        d = {'0key1': 'value1', 'Key2': 'value2'}
> +        with self.assertRaisesRegexp(error.ProgrammingError,
> +                                     "keys must start with a letter.*"):
> +            scmutil.simplekeyvaluefile(self.vfs, 'kvfile').write(d)
> +        d = {'key1@': 'value1', 'Key2': 'value2'}
> +        with self.assertRaisesRegexp(error.ProgrammingError, "invalid key.*"):
> +            scmutil.simplekeyvaluefile(self.vfs, 'kvfile').write(d)
> +
> +    def testinvalidvalues(self):
> +        d = {'key1': 'value1', 'Key2': 'value2\n'}
> +        with self.assertRaisesRegexp(error.ProgrammingError, "invalid val.*"):
> +            scmutil.simplekeyvaluefile(self.vfs, 'kvfile').write(d)
> +
> +    def testrequiredkeys(self):
> +        d = {'key1': 'value1', 'Key2': 'value2'}
> +        scmutil.simplekeyvaluefile(self.vfs, 'allkeyshere').write(d)
> +
> +        class kvf(scmutil.simplekeyvaluefile):
> +            KEYS = [('key3', False), ('Key2', True)]
> +        self.assertEqual(sorted(kvf(self.vfs, 'allkeyshere').read().items()),
> +                         [('Key2', 'value'), ('key1', 'value')])
> +
> +        d = {'key1': 'value1', 'Key3': 'value2'}
> +        scmutil.simplekeyvaluefile(self.vfs, 'missingkeys').write(d)
> +
> +        class kvf(scmutil.simplekeyvaluefile):
> +            KEYS = [('key3', False), ('Key2', True)]
> +        with self.assertRaisesRegexp(error.MissingRequiredKey, "missing a.*"):
> +            kvf(self.vfs, 'missingkeys').read()
> +
> +    def testcorruptedfile(self):
> +        contents['badfile'] = 'ababagalamaga\n'
> +        with self.assertRaisesRegexp(error.CorruptedState,
> +                                     "dictionary.*element.*"):
> +            scmutil.simplekeyvaluefile(self.vfs, 'badfile').read()
> +
> +if __name__ == "__main__":
> +    silenttestrunner.main(__name__)
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Augie Fackler - Feb. 13, 2017, 11:29 p.m.
On Fri, Feb 03, 2017 at 02:47:26AM -0800, Kostia Balytskyi wrote:
> # HG changeset patch
> # User Kostia Balytskyi <ikostia@fb.com>
> # Date 1484824655 28800
> #      Thu Jan 19 03:17:35 2017 -0800
> # Node ID 19a449c91ef14e691cf1347748473e0094fedc86
> # Parent  9f264adbe75bfae8551dc0e6e0fce8d43fc7b43a
> scmutil: add a simple key-value file helper

(Also, protip: it was confusing to me to get this as an RFC but then
have other substantive work that's not RFC depend on it, because I
typically put RFCs at the bottom of my priority queue. Not sure if
others work that way or not....)

Patch

diff --git a/mercurial/error.py b/mercurial/error.py
--- a/mercurial/error.py
+++ b/mercurial/error.py
@@ -246,3 +246,6 @@  class UnsupportedBundleSpecification(Exc
 
 class CorruptedState(Exception):
     """error raised when a command is not able to read its state from file"""
+
+class MissingRequiredKey(Exception):
+    """error raised when simple key-value file misses a required key"""
diff --git a/mercurial/scmutil.py b/mercurial/scmutil.py
--- a/mercurial/scmutil.py
+++ b/mercurial/scmutil.py
@@ -1571,3 +1571,51 @@  class checkambigatclosing(closewrapbase)
     def close(self):
         self._origfh.close()
         self._checkambig()
+
+class simplekeyvaluefile(object):
+    """A simple file with key=value lines
+
+    Keys must be alphanumerics and start with a letter, values must not
+    contain '\n' characters"""
+
+    # if KEYS is non-empty, read values are validated against it:
+    # each key is a tuple (keyname, required)
+    KEYS = []
+
+    def __init__(self, vfs, path):
+        self.vfs = vfs
+        self.path = path
+
+    def validate(self, d):
+        for key, req in self.KEYS:
+            if req and key not in d:
+                e = "missing a required key: '%s'" % key
+                raise error.MissingRequiredKey(e)
+
+    def read(self):
+        lines = self.vfs.readlines(self.path)
+        try:
+            d = dict(line[:-1].split('=', 1) for line in lines if line)
+        except ValueError as e:
+            raise error.CorruptedState(str(e))
+        self.validate(d)
+        return d
+
+    def write(self, data):
+        """Write key=>value mapping to a file
+        data is a dict. Keys must be alphanumerical and start with a letter.
+        Values must not contain newline characters."""
+        lines = []
+        for k, v in data.items():
+            if not k[0].isalpha():
+                e = "keys must start with a letter in a key-value file"
+                raise error.ProgrammingError(e)
+            if not k.isalnum():
+                e = "invalid key name in a simple key-value file"
+                raise error.ProgrammingError(e)
+            if '\n' in v:
+                e = "invalid value in a simple key-value file"
+                raise error.ProgrammingError(e)
+            lines.append("%s=%s\n" % (k, v))
+        with self.vfs(self.path, mode='wb', atomictemp=True) as fp:
+            fp.write(''.join(lines))
diff --git a/tests/test-simplekeyvaluefile.py b/tests/test-simplekeyvaluefile.py
new file mode 100644
--- /dev/null
+++ b/tests/test-simplekeyvaluefile.py
@@ -0,0 +1,87 @@ 
+from __future__ import absolute_import
+
+import unittest
+import silenttestrunner
+
+from mercurial import (
+    error,
+    scmutil,
+)
+
+contents = {}
+
+class fileobj(object):
+    def __init__(self, name):
+        self.name = name
+
+    def __enter__(self):
+        return self
+
+    def __exit__(self, *args, **kwargs):
+        pass
+
+    def write(self, text):
+        contents[self.name] = text
+
+    def read(self):
+        return contents[self.name]
+
+class mockvfs(object):
+    def read(self, path):
+        return fileobj(path).read()
+
+    def readlines(self, path):
+        return fileobj(path).read().split('\n')
+
+    def __call__(self, path, mode, atomictemp):
+        return fileobj(path)
+
+class testsimplekeyvaluefile(unittest.TestCase):
+    def setUp(self):
+        self.vfs = mockvfs()
+
+    def testbasicwriting(self):
+        d = {'key1': 'value1', 'Key2': 'value2'}
+        scmutil.simplekeyvaluefile(self.vfs, 'kvfile').write(d)
+        self.assertEqual(sorted(self.vfs.read('kvfile').split('\n')),
+                         ['', 'Key2=value2', 'key1=value1'])
+
+    def testinvalidkeys(self):
+        d = {'0key1': 'value1', 'Key2': 'value2'}
+        with self.assertRaisesRegexp(error.ProgrammingError,
+                                     "keys must start with a letter.*"):
+            scmutil.simplekeyvaluefile(self.vfs, 'kvfile').write(d)
+        d = {'key1@': 'value1', 'Key2': 'value2'}
+        with self.assertRaisesRegexp(error.ProgrammingError, "invalid key.*"):
+            scmutil.simplekeyvaluefile(self.vfs, 'kvfile').write(d)
+
+    def testinvalidvalues(self):
+        d = {'key1': 'value1', 'Key2': 'value2\n'}
+        with self.assertRaisesRegexp(error.ProgrammingError, "invalid val.*"):
+            scmutil.simplekeyvaluefile(self.vfs, 'kvfile').write(d)
+
+    def testrequiredkeys(self):
+        d = {'key1': 'value1', 'Key2': 'value2'}
+        scmutil.simplekeyvaluefile(self.vfs, 'allkeyshere').write(d)
+
+        class kvf(scmutil.simplekeyvaluefile):
+            KEYS = [('key3', False), ('Key2', True)]
+        self.assertEqual(sorted(kvf(self.vfs, 'allkeyshere').read().items()),
+                         [('Key2', 'value'), ('key1', 'value')])
+
+        d = {'key1': 'value1', 'Key3': 'value2'}
+        scmutil.simplekeyvaluefile(self.vfs, 'missingkeys').write(d)
+
+        class kvf(scmutil.simplekeyvaluefile):
+            KEYS = [('key3', False), ('Key2', True)]
+        with self.assertRaisesRegexp(error.MissingRequiredKey, "missing a.*"):
+            kvf(self.vfs, 'missingkeys').read()
+
+    def testcorruptedfile(self):
+        contents['badfile'] = 'ababagalamaga\n'
+        with self.assertRaisesRegexp(error.CorruptedState,
+                                     "dictionary.*element.*"):
+            scmutil.simplekeyvaluefile(self.vfs, 'badfile').read()
+
+if __name__ == "__main__":
+    silenttestrunner.main(__name__)