Patchwork [1,of,2] config: introduce load order tracking

login
register
mail settings
Submitter Boris Feld
Date July 9, 2018, 10:12 a.m.
Message ID <1019d8a4f6b810aaa636.1531131141@FB-lair>
Download mbox | patch
Permalink /patch/32696/
State Accepted
Headers show

Comments

Boris Feld - July 9, 2018, 10:12 a.m.
# HG changeset patch
# User Boris Feld <boris.feld@octobus.net>
# Date 1530887625 -7200
#      Fri Jul 06 16:33:45 2018 +0200
# Node ID 1019d8a4f6b810aaa63651ed56b29668650f590e
# Parent  8c38d29482177cd40243a008057d6762c7d23c6f
# EXP-Topic config-order
# Available At https://bitbucket.org/octobus/mercurial-devel/
#              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 1019d8a4f6b8
config: introduce load order tracking

Configuration values reads from disk are now associated with the index of the
file they came from. (First loaded file has index 0, second file index 1,
etc…).

This will ultimately allow us to use the alias value of a configuration item
if it was defined in a higher priority file than the configuration item value
itself.

See next changeset for details.

Value set directly through the code or through the command line have the highest
priority.
Yuya Nishihara - July 9, 2018, 12:50 p.m.
On Mon, 09 Jul 2018 12:12:21 +0200, Boris Feld wrote:
> # HG changeset patch
> # User Boris Feld <boris.feld@octobus.net>
> # Date 1530887625 -7200
> #      Fri Jul 06 16:33:45 2018 +0200
> # Node ID 1019d8a4f6b810aaa63651ed56b29668650f590e
> # Parent  8c38d29482177cd40243a008057d6762c7d23c6f
> # EXP-Topic config-order
> # Available At https://bitbucket.org/octobus/mercurial-devel/
> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 1019d8a4f6b8
> config: introduce load order tracking
> 
> Configuration values reads from disk are now associated with the index of the
> file they came from. (First loaded file has index 0, second file index 1,
> etc…).
> 
> This will ultimately allow us to use the alias value of a configuration item
> if it was defined in a higher priority file than the configuration item value
> itself.
> 
> See next changeset for details.
> 
> Value set directly through the code or through the command line have the highest
> priority.

Didn't read this thoroughly, but have you tried making configs layered (e.g.
overlay -> repo -> user -> env -> system)?

https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-March/095733.html

The previous RFC series seems a bit over engineered, but I think the idea
behind it is probably better overall than keep track of the "level" for each
item. We can get rid of the "cow" hack, configoverride() can be instant, etc.

> --- a/mercurial/config.py
> +++ b/mercurial/config.py

>      def restore(self, data):
>          """restore data returned by self.backup"""
>          self._source = self._source.preparewrite()
> -        if len(data) == 4:
> +        self._level = self._level.preparewrite()
> +        if len(data) == 5:
>              # restore old data
> -            section, item, value, source = data
> +            section, item, value, source, level = data
>              self._data[section] = self._data[section].preparewrite()
>              self._data[section][item] = value
>              self._source[(section, item)] = source
> +            self._source[(section, item)] = level

self._level ?

> --- a/mercurial/ui.py
> +++ b/mercurial/ui.py
> @@ -191,6 +191,9 @@ def _catchterm(*args):
>  # _reqexithandlers: callbacks run at the end of a request
>  _reqexithandlers = []
>  
> +# config level set directly have higher level than those from disk
> +directconfig = sys.maxint

sys.maxint no longer exists in Python 3.
Boris Feld - July 10, 2018, 1:55 p.m.
On 09/07/2018 14:50, Yuya Nishihara wrote:
> On Mon, 09 Jul 2018 12:12:21 +0200, Boris Feld wrote:
>> # HG changeset patch
>> # User Boris Feld <boris.feld@octobus.net>
>> # Date 1530887625 -7200
>> #      Fri Jul 06 16:33:45 2018 +0200
>> # Node ID 1019d8a4f6b810aaa63651ed56b29668650f590e
>> # Parent  8c38d29482177cd40243a008057d6762c7d23c6f
>> # EXP-Topic config-order
>> # Available At https://bitbucket.org/octobus/mercurial-devel/
>> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 1019d8a4f6b8
>> config: introduce load order tracking
>>
>> Configuration values reads from disk are now associated with the index of the
>> file they came from. (First loaded file has index 0, second file index 1,
>> etc…).
>>
>> This will ultimately allow us to use the alias value of a configuration item
>> if it was defined in a higher priority file than the configuration item value
>> itself.
>>
>> See next changeset for details.
>>
>> Value set directly through the code or through the command line have the highest
>> priority.
> Didn't read this thoroughly, but have you tried making configs layered (e.g.
> overlay -> repo -> user -> env -> system)?
>
> https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-March/095733.html
>
> The previous RFC series seems a bit over engineered, but I think the idea
> behind it is probably better overall than keep track of the "level" for each
> item. We can get rid of the "cow" hack, configoverride() can be instant, etc.

The layered approach seems to be the right architecture but will require 
more refactoring and careful works to make it right. We would like to 
favor for the current simpler, hackier, approach and prepare the road to 
the layered configuration way.

The layers would have the biggest impact with chg and we plan to take a 
look at it once we have chg works in our schedule.

>
>> --- a/mercurial/config.py
>> +++ b/mercurial/config.py
>>       def restore(self, data):
>>           """restore data returned by self.backup"""
>>           self._source = self._source.preparewrite()
>> -        if len(data) == 4:
>> +        self._level = self._level.preparewrite()
>> +        if len(data) == 5:
>>               # restore old data
>> -            section, item, value, source = data
>> +            section, item, value, source, level = data
>>               self._data[section] = self._data[section].preparewrite()
>>               self._data[section][item] = value
>>               self._source[(section, item)] = source
>> +            self._source[(section, item)] = level
> self._level ?
>
>> --- a/mercurial/ui.py
>> +++ b/mercurial/ui.py
>> @@ -191,6 +191,9 @@ def _catchterm(*args):
>>   # _reqexithandlers: callbacks run at the end of a request
>>   _reqexithandlers = []
>>   
>> +# config level set directly have higher level than those from disk
>> +directconfig = sys.maxint
> sys.maxint no longer exists in Python 3.
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
via Mercurial-devel - July 10, 2018, 2:58 p.m.
I'd also be happy to see the immutable configs series land. Yuya, do you
remember what about it you found over engineered? Perhaps it wouldn't be
hard to fix those things and land it.

On Tue, Jul 10, 2018, 06:55 Boris FELD <boris.feld@octobus.net> wrote:

> On 09/07/2018 14:50, Yuya Nishihara wrote:
> > On Mon, 09 Jul 2018 12:12:21 +0200, Boris Feld wrote:
> >> # HG changeset patch
> >> # User Boris Feld <boris.feld@octobus.net>
> >> # Date 1530887625 -7200
> >> #      Fri Jul 06 16:33:45 2018 +0200
> >> # Node ID 1019d8a4f6b810aaa63651ed56b29668650f590e
> >> # Parent  8c38d29482177cd40243a008057d6762c7d23c6f
> >> # EXP-Topic config-order
> >> # Available At https://bitbucket.org/octobus/mercurial-devel/
> >> #              hg pull https://bitbucket.org/octobus/mercurial-devel/
> -r 1019d8a4f6b8
> >> config: introduce load order tracking
> >>
> >> Configuration values reads from disk are now associated with the index
> of the
> >> file they came from. (First loaded file has index 0, second file index
> 1,
> >> etc…).
> >>
> >> This will ultimately allow us to use the alias value of a configuration
> item
> >> if it was defined in a higher priority file than the configuration item
> value
> >> itself.
> >>
> >> See next changeset for details.
> >>
> >> Value set directly through the code or through the command line have
> the highest
> >> priority.
> > Didn't read this thoroughly, but have you tried making configs layered
> (e.g.
> > overlay -> repo -> user -> env -> system)?
> >
> >
> https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-March/095733.html
> >
> > The previous RFC series seems a bit over engineered, but I think the idea
> > behind it is probably better overall than keep track of the "level" for
> each
> > item. We can get rid of the "cow" hack, configoverride() can be instant,
> etc.
>
> The layered approach seems to be the right architecture but will require
> more refactoring and careful works to make it right. We would like to
> favor for the current simpler, hackier, approach and prepare the road to
> the layered configuration way.
>
> The layers would have the biggest impact with chg and we plan to take a
> look at it once we have chg works in our schedule.
>
> >
> >> --- a/mercurial/config.py
> >> +++ b/mercurial/config.py
> >>       def restore(self, data):
> >>           """restore data returned by self.backup"""
> >>           self._source = self._source.preparewrite()
> >> -        if len(data) == 4:
> >> +        self._level = self._level.preparewrite()
> >> +        if len(data) == 5:
> >>               # restore old data
> >> -            section, item, value, source = data
> >> +            section, item, value, source, level = data
> >>               self._data[section] = self._data[section].preparewrite()
> >>               self._data[section][item] = value
> >>               self._source[(section, item)] = source
> >> +            self._source[(section, item)] = level
> > self._level ?
> >
> >> --- a/mercurial/ui.py
> >> +++ b/mercurial/ui.py
> >> @@ -191,6 +191,9 @@ def _catchterm(*args):
> >>   # _reqexithandlers: callbacks run at the end of a request
> >>   _reqexithandlers = []
> >>
> >> +# config level set directly have higher level than those from disk
> >> +directconfig = sys.maxint
> > sys.maxint no longer exists in Python 3.
> > _______________________________________________
> > Mercurial-devel mailing list
> > Mercurial-devel@mercurial-scm.org
> > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
Yuya Nishihara - July 10, 2018, 3:01 p.m.
On Tue, 10 Jul 2018 15:55:36 +0200, Boris FELD wrote:
> On 09/07/2018 14:50, Yuya Nishihara wrote:
> > On Mon, 09 Jul 2018 12:12:21 +0200, Boris Feld wrote:
> >> # HG changeset patch
> >> # User Boris Feld <boris.feld@octobus.net>
> >> # Date 1530887625 -7200
> >> #      Fri Jul 06 16:33:45 2018 +0200
> >> # Node ID 1019d8a4f6b810aaa63651ed56b29668650f590e
> >> # Parent  8c38d29482177cd40243a008057d6762c7d23c6f
> >> # EXP-Topic config-order
> >> # Available At https://bitbucket.org/octobus/mercurial-devel/
> >> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 1019d8a4f6b8
> >> config: introduce load order tracking
> >>
> >> Configuration values reads from disk are now associated with the index of the
> >> file they came from. (First loaded file has index 0, second file index 1,
> >> etc…).
> >>
> >> This will ultimately allow us to use the alias value of a configuration item
> >> if it was defined in a higher priority file than the configuration item value
> >> itself.
> >>
> >> See next changeset for details.
> >>
> >> Value set directly through the code or through the command line have the highest
> >> priority.
> > Didn't read this thoroughly, but have you tried making configs layered (e.g.
> > overlay -> repo -> user -> env -> system)?
> >
> > https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-March/095733.html
> >
> > The previous RFC series seems a bit over engineered, but I think the idea
> > behind it is probably better overall than keep track of the "level" for each
> > item. We can get rid of the "cow" hack, configoverride() can be instant, etc.
> 
> The layered approach seems to be the right architecture but will require 
> more refactoring and careful works to make it right. We would like to 
> favor for the current simpler, hackier, approach and prepare the road to 
> the layered configuration way.

If the problem to be addressed by this hack is important, and if the hack is
really simple, sure I would say go for it. But I, as an old Mercurial user,
see the bulk renaming of the config keys is minor improvement.

Thoughts?
Yuya Nishihara - July 10, 2018, 3:19 p.m.
On Tue, 10 Jul 2018 07:58:04 -0700, Martin von Zweigbergk wrote:
> I'd also be happy to see the immutable configs series land. Yuya, do you
> remember what about it you found over engineered? Perhaps it wouldn't be
> hard to fix those things and land it.

It isn't landed just because Jun got busy for the other work. I don't
have any particular concern, but I just felt it could be slightly simpler.
David Demelier - July 11, 2018, 11:32 a.m.
On Wed, 2018-07-11 at 00:01 +0900, Yuya Nishihara wrote:
> If the problem to be addressed by this hack is important, and if the
> hack is
> really simple, sure I would say go for it. But I, as an old Mercurial
> user,
> see the bulk renaming of the config keys is minor improvement.

I think it's important, consistency in a software shows a strong core
guideline applied and strict reviewing. The current hgrc manual shows
three different styles along with unreadable names like
backgroundclosethreadcount.

Obviously my opinion is largely biased since I'm half author of the
ConfigConsolidationPlan [0].

What bothers me is that we agreed on a new syntax for new option style
[1] and it's still not applied, see server.streamunbundle and D3893.
However I think you're aware since you personally pushed my deprecated
topic.

Hopefully av6 helps me with it's reviewing :-)

[0]: https://www.mercurial-scm.org/wiki/ConfigConsolidationPlan
[1]: https://www.mercurial-scm.org/wiki/UIGuideline#adding_new_options

Regards,
Boris Feld - July 13, 2018, 8:30 a.m.
On 11/07/2018 13:32, David Demelier wrote:
> On Wed, 2018-07-11 at 00:01 +0900, Yuya Nishihara wrote:
>> If the problem to be addressed by this hack is important, and if the
>> hack is
>> really simple, sure I would say go for it. But I, as an old Mercurial
>> user,
>> see the bulk renaming of the config keys is minor improvement.
> I think it's important, consistency in a software shows a strong core
> guideline applied and strict reviewing. The current hgrc manual shows
> three different styles along with unreadable names like
> backgroundclosethreadcount.

We agree with David here that consistency helps the user experience.

Even for more experienced contributors, configuration inconsistency 
leads to waste time at multiple occurrences on mistakes. This happens on 
a regular basis at octobus or at places we provide support to.

We would like to improve the situation in order to improve Mercurial 
contributors and users productivity by avoiding configuration mistakes.

>
> Obviously my opinion is largely biased since I'm half author of the
> ConfigConsolidationPlan [0].
>
> What bothers me is that we agreed on a new syntax for new option style
> [1] and it's still not applied, see server.streamunbundle and D3893.
> However I think you're aware since you personally pushed my deprecated
> topic.
>
> Hopefully av6 helps me with it's reviewing :-)
>
> [0]: https://www.mercurial-scm.org/wiki/ConfigConsolidationPlan
> [1]: https://www.mercurial-scm.org/wiki/UIGuideline#adding_new_options
>
> Regards,
>
Yuya Nishihara - July 13, 2018, 1:01 p.m.
On Fri, 13 Jul 2018 10:30:33 +0200, Boris FELD wrote:
> On 11/07/2018 13:32, David Demelier wrote:
> > On Wed, 2018-07-11 at 00:01 +0900, Yuya Nishihara wrote:
> >> If the problem to be addressed by this hack is important, and if the
> >> hack is
> >> really simple, sure I would say go for it. But I, as an old Mercurial
> >> user,
> >> see the bulk renaming of the config keys is minor improvement.
> > I think it's important, consistency in a software shows a strong core
> > guideline applied and strict reviewing. The current hgrc manual shows
> > three different styles along with unreadable names like
> > backgroundclosethreadcount.
> 
> We agree with David here that consistency helps the user experience.

I know. I'm just saying it isn't an immediate problem enough to add a
one-off hack to the config system.

> Even for more experienced contributors, configuration inconsistency
> leads to waste time at multiple occurrences on mistakes. This happens on
> a regular basis at octobus or at places we provide support to.

For that purpose, I think it's better to add a linter (e.g. hg config --check)
to detect unknown (but similar to registered) names. Inserting dashes to every
config name is merely a partial solution for ambiguous naming. There's always
inconsistency such as "X-count" vs "Xs" for the number of X.

Patch

diff --git a/mercurial/config.py b/mercurial/config.py
--- a/mercurial/config.py
+++ b/mercurial/config.py
@@ -26,8 +26,11 @@  class config(object):
             for k in data._data:
                 self._data[k] = data[k].copy()
             self._source = data._source.copy()
+            self._level = data._level.copy()
         else:
             self._source = util.cowdict()
+            self._level = util.cowdict()
+
     def copy(self):
         return config(self)
     def __contains__(self, section):
@@ -47,6 +50,7 @@  class config(object):
                 self._data[s] = ds.preparewrite()
                 del self._data[s][n]
                 del self._source[(s, n)]
+                self._level.pop((s, n), None)
         for s in src:
             ds = self._data.get(s, None)
             if ds:
@@ -55,6 +59,7 @@  class config(object):
                 self._data[s] = util.cowsortdict()
             self._data[s].update(src._data[s])
         self._source.update(src._source)
+        self._level.update(src._level)
     def get(self, section, item, default=None):
         return self._data.get(section, {}).get(item, default)
 
@@ -66,17 +71,20 @@  class config(object):
         try:
             value = self._data[section][item]
             source = self.source(section, item)
-            return (section, item, value, source)
+            level = self.level(section, item)
+            return (section, item, value, source, level)
         except KeyError:
             return (section, item)
 
     def source(self, section, item):
         return self._source.get((section, item), "")
+    def level(self, section, item):
+        return self._level.get((section, item), None)
     def sections(self):
         return sorted(self._data.keys())
     def items(self, section):
         return list(self._data.get(section, {}).iteritems())
-    def set(self, section, item, value, source=""):
+    def set(self, section, item, value, source="", level=None):
         if pycompat.ispy3:
             assert not isinstance(value, str), (
                 'config values may not be unicode strings on Python 3')
@@ -88,24 +96,30 @@  class config(object):
         if source:
             self._source = self._source.preparewrite()
             self._source[(section, item)] = source
+        if level is not None:
+            self._level[(section, item)] = level
 
     def restore(self, data):
         """restore data returned by self.backup"""
         self._source = self._source.preparewrite()
-        if len(data) == 4:
+        self._level = self._level.preparewrite()
+        if len(data) == 5:
             # restore old data
-            section, item, value, source = data
+            section, item, value, source, level = data
             self._data[section] = self._data[section].preparewrite()
             self._data[section][item] = value
             self._source[(section, item)] = source
+            self._source[(section, item)] = level
         else:
             # no data before, remove everything
             section, item = data
             if section in self._data:
                 self._data[section].pop(item, None)
             self._source.pop((section, item), None)
+            self._level.pop((section, item), None)
 
-    def parse(self, src, data, sections=None, remap=None, include=None):
+    def parse(self, src, data, sections=None, remap=None, include=None,
+            level=None):
         sectionre = util.re.compile(br'\[([^\[]+)\]')
         itemre = util.re.compile(br'([^=\s][^=]*?)\s*=\s*(.*\S|)')
         contre = util.re.compile(br'\s+(\S|\S.*\S)\s*$')
@@ -134,7 +148,8 @@  class config(object):
                     if sections and section not in sections:
                         continue
                     v = self.get(section, item) + "\n" + m.group(1)
-                    self.set(section, item, v, "%s:%d" % (src, line))
+                    self.set(section, item, v, "%s:%d" % (src, line),
+                             level=level)
                     continue
                 item = None
                 cont = False
@@ -172,7 +187,8 @@  class config(object):
                 cont = True
                 if sections and section not in sections:
                     continue
-                self.set(section, item, m.group(2), "%s:%d" % (src, line))
+                self.set(section, item, m.group(2), "%s:%d" % (src, line),
+                         level=level)
                 continue
             m = unsetre.match(l)
             if m:
@@ -187,14 +203,14 @@  class config(object):
 
             raise error.ParseError(l.rstrip(), ("%s:%d" % (src, line)))
 
-    def read(self, path, fp=None, sections=None, remap=None):
+    def read(self, path, fp=None, sections=None, remap=None, level=None):
         if not fp:
             fp = util.posixfile(path, 'rb')
         assert getattr(fp, 'mode', r'rb') == r'rb', (
             'config files must be opened in binary mode, got fp=%r mode=%r' % (
                 fp, fp.mode))
-        self.parse(path, fp.read(),
-                   sections=sections, remap=remap, include=self.read)
+        self.parse(path, fp.read(), sections=sections, remap=remap,
+                   include=self.read, level=level)
 
 def parselist(value):
     """parse a configuration value as a list of comma/space separated strings
diff --git a/mercurial/ui.py b/mercurial/ui.py
--- a/mercurial/ui.py
+++ b/mercurial/ui.py
@@ -191,6 +191,9 @@  def _catchterm(*args):
 # _reqexithandlers: callbacks run at the end of a request
 _reqexithandlers = []
 
+# config level set directly have higher level than those from disk
+directconfig = sys.maxint
+
 class ui(object):
     def __init__(self, src=None):
         """Create a fresh new ui object if no src given
@@ -210,6 +213,7 @@  class ui(object):
         self.quiet = self.verbose = self.debugflag = self.tracebackflag = False
         self._reportuntrusted = True
         self._knownconfig = configitems.coreitems
+        self._configlevel = 0
         self._ocfg = config.config() # overlay
         self._tcfg = config.config() # trusted
         self._ucfg = config.config() # untrusted
@@ -234,6 +238,7 @@  class ui(object):
             self._disablepager = src._disablepager
             self._tweaked = src._tweaked
 
+            self._configlevel = src._configlevel
             self._tcfg = src._tcfg.copy()
             self._ucfg = src._ucfg.copy()
             self._ocfg = src._ocfg.copy()
@@ -282,12 +287,14 @@  class ui(object):
             if t == 'path':
                 u.readconfig(f, trust=True)
             elif t == 'items':
+                level = u._configlevel
+                u._configlevel += 1
                 sections = set()
                 for section, name, value, source in f:
                     # do not set u._ocfg
                     # XXX clean this up once immutable config object is a thing
-                    u._tcfg.set(section, name, value, source)
-                    u._ucfg.set(section, name, value, source)
+                    u._tcfg.set(section, name, value, source, level=level)
+                    u._ucfg.set(section, name, value, source, level=level)
                     sections.add(section)
                 for section in sections:
                     u.fixconfig(section=section)
@@ -314,7 +321,9 @@  class ui(object):
         for section in tmpcfg:
             for name, value in tmpcfg.items(section):
                 if not self.hasconfig(section, name):
-                    self.setconfig(section, name, value, "<tweakdefaults>")
+                    self.setconfig(section, name, value, "<tweakdefaults>",
+                                   level=self._configlevel)
+        self._configlevel += 1
 
     def copy(self):
         return self.__class__(self)
@@ -390,6 +399,8 @@  class ui(object):
 
     def readconfig(self, filename, root=None, trust=False,
                    sections=None, remap=None):
+        level = self._configlevel
+        self._configlevel += 1
         try:
             fp = open(filename, u'rb')
         except IOError:
@@ -401,7 +412,7 @@  class ui(object):
         trusted = sections or trust or self._trusted(fp, filename)
 
         try:
-            cfg.read(filename, fp, sections=sections, remap=remap)
+            cfg.read(filename, fp, sections=sections, remap=remap, level=level)
             fp.close()
         except error.ConfigError as inst:
             if trusted:
@@ -459,7 +470,7 @@  class ui(object):
                     p = util.expandpath(p)
                     if not util.hasscheme(p) and not os.path.isabs(p):
                         p = os.path.normpath(os.path.join(root, p))
-                    c.set("paths", n, p)
+                    c.set("paths", n, p, level=c.level("paths", n))
 
         if section in (None, 'ui'):
             # update ui options
@@ -487,9 +498,9 @@  class ui(object):
         self._tcfg.restore(data[1])
         self._ucfg.restore(data[2])
 
-    def setconfig(self, section, name, value, source=''):
+    def setconfig(self, section, name, value, source='', level=directconfig):
         for cfg in (self._ocfg, self._tcfg, self._ucfg):
-            cfg.set(section, name, value, source)
+            cfg.set(section, name, value, source, level=level)
         self.fixconfig(section=section)
         self._maybetweakdefaults()