Patchwork [2,of,2] configitems: document the choice of using 'match' instead of 'search'

login
register
mail settings
Submitter Boris Feld
Date Oct. 18, 2017, 2:04 p.m.
Message ID <e4c6f469eb215986ae8b.1508335472@FB>
Download mbox | patch
Permalink /patch/25165/
State Accepted
Headers show

Comments

Boris Feld - Oct. 18, 2017, 2:04 p.m.
# HG changeset patch
# User Boris Feld <boris.feld@octobus.net>
# Date 1508322983 -7200
#      Wed Oct 18 12:36:23 2017 +0200
# Node ID e4c6f469eb215986ae8b1d5624c5867501462a58
# Parent  a3110a4c8aa98ebd9fc26702eb81b9366edf7335
# EXP-Topic config.register.fixup
# Available At https://bitbucket.org/octobus/mercurial-devel/
#              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r e4c6f469eb21
configitems: document the choice of using 'match' instead of 'search'
Yuya Nishihara - Oct. 21, 2017, 5:14 a.m.
On Wed, 18 Oct 2017 16:04:32 +0200, Boris Feld wrote:
> # HG changeset patch
> # User Boris Feld <boris.feld@octobus.net>
> # Date 1508322983 -7200
> #      Wed Oct 18 12:36:23 2017 +0200
> # Node ID e4c6f469eb215986ae8b1d5624c5867501462a58
> # Parent  a3110a4c8aa98ebd9fc26702eb81b9366edf7335
> # EXP-Topic config.register.fixup
> # Available At https://bitbucket.org/octobus/mercurial-devel/
> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r e4c6f469eb21
> configitems: document the choice of using 'match' instead of 'search'
> 
> diff --git a/mercurial/configitems.py b/mercurial/configitems.py
> --- a/mercurial/configitems.py
> +++ b/mercurial/configitems.py
> @@ -74,6 +74,17 @@
>          # search for a matching generic item
>          generics = sorted(self._generics, key=(lambda x: (x.priority, x.name)))
>          for item in generics:
> +            # we use 'match' instead of 'search' to make the matching simpler
> +            # for people unfamiliar with regular expression. Having the match
> +            # rooted to the start of the string will produce less surprising
> +            # result for user writing simple regex for sub-attribute.

My concern was theoretical cost of backtracking caused by '.*<suffix>', though
it would be ignorable compared to compilation cost.

> +            # For example using "color\..*" match produces an unsurprising
> +            # result, while using search could suddenly match apparently
> +            # unrelated configuration that happens to contains "color."
> +            # anywhere. This is a tradeoff where we favor requiring ".*" on
> +            # some match to avoid the need to prefix most pattern with "^".
> +            # The "^" seems more error prone.

For this purpose, I think a restricted glob pattern is easier to read/write
and faster. All generic configitems in core can be rewritten as follows:

  '*'     # True
  'pfx*'  # pfx, sfx = name.split('*', 1)
  '*sfx'  # key.startswith(pfx) and key.endswith(sfx)

Patch

diff --git a/mercurial/configitems.py b/mercurial/configitems.py
--- a/mercurial/configitems.py
+++ b/mercurial/configitems.py
@@ -74,6 +74,17 @@ 
         # search for a matching generic item
         generics = sorted(self._generics, key=(lambda x: (x.priority, x.name)))
         for item in generics:
+            # we use 'match' instead of 'search' to make the matching simpler
+            # for people unfamiliar with regular expression. Having the match
+            # rooted to the start of the string will produce less surprising
+            # result for user writing simple regex for sub-attribute.
+            #
+            # For example using "color\..*" match produces an unsurprising
+            # result, while using search could suddenly match apparently
+            # unrelated configuration that happens to contains "color."
+            # anywhere. This is a tradeoff where we favor requiring ".*" on
+            # some match to avoid the need to prefix most pattern with "^".
+            # The "^" seems more error prone.
             if item._re.match(key):
                 return item