Patchwork [12,of,12,RFC] policy: add helper to import cext/pure module

login
register
mail settings
Submitter Yuya Nishihara
Date May 7, 2017, 1:45 a.m.
Message ID <7d94765310f0b8fb0476.1494121553@mimosa>
Download mbox | patch
Permalink /patch/20510/
State Accepted
Headers show

Comments

Yuya Nishihara - May 7, 2017, 1:45 a.m.
# HG changeset patch
# User Yuya Nishihara <yuya@tcha.org>
# Date 1470969017 -32400
#      Fri Aug 12 11:30:17 2016 +0900
# Node ID 7d94765310f0b8fb04765ce0bc958edfc3f12960
# Parent  89045bec86dcb51ac0641e06abd59e82d3c22172
policy: add helper to import cext/pure module

These functions are sysstr API since __import__() and getattr() hate byte
strings on Python 3.

There's a minor BC, which is ImportError will be raised if invalid
HGMODULEPOLICY is specified. I think this is more desirable behavior.

THIS PATCH IS RFC because the current implementation uses a proxy module
to dispatch attribute accesses to cext or pure. This appears the most
controversial part in this series. We could instead add version checks
if we plan to switch to per-module versioning (proposed by Jun Wu):

  def _importversioned(pkgname, modname, locals):
      mod = _importfrom(pkgname, modname, locals)
      ver = getattr(mod, r'version', 0)  # TODO: select name of constant
      if ver != 0:
          raise ImportError(r'module version mismatch')
      return mod

Where a mod is loaded before assigned to locals, so we should disable
demandimport in policy.importmod(). This means we won't need locals dict.
Augie Fackler - May 8, 2017, 7:22 p.m.
On Sun, May 07, 2017 at 10:45:53AM +0900, Yuya Nishihara wrote:
> # HG changeset patch
> # User Yuya Nishihara <yuya@tcha.org>
> # Date 1470969017 -32400
> #      Fri Aug 12 11:30:17 2016 +0900
> # Node ID 7d94765310f0b8fb04765ce0bc958edfc3f12960
> # Parent  89045bec86dcb51ac0641e06abd59e82d3c22172
> policy: add helper to import cext/pure module

I've taken patches 1-10 since they seem to clean up a lot of things. I
don't have strong feelings on patch 12, and patch 11 looks fine to me,
but I didn't take either one.

Many thanks!

>
> These functions are sysstr API since __import__() and getattr() hate byte
> strings on Python 3.
>
> There's a minor BC, which is ImportError will be raised if invalid
> HGMODULEPOLICY is specified. I think this is more desirable behavior.
>
> THIS PATCH IS RFC because the current implementation uses a proxy module
> to dispatch attribute accesses to cext or pure. This appears the most
> controversial part in this series. We could instead add version checks
> if we plan to switch to per-module versioning (proposed by Jun Wu):
>
>   def _importversioned(pkgname, modname, locals):
>       mod = _importfrom(pkgname, modname, locals)
>       ver = getattr(mod, r'version', 0)  # TODO: select name of constant
>       if ver != 0:
>           raise ImportError(r'module version mismatch')
>       return mod
>
> Where a mod is loaded before assigned to locals, so we should disable
> demandimport in policy.importmod(). This means we won't need locals dict.
>
> diff --git a/mercurial/policy.py b/mercurial/policy.py
> --- a/mercurial/policy.py
> +++ b/mercurial/policy.py
> @@ -24,6 +24,13 @@ import sys
>  policy = b'allow'
>  policynoc = (b'cffi', b'cffi-allow', b'py')
>  policynocffi = (b'c', b'py')
> +_packageprefs = {
> +    b'c': [r'cext'],
> +    b'allow': [r'cext', r'pure'],
> +    b'cffi': [r'pure'],  # TODO: [r'cffi']
> +    b'cffi-allow': [r'pure'],  # TODO: [r'cffi', r'pure']
> +    b'py': [r'pure'],
> +}
>
>  try:
>      from . import __modulepolicy__
> @@ -49,3 +56,53 @@ if sys.version_info[0] >= 3:
>          policy = os.environ[r'HGMODULEPOLICY'].encode(r'utf-8')
>  else:
>      policy = os.environ.get(r'HGMODULEPOLICY', policy)
> +
> +class _dualmod(object):
> +    """Proxy loading named attribute from mod1 or mod2
> +
> +    mod2 is kept unloaded as long as the given name can be found in mod1.
> +    """
> +
> +    def __init__(self, mod1, mod2):
> +        self._mod1 = mod1
> +        self._mod2 = mod2
> +
> +    def __getattr__(self, name):
> +        try:
> +            obj = getattr(self._mod1, name)  # may be unloaded module
> +        except (AttributeError, ImportError):
> +            obj = getattr(self._mod2, name)
> +        self.__dict__[name] = obj
> +        return obj
> +
> +    @property
> +    def __doc__(self):
> +        return self.__getattr__(r'__doc__')
> +
> +    def __repr__(self):
> +        return r'<dual module %r, %r>' % (self._mod1, self._mod2)
> +
> +def _importfrom(pkgname, modname, locals):
> +    # pass globals() to resolve name relative to this module
> +    pkg = __import__(pkgname, globals(), locals, [modname], level=1)
> +    return getattr(pkg, modname)
> +
> +def importmod(modname, locals):
> +    """Import module according to policy
> +
> +    locals dict must be specified so demandimport can update the module
> +    reference in place.
> +    """
> +    try:
> +        prefs = _packageprefs[policy]
> +    except KeyError:
> +        raise ImportError(r'invalid HGMODULEPOLICY %r' % policy)
> +
> +    if len(prefs) == 1:
> +        return _importfrom(prefs[0], modname, locals)
> +    try:
> +        mods = [_importfrom(p, modname, locals) for p in prefs]
> +        return _dualmod(*mods)
> +    except ImportError:
> +        # fallback module must exist
> +        return _importfrom(prefs[-1], modname, locals)
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Jun Wu - May 8, 2017, 11:19 p.m.
Excerpts from Augie Fackler's message of 2017-05-08 15:22:41 -0400:
> I've taken patches 1-10 since they seem to clean up a lot of things. I
> don't have strong feelings on patch 12, and patch 11 looks fine to me,
> but I didn't take either one.

Patch 11 looks good to me too. It's solving a similar problem with the auto
rebuild patch [1]. If we take patch 11, then the auto rebuild patch becomes
optional, although still nice to have for people wanting to test C code (for
reasons like perf, correctness, etc) explicitly.

Patch 12 is where I disagree. _dualmod still requires careful renaming and
could cause subtle problems that does not get warned by any code checking
tools. They also require extra effort in writing and reviewing related code.
I'd like to go through the explicit versioning way so even the renaming
churn could be avoided.

[1]: https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-May/097329.html
Yuya Nishihara - May 9, 2017, 2:09 p.m.
On Mon, 8 May 2017 16:19:50 -0700, Jun Wu wrote:
> Excerpts from Augie Fackler's message of 2017-05-08 15:22:41 -0400:
> > I've taken patches 1-10 since they seem to clean up a lot of things. I
> > don't have strong feelings on patch 12, and patch 11 looks fine to me,
> > but I didn't take either one.
> 
> Patch 11 looks good to me too. It's solving a similar problem with the auto
> rebuild patch [1]. If we take patch 11, then the auto rebuild patch becomes
> optional, although still nice to have for people wanting to test C code (for
> reasons like perf, correctness, etc) explicitly.

The auto-rebuild could be HGMODULEPOLICY=compile if we want.

> Patch 12 is where I disagree. _dualmod still requires careful renaming and
> could cause subtle problems that does not get warned by any code checking
> tools. They also require extra effort in writing and reviewing related code.
> I'd like to go through the explicit versioning way so even the renaming
> churn could be avoided.
> 
> [1]: https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-May/097329.html

I'm okay for the explicit versioning. I can adjust the last patch for that.

I don't have strong preference since managing API versions would be as
simple as managing explicit version constants unless we have internal
dependency in C layer.
Jun Wu - May 9, 2017, 6:08 p.m.
Excerpts from Yuya Nishihara's message of 2017-05-09 23:09:07 +0900:
> The auto-rebuild could be HGMODULEPOLICY=compile if we want.

I actually prefer "allow" by default to auto build. People wanting
up-to-date binaries should run `make local` manually.

> > Patch 12 is where I disagree. _dualmod still requires careful renaming and
> > could cause subtle problems that does not get warned by any code checking
> > tools. They also require extra effort in writing and reviewing related code.
> > I'd like to go through the explicit versioning way so even the renaming
> > churn could be avoided.
> > 
> > [1]: https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-May/097329.html 
> 
> I'm okay for the explicit versioning. I can adjust the last patch for that.
> 
> I don't have strong preference since managing API versions would be as
> simple as managing explicit version constants unless we have internal
> dependency in C layer.

We do have complex cases: module-level exception (mpatch.mpatchError), and
parsers.c with complex Python objects. Think about if someone wants to catch
mpatchError, or change the return value of lmiter_iterkeysnext in manifest.c
for example.
Yuya Nishihara - May 10, 2017, 1:56 p.m.
On Tue, 9 May 2017 11:08:03 -0700, Jun Wu wrote:
> > > Patch 12 is where I disagree. _dualmod still requires careful renaming and
> > > could cause subtle problems that does not get warned by any code checking
> > > tools. They also require extra effort in writing and reviewing related code.
> > > I'd like to go through the explicit versioning way so even the renaming
> > > churn could be avoided.
> > > 
> > > [1]: https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-May/097329.html 
> > 
> > I'm okay for the explicit versioning. I can adjust the last patch for that.
> > 
> > I don't have strong preference since managing API versions would be as
> > simple as managing explicit version constants unless we have internal
> > dependency in C layer.
> 
> We do have complex cases: module-level exception (mpatch.mpatchError), and
> parsers.c with complex Python objects. Think about if someone wants to catch
> mpatchError, or change the return value of lmiter_iterkeysnext in manifest.c
> for example.

Yeah, I know. My opinion is that I won't care much about that as we don't
right now. FWIW, mpatchError isn't exported by the C module.

That said, there seems no objection to switching to the per-module versioning,
so maybe we can go for it.
Sean Farley - May 10, 2017, 6:04 p.m.
Yuya Nishihara <yuya@tcha.org> writes:

> On Tue, 9 May 2017 11:08:03 -0700, Jun Wu wrote:
>> > > Patch 12 is where I disagree. _dualmod still requires careful renaming and
>> > > could cause subtle problems that does not get warned by any code checking
>> > > tools. They also require extra effort in writing and reviewing related code.
>> > > I'd like to go through the explicit versioning way so even the renaming
>> > > churn could be avoided.
>> > > 
>> > > [1]: https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-May/097329.html 
>> > 
>> > I'm okay for the explicit versioning. I can adjust the last patch for that.
>> > 
>> > I don't have strong preference since managing API versions would be as
>> > simple as managing explicit version constants unless we have internal
>> > dependency in C layer.
>> 
>> We do have complex cases: module-level exception (mpatch.mpatchError), and
>> parsers.c with complex Python objects. Think about if someone wants to catch
>> mpatchError, or change the return value of lmiter_iterkeysnext in manifest.c
>> for example.
>
> Yeah, I know. My opinion is that I won't care much about that as we don't
> right now. FWIW, mpatchError isn't exported by the C module.
>
> That said, there seems no objection to switching to the per-module versioning,
> so maybe we can go for it.

Maybe would could have a discussion of all the pros and cons? If that's
overkill, then I trust Yuya to make a good call here.
Yuya Nishihara - May 11, 2017, 3:13 p.m.
On Wed, 10 May 2017 11:04:41 -0700, Sean Farley wrote:
> Yuya Nishihara <yuya@tcha.org> writes:
> > On Tue, 9 May 2017 11:08:03 -0700, Jun Wu wrote:
> >> > > Patch 12 is where I disagree. _dualmod still requires careful renaming and
> >> > > could cause subtle problems that does not get warned by any code checking
> >> > > tools. They also require extra effort in writing and reviewing related code.
> >> > > I'd like to go through the explicit versioning way so even the renaming
> >> > > churn could be avoided.
> >> > > 
> >> > > [1]: https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-May/097329.html 
> >> > 
> >> > I'm okay for the explicit versioning. I can adjust the last patch for that.
> >> > 
> >> > I don't have strong preference since managing API versions would be as
> >> > simple as managing explicit version constants unless we have internal
> >> > dependency in C layer.
> >> 
> >> We do have complex cases: module-level exception (mpatch.mpatchError), and
> >> parsers.c with complex Python objects. Think about if someone wants to catch
> >> mpatchError, or change the return value of lmiter_iterkeysnext in manifest.c
> >> for example.
> >
> > Yeah, I know. My opinion is that I won't care much about that as we don't
> > right now. FWIW, mpatchError isn't exported by the C module.
> >
> > That said, there seems no objection to switching to the per-module versioning,
> > so maybe we can go for it.
> 
> Maybe would could have a discussion of all the pros and cons? If that's
> overkill, then I trust Yuya to make a good call here.

Pros of module versioning (cons of dualdmod):

 a. more consistent for inner dependencies (attributes in pure/cext modules
    will never be mixed)
 b. no need to rename a function (and its callers) on API change
 c. module loader will be slightly simpler (unless we add the auto-compile
    option)

Pros of dualmod (cons of module versioning):

 a. more permissive (only missing attributes will fallback to pure's)
 b. no need to maintain module version for API changes

Patch

diff --git a/mercurial/policy.py b/mercurial/policy.py
--- a/mercurial/policy.py
+++ b/mercurial/policy.py
@@ -24,6 +24,13 @@  import sys
 policy = b'allow'
 policynoc = (b'cffi', b'cffi-allow', b'py')
 policynocffi = (b'c', b'py')
+_packageprefs = {
+    b'c': [r'cext'],
+    b'allow': [r'cext', r'pure'],
+    b'cffi': [r'pure'],  # TODO: [r'cffi']
+    b'cffi-allow': [r'pure'],  # TODO: [r'cffi', r'pure']
+    b'py': [r'pure'],
+}
 
 try:
     from . import __modulepolicy__
@@ -49,3 +56,53 @@  if sys.version_info[0] >= 3:
         policy = os.environ[r'HGMODULEPOLICY'].encode(r'utf-8')
 else:
     policy = os.environ.get(r'HGMODULEPOLICY', policy)
+
+class _dualmod(object):
+    """Proxy loading named attribute from mod1 or mod2
+
+    mod2 is kept unloaded as long as the given name can be found in mod1.
+    """
+
+    def __init__(self, mod1, mod2):
+        self._mod1 = mod1
+        self._mod2 = mod2
+
+    def __getattr__(self, name):
+        try:
+            obj = getattr(self._mod1, name)  # may be unloaded module
+        except (AttributeError, ImportError):
+            obj = getattr(self._mod2, name)
+        self.__dict__[name] = obj
+        return obj
+
+    @property
+    def __doc__(self):
+        return self.__getattr__(r'__doc__')
+
+    def __repr__(self):
+        return r'<dual module %r, %r>' % (self._mod1, self._mod2)
+
+def _importfrom(pkgname, modname, locals):
+    # pass globals() to resolve name relative to this module
+    pkg = __import__(pkgname, globals(), locals, [modname], level=1)
+    return getattr(pkg, modname)
+
+def importmod(modname, locals):
+    """Import module according to policy
+
+    locals dict must be specified so demandimport can update the module
+    reference in place.
+    """
+    try:
+        prefs = _packageprefs[policy]
+    except KeyError:
+        raise ImportError(r'invalid HGMODULEPOLICY %r' % policy)
+
+    if len(prefs) == 1:
+        return _importfrom(prefs[0], modname, locals)
+    try:
+        mods = [_importfrom(p, modname, locals) for p in prefs]
+        return _dualmod(*mods)
+    except ImportError:
+        # fallback module must exist
+        return _importfrom(prefs[-1], modname, locals)