Patchwork [3,of,6] import-checker: guess what module is imported by list of locally defined ones

login
register
mail settings
Submitter Katsunori FUJIWARA
Date May 13, 2015, 4:53 p.m.
Message ID <be2b8d8ae72918035dbb.1431536036@juju>
Download mbox | patch
Permalink /patch/9048/
State Accepted
Headers show

Comments

Katsunori FUJIWARA - May 13, 2015, 4:53 p.m.
# HG changeset patch
# User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
# Date 1431535750 -32400
#      Thu May 14 01:49:10 2015 +0900
# Node ID be2b8d8ae72918035dbb1ab15bf6c6b3919dd40e
# Parent  d96e5eeff38853397e32a354a94052c4687535c0
import-checker: guess what module is imported by list of locally defined ones

Before this patch, to guess what module is imported in each modules,
"verify_stdlib_on_own_line()" examines whether imported module is one
of standard Python library or not.

But this causes some problems below:

  - "mixed imports" detection doesn't work as expected in some cases

    name of some mercurial specific modules collides against one of
    standard library (e.g. commands, parser and formatter).

  - it is difficult to list up enough module names of standard
    library in the portable and robust way

    for example, see fbdbff1b486a of Windows environment

To avoid problems above, this patch guesses what module is imported by
list of locally defined (= mercurial specific) ones.

This logic is reasonable, because locally defined module should be
imported via "import xxxx" (without "from yyyy"), even if its name
collides against one of standard library.

It is assumed that all locally defined modules are correctly specified
to "import-checker.py" at once.

"getprefix()" and "fromlocalfunc()" are defined as an individual
function to be reused in subsequent patch.
Augie Fackler - May 14, 2015, 12:08 a.m.
On Thu, May 14, 2015 at 01:53:56AM +0900, FUJIWARA Katsunori wrote:
> # HG changeset patch
> # User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
> # Date 1431535750 -32400
> #      Thu May 14 01:49:10 2015 +0900
> # Node ID be2b8d8ae72918035dbb1ab15bf6c6b3919dd40e
> # Parent  d96e5eeff38853397e32a354a94052c4687535c0
> import-checker: guess what module is imported by list of locally defined ones
>
> Before this patch, to guess what module is imported in each modules,
> "verify_stdlib_on_own_line()" examines whether imported module is one
> of standard Python library or not.
>
> But this causes some problems below:
>
>   - "mixed imports" detection doesn't work as expected in some cases
>
>     name of some mercurial specific modules collides against one of
>     standard library (e.g. commands, parser and formatter).

So, that's actually a problem in Python 3: we'll need to move to
explicit relative imports if we want modules with these names to work
right there.

I'm -0 on this patch because it hides Python 3 problems. If others
feel strongly, let's keep it, but I'm not super-thrilled with the result.

>
>   - it is difficult to list up enough module names of standard
>     library in the portable and robust way
>
>     for example, see fbdbff1b486a of Windows environment
>
> To avoid problems above, this patch guesses what module is imported by
> list of locally defined (= mercurial specific) ones.
>
> This logic is reasonable, because locally defined module should be
> imported via "import xxxx" (without "from yyyy"), even if its name
> collides against one of standard library.
>
> It is assumed that all locally defined modules are correctly specified
> to "import-checker.py" at once.
>
> "getprefix()" and "fromlocalfunc()" are defined as an individual
> function to be reused in subsequent patch.
>
> diff --git a/contrib/import-checker.py b/contrib/import-checker.py
> --- a/contrib/import-checker.py
> +++ b/contrib/import-checker.py
> @@ -137,7 +137,24 @@ def imported_modules(source, ignore_nest
>              for n in node.names:
>                  yield prefix + n.name
>
> -def verify_stdlib_on_own_line(source):
> +def getprefix(modulename):
> +    """Get prefix of specified `modulename`
> +    """
> +    prefix = '.'.join(modulename.split('.')[:-1])
> +    if prefix:
> +        prefix += '.'
> +    return prefix
> +
> +def fromlocalfunc(modulename, localmods):
> +    """Get a function to examine whether specified name is imported locally
> +    """
> +    prefix = getprefix(modulename)
> +    def fromlocal(name):
> +        return ((prefix + name) in localmods or
> +                (prefix + name + ".__init__") in localmods)
> +    return fromlocal
> +
> +def verify_stdlib_on_own_line(source, modname, localmods):
>      """Given some python source, verify that stdlib imports are done
>      in separate statements from relative local module imports.
>
> @@ -145,18 +162,53 @@ def verify_stdlib_on_own_line(source):
>      annoying lib2to3 bug in relative import rewrites:
>      http://bugs.python.org/issue19510.
>
> -    >>> list(verify_stdlib_on_own_line('import sys, foo'))
> +    ``modname`` is the dotted module name of ``source``. This is used
> +    to guess full (dotted) module name of ones listed in "import xxxx"
> +    (without "from yyyy") line.
> +
> +    ``localmods`` is used to examine whether composed dotted module
> +    name is locally defined (= mercurial specific) one or not.
> +
> +    >>> # relative importing in top-level module
> +    >>> localmods = {'foo.__init__': True, 'bar': True}
> +    >>> list(verify_stdlib_on_own_line('import sys, foo',
> +    ...                                'top', localmods))
>      ['mixed imports\\n   stdlib:    sys\\n   relative:  foo']
> -    >>> list(verify_stdlib_on_own_line('import sys, os'))
> -    []
> -    >>> list(verify_stdlib_on_own_line('import foo, bar'))
> -    []
> -    """
> +    >>> list(verify_stdlib_on_own_line('import sys, os',
> +    ...                                'top', localmods))
> +    []
> +    >>> list(verify_stdlib_on_own_line('import foo, bar',
> +    ...                                'top', localmods))
> +    []
> +    >>> # relative importing in sub module
> +    >>> localmods = {'sub.foo': True, 'sub.bar': True}
> +    >>> list(verify_stdlib_on_own_line('import sys, foo',
> +    ...                                'sub.baz', localmods))
> +    ['mixed imports\\n   stdlib:    sys\\n   relative:  foo']
> +    >>> list(verify_stdlib_on_own_line('import sys, os',
> +    ...                                'sub.baz', localmods))
> +    []
> +    >>> list(verify_stdlib_on_own_line('import foo, bar',
> +    ...                                'sub.baz', localmods))
> +    []
> +    >>> # relative importing in sub module: with "__init__"
> +    >>> localmods = {'sub.foo.__init__': True, 'sub.bar': True}
> +    >>> list(verify_stdlib_on_own_line('import sys, foo',
> +    ...                                'sub.baz', localmods))
> +    ['mixed imports\\n   stdlib:    sys\\n   relative:  foo']
> +    >>> list(verify_stdlib_on_own_line('import sys, os',
> +    ...                                'sub.baz', localmods))
> +    []
> +    >>> list(verify_stdlib_on_own_line('import foo, bar',
> +    ...                                'sub.baz', localmods))
> +    []
> +    """
> +    fromlocal = fromlocalfunc(modname, localmods)
>      for node in ast.walk(ast.parse(source)):
>          if isinstance(node, ast.Import):
>              from_stdlib = {False: [], True: []}
>              for n in node.names:
> -                from_stdlib[n.name in stdlib_modules].append(n.name)
> +                from_stdlib[not fromlocal(n.name)].append(n.name)
>              if from_stdlib[True] and from_stdlib[False]:
>                  yield ('mixed imports\n   stdlib:    %s\n   relative:  %s' %
>                         (', '.join(sorted(from_stdlib[True])),
> @@ -232,7 +284,7 @@ def main(argv):
>          src = f.read()
>          used_imports[modname] = sorted(
>              imported_modules(src, ignore_nested=True))
> -        for error in verify_stdlib_on_own_line(src):
> +        for error in verify_stdlib_on_own_line(src, modname, localmods):
>              any_errors = True
>              print source_path, error
>          f.close()
> diff --git a/tests/test-module-imports.t b/tests/test-module-imports.t
> --- a/tests/test-module-imports.t
> +++ b/tests/test-module-imports.t
> @@ -14,26 +14,10 @@ it's working correctly.
>
>    $ cd "$TESTDIR"/..
>
> -There are a handful of cases here that require renaming a module so it
> -doesn't overlap with a stdlib module name. There are also some cycles
> +There are some cycles
>  here that we should still endeavor to fix, and some cycles will be
>  hidden by deduplication algorithm in the cycle detector, so fixing
>  these may expose other cycles.
>
>    $ hg locate 'mercurial/**.py' | sed 's-\\-/-g' | python "$import_checker" -
> -  mercurial/dispatch.py mixed imports
> -     stdlib:    commands
> -     relative:  error, extensions, fancyopts, hg, hook, util
> -  mercurial/fileset.py mixed imports
> -     stdlib:    parser
> -     relative:  error, merge, util
> -  mercurial/revset.py mixed imports
> -     stdlib:    parser
> -     relative:  error, hbisect, phases, util
> -  mercurial/templater.py mixed imports
> -     stdlib:    parser
> -     relative:  config, error, templatefilters, templatekw, util
> -  mercurial/ui.py mixed imports
> -     stdlib:    formatter
> -     relative:  config, error, scmutil, util
>    Import cycle: mercurial.cmdutil -> mercurial.context -> mercurial.subrepo -> mercurial.cmdutil
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel

Patch

diff --git a/contrib/import-checker.py b/contrib/import-checker.py
--- a/contrib/import-checker.py
+++ b/contrib/import-checker.py
@@ -137,7 +137,24 @@  def imported_modules(source, ignore_nest
             for n in node.names:
                 yield prefix + n.name
 
-def verify_stdlib_on_own_line(source):
+def getprefix(modulename):
+    """Get prefix of specified `modulename`
+    """
+    prefix = '.'.join(modulename.split('.')[:-1])
+    if prefix:
+        prefix += '.'
+    return prefix
+
+def fromlocalfunc(modulename, localmods):
+    """Get a function to examine whether specified name is imported locally
+    """
+    prefix = getprefix(modulename)
+    def fromlocal(name):
+        return ((prefix + name) in localmods or
+                (prefix + name + ".__init__") in localmods)
+    return fromlocal
+
+def verify_stdlib_on_own_line(source, modname, localmods):
     """Given some python source, verify that stdlib imports are done
     in separate statements from relative local module imports.
 
@@ -145,18 +162,53 @@  def verify_stdlib_on_own_line(source):
     annoying lib2to3 bug in relative import rewrites:
     http://bugs.python.org/issue19510.
 
-    >>> list(verify_stdlib_on_own_line('import sys, foo'))
+    ``modname`` is the dotted module name of ``source``. This is used
+    to guess full (dotted) module name of ones listed in "import xxxx"
+    (without "from yyyy") line.
+
+    ``localmods`` is used to examine whether composed dotted module
+    name is locally defined (= mercurial specific) one or not.
+
+    >>> # relative importing in top-level module
+    >>> localmods = {'foo.__init__': True, 'bar': True}
+    >>> list(verify_stdlib_on_own_line('import sys, foo',
+    ...                                'top', localmods))
     ['mixed imports\\n   stdlib:    sys\\n   relative:  foo']
-    >>> list(verify_stdlib_on_own_line('import sys, os'))
-    []
-    >>> list(verify_stdlib_on_own_line('import foo, bar'))
-    []
-    """
+    >>> list(verify_stdlib_on_own_line('import sys, os',
+    ...                                'top', localmods))
+    []
+    >>> list(verify_stdlib_on_own_line('import foo, bar',
+    ...                                'top', localmods))
+    []
+    >>> # relative importing in sub module
+    >>> localmods = {'sub.foo': True, 'sub.bar': True}
+    >>> list(verify_stdlib_on_own_line('import sys, foo',
+    ...                                'sub.baz', localmods))
+    ['mixed imports\\n   stdlib:    sys\\n   relative:  foo']
+    >>> list(verify_stdlib_on_own_line('import sys, os',
+    ...                                'sub.baz', localmods))
+    []
+    >>> list(verify_stdlib_on_own_line('import foo, bar',
+    ...                                'sub.baz', localmods))
+    []
+    >>> # relative importing in sub module: with "__init__"
+    >>> localmods = {'sub.foo.__init__': True, 'sub.bar': True}
+    >>> list(verify_stdlib_on_own_line('import sys, foo',
+    ...                                'sub.baz', localmods))
+    ['mixed imports\\n   stdlib:    sys\\n   relative:  foo']
+    >>> list(verify_stdlib_on_own_line('import sys, os',
+    ...                                'sub.baz', localmods))
+    []
+    >>> list(verify_stdlib_on_own_line('import foo, bar',
+    ...                                'sub.baz', localmods))
+    []
+    """
+    fromlocal = fromlocalfunc(modname, localmods)
     for node in ast.walk(ast.parse(source)):
         if isinstance(node, ast.Import):
             from_stdlib = {False: [], True: []}
             for n in node.names:
-                from_stdlib[n.name in stdlib_modules].append(n.name)
+                from_stdlib[not fromlocal(n.name)].append(n.name)
             if from_stdlib[True] and from_stdlib[False]:
                 yield ('mixed imports\n   stdlib:    %s\n   relative:  %s' %
                        (', '.join(sorted(from_stdlib[True])),
@@ -232,7 +284,7 @@  def main(argv):
         src = f.read()
         used_imports[modname] = sorted(
             imported_modules(src, ignore_nested=True))
-        for error in verify_stdlib_on_own_line(src):
+        for error in verify_stdlib_on_own_line(src, modname, localmods):
             any_errors = True
             print source_path, error
         f.close()
diff --git a/tests/test-module-imports.t b/tests/test-module-imports.t
--- a/tests/test-module-imports.t
+++ b/tests/test-module-imports.t
@@ -14,26 +14,10 @@  it's working correctly.
 
   $ cd "$TESTDIR"/..
 
-There are a handful of cases here that require renaming a module so it
-doesn't overlap with a stdlib module name. There are also some cycles
+There are some cycles
 here that we should still endeavor to fix, and some cycles will be
 hidden by deduplication algorithm in the cycle detector, so fixing
 these may expose other cycles.
 
   $ hg locate 'mercurial/**.py' | sed 's-\\-/-g' | python "$import_checker" -
-  mercurial/dispatch.py mixed imports
-     stdlib:    commands
-     relative:  error, extensions, fancyopts, hg, hook, util
-  mercurial/fileset.py mixed imports
-     stdlib:    parser
-     relative:  error, merge, util
-  mercurial/revset.py mixed imports
-     stdlib:    parser
-     relative:  error, hbisect, phases, util
-  mercurial/templater.py mixed imports
-     stdlib:    parser
-     relative:  config, error, templatefilters, templatekw, util
-  mercurial/ui.py mixed imports
-     stdlib:    formatter
-     relative:  config, error, scmutil, util
   Import cycle: mercurial.cmdutil -> mercurial.context -> mercurial.subrepo -> mercurial.cmdutil