Patchwork parsers: fail fast if Python has wrong minor version (issue4110)

login
register
mail settings
Submitter Chris Jerdonek
Date Nov. 29, 2013, 8:47 p.m.
Message ID <0f89711072592d7dd738.1385758046@stonewall.local>
Download mbox | patch
Permalink /patch/3192/
State Superseded
Commit 3681de20b0a7351ffb949db32c1859ab2f2b6105
Headers show

Comments

Chris Jerdonek - Nov. 29, 2013, 8:47 p.m.
# HG changeset patch
# User Chris Jerdonek <chris.jerdonek@gmail.com>
# Date 1385757388 28800
#      Fri Nov 29 12:36:28 2013 -0800
# Node ID 0f89711072592d7dd738fb89e4653534a042aa42
# Parent  1df77035c8141d4586ff5af84c34d54cb9912402
parsers: fail fast if Python has wrong minor version (issue4110)

This change causes an informative ImportError to be raised when importing
the extension module parsers if the minor version of the currently-running
Python interpreter doesn't match that of the Python that was used when
compiling the extension module.  Here is an example of what the new error
looks like:

  Traceback (most recent call last):
    File "test.py", line 1, in <module>
      import mercurial.parsers
  ImportError: Python minor version mismatch: The Mercurial extension
  modules were compiled with Python 2.7.6, but Mercurial is currently using
  Python with sys.hexversion=33883888: Python 2.5.6
  (r256:88840, Nov 18 2012, 05:37:10)
  [GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))]
   at: /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/
    Python.app/Contents/MacOS/Python

The reason for raising an error in this scenario is that Python's C API
is known not to be compatible from minor version to minor version, even
if sys.api_version is the same.  See for example this Python bug report
about incompatibilities between 2.5 and 2.6+:

  http://bugs.python.org/issue8118

These incompatibilities can cause Mercurial to break in mysterious,
unforeseen ways.  For example, when Mercurial compiled with Python 2.7 was
run with 2.5, the following crash occurred when running "hg status":

  http://bz.selenic.com/show_bug.cgi?id=4110

After this crash was fixed, running with Python 2.5 no longer crashes, but
the following puzzling behavior still occurs:

    $ hg status
      ...
      File ".../mercurial/changelog.py", line 123, in __init__
        revlog.revlog.__init__(self, opener, "00changelog.i")
      File ".../mercurial/revlog.py", line 251, in __init__
        d = self._io.parseindex(i, self._inline)
      File ".../mercurial/revlog.py", line 158, in parseindex
        index, cache = parsers.parse_index2(data, inline)
    TypeError: data is not a string

which can be reproduced more simply with:

    import mercurial.parsers as parsers
    parsers.parse_index2("", True)

Both the crash and the TypeError occurred because the Python C API's
PyString_Check returns the wrong value when the C header files from
Python 2.7 are run with Python 2.5.  This is an example of an
incompatibility of the sort mentioned in the Python bug report above.

Failing fast with an informative error message will result in a better
user experience in cases like the above.  The information in the ImportError
will also simplify troubleshooting for those on Mercurial mailing lists,
the bug tracker, etc.

This patch only adds the version check to parsers.c, which is sufficient
to affect command-line commands like "hg status" and "hg summary".
An idea for a future improvement is to move the version-checking C code
to a more central location, and have it run when importing all
Mercurial extension modules and not just parsers.c.
Matt Mackall - Dec. 2, 2013, 1:59 a.m.
On Fri, 2013-11-29 at 12:47 -0800, Chris Jerdonek wrote:
> # HG changeset patch
> # User Chris Jerdonek <chris.jerdonek@gmail.com>
> # Date 1385757388 28800
> #      Fri Nov 29 12:36:28 2013 -0800
> # Node ID 0f89711072592d7dd738fb89e4653534a042aa42
> # Parent  1df77035c8141d4586ff5af84c34d54cb9912402
> parsers: fail fast if Python has wrong minor version (issue4110)

Queued, thanks. I've put this on default as it's a little complex/obscure.
Chris Jerdonek - Dec. 2, 2013, 2:02 a.m.
On Sun, Dec 1, 2013 at 5:59 PM, Matt Mackall <mpm@selenic.com> wrote:
>
> On Fri, 2013-11-29 at 12:47 -0800, Chris Jerdonek wrote:
>> # HG changeset patch
>> # User Chris Jerdonek <chris.jerdonek@gmail.com>
>> # Date 1385757388 28800
>> #      Fri Nov 29 12:36:28 2013 -0800
>> # Node ID 0f89711072592d7dd738fb89e4653534a042aa42
>> # Parent  1df77035c8141d4586ff5af84c34d54cb9912402
>> parsers: fail fast if Python has wrong minor version (issue4110)
>
> Queued, thanks. I've put this on default as it's a little complex/obscure.

Yeah, I agree with you re: default.  Thanks a lot for the quick turnaround! :)

--Chris



>
> --
> Mathematics is the supreme nostalgia of our time.
>
>

Patch

diff --git a/mercurial/parsers.c b/mercurial/parsers.c
--- a/mercurial/parsers.c
+++ b/mercurial/parsers.c
@@ -1941,6 +1941,25 @@ 
 	dirstate_unset = Py_BuildValue("ciii", 'n', 0, -1, -1);
 }
 
+static int check_python_version()
+{
+	PyObject *sys = PyImport_ImportModule("sys");
+	PyObject *hexversion = PyObject_GetAttrString(sys, "hexversion");
+	long version = PyInt_AsLong(hexversion);
+	/* sys.hexversion is a 32-bit number by default, so the -1 case
+	 * should only occur in unusual circumstances (e.g. if sys.hexversion
+	 * is manually set to an invalid value). */
+	if ((version == -1) || (version >> 16 != PY_VERSION_HEX >> 16)) {
+		PyErr_Format(PyExc_ImportError, "Python minor version mismatch: "
+			"The Mercurial extension modules were compiled with Python "
+			PY_VERSION ", but Mercurial is currently using Python with "
+			"sys.hexversion=%ld: Python %s\n at: %s", version,
+			Py_GetVersion(), Py_GetProgramFullPath());
+		return -1;
+	}
+	return 0;
+}
+
 #ifdef IS_PY3K
 static struct PyModuleDef parsers_module = {
 	PyModuleDef_HEAD_INIT,
@@ -1952,6 +1971,8 @@ 
 
 PyMODINIT_FUNC PyInit_parsers(void)
 {
+	if (check_python_version() == -1)
+		return;
 	PyObject *mod = PyModule_Create(&parsers_module);
 	module_init(mod);
 	return mod;
@@ -1959,6 +1980,8 @@ 
 #else
 PyMODINIT_FUNC initparsers(void)
 {
+	if (check_python_version() == -1)
+		return;
 	PyObject *mod = Py_InitModule3("parsers", methods, parsers_doc);
 	module_init(mod);
 }
diff --git a/tests/test-parseindex2.py b/tests/test-parseindex2.py
--- a/tests/test-parseindex2.py
+++ b/tests/test-parseindex2.py
@@ -1,6 +1,8 @@ 
 from mercurial import parsers
 from mercurial.node import nullid, nullrev
 import struct
+import subprocess
+import sys
 
 # This unit test compares the return value of the original Python
 # implementation of parseindex and the new C implementation for
@@ -97,7 +99,62 @@ 
     index, chunkcache = parsers.parse_index2(data, inline)
     return list(index), chunkcache
 
+def importparsers(hexversion):
+    """Import mercurial.parsers with the given sys.hexversion."""
+    # The file parsers.c inspects sys.hexversion to determine the version
+    # of the currently-running Python interpreter, so we monkey-patch
+    # sys.hexversion to simulate using different versions.
+    code = ("import sys; sys.hexversion=%s; "
+            "import mercurial.parsers" % hexversion)
+    cmd = "python -c \"%s\"" % code
+    # We need to do these tests inside a subprocess because parser.c's
+    # version-checking code happens inside the module init function, and
+    # when using reload() to reimport an extension module, "The init function
+    # of extension modules is not called a second time"
+    # (from http://docs.python.org/2/library/functions.html?#reload).
+    p = subprocess.Popen(cmd, shell=True,
+                         stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
+    return p.communicate()  # returns stdout, stderr
+
+def printhexfail(testnumber, hexversion, msg):
+    try:
+        hexstring = hex(hexversion)
+    except TypeError:
+        hexstring = None
+    print ("%s) using Python %s and patched sys.hexversion %r (%r): %s" %
+           (testnumber, sys.version_info, hexversion, hexstring, msg))
+
+def testversionokay(testnumber, hexversion):
+    stdout, stderr = importparsers(hexversion)
+    if stdout:
+        printhexfail(testnumber, hexversion,
+                     "Expected no stdout but got: %r" % stdout)
+
+def testversionfail(testnumber, hexversion):
+    stdout, stderr = importparsers(hexversion)
+    if not "ImportError: Python minor version mismatch" in stdout:
+        printhexfail(testnumber, hexversion,
+                     "Expected stdout to contain %r but got: %r" %
+                     (errstring, stdout))
+
+def makehex(major, minor, micro):
+    return int("%x%02x%02x00" % (major, minor, micro), 16)
+
+def runversiontests():
+    """Test importing parsers using different Python versions."""
+    info = sys.version_info
+    major, minor, micro = info[0], info[1], info[2]
+    # Test same major-minor versions.
+    testversionokay(1, makehex(major, minor, micro))
+    testversionokay(2, makehex(major, minor, micro + 1))
+    # Test different major-minor versions.
+    testversionfail(3, makehex(major + 1, minor, micro))
+    testversionfail(4, makehex(major, minor + 1, micro))
+    testversionfail(5, "'foo'")
+
 def runtest() :
+    runversiontests()
+
     # Check that parse_index2() raises TypeError on bad arguments.
     try:
         parse_index2(0, True)