Patchwork [5,of,6] dirs: reuse private ints in _incdir

login
register
mail settings
Submitter Bryan O'Sullivan
Date March 29, 2013, 1:22 a.m.
Message ID <96da5c8a645016dc0b68.1364520167@australite.local>
Download mbox | patch
Permalink /patch/1212/
State Superseded
Headers show

Comments

Bryan O'Sullivan - March 29, 2013, 1:22 a.m.
# HG changeset patch
# User Bryan O'Sullivan <bryano@fb.com>
# Date 1364520157 25200
#      Thu Mar 28 18:22:37 2013 -0700
# Node ID 96da5c8a645016dc0b68855ff26293ceb1188cf2
# Parent  492566bf24e94dcbb87bdfcfb10c102742c90ab5
dirs: reuse private ints in _incdir

We do not publicly expose our refcounts, so it is safe to violate Python's
assumption that ints are immutable.

Mutating the refcounts directly saves a ton of work.

perfdirs performance in a working dir with 170,000 files:

  previously  248  msec
  now         194

Patch

diff --git a/mercurial/dirs.c b/mercurial/dirs.c
--- a/mercurial/dirs.c
+++ b/mercurial/dirs.c
@@ -11,6 +11,12 @@ 
 #include <Python.h>
 #include "util.h"
 
+/*
+ * We violate the Python rule that integers are immutable. Said
+ * integers are used only for internal refcounting by this code, and
+ * are not (and must not be) used by Python code.
+ */
+
 static inline Py_ssize_t _finddir(PyObject *path, Py_ssize_t pos)
 {
 	const char *s = PyString_AS_STRING(path);
@@ -32,7 +38,6 @@  static int _incdirs(PyObject *dirs, PyOb
 
 	while ((pos = _finddir(path, pos - 1)) != -1) {
 		PyObject *val;
-		long v = 0;
 
 		key = PyString_FromStringAndSize(PyString_AS_STRING(path), pos);
 
@@ -40,20 +45,30 @@  static int _incdirs(PyObject *dirs, PyOb
 			goto bail;
 
 		val = PyDict_GetItem(dirs, key);
+		/* Avoid allocating and deallocating an int every time
+		   we revisit a directory that we have seen already,
+		   by directly incrementing our internal refcount.
+		   (This mutation is why Python code must not look at
+		   our refcounts.) */
 		if (val != NULL) {
 			if (!PyInt_Check(val)) {
 				PyErr_SetString(PyExc_TypeError,
 						"expected int value");
 				goto bail;
 			}
-			v = PyInt_AS_LONG(val);
+			PyInt_AS_LONG(val) += 1;
+			Py_CLEAR(key);
+			continue;
 		}
 
-		newval = PyInt_FromLong(v + 1);
+		/* Force Python to not reuse a value from its shared
+		   pool of small ints. */
+		newval = PyInt_FromLong(0x1eadbeef);
 
 		if (newval == NULL)
 			goto bail;
 
+		PyInt_AS_LONG(newval) = 1;
 		ret = PyDict_SetItem(dirs, key, newval);
 		if (ret == -1)
 			goto bail;