Patchwork [5,of,6,V7] dirstate: add a C implementation for nonnormalentries

login
register
mail settings
Submitter Laurent Charignon
Date Dec. 23, 2015, 9:20 p.m.
Message ID <755fe13eac4245420e8e.1450905620@mbuchanan-mbp.DHCP.thefacebook.com>
Download mbox | patch
Permalink /patch/12328/
State Accepted
Headers show

Comments

Laurent Charignon - Dec. 23, 2015, 9:20 p.m.
# HG changeset patch
# User Laurent Charignon <lcharignon@fb.com>
# Date 1450744036 28800
#      Mon Dec 21 16:27:16 2015 -0800
# Node ID 755fe13eac4245420e8e8065864abcbd9a3fdaa9
# Parent  968b926e9e546d7d09d7d68474b4c9235ce634a3
dirstate: add a C implementation for nonnormalentries

Before this patch, there was only a python version of nonnormalentries.
On mozilla-central we have a 10x win by putting this function in C:
% python -m timeit -s \
        'from mercurial import hg, ui, parsers; \
        repo = hg.repository(ui.ui(), "mozilla-central"); \
        m = repo.dirstate._map' \
        'parsers.nonnormalentries(m)'

100 loops, best of 3: 3.15 msec per loop

The python implementation runs in 31ms, a similar test gives:
10 loops, best of 3: 31.7 msec per loop

On our big repos, the win is still of 10x with the python implementation running
in 350ms and the C implementation running in 30ms.
Bryan O'Sullivan - Jan. 2, 2016, 12:58 a.m.
On Wed, Dec 23, 2015 at 1:20 PM, Laurent Charignon <lcharignon@fb.com>
wrote:

> +       if (!PyArg_ParseTuple(args, "O!:nonnormalentries",
> +                             &PyDict_Type, &dmap))
> +               goto bail;
> +
> +       nonnset = PySet_New(NULL);
> +       if (nonnset == NULL)
> +               goto bail;
>

We normally reserve "goto bail" for cases where there's cleanup needed. In
these two cases there's no cleanup required, so it's clearer to simply
"return NULL".
Yuya Nishihara - Jan. 2, 2016, 9 a.m.
On Fri, 1 Jan 2016 16:58:00 -0800, Bryan O'Sullivan wrote:
> On Wed, Dec 23, 2015 at 1:20 PM, Laurent Charignon <lcharignon@fb.com>
> wrote:
> > +       if (!PyArg_ParseTuple(args, "O!:nonnormalentries",
> > +                             &PyDict_Type, &dmap))
> > +               goto bail;
> > +
> > +       nonnset = PySet_New(NULL);
> > +       if (nonnset == NULL)
> > +               goto bail;
> >
> 
> We normally reserve "goto bail" for cases where there's cleanup needed. In
> these two cases there's no cleanup required, so it's clearer to simply
> "return NULL".

I wrote it. I prefer using a common exit point after allocation. But this
function is simple enough to see if cleanup is necessary or not, so please
feel free to change them to "return NULL".

Patch

diff --git a/mercurial/parsers.c b/mercurial/parsers.c
--- a/mercurial/parsers.c
+++ b/mercurial/parsers.c
@@ -547,6 +547,44 @@  quit:
 }
 
 /*
+ * Build a set of non-normal entries from the dirstate dmap
+*/
+static PyObject *nonnormalentries(PyObject *self, PyObject *args)
+{
+	PyObject *dmap, *nonnset = NULL, *fname, *v;
+	Py_ssize_t pos;
+
+	if (!PyArg_ParseTuple(args, "O!:nonnormalentries",
+			      &PyDict_Type, &dmap))
+		goto bail;
+
+	nonnset = PySet_New(NULL);
+	if (nonnset == NULL)
+		goto bail;
+
+	pos = 0;
+	while (PyDict_Next(dmap, &pos, &fname, &v)) {
+		dirstateTupleObject *t;
+		if (!dirstate_tuple_check(v)) {
+			PyErr_SetString(PyExc_TypeError,
+					"expected a dirstate tuple");
+			goto bail;
+		}
+		t = (dirstateTupleObject *)v;
+
+		if (t->state == 'n' && t->mtime != -1)
+			continue;
+		if (PySet_Add(nonnset, fname) == -1)
+			goto bail;
+	}
+
+	return nonnset;
+bail:
+	Py_XDECREF(nonnset);
+	return NULL;
+}
+
+/*
  * Efficiently pack a dirstate object into its on-disk format.
  */
 static PyObject *pack_dirstate(PyObject *self, PyObject *args)
@@ -2740,6 +2778,8 @@  PyObject *lowerencode(PyObject *self, Py
 
 static PyMethodDef methods[] = {
 	{"pack_dirstate", pack_dirstate, METH_VARARGS, "pack a dirstate\n"},
+	{"nonnormalentries", nonnormalentries, METH_VARARGS,
+	"create a set containing non-normal entries of given dirstate\n"},
 	{"parse_manifest", parse_manifest, METH_VARARGS, "parse a manifest\n"},
 	{"parse_dirstate", parse_dirstate, METH_VARARGS, "parse a dirstate\n"},
 	{"parse_index2", parse_index2, METH_VARARGS, "parse a revlog index\n"},