Patchwork [6,of,7,V6] dirstate: add a C implementation for nonnormalentries

login
register
mail settings
Submitter Laurent Charignon
Date Dec. 22, 2015, 12:33 a.m.
Message ID <cc3b2338b18a45db45a2.1450744395@mbuchanan-mbp.DHCP.thefacebook.com>
Download mbox | patch
Permalink /patch/12218/
State Superseded, archived
Commit 7c9eb2927879613e6bdd924c58aaa80c0fde7aba
Headers show

Comments

Laurent Charignon - Dec. 22, 2015, 12:33 a.m.
# HG changeset patch
# User Laurent Charignon <lcharignon@fb.com>
# Date 1450744036 28800
#      Mon Dec 21 16:27:16 2015 -0800
# Node ID cc3b2338b18a45db45a2dcab757455c63e6de0d4
# Parent  ea469e3797c1ea6c20f3b30efe5ef532279dd3ce
dirstate: add a C implementation for nonnormalentries

Before this patch, there was only a python version of nonnormalentries.
On mozilla-central we have a 10x win by putting this function in C:
% python -m timeit -s \
        'from mercurial import hg, ui, parsers; \
        repo = hg.repository(ui.ui(), "mozilla-central"); \
        m = repo.dirstate._map' \
        'parsers.nonnormalentries(m)'

100 loops, best of 3: 3.15 msec per loop

The python implementation runs in 31ms, a similar test gives:
10 loops, best of 3: 31.7 msec per loop

On our big repos, the win is still of 10x with the python implementation running
in 350ms and the C implementation running in 30ms.
Martin von Zweigbergk - Dec. 22, 2015, 4:31 p.m.
On Mon, Dec 21, 2015 at 4:38 PM Laurent Charignon <lcharignon@fb.com> wrote:

> # HG changeset patch
> # User Laurent Charignon <lcharignon@fb.com>
> # Date 1450744036 28800
> #      Mon Dec 21 16:27:16 2015 -0800
> # Node ID cc3b2338b18a45db45a2dcab757455c63e6de0d4
> # Parent  ea469e3797c1ea6c20f3b30efe5ef532279dd3ce
> dirstate: add a C implementation for nonnormalentries
>
> Before this patch, there was only a python version of nonnormalentries.
> On mozilla-central we have a 10x win by putting this function in C:
> % python -m timeit -s \
>         'from mercurial import hg, ui, parsers; \
>         repo = hg.repository(ui.ui(), "mozilla-central"); \
>         m = repo.dirstate._map' \
>         'parsers.nonnormalentries(m)'
>
> 100 loops, best of 3: 3.15 msec per loop
>
> The python implementation runs in 31ms, a similar test gives:
> 10 loops, best of 3: 31.7 msec per loop
>
> On our big repos, the win is still of 10x with the python implementation
> running
> in 350ms and the C implementation running in 30ms.


What does the above mean in practice? How much faster does "hg status" get
in the normal case of a few modified files? (I'm not suggesting you should
remove the above, but I'm personally more interested in what optimizations
mean in practice.)
Martin von Zweigbergk - Dec. 22, 2015, 4:32 p.m.
On Tue, Dec 22, 2015 at 8:31 AM Martin von Zweigbergk <martinvonz@google.com>
wrote:

> On Mon, Dec 21, 2015 at 4:38 PM Laurent Charignon <lcharignon@fb.com>
> wrote:
>
>> # HG changeset patch
>> # User Laurent Charignon <lcharignon@fb.com>
>> # Date 1450744036 28800
>> #      Mon Dec 21 16:27:16 2015 -0800
>> # Node ID cc3b2338b18a45db45a2dcab757455c63e6de0d4
>> # Parent  ea469e3797c1ea6c20f3b30efe5ef532279dd3ce
>> dirstate: add a C implementation for nonnormalentries
>>
>> Before this patch, there was only a python version of nonnormalentries.
>> On mozilla-central we have a 10x win by putting this function in C:
>> % python -m timeit -s \
>>         'from mercurial import hg, ui, parsers; \
>>         repo = hg.repository(ui.ui(), "mozilla-central"); \
>>         m = repo.dirstate._map' \
>>         'parsers.nonnormalentries(m)'
>>
>> 100 loops, best of 3: 3.15 msec per loop
>>
>> The python implementation runs in 31ms, a similar test gives:
>> 10 loops, best of 3: 31.7 msec per loop
>>
>> On our big repos, the win is still of 10x with the python implementation
>> running
>> in 350ms and the C implementation running in 30ms.
>
>
> What does the above mean in practice? How much faster does "hg status" get
> in the normal case of a few modified files? (I'm not suggesting you should
> remove the above, but I'm personally more interested in what optimizations
> mean in practice.)
>

Oh, that was in the next patch. Never mind...

Patch

diff --git a/mercurial/parsers.c b/mercurial/parsers.c
--- a/mercurial/parsers.c
+++ b/mercurial/parsers.c
@@ -547,6 +547,44 @@  quit:
 }
 
 /*
+ * Build a set of non-normal entries from the dirstate dmap
+*/
+static PyObject *nonnormalentries(PyObject *self, PyObject *args)
+{
+	PyObject *dmap, *nonnset = NULL, *fname, *v;
+	Py_ssize_t pos;
+
+	if (!PyArg_ParseTuple(args, "O!:nonnormalentries",
+			      &PyDict_Type, &dmap))
+		goto bail;
+
+	nonnset = PySet_New(NULL);
+	if (nonnset == NULL)
+		goto bail;
+
+	pos = 0;
+	while (PyDict_Next(dmap, &pos, &fname, &v)) {
+		dirstateTupleObject *t;
+		if (!dirstate_tuple_check(v)) {
+			PyErr_SetString(PyExc_TypeError,
+					"expected a dirstate tuple");
+			goto bail;
+		}
+		t = (dirstateTupleObject *)v;
+
+		if (t->state == 'n' && t->mtime != -1)
+			continue;
+		if (PySet_Add(nonnset, fname) == -1)
+			goto bail;
+	}
+
+	return nonnset;
+bail:
+	Py_XDECREF(nonnset);
+	return NULL;
+}
+
+/*
  * Efficiently pack a dirstate object into its on-disk format.
  */
 static PyObject *pack_dirstate(PyObject *self, PyObject *args)
@@ -2740,6 +2778,8 @@  PyObject *lowerencode(PyObject *self, Py
 
 static PyMethodDef methods[] = {
 	{"pack_dirstate", pack_dirstate, METH_VARARGS, "pack a dirstate\n"},
+	{"nonnormalentries", nonnormalentries, METH_VARARGS,
+	"create a set containing non-normal entries of given dirstate\n"},
 	{"parse_manifest", parse_manifest, METH_VARARGS, "parse a manifest\n"},
 	{"parse_dirstate", parse_dirstate, METH_VARARGS, "parse a dirstate\n"},
 	{"parse_index2", parse_index2, METH_VARARGS, "parse a revlog index\n"},