Submitter | Laurent Charignon |
---|---|
Date | Dec. 22, 2015, 12:33 a.m. |
Message ID | <cc3b2338b18a45db45a2.1450744395@mbuchanan-mbp.DHCP.thefacebook.com> |
Download | mbox | patch |
Permalink | /patch/12218/ |
State | Superseded, archived |
Commit | 7c9eb2927879613e6bdd924c58aaa80c0fde7aba |
Headers | show |
Comments
On Mon, Dec 21, 2015 at 4:38 PM Laurent Charignon <lcharignon@fb.com> wrote: > # HG changeset patch > # User Laurent Charignon <lcharignon@fb.com> > # Date 1450744036 28800 > # Mon Dec 21 16:27:16 2015 -0800 > # Node ID cc3b2338b18a45db45a2dcab757455c63e6de0d4 > # Parent ea469e3797c1ea6c20f3b30efe5ef532279dd3ce > dirstate: add a C implementation for nonnormalentries > > Before this patch, there was only a python version of nonnormalentries. > On mozilla-central we have a 10x win by putting this function in C: > % python -m timeit -s \ > 'from mercurial import hg, ui, parsers; \ > repo = hg.repository(ui.ui(), "mozilla-central"); \ > m = repo.dirstate._map' \ > 'parsers.nonnormalentries(m)' > > 100 loops, best of 3: 3.15 msec per loop > > The python implementation runs in 31ms, a similar test gives: > 10 loops, best of 3: 31.7 msec per loop > > On our big repos, the win is still of 10x with the python implementation > running > in 350ms and the C implementation running in 30ms. What does the above mean in practice? How much faster does "hg status" get in the normal case of a few modified files? (I'm not suggesting you should remove the above, but I'm personally more interested in what optimizations mean in practice.)
On Tue, Dec 22, 2015 at 8:31 AM Martin von Zweigbergk <martinvonz@google.com> wrote: > On Mon, Dec 21, 2015 at 4:38 PM Laurent Charignon <lcharignon@fb.com> > wrote: > >> # HG changeset patch >> # User Laurent Charignon <lcharignon@fb.com> >> # Date 1450744036 28800 >> # Mon Dec 21 16:27:16 2015 -0800 >> # Node ID cc3b2338b18a45db45a2dcab757455c63e6de0d4 >> # Parent ea469e3797c1ea6c20f3b30efe5ef532279dd3ce >> dirstate: add a C implementation for nonnormalentries >> >> Before this patch, there was only a python version of nonnormalentries. >> On mozilla-central we have a 10x win by putting this function in C: >> % python -m timeit -s \ >> 'from mercurial import hg, ui, parsers; \ >> repo = hg.repository(ui.ui(), "mozilla-central"); \ >> m = repo.dirstate._map' \ >> 'parsers.nonnormalentries(m)' >> >> 100 loops, best of 3: 3.15 msec per loop >> >> The python implementation runs in 31ms, a similar test gives: >> 10 loops, best of 3: 31.7 msec per loop >> >> On our big repos, the win is still of 10x with the python implementation >> running >> in 350ms and the C implementation running in 30ms. > > > What does the above mean in practice? How much faster does "hg status" get > in the normal case of a few modified files? (I'm not suggesting you should > remove the above, but I'm personally more interested in what optimizations > mean in practice.) > Oh, that was in the next patch. Never mind...
Patch
diff --git a/mercurial/parsers.c b/mercurial/parsers.c --- a/mercurial/parsers.c +++ b/mercurial/parsers.c @@ -547,6 +547,44 @@ quit: } /* + * Build a set of non-normal entries from the dirstate dmap +*/ +static PyObject *nonnormalentries(PyObject *self, PyObject *args) +{ + PyObject *dmap, *nonnset = NULL, *fname, *v; + Py_ssize_t pos; + + if (!PyArg_ParseTuple(args, "O!:nonnormalentries", + &PyDict_Type, &dmap)) + goto bail; + + nonnset = PySet_New(NULL); + if (nonnset == NULL) + goto bail; + + pos = 0; + while (PyDict_Next(dmap, &pos, &fname, &v)) { + dirstateTupleObject *t; + if (!dirstate_tuple_check(v)) { + PyErr_SetString(PyExc_TypeError, + "expected a dirstate tuple"); + goto bail; + } + t = (dirstateTupleObject *)v; + + if (t->state == 'n' && t->mtime != -1) + continue; + if (PySet_Add(nonnset, fname) == -1) + goto bail; + } + + return nonnset; +bail: + Py_XDECREF(nonnset); + return NULL; +} + +/* * Efficiently pack a dirstate object into its on-disk format. */ static PyObject *pack_dirstate(PyObject *self, PyObject *args) @@ -2740,6 +2778,8 @@ PyObject *lowerencode(PyObject *self, Py static PyMethodDef methods[] = { {"pack_dirstate", pack_dirstate, METH_VARARGS, "pack a dirstate\n"}, + {"nonnormalentries", nonnormalentries, METH_VARARGS, + "create a set containing non-normal entries of given dirstate\n"}, {"parse_manifest", parse_manifest, METH_VARARGS, "parse a manifest\n"}, {"parse_dirstate", parse_dirstate, METH_VARARGS, "parse a dirstate\n"}, {"parse_index2", parse_index2, METH_VARARGS, "parse a revlog index\n"},