Submitter | phabricator |
---|---|
Date | March 24, 2018, 6:30 p.m. |
Message ID | <differential-rev-PHID-DREV-zs3s7r4oup7ffmbifdru-req@phab.mercurial-scm.org> |
Download | mbox | patch |
Permalink | /patch/29820/ |
State | Superseded |
Headers | show |
Comments
durin42 added a subscriber: indygreg. durin42 added a comment. @indygreg I think you saw this failure mode, I'd appreciate it if you could check if this fixes the watchman failures I introduced (I don't use watchman, and so I'm not quite sure how to reproduce.) REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D2939 To: durin42, #hg-reviewers Cc: indygreg, mercurial-devel
indygreg requested changes to this revision. indygreg added a comment. This revision now requires changes to proceed. This still fails for me with: File "/home/gps/src/hg/mercurial/context.py", line 1590, in _dirstatestatus clean=clean, unknown=unknown) File "/home/gps/src/hg/mercurial/dirstate.py", line 1073, in status elif (time != st[stat.ST_MTIME] IndexError: tuple index out of range Also, hacking the low-level bser data type at the bser level feels **very** wrong. I would rather we roll our own C type that holds a reference to the `bserObject` and provides the necessary object API. Although there are already Mercurial hacks in `bser.c` for `st_*` attribute access. So I could be convinced that this hack is acceptable. I can try debugging this failure if that would be helpful. Perhaps something in CPython land is evaluating `len()` and short-circuiting the `[8]` access. INLINE COMMENTS > bser.c:105 > > + if (i == 8 && PySequence_Size(obj->values) < 7) { > + // Hack alert: Python 3 removed support for os.stat().st_mtime Is there an off by 2 with `7`? Shouldn't it be `9` (since accesses up to `[7]` - an object with 8 elements) should be allowed? > bser.c:112 > + // st_mtime instead. > + return bserobj_getattrro(o, PyBytes_FromString("st_mtime")); > + } This leaks the `st_mtime` PyObject. Also, strictly speaking, we should verify that the return value is not `NULL`. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D2939 To: durin42, #hg-reviewers, indygreg Cc: indygreg, mercurial-devel
durin42 added a comment. In https://phab.mercurial-scm.org/D2939#47601, @indygreg wrote: > This still fails for me with: > > File "/home/gps/src/hg/mercurial/context.py", line 1590, in _dirstatestatus > clean=clean, unknown=unknown) > File "/home/gps/src/hg/mercurial/dirstate.py", line 1073, in status > elif (time != st[stat.ST_MTIME] > IndexError: tuple index out of range > > > Also, hacking the low-level bser data type at the bser level feels **very** wrong. I would rather we roll our own C type that holds a reference to the `bserObject` and provides the necessary object API. Although there are already Mercurial hacks in `bser.c` for `st_*` attribute access. So I could be convinced that this hack is acceptable. Yeah, I mostly wanted to verify a fix before trying to do something principled. > I can try debugging this failure if that would be helpful. Perhaps something in CPython land is evaluating `len()` and short-circuiting the `[8]` access. Yes, debugging would help a ton. I don't really have enough understanding to make enough headway here :( REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D2939 To: durin42, #hg-reviewers, indygreg Cc: indygreg, mercurial-devel
indygreg added a comment. I may look into this today. Having fsmonitor disabled makes operations on Firefox repos much slower and this is hampering my ability to hack on that repo :/ REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D2939 To: durin42, #hg-reviewers, indygreg Cc: indygreg, mercurial-devel
indygreg added a comment. I managed to debug this. `st[state.ST_MTIME]` will invoke `PyObject_GetItem`. Its implementation does this: m = o->ob_type->tp_as_mapping; if (m && m->mp_subscript) return m->mp_subscript(o, key); if (o->ob_type->tp_as_sequence) { if (PyIndex_Check(key)) { Py_ssize_t key_value; key_value = PyNumber_AsSsize_t(key, PyExc_IndexError); if (key_value == -1 && PyErr_Occurred()) return NULL; return PySequence_GetItem(o, key_value); } Our `bserObjectType` defines `tp_as_mapping` and `st[state.ST_MTIME]` ends up calling `bserobj_getattrro()`. That function sees the argument is an integer and ends up calling `PySequence_GetItem()`: if (PyIndex_Check(name)) { i = PyNumber_AsSsize_t(name, PyExc_IndexError); if (i == -1 && PyErr_Occurred()) { goto bail; } ret = PySequence_GetItem(obj->values, i); goto bail; } `PySequence_GetItem()` does the following: m = s->ob_type->tp_as_sequence; if (m && m->sq_item) { if (i < 0) { if (m->sq_length) { Py_ssize_t l = (*m->sq_length)(s); if (l < 0) return NULL; i += l; } } return m->sq_item(s, i); } `sq_item` here is `tupleitem()`, not `bserobj_tuple_item()`. It tells the truth and `IndexError` is raised. So, I think we want the hack to live in `bserobj_getattrro()`. I don't think the hack needs to live in `bserobj_tuple_item()` because that function only ever gets called through `bserobj_getattrro()`. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D2939 To: durin42, #hg-reviewers, indygreg Cc: spectral, indygreg, mercurial-devel
durin42 added a comment. PTAL - I think this should be ready to go. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D2939 To: durin42, #hg-reviewers, indygreg Cc: spectral, indygreg, mercurial-devel
indygreg accepted this revision. indygreg added a comment. This revision is now accepted and ready to land. This is super hacky. That's par for the course for this file, sadly. Also, I'm kinda surprised we're fully realizing the Python tuple to represent stat results. If we made tuple attribute access lazy, that would probably make `hg status` a bit faster, since we never access every field of that tuple. These kinds of things never show up in Python profilers though. And it is difficult to pin performance problems on Python object creation. But we know it is a very real thing. Obviously work for another day. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D2939 To: durin42, #hg-reviewers, indygreg Cc: spectral, indygreg, mercurial-devel
Patch
diff --git a/hgext/fsmonitor/pywatchman/bser.c b/hgext/fsmonitor/pywatchman/bser.c --- a/hgext/fsmonitor/pywatchman/bser.c +++ b/hgext/fsmonitor/pywatchman/bser.c @@ -99,6 +99,15 @@ static PyObject* bserobj_tuple_item(PyObject* o, Py_ssize_t i) { bserObject* obj = (bserObject*)o; + if (i == 8 && PySequence_Size(obj->values) < 7) { + // Hack alert: Python 3 removed support for os.stat().st_mtime + // being an integer.Instead, if you need an integer, you have to + // use os.stat()[stat.ST_MTIME] instead. stat.ST_MTIME is 8, and + // our stat tuples are shorter than that, so we can detect + // requests for index 8 on tuples shorter than that and return + // st_mtime instead. + return bserobj_getattrro(o, "st_mtime"); + } return PySequence_GetItem(obj->values, i); }