Patchwork dirs: remove mutable string optimization at all

login
register
mail settings
Submitter Yuya Nishihara
Date Oct. 13, 2019, 11:27 a.m.
Message ID <87dfcdb8394e377049b5.1570966052@mimosa>
Download mbox | patch
Permalink /patch/42294/
State Superseded
Headers show

Comments

Yuya Nishihara - Oct. 13, 2019, 11:27 a.m.
# HG changeset patch
# User Yuya Nishihara <yuya@tcha.org>
# Date 1570964769 -32400
#      Sun Oct 13 20:06:09 2019 +0900
# Node ID 87dfcdb8394e377049b5a337585309ae4c652c7d
# Parent  ce20b870041fc4d6ba6989acbb9373797ce9b3d6
dirs: remove mutable string optimization at all

As far as I can see, the optimization trick has been dead since 42e89b87ca79
"dirs: speed up by storing number of direct children per dir". After
42e89b87ca79, the key variable is cleared to NULL at every iteration.
Matt Harbison - Oct. 13, 2019, 7:33 p.m.
On Sun, 13 Oct 2019 07:27:32 -0400, Yuya Nishihara <yuya@tcha.org> wrote:

> # HG changeset patch
> # User Yuya Nishihara <yuya@tcha.org>
> # Date 1570964769 -32400
> #      Sun Oct 13 20:06:09 2019 +0900
> # Node ID 87dfcdb8394e377049b5a337585309ae4c652c7d
> # Parent  ce20b870041fc4d6ba6989acbb9373797ce9b3d6
> dirs: remove mutable string optimization at all

Should this original comment go away too?

https://www.mercurial-scm.org/repo/hg/file/52781d57313d/mercurial/cext/dirs.c#l30
Yuya Nishihara - Oct. 14, 2019, 3:45 a.m.
On Sun, 13 Oct 2019 15:33:28 -0400, Matt Harbison wrote:
> On Sun, 13 Oct 2019 07:27:32 -0400, Yuya Nishihara <yuya@tcha.org> wrote:
> 
> > # HG changeset patch
> > # User Yuya Nishihara <yuya@tcha.org>
> > # Date 1570964769 -32400
> > #      Sun Oct 13 20:06:09 2019 +0900
> > # Node ID 87dfcdb8394e377049b5a337585309ae4c652c7d
> > # Parent  ce20b870041fc4d6ba6989acbb9373797ce9b3d6
> > dirs: remove mutable string optimization at all
> 
> Should this original comment go away too?
> 
> https://www.mercurial-scm.org/repo/hg/file/52781d57313d/mercurial/cext/dirs.c#l30

Good catch. I'll send V2.

Patch

diff --git a/mercurial/cext/dirs.c b/mercurial/cext/dirs.c
--- a/mercurial/cext/dirs.c
+++ b/mercurial/cext/dirs.c
@@ -63,46 +63,18 @@  static int _addpath(PyObject *dirs, PyOb
 	* "protocol" such as mutating immutable objects. But since we only
 	* mutate objects created in this function or in other well-defined
 	* locations, the references are known so these violations should go
-	* unnoticed. The code for adjusting the length of a PyBytesObject is
-	* essentially a minimal version of _PyBytes_Resize. */
+	* unnoticed. */
 	while ((pos = _finddir(cpath, pos - 1)) != -1) {
 		PyObject *val;
 
-		if (pos < 2) {
-			key = PyBytes_FromStringAndSize(cpath, pos);
-			if (key == NULL)
-				goto bail;
-		} else {
-			/* It's likely that every prefix already has an entry
-			   in our dict. Try to avoid allocating and
-			   deallocating a string for each prefix we check. */
-			if (key != NULL)
-				((PyBytesObject *)key)->ob_shash = -1;
-			else {
-				/* We know pos >= 2, so we won't get a small
-				 * shared string. */
-				key = PyBytes_FromStringAndSize(cpath, pos);
-				if (key == NULL)
-					goto bail;
-			}
-			/* Py_SIZE(o) refers to the ob_size member of
-			 * the struct. Yes, assigning to what looks
-			 * like a function seems wrong. */
-			Py_SIZE(key) = pos;
-			((PyBytesObject *)key)->ob_sval[pos] = '\0';
-		}
+		key = PyBytes_FromStringAndSize(cpath, pos);
+		if (key == NULL)
+			goto bail;
 
 		val = PyDict_GetItem(dirs, key);
 		if (val != NULL) {
 			PYLONG_VALUE(val) += 1;
-			if (pos < 2) {
-				/* This was a short string, so we
-				 * probably got a small shared string
-				 * we can't mutate on the next loop
-				 * iteration. Clear it.
-				 */
-				Py_CLEAR(key);
-			}
+			Py_CLEAR(key);
 			break;
 		}