Patchwork [1,of,7,V4] sparse-revlog: add a `index_get_start` function in C

login
register
mail settings
Submitter Boris Feld
Date Nov. 20, 2018, 8:44 p.m.
Message ID <60a55da39befa4996c1a.1542746672@localhost.localdomain>
Download mbox | patch
Permalink /patch/36677/
State Superseded
Headers show

Comments

Boris Feld - Nov. 20, 2018, 8:44 p.m.
# HG changeset patch
# User Boris Feld <boris.feld@octobus.net>
# Date 1542725358 0
#      Tue Nov 20 14:49:18 2018 +0000
# Node ID 60a55da39befa4996c1a88ca6f663765ea143ef2
# Parent  337a389953366f059f2cb88129031ba3e67e0bbc
# EXP-Topic sparse-perf
# Available At https://bitbucket.org/octobus/mercurial-devel/
#              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 60a55da39bef
sparse-revlog: add a `index_get_start` function in C

We are about to implement a native version of `slicechunktodensity`. For
clarity, we introduce the helper functions first. This new function provides
an efficient way to retrieve some of the information needed by
`slicechunktodensity`.
Yuya Nishihara - Nov. 21, 2018, 12:30 p.m.
On Tue, 20 Nov 2018 21:44:32 +0100, Boris Feld wrote:
> # HG changeset patch
> # User Boris Feld <boris.feld@octobus.net>
> # Date 1542725358 0
> #      Tue Nov 20 14:49:18 2018 +0000
> # Node ID 60a55da39befa4996c1a88ca6f663765ea143ef2
> # Parent  337a389953366f059f2cb88129031ba3e67e0bbc
> # EXP-Topic sparse-perf
> # Available At https://bitbucket.org/octobus/mercurial-devel/
> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 60a55da39bef
> sparse-revlog: add a `index_get_start` function in C

> diff --git a/mercurial/cext/revlog.c b/mercurial/cext/revlog.c
> --- a/mercurial/cext/revlog.c
> +++ b/mercurial/cext/revlog.c
> @@ -185,6 +185,46 @@ static inline int index_get_parents(inde
>  	return 0;
>  }
>  
> +static inline int64_t index_get_start(indexObject *self, Py_ssize_t rev)

I meant "int index_get_start(*self, rev, *offset)" just like index_get_parents(),
but this is also good.

> +{
> +	uint64_t offset;
> +	if (rev >= self->length) {
> +		PyObject *tuple;
> +		PyObject *pylong;
> +		unsigned PY_LONG_LONG tmp;
> +		tuple = PyList_GET_ITEM(self->added, rev - self->length);
> +		pylong = PyTuple_GET_ITEM(tuple, 0);
> +#ifndef IS_PY3K
> +		if (PyInt_Check(pylong)) {
> +			long tmp2 = PyInt_AsLong(pylong);
> +			if (tmp2 < 0) {
> +				return -1;
> +			}
> +			tmp = (unsigned PY_LONG_LONG)tmp2;
> +		} else {
> +#endif
> +			tmp = PyLong_AsUnsignedLongLong(pylong);
> +			if (tmp == (unsigned PY_LONG_LONG) - 1) {
> +				return -1;
> +			}
> +
> +#ifndef IS_PY3K
> +		}
> +#endif
> +		offset = (uint64_t)tmp;

Sigh, PyLong API is so inconsistent on Python 2. Let's switch to
PyLong_AsLongLong(), which can handle PyInt as well. I think it's
okay to raise OverflowError if offset >= 1 << (63 - 16).

FWIW, PyLong_AsUnsignedLongLong() of Python 2 would raise BadInternalCall
if non-long type were passed.
Boris Feld - Nov. 21, 2018, 3:50 p.m.
On 21/11/2018 13:30, Yuya Nishihara wrote:
> On Tue, 20 Nov 2018 21:44:32 +0100, Boris Feld wrote:
>> # HG changeset patch
>> # User Boris Feld <boris.feld@octobus.net>
>> # Date 1542725358 0
>> #      Tue Nov 20 14:49:18 2018 +0000
>> # Node ID 60a55da39befa4996c1a88ca6f663765ea143ef2
>> # Parent  337a389953366f059f2cb88129031ba3e67e0bbc
>> # EXP-Topic sparse-perf
>> # Available At https://bitbucket.org/octobus/mercurial-devel/
>> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 60a55da39bef
>> sparse-revlog: add a `index_get_start` function in C
>> diff --git a/mercurial/cext/revlog.c b/mercurial/cext/revlog.c
>> --- a/mercurial/cext/revlog.c
>> +++ b/mercurial/cext/revlog.c
>> @@ -185,6 +185,46 @@ static inline int index_get_parents(inde
>>  	return 0;
>>  }
>>  
>> +static inline int64_t index_get_start(indexObject *self, Py_ssize_t rev)
> I meant "int index_get_start(*self, rev, *offset)" just like index_get_parents(),
> but this is also good.
Since the return is simpler than index_get_parent, it seemed simpler to
return the value directly.
>
>> +{
>> +	uint64_t offset;
>> +	if (rev >= self->length) {
>> +		PyObject *tuple;
>> +		PyObject *pylong;
>> +		unsigned PY_LONG_LONG tmp;
>> +		tuple = PyList_GET_ITEM(self->added, rev - self->length);
>> +		pylong = PyTuple_GET_ITEM(tuple, 0);
>> +#ifndef IS_PY3K
>> +		if (PyInt_Check(pylong)) {
>> +			long tmp2 = PyInt_AsLong(pylong);
>> +			if (tmp2 < 0) {
>> +				return -1;
>> +			}
>> +			tmp = (unsigned PY_LONG_LONG)tmp2;
>> +		} else {
>> +#endif
>> +			tmp = PyLong_AsUnsignedLongLong(pylong);
>> +			if (tmp == (unsigned PY_LONG_LONG) - 1) {
>> +				return -1;
>> +			}
>> +
>> +#ifndef IS_PY3K
>> +		}
>> +#endif
>> +		offset = (uint64_t)tmp;
> Sigh, PyLong API is so inconsistent on Python 2. Let's switch to
> PyLong_AsLongLong(), which can handle PyInt as well. I think it's
> okay to raise OverflowError if offset >= 1 << (63 - 16).

Okay, so dropping the special case and directly using:

  offset = PyLong_AsLongLong(pylong);
  if (offset < 0){ … }

Right?
>
> FWIW, PyLong_AsUnsignedLongLong() of Python 2 would raise BadInternalCall
> if non-long type were passed.
Yep, I saw that, and it's how I ended up adding the PyInt case.
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Yuya Nishihara - Nov. 21, 2018, 11:03 p.m.
On Wed, 21 Nov 2018 16:50:35 +0100, Boris FELD wrote:
> > Sigh, PyLong API is so inconsistent on Python 2. Let's switch to
> > PyLong_AsLongLong(), which can handle PyInt as well. I think it's
> > okay to raise OverflowError if offset >= 1 << (63 - 16).
> 
> Okay, so dropping the special case and directly using:
> 
>   offset = PyLong_AsLongLong(pylong);
>   if (offset < 0){ … }
> 
> Right?

Yep, something similar to the one I proposed for the PATCH 2.

Patch

diff --git a/mercurial/cext/revlog.c b/mercurial/cext/revlog.c
--- a/mercurial/cext/revlog.c
+++ b/mercurial/cext/revlog.c
@@ -185,6 +185,46 @@  static inline int index_get_parents(inde
 	return 0;
 }
 
+static inline int64_t index_get_start(indexObject *self, Py_ssize_t rev)
+{
+	uint64_t offset;
+	if (rev >= self->length) {
+		PyObject *tuple;
+		PyObject *pylong;
+		unsigned PY_LONG_LONG tmp;
+		tuple = PyList_GET_ITEM(self->added, rev - self->length);
+		pylong = PyTuple_GET_ITEM(tuple, 0);
+#ifndef IS_PY3K
+		if (PyInt_Check(pylong)) {
+			long tmp2 = PyInt_AsLong(pylong);
+			if (tmp2 < 0) {
+				return -1;
+			}
+			tmp = (unsigned PY_LONG_LONG)tmp2;
+		} else {
+#endif
+			tmp = PyLong_AsUnsignedLongLong(pylong);
+			if (tmp == (unsigned PY_LONG_LONG) - 1) {
+				return -1;
+			}
+
+#ifndef IS_PY3K
+		}
+#endif
+		offset = (uint64_t)tmp;
+	} else {
+		const char *data = index_deref(self, rev);
+		offset = getbe32(data + 4);
+		if (rev == 0) /* mask out version number for the first entry */
+			offset &= 0xFFFF;
+		else {
+			uint32_t offset_high = getbe32(data);
+			offset |= ((uint64_t)offset_high) << 32;
+		}
+	}
+	return (int64_t)(offset >> 16);
+}
+
 /*
  * RevlogNG format (all in big endian, data may be inlined):
  *    6 bytes: offset