Patchwork [07,of,10,V2] sparse-revlog: add a `_trimchunk` function in C

login
register
mail settings
Submitter Boris Feld
Date Nov. 15, 2018, 10:38 a.m.
Message ID <f76fc4ed86fd2072c861.1542278325@localhost.localdomain>
Download mbox | patch
Permalink /patch/36597/
State Accepted
Headers show

Comments

Boris Feld - Nov. 15, 2018, 10:38 a.m.
# HG changeset patch
# User Boris Feld <boris.feld@octobus.net>
# Date 1541785523 -3600
#      Fri Nov 09 18:45:23 2018 +0100
# Node ID f76fc4ed86fd2072c861f57a3c1c3e892648fc92
# Parent  d8b70c216567aebbd5aeabf0cbee2ce192212e0e
# EXP-Topic sparse-perf
# Available At https://bitbucket.org/octobus/mercurial-devel/
#              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r f76fc4ed86fd
sparse-revlog: add a `_trimchunk` function in C

We are about to implement a native version of `slicechunktodensity`. For
clarity, we introduce the helper functions first.

This function is a native implementation of the python function `_trimchunk`
in `mercurial/revlogutils/deltas.py`.
Yuya Nishihara - Nov. 16, 2018, 1:31 p.m.
On Thu, 15 Nov 2018 11:38:45 +0100, Boris Feld wrote:
> # HG changeset patch
> # User Boris Feld <boris.feld@octobus.net>
> # Date 1541785523 -3600
> #      Fri Nov 09 18:45:23 2018 +0100
> # Node ID f76fc4ed86fd2072c861f57a3c1c3e892648fc92
> # Parent  d8b70c216567aebbd5aeabf0cbee2ce192212e0e
> # EXP-Topic sparse-perf
> # Available At https://bitbucket.org/octobus/mercurial-devel/
> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r f76fc4ed86fd
> sparse-revlog: add a `_trimchunk` function in C
> 
> We are about to implement a native version of `slicechunktodensity`. For
> clarity, we introduce the helper functions first.
> 
> This function is a native implementation of the python function `_trimchunk`
> in `mercurial/revlogutils/deltas.py`.
> 
> diff --git a/mercurial/cext/revlog.c b/mercurial/cext/revlog.c
> --- a/mercurial/cext/revlog.c
> +++ b/mercurial/cext/revlog.c
> @@ -1031,6 +1031,25 @@ static inline long index_segment_span(in
>  	        index_get_length(self, end_rev));
>  }
>  
> +/* returns revs[startidx:endidx] without empty trailing revs */
> +static PyObject *_trimchunk(indexObject *self, PyObject *revs, long startidx,
> +                            long endidx)

Perhaps start/endidx should be Py_ssize_t.

> +{
> +	while (endidx > 1 && endidx > startidx) {

Nit: I slightly prefer using "for" statement, but I'm fine with "while"
to be aligned with the pure Python implementation.

> +		PyObject *rev = PyList_GET_ITEM(revs, endidx - 1);
> +		Py_ssize_t r = PyInt_AsLong(rev);
> +		if (r == -1 && PyErr_Occurred()) {
> +			return NULL;
> +		}

Can't we use a native revs list instead of PyObject one?

> +		if (index_get_length(self, r - 1) != 0) {

What would happen if r == 0?

FWIW, the pure-Python implementation does length(revs[endidx -1]), not
length(revs[endidx - 1] - 1).

> +			break;
> +		}
> +		endidx -= 1;
> +	}
> +	PyObject *chunk = PyList_GetSlice(revs, startidx, endidx);
> +	return chunk;

Variable can't be declared here in C89, and Windows compiler is really strict
about that. Just return the chunk without naming it.
Boris Feld - Nov. 19, 2018, 9:17 a.m.
On 16/11/2018 14:31, Yuya Nishihara wrote:
> On Thu, 15 Nov 2018 11:38:45 +0100, Boris Feld wrote:
>> # HG changeset patch
>> # User Boris Feld <boris.feld@octobus.net>
>> # Date 1541785523 -3600
>> #      Fri Nov 09 18:45:23 2018 +0100
>> # Node ID f76fc4ed86fd2072c861f57a3c1c3e892648fc92
>> # Parent  d8b70c216567aebbd5aeabf0cbee2ce192212e0e
>> # EXP-Topic sparse-perf
>> # Available At https://bitbucket.org/octobus/mercurial-devel/
>> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r f76fc4ed86fd
>> sparse-revlog: add a `_trimchunk` function in C
>>
>> We are about to implement a native version of `slicechunktodensity`. For
>> clarity, we introduce the helper functions first.
>>
>> This function is a native implementation of the python function `_trimchunk`
>> in `mercurial/revlogutils/deltas.py`.
>>
>> diff --git a/mercurial/cext/revlog.c b/mercurial/cext/revlog.c
>> --- a/mercurial/cext/revlog.c
>> +++ b/mercurial/cext/revlog.c
>> @@ -1031,6 +1031,25 @@ static inline long index_segment_span(in
>>  	        index_get_length(self, end_rev));
>>  }
>>  
>> +/* returns revs[startidx:endidx] without empty trailing revs */
>> +static PyObject *_trimchunk(indexObject *self, PyObject *revs, long startidx,
>> +                            long endidx)
> Perhaps start/endidx should be Py_ssize_t.
Indeed, good catch.
>
>> +{
>> +	while (endidx > 1 && endidx > startidx) {
> Nit: I slightly prefer using "for" statement, but I'm fine with "while"
> to be aligned with the pure Python implementation.
Yeah, while was picked to mirror the Python implementation. I would
rather keep that.
>
>> +		PyObject *rev = PyList_GET_ITEM(revs, endidx - 1);
>> +		Py_ssize_t r = PyInt_AsLong(rev);
>> +		if (r == -1 && PyErr_Occurred()) {
>> +			return NULL;
>> +		}
> Can't we use a native revs list instead of PyObject one?
With a significant semantic change for that function we can. I've
updated that patch to introduce a function that changes the endidx
value. The Python list slicing is now done one level higher.
>
>> +		if (index_get_length(self, r - 1) != 0) {
> What would happen if r == 0?
>
> FWIW, the pure-Python implementation does length(revs[endidx -1]), not
> length(revs[endidx - 1] - 1).
Oops, thank for spotting it.
>
>> +			break;
>> +		}
>> +		endidx -= 1;
>> +	}
>> +	PyObject *chunk = PyList_GetSlice(revs, startidx, endidx);
>> +	return chunk;
Fixed, thanks for spotting that too.
> Variable can't be declared here in C89, and Windows compiler is really strict
> about that. Just return the chunk without naming it.
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/cext/revlog.c b/mercurial/cext/revlog.c
--- a/mercurial/cext/revlog.c
+++ b/mercurial/cext/revlog.c
@@ -1031,6 +1031,25 @@  static inline long index_segment_span(in
 	        index_get_length(self, end_rev));
 }
 
+/* returns revs[startidx:endidx] without empty trailing revs */
+static PyObject *_trimchunk(indexObject *self, PyObject *revs, long startidx,
+                            long endidx)
+{
+	while (endidx > 1 && endidx > startidx) {
+		PyObject *rev = PyList_GET_ITEM(revs, endidx - 1);
+		Py_ssize_t r = PyInt_AsLong(rev);
+		if (r == -1 && PyErr_Occurred()) {
+			return NULL;
+		}
+		if (index_get_length(self, r - 1) != 0) {
+			break;
+		}
+		endidx -= 1;
+	}
+	PyObject *chunk = PyList_GetSlice(revs, startidx, endidx);
+	return chunk;
+}
+
 static inline int nt_level(const char *node, Py_ssize_t level)
 {
 	int v = node[level >> 1];