Patchwork [7,of,7] ancestors: return all ancestors - not just those who are furthest from a root

login
register
mail settings
Submitter Mads Kiilerich
Date Feb. 24, 2014, 10:19 p.m.
Message ID <2b44103c2e866a3a06ad.1393280356@mk-desktop>
Download mbox | patch
Permalink /patch/3750/
State Deferred
Headers show

Comments

Mads Kiilerich - Feb. 24, 2014, 10:19 p.m.
# HG changeset patch
# User Mads Kiilerich <madski@unity3d.com>
# Date 1393278134 -3600
#      Mon Feb 24 22:42:14 2014 +0100
# Node ID 2b44103c2e866a3a06adc3b7f91cdb609754d4db
# Parent  c5b597a62f2231919df87fe2915a4edf1023166b
ancestors: return all ancestors - not just those who are furthest from a root

The function seemed to solve a harder problem than necessary. It gave some
common ancestors priority over others. This is a low level function and it
should consider all equal and leave it to the higher levels to decide how to
handle the multiple heads. Now they are all equal.

This can in some cases make a positive difference. The lack of test suite
changes shows that it in general won't.
Mads Kiilerich - Feb. 24, 2014, 10:23 p.m.
The last patches are mostly a request for comments. The first patches 
seems more stable and are candidates for default.

/Mads
Matt Mackall - Feb. 24, 2014, 10:31 p.m.
On Mon, 2014-02-24 at 23:19 +0100, Mads Kiilerich wrote:
> # HG changeset patch
> # User Mads Kiilerich <madski@unity3d.com>
> # Date 1393278134 -3600
> #      Mon Feb 24 22:42:14 2014 +0100
> # Node ID 2b44103c2e866a3a06adc3b7f91cdb609754d4db
> # Parent  c5b597a62f2231919df87fe2915a4edf1023166b
> ancestors: return all ancestors - not just those who are furthest from a root
> 
> The function seemed to solve a harder problem than necessary. It gave some
> common ancestors priority over others. This is a low level function and it
> should consider all equal and leave it to the higher levels to decide how to
> handle the multiple heads. Now they are all equal.

I'll need to think on this. But you should probably share some
benchmarks.
Mads Kiilerich - Feb. 25, 2014, 11:54 p.m.
On 02/24/2014 11:31 PM, Matt Mackall wrote:
> On Mon, 2014-02-24 at 23:19 +0100, Mads Kiilerich wrote:
>> # HG changeset patch
>> # User Mads Kiilerich <madski@unity3d.com>
>> # Date 1393278134 -3600
>> #      Mon Feb 24 22:42:14 2014 +0100
>> # Node ID 2b44103c2e866a3a06adc3b7f91cdb609754d4db
>> # Parent  c5b597a62f2231919df87fe2915a4edf1023166b
>> ancestors: return all ancestors - not just those who are furthest from a root
>>
>> The function seemed to solve a harder problem than necessary. It gave some
>> common ancestors priority over others. This is a low level function and it
>> should consider all equal and leave it to the higher levels to decide how to
>> handle the multiple heads. Now they are all equal.
> I'll need to think on this. But you should probably share some
> benchmarks.

What benchmarks do you have in mind?

My concern here is only correctness, not performance. The patch just 
removes a post processing step in the gca calculation and can thus only 
make it faster. The algorithm did two linear iterations over the full 
repo with the option of early exit - now it do one.

/Mads
Giovanni Gherdovich - Feb. 26, 2014, 12:07 a.m.
On Wed, Feb 26, 2014 at 12:54 AM, Mads Kiilerich <mads@kiilerich.com> wrote:
>
> On 02/24/2014 11:31 PM, Matt Mackall wrote:
>>
>> On Mon, 2014-02-24 at 23:19 +0100, Mads Kiilerich wrote:
>>>
>>> [...]
>>> The function seemed to solve a harder problem than necessary. It gave
some
>>> common ancestors priority over others. [...]
>>
>> I'll need to think on this. But you should probably share some
>> benchmarks.
>
>
> What benchmarks do you have in mind?
>
> My concern here is only correctness, not performance.
> The patch just removes a post processing step in
> the gca calculation [...]

When bos added the function you're removing,
he was very happy of the speedup he got:
http://selenic.com/hg/rev/2f7186400a07

The previous ancestor() code was renamed to genericancestor()
(dead code you are removing in another patch).
The reason that code wasn't deleted I guess
is in this 0/N message of that same series by bos:
http://markmail.org/message/64gbckmjmmwvdt3b

I think an important point is:
does mercurial need to bend the definition of Greater Common Ancestor,
which is heads(::X and ::Y), adding the clause that the base
of merges needs also to be "the furthest from the root" ?

One can say that it kinda make sense because "furthest from a root"
can mean "more recent" in the topological sense.
But if we say that we drop this, you're indeed saving some work
with your code removal.
Otherwise, it has to be done elsewhere and you might not have all info you
need at that point.

Cheers,
GGhh

Patch

diff --git a/mercurial/ancestor.py b/mercurial/ancestor.py
--- a/mercurial/ancestor.py
+++ b/mercurial/ancestor.py
@@ -11,8 +11,7 @@ 
 
 def ancestors(pfunc, *orignodes):
     """
-    Returns the common ancestors of a and b that are furthest from a
-    root (as measured by longest path).
+    Returns a set with the heads of all common ancestors of all orignodes.
 
     pfunc must return a list of parent vertices for a given vertex.
     """
@@ -67,67 +66,9 @@ 
                     seen[p] = sv
         return gca
 
-    def deepest(nodes):
-        interesting = {}
-        count = max(nodes) + 1
-        depth = [0] * count
-        seen = [0] * count
-        mapping = []
-        for (i, n) in enumerate(sorted(nodes)):
-            depth[n] = 1
-            b = 1 << i
-            seen[n] = b
-            interesting[b] = 1
-            mapping.append((b, n))
-        nv = count - 1
-        while nv >= 0 and len(interesting) > 1:
-            v = nv
-            nv -= 1
-            dv = depth[v]
-            if dv == 0:
-                continue
-            sv = seen[v]
-            for p in pfunc(v):
-                if p == nullrev:
-                    continue
-                dp = depth[p]
-                nsp = sp = seen[p]
-                if dp <= dv:
-                    depth[p] = dv + 1
-                    if sp != sv:
-                        interesting[sv] += 1
-                        nsp = seen[p] = sv
-                        if sp:
-                            interesting[sp] -= 1
-                            if interesting[sp] == 0:
-                                del interesting[sp]
-                elif dv == dp - 1:
-                    nsp = sp | sv
-                    if nsp == sp:
-                        continue
-                    seen[p] = nsp
-                    interesting.setdefault(nsp, 0)
-                    interesting[nsp] += 1
-                    interesting[sp] -= 1
-                    if interesting[sp] == 0:
-                        del interesting[sp]
-            interesting[sv] -= 1
-            if interesting[sv] == 0:
-                del interesting[sv]
-
-        if len(interesting) != 1:
-            return []
-
-        k = 0
-        for i in interesting:
-            k |= i
-        return set(n for (i, n) in mapping if k & i)
-
     gca = candidates(orignodes)
 
-    if len(gca) <= 1:
-        return gca
-    return deepest(gca)
+    return gca
 
 def genericancestor(a, b, pfunc):
     """
diff --git a/mercurial/parsers.c b/mercurial/parsers.c
--- a/mercurial/parsers.c
+++ b/mercurial/parsers.c
@@ -1288,162 +1288,8 @@ 
 }
 
 /*
- * Given a disjoint set of revs, return the subset with the longest
- * path to the root.
- */
-static PyObject *find_deepest(indexObject *self, PyObject *revs)
-{
-	const Py_ssize_t revcount = PyList_GET_SIZE(revs);
-	static const Py_ssize_t capacity = 24;
-	int *depth, *interesting = NULL;
-	int i, j, v, ninteresting;
-	PyObject *dict = NULL, *keys;
-	long *seen = NULL;
-	int maxrev = -1;
-	long final;
-
-	if (revcount > capacity) {
-		PyErr_Format(PyExc_OverflowError,
-			     "bitset size (%ld) > capacity (%ld)",
-			     (long)revcount, (long)capacity);
-		return NULL;
-	}
-
-	for (i = 0; i < revcount; i++) {
-		int n = (int)PyInt_AsLong(PyList_GET_ITEM(revs, i));
-		if (n > maxrev)
-			maxrev = n;
-	}
-
-	depth = calloc(sizeof(*depth), maxrev + 1);
-	if (depth == NULL)
-		return PyErr_NoMemory();
-
-	seen = calloc(sizeof(*seen), maxrev + 1);
-	if (seen == NULL) {
-		PyErr_NoMemory();
-		goto bail;
-	}
-
-	interesting = calloc(sizeof(*interesting), 2 << revcount);
-	if (interesting == NULL) {
-		PyErr_NoMemory();
-		goto bail;
-	}
-
-	if (PyList_Sort(revs) == -1)
-		goto bail;
-
-	for (i = 0; i < revcount; i++) {
-		int n = (int)PyInt_AsLong(PyList_GET_ITEM(revs, i));
-		long b = 1l << i;
-		depth[n] = 1;
-		seen[n] = b;
-		interesting[b] = 1;
-	}
-
-	ninteresting = (int)revcount;
-
-	for (v = maxrev; v >= 0 && ninteresting > 1; v--) {
-		int dv = depth[v];
-		int parents[2];
-		long sv;
-
-		if (dv == 0)
-			continue;
-
-		sv = seen[v];
-		index_get_parents(self, v, parents);
-
-		for (i = 0; i < 2; i++) {
-			int p = parents[i];
-			long nsp, sp;
-			int dp;
-
-			if (p == -1)
-				continue;
-
-			dp = depth[p];
-			nsp = sp = seen[p];
-			if (dp <= dv) {
-				depth[p] = dv + 1;
-				if (sp != sv) {
-					interesting[sv] += 1;
-					nsp = seen[p] = sv;
-					if (sp) {
-						interesting[sp] -= 1;
-						if (interesting[sp] == 0)
-							ninteresting -= 1;
-					}
-				}
-			}
-			else if (dv == dp - 1) {
-				nsp = sp | sv;
-				if (nsp == sp)
-					continue;
-				seen[p] = nsp;
-				interesting[sp] -= 1;
-				if (interesting[sp] == 0 && interesting[nsp] > 0)
-					ninteresting -= 1;
-				interesting[nsp] += 1;
-			}
-		}
-		interesting[sv] -= 1;
-		if (interesting[sv] == 0)
-			ninteresting -= 1;
-	}
-
-	final = 0;
-	j = ninteresting;
-	for (i = 0; i < (int)(2 << revcount) && j > 0; i++) {
-		if (interesting[i] == 0)
-			continue;
-		final |= i;
-		j -= 1;
-	}
-	if (final == 0)
-		return PyList_New(0);
-
-	dict = PyDict_New();
-	if (dict == NULL)
-		goto bail;
-
-	for (i = 0; i < revcount; i++) {
-		PyObject *key;
-
-		if ((final & (1 << i)) == 0)
-			continue;
-
-		key = PyList_GET_ITEM(revs, i);
-		Py_INCREF(key);
-		Py_INCREF(Py_None);
-		if (PyDict_SetItem(dict, key, Py_None) == -1) {
-			Py_DECREF(key);
-			Py_DECREF(Py_None);
-			goto bail;
-		}
-	}
-
-	keys = PyDict_Keys(dict);
-
-	free(depth);
-	free(seen);
-	free(interesting);
-	Py_DECREF(dict);
-
-	return keys;
-bail:
-	free(depth);
-	free(seen);
-	free(interesting);
-	Py_XDECREF(dict);
-
-	return NULL;
-}
-
-/*
- * Given a (possibly overlapping) set of revs, return the greatest
- * common ancestors: those with the longest path to the root.
+ * Given a (possibly overlapping) set of revs, return all the greatest
+ * common ancestors
  */
 static PyObject *index_ancestors(indexObject *self, PyObject *args)
 {
@@ -1522,11 +1368,8 @@ 
 	if (gca == NULL)
 		goto bail;
 
-	if (PyList_GET_SIZE(gca) <= 1) {
-		ret = gca;
-		Py_INCREF(gca);
-	}
-	else ret = find_deepest(self, gca);
+	ret = gca;
+	Py_INCREF(gca);
 
 done:
 	free(revs);