Patchwork parsers: ensure revlog index node tree is initialized before insertion

login
register
mail settings
Submitter adgar@google.com
Date Dec. 4, 2014, 6:53 p.m.
Message ID <4c8cd4678e1cf6992b7d.1417719221@adgar.nyc.corp.google.com>
Download mbox | patch
Permalink /patch/7008/
State Accepted
Headers show

Comments

adgar@google.com - Dec. 4, 2014, 6:53 p.m.
# HG changeset patch
# User Mike Edgar <adgar@google.com>
# Date 1417712522 18000
#      Thu Dec 04 12:02:02 2014 -0500
# Node ID 4c8cd4678e1cf6992b7d0a07b87b60775f0dc2c0
# Parent  c237499a7fba65c88a2da721a22b66df4f39cf4e
parsers: ensure revlog index node tree is initialized before insertion

Currently, the revlog index C implementation assumes its node tree will be
initialized before a new element is inserted by revnum. For example, revlog.py
executes 'self.index.insert(-1, e)' in _addrevision(). This is only safe
because the node tree has been initialized by a "node in self.nodemap"
check made in addrevision().

(For context, this was discovered while developing an experimental revlog
mixin which stores "elided nodes" via a separate code path from
_addrevision(); that new code path segfaults without this patch.)
Matt Mackall - Dec. 4, 2014, 11:43 p.m.
On Thu, 2014-12-04 at 13:53 -0500, Mike Edgar wrote:
> # HG changeset patch
> # User Mike Edgar <adgar@google.com>
> # Date 1417712522 18000
> #      Thu Dec 04 12:02:02 2014 -0500
> # Node ID 4c8cd4678e1cf6992b7d0a07b87b60775f0dc2c0
> # Parent  c237499a7fba65c88a2da721a22b66df4f39cf4e
> parsers: ensure revlog index node tree is initialized before insertion
> 
> Currently, the revlog index C implementation assumes its node tree will be
> initialized before a new element is inserted by revnum. For example, revlog.py
> executes 'self.index.insert(-1, e)' in _addrevision(). This is only safe
> because the node tree has been initialized by a "node in self.nodemap"
> check made in addrevision().
> 
> (For context, this was discovered while developing an experimental revlog
> mixin which stores "elided nodes" via a separate code path from
> _addrevision(); that new code path segfaults without this patch.)

I take this to mean that's there's no way to hit this in normal
Mercurial, right? Queued for default, thanks.
adgar@google.com - Dec. 5, 2014, 3:25 p.m.
On Thu, Dec 4, 2014 at 6:43 PM, Matt Mackall <mpm@selenic.com> wrote:

> On Thu, 2014-12-04 at 13:53 -0500, Mike Edgar wrote:
> > # HG changeset patch
> > # User Mike Edgar <adgar@google.com>
> > # Date 1417712522 18000
> > #      Thu Dec 04 12:02:02 2014 -0500
> > # Node ID 4c8cd4678e1cf6992b7d0a07b87b60775f0dc2c0
> > # Parent  c237499a7fba65c88a2da721a22b66df4f39cf4e
> > parsers: ensure revlog index node tree is initialized before insertion
> >
> > Currently, the revlog index C implementation assumes its node tree will
> be
> > initialized before a new element is inserted by revnum. For example,
> revlog.py
> > executes 'self.index.insert(-1, e)' in _addrevision(). This is only safe
> > because the node tree has been initialized by a "node in self.nodemap"
> > check made in addrevision().
> >
> > (For context, this was discovered while developing an experimental revlog
> > mixin which stores "elided nodes" via a separate code path from
> > _addrevision(); that new code path segfaults without this patch.)
>
> I take this to mean that's there's no way to hit this in normal
> Mercurial, right? Queued for default, thanks.


Correct - existing Mercurial code appears to be unaffected. This patch is
purely precautionary.


> --
> Mathematics is the supreme nostalgia of our time.
>
>
>
Augie Fackler - Dec. 5, 2014, 6:05 p.m.
On Thu, Dec 04, 2014 at 05:43:32PM -0600, Matt Mackall wrote:
> On Thu, 2014-12-04 at 13:53 -0500, Mike Edgar wrote:
> > # HG changeset patch
> > # User Mike Edgar <adgar@google.com>
> > # Date 1417712522 18000
> > #      Thu Dec 04 12:02:02 2014 -0500
> > # Node ID 4c8cd4678e1cf6992b7d0a07b87b60775f0dc2c0
> > # Parent  c237499a7fba65c88a2da721a22b66df4f39cf4e
> > parsers: ensure revlog index node tree is initialized before insertion
> >
> > Currently, the revlog index C implementation assumes its node tree will be
> > initialized before a new element is inserted by revnum. For example, revlog.py
> > executes 'self.index.insert(-1, e)' in _addrevision(). This is only safe
> > because the node tree has been initialized by a "node in self.nodemap"
> > check made in addrevision().
> >
> > (For context, this was discovered while developing an experimental revlog
> > mixin which stores "elided nodes" via a separate code path from
> > _addrevision(); that new code path segfaults without this patch.)
>
> I take this to mean that's there's no way to hit this in normal
> Mercurial, right? Queued for default, thanks.

I don't believe it's possible yet, no.

>
> --
> Mathematics is the supreme nostalgia of our time.
>
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel

Patch

diff -r c237499a7fba -r 4c8cd4678e1c mercurial/parsers.c
--- a/mercurial/parsers.c	Wed Dec 03 22:56:42 2014 +0900
+++ b/mercurial/parsers.c	Thu Dec 04 12:02:02 2014 -0500
@@ -1978,6 +1978,9 @@ 
 			PyErr_SetString(PyExc_ValueError, "rev out of range");
 		return -1;
 	}
+
+	if (nt_init(self) == -1)
+		return -1;
 	return nt_insert(self, node, (int)rev);
 }