Patchwork [2,of,2] subrepo: check that subrepo paths are valid (issue4363)

login
register
mail settings
Submitter Jordi Gutiérrez Hermoso
Date Sept. 9, 2014, 5:56 p.m.
Message ID <3db2b7dd861d71344126.1410285402@Iris>
Download mbox | patch
Permalink /patch/5746/
State Changes Requested
Headers show

Comments

Jordi Gutiérrez Hermoso - Sept. 9, 2014, 5:56 p.m.
# HG changeset patch
# User Jordi Gutiérrez Hermoso <jordigh@octave.org>
# Date 1410279641 14400
#      Tue Sep 09 12:20:41 2014 -0400
# Node ID 3db2b7dd861d7134412678f96df7b7bb1cfe9c82
# Parent  7611a7833657301d9a881095828d5e75558d82f9
subrepo: check that subrepo paths are valid (issue4363)

This checks that proposed subrepo paths are valid using
pathutil.pathauditor. This check is already done when creating a
subrepo object in the subrepo.subrepo function, but I think this is
too late and doesn't catch the case where a modified .hgsub has had
new paths added but hasn't been committed yet.

Moving this check earlier and catching the exception to provide a
better error message both seem like improvements.
Mads Kiilerich - Sept. 9, 2014, 11:31 p.m.
On 09/09/2014 07:56 PM, Jordi Gutiérrez Hermoso wrote:
> # HG changeset patch
> # User Jordi Gutiérrez Hermoso <jordigh@octave.org>
> # Date 1410279641 14400
> #      Tue Sep 09 12:20:41 2014 -0400
> # Node ID 3db2b7dd861d7134412678f96df7b7bb1cfe9c82
> # Parent  7611a7833657301d9a881095828d5e75558d82f9
> subrepo: check that subrepo paths are valid (issue4363)
>
> This checks that proposed subrepo paths are valid using
> pathutil.pathauditor. This check is already done when creating a
> subrepo object in the subrepo.subrepo function, but I think this is
> too late and doesn't catch the case where a modified .hgsub has had
> new paths added but hasn't been committed yet.

Why do you think that is too late? You don't have a test case to show it 
so you really need a good explanation. But there should be a test case 
anyway.

> Moving this check earlier and catching the exception to provide a
> better error message both seem like improvements.

"and" = two patches.

> diff --git a/mercurial/subrepo.py b/mercurial/subrepo.py
> --- a/mercurial/subrepo.py
> +++ b/mercurial/subrepo.py
> @@ -130,6 +130,12 @@ def state(ctx, ui):
>           if os.path.isfile(path):
>               raise util.Abort(_('subrepository path is an existing file: %s')
>                                % path)
> +        try:
> +            pathutil.pathauditor(ctx._repo.root)(path)
> +        except util.Abort, e:
> +            raise util.Abort(_('bad subrepository path; %s') % e.message,
> +                             hint=e.hint)
> +
>           kind = 'hg'
>           if src.startswith('['):
>               if ']' not in src:
> @@ -344,7 +350,6 @@ def subrepo(ctx, path):
>       import hg as h
>       hg = h
>   
> -    pathutil.pathauditor(ctx._repo.root)(path)

Are you 200% confident that removing this pathauditor in another method 
doesn't open up for any un-audited code paths?

/Mads
Jordi Gutiérrez Hermoso - Sept. 10, 2014, 2:57 p.m.
On Wed, 2014-09-10 at 01:31 +0200, Mads Kiilerich wrote:
> On 09/09/2014 07:56 PM, Jordi Gutiérrez Hermoso wrote:
> > # HG changeset patch
> > # User Jordi Gutiérrez Hermoso <jordigh@octave.org>
> > # Date 1410279641 14400
> > #      Tue Sep 09 12:20:41 2014 -0400
> > # Node ID 3db2b7dd861d7134412678f96df7b7bb1cfe9c82
> > # Parent  7611a7833657301d9a881095828d5e75558d82f9
> > subrepo: check that subrepo paths are valid (issue4363)
> >
> > This checks that proposed subrepo paths are valid using
> > pathutil.pathauditor. This check is already done when creating a
> > subrepo object in the subrepo.subrepo function, but I think this is
> > too late and doesn't catch the case where a modified .hgsub has had
> > new paths added but hasn't been committed yet.
> 
> Why do you think that is too late?

I say right here, "doesn't catch the case where a modified..." Testing
for bad paths later doesn't catch this case.

> > Moving this check earlier and catching the exception to provide a
> > better error message both seem like improvements.
> 
> "and" = two patches.

Okay.

> > diff --git a/mercurial/subrepo.py b/mercurial/subrepo.py
> > --- a/mercurial/subrepo.py
> > +++ b/mercurial/subrepo.py
> > @@ -130,6 +130,12 @@ def state(ctx, ui):
> >           if os.path.isfile(path):
> >               raise util.Abort(_('subrepository path is an existing file: %s')
> >                                % path)
> > +        try:
> > +            pathutil.pathauditor(ctx._repo.root)(path)
> > +        except util.Abort, e:
> > +            raise util.Abort(_('bad subrepository path; %s') % e.message,
> > +                             hint=e.hint)
> > +
> >           kind = 'hg'
> >           if src.startswith('['):
> >               if ']' not in src:
> > @@ -344,7 +350,6 @@ def subrepo(ctx, path):
> >       import hg as h
> >       hg = h
> >   
> > -    pathutil.pathauditor(ctx._repo.root)(path)
> 
> Are you 200% confident that removing this pathauditor in another method 
> doesn't open up for any un-audited code paths?

It passes the test suite. And I know that before creating a subrepo
object, .hgsub must be parsed, so erroring out when parsing .hgsub
seems to be the best thing to do.

There's a bigger problem here, really, that I changed the error
message in order to actually make it clear what the bad path was
about. The error message before was really cryptic and made it
impossible to know that it was about subrepos. So, that really is a
BC, which I guess passed the test suite because we don't have a test
for it.

Assuming this BC can happen (and my guess is it will be shot down),
Augie raised the point that it means I should change how
pathutil.pathauditor works, since apparently catching util.Abort is
undesirable, although I'm not 100% clear on why. I suppose it's too
generic of a catch. If that is the problem, then I would prefer to
modify pathauditor to raise a subclass of util.Abort, but Augie
indicated this was bad technical debt too. This explanation is still
obscure to me.

So what's the problem with catching util.Abort? I'd rather not make
the more annoying change of modifying pathauditor and all of its
callers to return status and error messages instead of raising
exceptions.

- Jordi G. H.
Mads Kiilerich - Sept. 10, 2014, 5:31 p.m.
On 09/10/2014 04:57 PM, Jordi Gutiérrez Hermoso wrote:
> On Wed, 2014-09-10 at 01:31 +0200, Mads Kiilerich wrote:
>> On 09/09/2014 07:56 PM, Jordi Gutiérrez Hermoso wrote:
>>> # HG changeset patch
>>> # User Jordi Gutiérrez Hermoso <jordigh@octave.org>
>>> # Date 1410279641 14400
>>> #      Tue Sep 09 12:20:41 2014 -0400
>>> # Node ID 3db2b7dd861d7134412678f96df7b7bb1cfe9c82
>>> # Parent  7611a7833657301d9a881095828d5e75558d82f9
>>> subrepo: check that subrepo paths are valid (issue4363)
>>>
>>> This checks that proposed subrepo paths are valid using
>>> pathutil.pathauditor. This check is already done when creating a
>>> subrepo object in the subrepo.subrepo function, but I think this is
>>> too late and doesn't catch the case where a modified .hgsub has had
>>> new paths added but hasn't been committed yet.
>> Why do you think that is too late?
> I say right here, "doesn't catch the case where a modified..." Testing
> for bad paths later doesn't catch this case.

Path auditor is primarily to protect the user from bad commits from 
other uses, secondarily to protect users from making such commits. It is 
not possible to protect the user from himself and we shouldn't try too 
hard to do it.

Mentioning what would happen and having a test case for it would help.

>>> diff --git a/mercurial/subrepo.py b/mercurial/subrepo.py
>>> --- a/mercurial/subrepo.py
>>> +++ b/mercurial/subrepo.py
>>> @@ -130,6 +130,12 @@ def state(ctx, ui):
>>>            if os.path.isfile(path):
>>>                raise util.Abort(_('subrepository path is an existing file: %s')
>>>                                 % path)
>>> +        try:
>>> +            pathutil.pathauditor(ctx._repo.root)(path)
>>> +        except util.Abort, e:
>>> +            raise util.Abort(_('bad subrepository path; %s') % e.message,
>>> +                             hint=e.hint)
>>> +
>>>            kind = 'hg'
>>>            if src.startswith('['):
>>>                if ']' not in src:
>>> @@ -344,7 +350,6 @@ def subrepo(ctx, path):
>>>        import hg as h
>>>        hg = h
>>>    
>>> -    pathutil.pathauditor(ctx._repo.root)(path)
>> Are you 200% confident that removing this pathauditor in another method
>> doesn't open up for any un-audited code paths?
> It passes the test suite.

Path auditor is a security mechanism. Tests can prove expected behaviour 
in some defined cases, but they cannot prove that there in no cases will 
be "wrong" behaviour - and that is essentially what security is.

> And I know that before creating a subrepo
> object, .hgsub must be parsed, so erroring out when parsing .hgsub
> seems to be the best thing to do.

If you have reviewed that all execution paths that would have hit the 
old pathauditor now will hit the new one instead, then say so.

> There's a bigger problem here, really, that I changed the error
> message in order to actually make it clear what the bad path was
> about. The error message before was really cryptic and made it
> impossible to know that it was about subrepos. So, that really is a
> BC, which I guess passed the test suite because we don't have a test
> for it.
>
> Assuming this BC can happen (and my guess is it will be shot down),
> Augie raised the point that it means I should change how
> pathutil.pathauditor works, since apparently catching util.Abort is
> undesirable, although I'm not 100% clear on why. I suppose it's too
> generic of a catch. If that is the problem, then I would prefer to
> modify pathauditor to raise a subclass of util.Abort, but Augie
> indicated this was bad technical debt too. This explanation is still
> obscure to me.
>
> So what's the problem with catching util.Abort? I'd rather not make
> the more annoying change of modifying pathauditor and all of its
> callers to return status and error messages instead of raising
> exceptions.

That is discussion of the third and not so crucial patch - don't let 
that block the first two patches.

If it really is relevant to provide context for such exceptions, I would 
suggest catching the abort (or pretty much all exceptions), show a 
ui.error "error happened processing X:", and re-raise.

/Mads

Patch

diff --git a/mercurial/subrepo.py b/mercurial/subrepo.py
--- a/mercurial/subrepo.py
+++ b/mercurial/subrepo.py
@@ -130,6 +130,12 @@  def state(ctx, ui):
         if os.path.isfile(path):
             raise util.Abort(_('subrepository path is an existing file: %s')
                              % path)
+        try:
+            pathutil.pathauditor(ctx._repo.root)(path)
+        except util.Abort, e:
+            raise util.Abort(_('bad subrepository path; %s') % e.message,
+                             hint=e.hint)
+
         kind = 'hg'
         if src.startswith('['):
             if ']' not in src:
@@ -344,7 +350,6 @@  def subrepo(ctx, path):
     import hg as h
     hg = h
 
-    pathutil.pathauditor(ctx._repo.root)(path)
     state = ctx.substate[path]
     if state[2] not in types:
         raise util.Abort(_('unknown subrepo type %s') % state[2])