Submitter | Angel Ezquerra |
---|---|
Date | March 20, 2013, 10:15 p.m. |
Message ID | <84f24bbe0c3e97e4e175.1363817741@Angel-PC.localdomain> |
Download | mbox | patch |
Permalink | /patch/1144/ |
State | Rejected, archived |
Headers | show |
Comments
On Wed, Mar 20, 2013 at 11:15 PM, Angel Ezquerra <angel.ezquerra@gmail.com> wrote: > # HG changeset patch > # User Angel Ezquerra <angel.ezquerra@gmail.com> > # Date 1363809088 -3600 > # Wed Mar 20 20:51:28 2013 +0100 > # Node ID 84f24bbe0c3e97e4e1753c894e3c963d1e6c6d63 > # Parent 136516cd3d6902aaa2edc9befc65763c56a6dbfc > hgweb: require files and directory links to begin with 'path:' > > If they don't the server will reply with a 403 HTTP forbidden error. This gets > rid of the need to explicitly check for the known pattern types. > > Note that the templater has access to a "path" variable which is a path to the > current file or directory relative to the root of the repository, and which > begins with a "/". However archival.archive() expects 'path:' to not begin with > a '/'. To cope with this webcommands.archive must remove the extra '/' which is > passed by the templater. Sorry, I forgot to mark this as RFC as I have not updated the tests. This patch is in response to the recent thread between me, Kevin and Pierre-Yves. The code is a little clunky because I had to manually remove the extra slash that the templater adds to the current path. I didn't find a way to remove the extra slash using the template engine, so I had to remove it on the webcommands.archive side. Comments, particularly on how to improve this last part would be nice. Angel
On Wed, Mar 20, 2013 at 11:15:41PM +0100, Angel Ezquerra wrote: > # HG changeset patch > # User Angel Ezquerra <angel.ezquerra@gmail.com> > # Date 1363809088 -3600 > # Wed Mar 20 20:51:28 2013 +0100 > # Node ID 84f24bbe0c3e97e4e1753c894e3c963d1e6c6d63 > # Parent 136516cd3d6902aaa2edc9befc65763c56a6dbfc > hgweb: require files and directory links to begin with 'path:' > > If they don't the server will reply with a 403 HTTP forbidden error. This gets > rid of the need to explicitly check for the known pattern types. I really fails to grasp why you complexify for whole argument process for the shake of being able to replay a 403 in some rare error case. In my opinion, the "file" argument should be documented as a plain path and processed as such. It make not sense to allows any pattern here. They should not be recognised. Returning a 404 in this case is fine by me for the shake of simplicity. > Note that the templater has access to a "path" variable which is a path to the > current file or directory relative to the root of the repository, and which > begins with a "/". However archival.archive() expects 'path:' to not begin with > a '/'. To cope with this webcommands.archive must remove the extra '/' which is > passed by the templater. > > diff --git a/mercurial/hgweb/webcommands.py b/mercurial/hgweb/webcommands.py > --- a/mercurial/hgweb/webcommands.py > +++ b/mercurial/hgweb/webcommands.py > @@ -822,12 +822,11 @@ > file = req.form.get('file', None) > if file: > file = file[0] > - patandfile = file.split(':') > - if len(patandfile) > 1 and patandfile[0].lower() in ('glob', 'relglob', > - 'path', 'relpath', 're', 'relre', 'set'): > - msg = 'Archive pattern not allowed: %s' % file > + if not file.lower().startswith('path:/'): > + msg = "Archive path must begin with 'path:'" > raise ErrorResponse(HTTP_FORBIDDEN, msg) > - pats = ['path:' + file] > + # The file path has an extra "/" that must be removed > + pats = ['path:' + file[6:]] consider using lstrip('/')
On Thu, Mar 21, 2013 at 11:31 AM, Pierre-Yves David <pierre-yves.david@logilab.fr> wrote: > On Wed, Mar 20, 2013 at 11:15:41PM +0100, Angel Ezquerra wrote: >> # HG changeset patch >> # User Angel Ezquerra <angel.ezquerra@gmail.com> >> # Date 1363809088 -3600 >> # Wed Mar 20 20:51:28 2013 +0100 >> # Node ID 84f24bbe0c3e97e4e1753c894e3c963d1e6c6d63 >> # Parent 136516cd3d6902aaa2edc9befc65763c56a6dbfc >> hgweb: require files and directory links to begin with 'path:' >> >> If they don't the server will reply with a 403 HTTP forbidden error. This gets >> rid of the need to explicitly check for the known pattern types. > > I really fails to grasp why you complexify for whole argument process for the > shake of being able to replay a 403 in some rare error case. > > In my opinion, the "file" argument should be documented as a plain path and > processed as such. It make not sense to allows any pattern here. They should > not be recognised. Returning a 404 in this case is fine by me for the shake of > simplicity. I'm confused. I thought this patch _simplified_ the code. If we did not have to remove the extra starting slash the code would be as simple as doing a simple file.startswith('path:') check. Note that without this check archival will happily return an empty archive. The user cannot tell whether its input patter is wrong or the file simply doesn't exist. >> Note that the templater has access to a "path" variable which is a path to the >> current file or directory relative to the root of the repository, and which >> begins with a "/". However archival.archive() expects 'path:' to not begin with >> a '/'. To cope with this webcommands.archive must remove the extra '/' which is >> passed by the templater. >> >> diff --git a/mercurial/hgweb/webcommands.py b/mercurial/hgweb/webcommands.py >> --- a/mercurial/hgweb/webcommands.py >> +++ b/mercurial/hgweb/webcommands.py >> @@ -822,12 +822,11 @@ >> file = req.form.get('file', None) >> if file: >> file = file[0] >> - patandfile = file.split(':') >> - if len(patandfile) > 1 and patandfile[0].lower() in ('glob', 'relglob', >> - 'path', 'relpath', 're', 'relre', 'set'): >> - msg = 'Archive pattern not allowed: %s' % file >> + if not file.lower().startswith('path:/'): >> + msg = "Archive path must begin with 'path:'" >> raise ErrorResponse(HTTP_FORBIDDEN, msg) >> - pats = ['path:' + file] >> + # The file path has an extra "/" that must be removed >> + pats = ['path:' + file[6:]] > > consider using lstrip('/') > Neat, I did not know about that. Should I resend the patch with this little change or are you agains the patch as a whole? Cheers, Angel
On Thu, Mar 21, 2013 at 3:51 PM, Kevin Bullock <kbullock+mercurial@ringworld.org> wrote: > On 20 Mar 2013, at 5:15 PM, Angel Ezquerra wrote: > >> # HG changeset patch >> # User Angel Ezquerra <angel.ezquerra@gmail.com> >> # Date 1363809088 -3600 >> # Wed Mar 20 20:51:28 2013 +0100 >> # Node ID 84f24bbe0c3e97e4e1753c894e3c963d1e6c6d63 >> # Parent 136516cd3d6902aaa2edc9befc65763c56a6dbfc >> hgweb: require files and directory links to begin with 'path:' > > I'm -1 on this. It simplifies the code slightly but leaks an implementation detail. In fact I'm becoming more convinced that we should just be returning a 404 here anyway, and document the new feature as only taking explicit path names. > > Either way, we still need to turn an empty archive into a 404. OK, it seems actually quite easy to through an error.Abort when archival.archive finds no files. I will prepare a patch and send it. Note that I think we should only throw the exception if we pass a pattern and it matches no files. If we pass no pattern and there are no files (e.g. the repo is empty) then there should be no exception thrown IMO. Cheers, Angel
On Thu, Mar 21, 2013 at 10:07 PM, Angel Ezquerra <angel.ezquerra@gmail.com> wrote: > On Thu, Mar 21, 2013 at 3:51 PM, Kevin Bullock > <kbullock+mercurial@ringworld.org> wrote: >> On 20 Mar 2013, at 5:15 PM, Angel Ezquerra wrote: >> >>> # HG changeset patch >>> # User Angel Ezquerra <angel.ezquerra@gmail.com> >>> # Date 1363809088 -3600 >>> # Wed Mar 20 20:51:28 2013 +0100 >>> # Node ID 84f24bbe0c3e97e4e1753c894e3c963d1e6c6d63 >>> # Parent 136516cd3d6902aaa2edc9befc65763c56a6dbfc >>> hgweb: require files and directory links to begin with 'path:' >> >> I'm -1 on this. It simplifies the code slightly but leaks an implementation detail. In fact I'm becoming more convinced that we should just be returning a 404 here anyway, and document the new feature as only taking explicit path names. >> >> Either way, we still need to turn an empty archive into a 404. > > OK, > > it seems actually quite easy to through an error.Abort when > archival.archive finds no files. I will prepare a patch and send it. > Note that I think we should only throw the exception if we pass a > pattern and it matches no files. If we pass no pattern and there are > no files (e.g. the repo is empty) then there should be no exception > thrown IMO. > > Cheers, > > Angel Umm, this is a little less straight forward that it seemd. It is easy to make archival.archive raise an exception when no files are match. The problem is that when webcommands.archive calls archival.archive, it has already sent an HTTP_OK response. Thus, the exception is risen, but the client never sees it. I think you must send HTTP_OK before a succesful archival.archive(). The only way I can see to generate the exception earlier is to look for files matching the pattern outside of archival.archive. That is something like: files = [f for f in ctx.manifest().keys() if matchfn(f)] But then we'd be doing this twice (once outside archival.archive and once within). Would that be acceptable? Cheers, Angel
Patch
diff --git a/mercurial/hgweb/webcommands.py b/mercurial/hgweb/webcommands.py --- a/mercurial/hgweb/webcommands.py +++ b/mercurial/hgweb/webcommands.py @@ -822,12 +822,11 @@ file = req.form.get('file', None) if file: file = file[0] - patandfile = file.split(':') - if len(patandfile) > 1 and patandfile[0].lower() in ('glob', 'relglob', - 'path', 'relpath', 're', 'relre', 'set'): - msg = 'Archive pattern not allowed: %s' % file + if not file.lower().startswith('path:/'): + msg = "Archive path must begin with 'path:'" raise ErrorResponse(HTTP_FORBIDDEN, msg) - pats = ['path:' + file] + # The file path has an extra "/" that must be removed + pats = ['path:' + file[6:]] mimetype, artype, extension, encoding = web.archive_specs[type_] headers = [ diff --git a/mercurial/templates/coal/map b/mercurial/templates/coal/map --- a/mercurial/templates/coal/map +++ b/mercurial/templates/coal/map @@ -224,7 +224,7 @@ index = ../paper/index.tmpl archiveentry = ' <li> - <a href="{url|urlescape}archive/{node|short}{extension|urlescape}{ifeq(path,'/','',path|urlescape)}">{type|escape}</a> + <a href="{url|urlescape}archive/{node|short}{extension|urlescape}{ifeq(path,'/','','path:{path|urlescape}')}">{type|escape}</a> </li>' notfound = ../paper/notfound.tmpl error = ../paper/error.tmpl diff --git a/mercurial/templates/gitweb/map b/mercurial/templates/gitweb/map --- a/mercurial/templates/gitweb/map +++ b/mercurial/templates/gitweb/map @@ -289,7 +289,7 @@ <td class="link"> <a href="{url|urlescape}file/{node|short}/{file|urlescape}{sessionvars%urlparameter}">file</a> | <a href="{url|urlescape}diff/{node|short}/{file|urlescape}{sessionvars%urlparameter}">diff</a> | <a href="{url|urlescape}annotate/{node|short}/{file|urlescape}{sessionvars%urlparameter}">annotate</a> {rename%filelogrename}</td> </tr>' -archiveentry = ' | <a href="{url|urlescape}archive/{node|short}{extension}{ifeq(path,'/','',path|urlescape)}">{type|escape}</a> ' +archiveentry = ' | <a href="{url|urlescape}archive/{node|short}{extension}{ifeq(path,'/','','/path:{path|urlescape}')}">{type|escape}</a> ' indexentry = ' <tr class="parity{parity}"> <td> diff --git a/mercurial/templates/monoblue/map b/mercurial/templates/monoblue/map --- a/mercurial/templates/monoblue/map +++ b/mercurial/templates/monoblue/map @@ -245,7 +245,7 @@ {rename%filelogrename} </td> </tr>' -archiveentry = '<li><a href="{url|urlescape}archive/{node|short}{extension}{ifeq(path,'/','',path|urlescape)}">{type|escape}</a></li>' +archiveentry = '<li><a href="{url|urlescape}archive/{node|short}{extension}{ifeq(path,'/','','/path:{path|urlescape}')}">{type|escape}</a></li>' indexentry = ' <tr class="parity{parity}"> <td><a href="{url|urlescape}{sessionvars%urlparameter}">{name|escape}</a></td> diff --git a/mercurial/templates/paper/map b/mercurial/templates/paper/map --- a/mercurial/templates/paper/map +++ b/mercurial/templates/paper/map @@ -232,7 +232,7 @@ index = index.tmpl archiveentry = ' <li> - <a href="{url|urlescape}archive/{node|short}{extension|urlescape}{ifeq(path,'/','',path|urlescape)}">{type|escape}</a> + <a href="{url|urlescape}archive/{node|short}{extension|urlescape}{ifeq(path,'/','','/path:{path|urlescape}')}">{type|escape}</a> </li>' notfound = notfound.tmpl error = error.tmpl