Patchwork [1,of,4,V2] serve: add support for Mercurial subrepositories

login
register
mail settings
Submitter Yuya Nishihara
Date Feb. 26, 2017, 1:28 p.m.
Message ID <20170226222801.8e30297cb51cf9521e713933@tcha.org>
Download mbox | patch
Permalink /patch/18798/
State Not Applicable
Headers show

Comments

Yuya Nishihara - Feb. 26, 2017, 1:28 p.m.
On Sat, 25 Feb 2017 15:25:30 -0500, Matt Harbison wrote:
> On Tue, 21 Feb 2017 09:33:58 -0500, Yuya Nishihara <yuya@tcha.org> wrote:
> >> - At http://localhost:8000, the links on the left side were corrupt  
> >> (graph
> >> is simply 'graph/tip', tags is 'http://tags', etc.)  The pattern seemed  
> >> to
> >> be that things ending in 'tip' are relative URLs, and the rest get an
> >> http: prefix.  It looks from the sources like it is just a matter of '/'
> >> not translating for the {repo} template.
> >
> > [snip]
> >
> > I saw some of these problems in V1. However, the standalone hgweb can  
> > host
> > a repository at '/', so I don't think it's quite difficult to host the  
> > subrepo
> > parent at '/'.
> 
> That's a good point, so now I'm really puzzled over the subtle  
> differences.  It looks like when it finds a hosted repo, it just farms out  
> to hgweb, like the single serve would do.

Yes. hgwebdir is just a dispatcher except for generating an index page.

This is a dirty hack to make hgwebdir serve a repo at '/'.


> >> To me, the benefits of this series over the original is:
> >>
> >> 1) The repos aren't moved, and therefore don't incur these subtle bugs
> >> 2) These changes are totally isolated to subrepo serving, so there  
> >> should
> >> be no risk to mainstream hgweb usage
> >> 3) This is completely compatible with existing serves with a simple
> >> '/=$projects_dir/**' config file, because the real repos are not  
> >> anchored
> >> to '/'.
> >
> > I think (2) and (3) are up to how we build a repository map.
> 
> What I mean for (2) is that in V2, none of the code changes are hit  
> without -S.  That seems like a good thing, given the subtle breakage in V1.
> 
> What I mean by (3) is that the transition for current clones is to simply  
> drop --web-conf=.. and start using -S on the server side.  Clients won't  
> see a difference.

No code change is nice, but I don't buy it much since the redirection thing
seemed unnecessarily complicated.

> I assume that the hardcoded web.staticurl in V1 also means that it won't  
> be able to serve a subrepo named 'static'?  (That's admittedly an edge  
> case.)

Yes. Also, you can't see /rev/ page if there's a subrepo named 'rev'.

> I only see the two ways to build the repo map, so I'm not sure what you  
> are saying here.

I meant we could delete the root repo from self.repos map on plain hgwebdir
server if that's considered a behavior change. (otherwise the root index
page would be replaced by the hgweb for the root repo.)
Matt Harbison - Feb. 27, 2017, 4:09 a.m.
On Sun, 26 Feb 2017 08:28:01 -0500, Yuya Nishihara <yuya@tcha.org> wrote:

> On Sat, 25 Feb 2017 15:25:30 -0500, Matt Harbison wrote:
>> On Tue, 21 Feb 2017 09:33:58 -0500, Yuya Nishihara <yuya@tcha.org>  
>> wrote:
>> >> - At http://localhost:8000, the links on the left side were corrupt
>> >> (graph
>> >> is simply 'graph/tip', tags is 'http://tags', etc.)  The pattern  
>> seemed
>> >> to
>> >> be that things ending in 'tip' are relative URLs, and the rest get an
>> >> http: prefix.  It looks from the sources like it is just a matter of  
>> '/'
>> >> not translating for the {repo} template.
>> >
>> > [snip]
>> >
>> > I saw some of these problems in V1. However, the standalone hgweb can
>> > host
>> > a repository at '/', so I don't think it's quite difficult to host the
>> > subrepo
>> > parent at '/'.
>>
>> That's a good point, so now I'm really puzzled over the subtle
>> differences.  It looks like when it finds a hosted repo, it just farms  
>> out
>> to hgweb, like the single serve would do.
>
> Yes. hgwebdir is just a dispatcher except for generating an index page.
>
> This is a dirty hack to make hgwebdir serve a repo at '/'.
>
> diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
> --- a/mercurial/hgweb/hgweb_mod.py
> +++ b/mercurial/hgweb/hgweb_mod.py
> @@ -335,7 +335,7 @@ class hgweb(object):
>          req.url = req.env['SCRIPT_NAME']
>          if not req.url.endswith('/'):
>              req.url += '/'
> -        if 'REPO_NAME' in req.env:
> +        if req.env.get('REPO_NAME'):
>              req.url += req.env['REPO_NAME'] + '/'
>         if 'PATH_INFO' in req.env:
> diff --git a/mercurial/hgweb/hgwebdir_mod.py  
> b/mercurial/hgweb/hgwebdir_mod.py
> --- a/mercurial/hgweb/hgwebdir_mod.py
> +++ b/mercurial/hgweb/hgwebdir_mod.py
> @@ -254,18 +254,19 @@ class hgwebdir(object):
>                  return []
>             # top-level index
> -            elif not virtual:
> +            elif False:
>                  req.respond(HTTP_OK, ctype)
>                  return self.makeindex(req, tmpl)
>             # nested indexes and hgwebs
>             repos = dict(self.repos)
> -            virtualrepo = virtual
> -            while virtualrepo:
> -                real = repos.get(virtualrepo)
> +            # XXX dirty hack to handle ''. IIRC, we have a helper  
> function
> +            virtualrepo = '/' + virtual
> +            while True:
> +                real = repos.get(virtualrepo[1:])
>                  if real:
> -                    req.env['REPO_NAME'] = virtualrepo
> +                    req.env['REPO_NAME'] = virtualrepo[1:]
>                      try:
>                          # ensure caller gets private copy of ui
>                          repo = hg.repository(self.ui.copy(), real)

Oh, nice! Thanks for this.

Can you clarify what you mean by the hack? It seems like adding '/' to  
paths isn't all that uncommon (especially in subrepo code), and the  
closest I could find to a helper for removing '/' is  
hgwebdir_mod.cleannames().  But that takes a list and also does a  
util.pconvert().

Just a couple lines above this, 'virtual' is cleaned up with  
string.strip('/').  (That's interesting, because then I can't see how the  
"virtual.startswith('static/')" tests can ever be true.  It looks like  
this was papered over in 74f65f44a9aa?)

>> >> To me, the benefits of this series over the original is:
>> >>
>> >> 1) The repos aren't moved, and therefore don't incur these subtle  
>> bugs
>> >> 2) These changes are totally isolated to subrepo serving, so there
>> >> should
>> >> be no risk to mainstream hgweb usage
>> >> 3) This is completely compatible with existing serves with a simple
>> >> '/=$projects_dir/**' config file, because the real repos are not
>> >> anchored
>> >> to '/'.
>> >
>> > I think (2) and (3) are up to how we build a repository map.
>>
>> What I mean for (2) is that in V2, none of the code changes are hit
>> without -S.  That seems like a good thing, given the subtle breakage in  
>> V1.
>>
>> What I mean by (3) is that the transition for current clones is to  
>> simply
>> drop --web-conf=.. and start using -S on the server side.  Clients won't
>> see a difference.
>
> No code change is nice, but I don't buy it much since the redirection  
> thing
> seemed unnecessarily complicated.
>
>> I assume that the hardcoded web.staticurl in V1 also means that it won't
>> be able to serve a subrepo named 'static'?  (That's admittedly an edge
>> case.)
>
> Yes. Also, you can't see /rev/ page if there's a subrepo named 'rev'.
>
>> I only see the two ways to build the repo map, so I'm not sure what you
>> are saying here.
>
> I meant we could delete the root repo from self.repos map on plain  
> hgwebdir
> server if that's considered a behavior change. (otherwise the root index
> page would be replaced by the hgweb for the root repo.)
Yuya Nishihara - Feb. 27, 2017, 12:57 p.m.
On Sun, 26 Feb 2017 23:09:59 -0500, Matt Harbison wrote:
> On Sun, 26 Feb 2017 08:28:01 -0500, Yuya Nishihara <yuya@tcha.org> wrote:
> > This is a dirty hack to make hgwebdir serve a repo at '/'.
> >
> > diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
> > --- a/mercurial/hgweb/hgweb_mod.py
> > +++ b/mercurial/hgweb/hgweb_mod.py
> > @@ -335,7 +335,7 @@ class hgweb(object):
> >          req.url = req.env['SCRIPT_NAME']
> >          if not req.url.endswith('/'):
> >              req.url += '/'
> > -        if 'REPO_NAME' in req.env:
> > +        if req.env.get('REPO_NAME'):
> >              req.url += req.env['REPO_NAME'] + '/'
> >         if 'PATH_INFO' in req.env:
> > diff --git a/mercurial/hgweb/hgwebdir_mod.py  
> > b/mercurial/hgweb/hgwebdir_mod.py
> > --- a/mercurial/hgweb/hgwebdir_mod.py
> > +++ b/mercurial/hgweb/hgwebdir_mod.py
> > @@ -254,18 +254,19 @@ class hgwebdir(object):
> >                  return []
> >             # top-level index
> > -            elif not virtual:
> > +            elif False:
> >                  req.respond(HTTP_OK, ctype)
> >                  return self.makeindex(req, tmpl)
> >             # nested indexes and hgwebs
> >             repos = dict(self.repos)
> > -            virtualrepo = virtual
> > -            while virtualrepo:
> > -                real = repos.get(virtualrepo)
> > +            # XXX dirty hack to handle ''. IIRC, we have a helper  
> > function
> > +            virtualrepo = '/' + virtual
> > +            while True:
> > +                real = repos.get(virtualrepo[1:])
> >                  if real:
> > -                    req.env['REPO_NAME'] = virtualrepo
> > +                    req.env['REPO_NAME'] = virtualrepo[1:]
> >                      try:
> >                          # ensure caller gets private copy of ui
> >                          repo = hg.repository(self.ui.copy(), real)
> 
> Oh, nice! Thanks for this.
> 
> Can you clarify what you mean by the hack?

I wrote that code without thinking an edge case nor what would be the best way.
That's all.

> It seems like adding '/' to  
> paths isn't all that uncommon (especially in subrepo code), and the  
> closest I could find to a helper for removing '/' is  
> hgwebdir_mod.cleannames().  But that takes a list and also does a  
> util.pconvert().

Ok, found it. What I had in mind was util.finddirs(), which requires the first
'/' though.

> Just a couple lines above this, 'virtual' is cleaned up with  
> string.strip('/').  (That's interesting, because then I can't see how the  
> "virtual.startswith('static/')" tests can ever be true.  It looks like  
> this was papered over in 74f65f44a9aa?)

'virtual/foo', maybe.

Patch

diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py
--- a/mercurial/hgweb/hgweb_mod.py
+++ b/mercurial/hgweb/hgweb_mod.py
@@ -335,7 +335,7 @@  class hgweb(object):
         req.url = req.env['SCRIPT_NAME']
         if not req.url.endswith('/'):
             req.url += '/'
-        if 'REPO_NAME' in req.env:
+        if req.env.get('REPO_NAME'):
             req.url += req.env['REPO_NAME'] + '/'
 
         if 'PATH_INFO' in req.env:
diff --git a/mercurial/hgweb/hgwebdir_mod.py b/mercurial/hgweb/hgwebdir_mod.py
--- a/mercurial/hgweb/hgwebdir_mod.py
+++ b/mercurial/hgweb/hgwebdir_mod.py
@@ -254,18 +254,19 @@  class hgwebdir(object):
                 return []
 
             # top-level index
-            elif not virtual:
+            elif False:
                 req.respond(HTTP_OK, ctype)
                 return self.makeindex(req, tmpl)
 
             # nested indexes and hgwebs
 
             repos = dict(self.repos)
-            virtualrepo = virtual
-            while virtualrepo:
-                real = repos.get(virtualrepo)
+            # XXX dirty hack to handle ''. IIRC, we have a helper function
+            virtualrepo = '/' + virtual
+            while True:
+                real = repos.get(virtualrepo[1:])
                 if real:
-                    req.env['REPO_NAME'] = virtualrepo
+                    req.env['REPO_NAME'] = virtualrepo[1:]
                     try:
                         # ensure caller gets private copy of ui
                         repo = hg.repository(self.ui.copy(), real)