Patchwork [1,of,6,path,intent] ui: add an intent to expandpath

login
register
mail settings
Submitter Gregory Szorc
Date Sept. 29, 2014, 5:30 a.m.
Message ID <2f06625fe4d502f43294.1411968621@3.1.168.192.in-addr.arpa>
Download mbox | patch
Permalink /patch/6020/
State Changes Requested
Headers show

Comments

Gregory Szorc - Sept. 29, 2014, 5:30 a.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1411967336 25200
#      Sun Sep 28 22:08:56 2014 -0700
# Node ID 2f06625fe4d502f43294e9087d9a83398a36fe9f
# Parent  4109cc16279ef0e04dc70e7f4c9ab7415e6a22d3
ui: add an intent to expandpath

This patch adds an optional named argument to ui.expandpath() that
defines the intended use of a repository.

The impetus behind this patch is to allow extensions to provide
more advanced and dynamic repository lookup functionality. For
example, there exists an extension at Mozilla which automagically
defines common repository URLs for the many Firefox repositories.
We want read operations hitting read-only slaves over HTTP(S) and
write operations hitting the master over SSH. Today, we have to
use separate names for separate URLs, which increases cognitive
burden and sometimes results in accidental use of the wrong repo.
We desire built-in hg operations to automagically use the URL for
the task at hand. This change and subsequent additions to define
the intent parameter will enable this through monkeypatching
ui.expandpath().
Mads Kiilerich - Sept. 29, 2014, 9:43 p.m.
On 09/29/2014 07:30 AM, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1411967336 25200
> #      Sun Sep 28 22:08:56 2014 -0700
> # Node ID 2f06625fe4d502f43294e9087d9a83398a36fe9f
> # Parent  4109cc16279ef0e04dc70e7f4c9ab7415e6a22d3
> ui: add an intent to expandpath
>
> This patch adds an optional named argument to ui.expandpath() that
> defines the intended use of a repository.
>
> The impetus behind this patch is to allow extensions to provide
> more advanced and dynamic repository lookup functionality. For
> example, there exists an extension at Mozilla which automagically
> defines common repository URLs for the many Firefox repositories.
> We want read operations hitting read-only slaves over HTTP(S) and
> write operations hitting the master over SSH. Today, we have to
> use separate names for separate URLs, which increases cognitive
> burden and sometimes results in accidental use of the wrong repo.
> We desire built-in hg operations to automagically use the URL for
> the task at hand. This change and subsequent additions to define
> the intent parameter will enable this through monkeypatching
> ui.expandpath().

Meh...

First: It doesn't work to add "features" to Mercurial without any usage 
or test in Mercurial - especially not without any reference to exactly 
what it is that need this feature. That would be unmaintainable.

I assume the intent parameter should be considered some enumeration of 
some well-known path types - that do not seem like a nice API, but we 
would at least need a definition of which values are valid and what they 
mean.

Also, Incoming should be like a preview of pull and outgoing like a 
preview of push. They should thus be treated exactly the same and it 
seems wrong that they have different "intents".

The existing handling of push/pull paths is also a bit not-so-elegant. 
The default and -push handling is duplicated in all the places it is used.

Instead, I suggest giving the ui class a couple of extra methods:

     def pushpath(self, loc):
         """Return path (which might be a URL) for pushing to loc.
         loc will be 'default' if not specified.
         If paths.loc is configured, use that instead of loc.
         However, if paths.loc-push is configured, prefer that.
         """
     def pullpath(loc):
         """Return path (which might be a URL) for pulling from loc.
         loc will be 'default' if not specified.
         If paths.loc is configured, use that instead of loc.
         """

(This docstring also specify that the -push suffix should apply to other 
paths than default. I find that an obvious extension that it would be 
hard not to implement accidentally.)

(I have always found the special -push suffix a bit odd. Having a 
special -pull suffix seemed more obvious and safe to me. For symmetry 
reasons I would like to have both so I could specify default-push and 
default-pull and know that they "always" only would be used for their 
intended purpose.)

I assume your extensions could do their job by monkeypatching these 
methods? Or why not? What do these extensions do?

/Mads
Gregory Szorc - Sept. 29, 2014, 10:53 p.m.
On 9/29/14 2:43 PM, Mads Kiilerich wrote:
> On 09/29/2014 07:30 AM, Gregory Szorc wrote:
>> # HG changeset patch
>> # User Gregory Szorc <gregory.szorc@gmail.com>
>> # Date 1411967336 25200
>> #      Sun Sep 28 22:08:56 2014 -0700
>> # Node ID 2f06625fe4d502f43294e9087d9a83398a36fe9f
>> # Parent  4109cc16279ef0e04dc70e7f4c9ab7415e6a22d3
>> ui: add an intent to expandpath
>>
>> This patch adds an optional named argument to ui.expandpath() that
>> defines the intended use of a repository.
>>
>> The impetus behind this patch is to allow extensions to provide
>> more advanced and dynamic repository lookup functionality. For
>> example, there exists an extension at Mozilla which automagically
>> defines common repository URLs for the many Firefox repositories.
>> We want read operations hitting read-only slaves over HTTP(S) and
>> write operations hitting the master over SSH. Today, we have to
>> use separate names for separate URLs, which increases cognitive
>> burden and sometimes results in accidental use of the wrong repo.
>> We desire built-in hg operations to automagically use the URL for
>> the task at hand. This change and subsequent additions to define
>> the intent parameter will enable this through monkeypatching
>> ui.expandpath().
>
> Meh...
>
> First: It doesn't work to add "features" to Mercurial without any usage
> or test in Mercurial - especially not without any reference to exactly
> what it is that need this feature. That would be unmaintainable.
>
> I assume the intent parameter should be considered some enumeration of
> some well-known path types - that do not seem like a nice API, but we
> would at least need a definition of which values are valid and what they
> mean.
>
> Also, Incoming should be like a preview of pull and outgoing like a
> preview of push. They should thus be treated exactly the same and it
> seems wrong that they have different "intents".
>
> The existing handling of push/pull paths is also a bit not-so-elegant.
> The default and -push handling is duplicated in all the places it is used.
>
> Instead, I suggest giving the ui class a couple of extra methods:
>
>      def pushpath(self, loc):
>          """Return path (which might be a URL) for pushing to loc.
>          loc will be 'default' if not specified.
>          If paths.loc is configured, use that instead of loc.
>          However, if paths.loc-push is configured, prefer that.
>          """
>      def pullpath(loc):
>          """Return path (which might be a URL) for pulling from loc.
>          loc will be 'default' if not specified.
>          If paths.loc is configured, use that instead of loc.
>          """
>
> (This docstring also specify that the -push suffix should apply to other
> paths than default. I find that an obvious extension that it would be
> hard not to implement accidentally.)
>
> (I have always found the special -push suffix a bit odd. Having a
> special -pull suffix seemed more obvious and safe to me. For symmetry
> reasons I would like to have both so I could specify default-push and
> default-pull and know that they "always" only would be used for their
> intended purpose.)
>
> I assume your extensions could do their job by monkeypatching these
> methods? Or why not? What do these extensions do?

My primary case was documented in the commit message.

I think adding explicit APIs for pushpath() and pullpath() would be 
sufficient for my needs. Although, outgoing could fall into either camp 
and having an explicit intent for the operation at hand could be useful. 
Not sure I can justify it though.

I don't feel comfortable bloating scope to include adding magical 
"-push" for every path. For one, I believe there are proposals around a 
more robust [paths] definition (defining which branches/bookmarks to 
"track," etc) and I don't want to implement something that could 
interfere too much with that future. But if I have buy-in, I'll do what 
it takes to land this feature.
Mads Kiilerich - Sept. 29, 2014, 11:15 p.m.
On 09/30/2014 12:53 AM, Gregory Szorc wrote:
> On 9/29/14 2:43 PM, Mads Kiilerich wrote:
>> On 09/29/2014 07:30 AM, Gregory Szorc wrote:
>>> # HG changeset patch
>>> # User Gregory Szorc <gregory.szorc@gmail.com>
>>> # Date 1411967336 25200
>>> #      Sun Sep 28 22:08:56 2014 -0700
>>> # Node ID 2f06625fe4d502f43294e9087d9a83398a36fe9f
>>> # Parent  4109cc16279ef0e04dc70e7f4c9ab7415e6a22d3
>>> ui: add an intent to expandpath
>>>
>>> This patch adds an optional named argument to ui.expandpath() that
>>> defines the intended use of a repository.
>>>
>>> The impetus behind this patch is to allow extensions to provide
>>> more advanced and dynamic repository lookup functionality. For
>>> example, there exists an extension at Mozilla which automagically
>>> defines common repository URLs for the many Firefox repositories.
>>> We want read operations hitting read-only slaves over HTTP(S) and
>>> write operations hitting the master over SSH. Today, we have to
>>> use separate names for separate URLs, which increases cognitive
>>> burden and sometimes results in accidental use of the wrong repo.
>>> We desire built-in hg operations to automagically use the URL for
>>> the task at hand. This change and subsequent additions to define
>>> the intent parameter will enable this through monkeypatching
>>> ui.expandpath().
>>
>> Meh...
>>
>> First: It doesn't work to add "features" to Mercurial without any usage
>> or test in Mercurial - especially not without any reference to exactly
>> what it is that need this feature. That would be unmaintainable.
>>
>> I assume the intent parameter should be considered some enumeration of
>> some well-known path types - that do not seem like a nice API, but we
>> would at least need a definition of which values are valid and what they
>> mean.
>>
>> Also, Incoming should be like a preview of pull and outgoing like a
>> preview of push. They should thus be treated exactly the same and it
>> seems wrong that they have different "intents".
>>
>> The existing handling of push/pull paths is also a bit not-so-elegant.
>> The default and -push handling is duplicated in all the places it is 
>> used.
>>
>> Instead, I suggest giving the ui class a couple of extra methods:
>>
>>      def pushpath(self, loc):
>>          """Return path (which might be a URL) for pushing to loc.
>>          loc will be 'default' if not specified.
>>          If paths.loc is configured, use that instead of loc.
>>          However, if paths.loc-push is configured, prefer that.
>>          """
>>      def pullpath(loc):
>>          """Return path (which might be a URL) for pulling from loc.
>>          loc will be 'default' if not specified.
>>          If paths.loc is configured, use that instead of loc.
>>          """
>>
>> (This docstring also specify that the -push suffix should apply to other
>> paths than default. I find that an obvious extension that it would be
>> hard not to implement accidentally.)
>>
>> (I have always found the special -push suffix a bit odd. Having a
>> special -pull suffix seemed more obvious and safe to me. For symmetry
>> reasons I would like to have both so I could specify default-push and
>> default-pull and know that they "always" only would be used for their
>> intended purpose.)
>>
>> I assume your extensions could do their job by monkeypatching these
>> methods? Or why not? What do these extensions do?
>
> My primary case was documented in the commit message.

Anecdotal evidence is no evidence. Do you have a link to the extension 
so we can see what problem you are solving? How will it look when using 
the API you suggest?

> I think adding explicit APIs for pushpath() and pullpath() would be 
> sufficient for my needs. Although, outgoing could fall into either camp

How is that? Outgoing is the planning phase of a push. How can it be 
pull-ish?

> I don't feel comfortable bloating scope to include adding magical 
> "-push" for every path.

It sounds to me like that is exactly what you need to solve the problem 
you describe. That would make it possible to replace your extension with 
upstream functionality with trivial configuration.

/Mads
Gregory Szorc - Sept. 30, 2014, 12:30 a.m.
On 9/29/14 4:15 PM, Mads Kiilerich wrote:
> On 09/30/2014 12:53 AM, Gregory Szorc wrote:
>> On 9/29/14 2:43 PM, Mads Kiilerich wrote:
>>> On 09/29/2014 07:30 AM, Gregory Szorc wrote:
>>>> # HG changeset patch
>>>> # User Gregory Szorc <gregory.szorc@gmail.com>
>>>> # Date 1411967336 25200
>>>> #      Sun Sep 28 22:08:56 2014 -0700
>>>> # Node ID 2f06625fe4d502f43294e9087d9a83398a36fe9f
>>>> # Parent  4109cc16279ef0e04dc70e7f4c9ab7415e6a22d3
>>>> ui: add an intent to expandpath
>>>>
>>>> This patch adds an optional named argument to ui.expandpath() that
>>>> defines the intended use of a repository.
>>>>
>>>> The impetus behind this patch is to allow extensions to provide
>>>> more advanced and dynamic repository lookup functionality. For
>>>> example, there exists an extension at Mozilla which automagically
>>>> defines common repository URLs for the many Firefox repositories.
>>>> We want read operations hitting read-only slaves over HTTP(S) and
>>>> write operations hitting the master over SSH. Today, we have to
>>>> use separate names for separate URLs, which increases cognitive
>>>> burden and sometimes results in accidental use of the wrong repo.
>>>> We desire built-in hg operations to automagically use the URL for
>>>> the task at hand. This change and subsequent additions to define
>>>> the intent parameter will enable this through monkeypatching
>>>> ui.expandpath().
>>>
>>> Meh...
>>>
>>> First: It doesn't work to add "features" to Mercurial without any usage
>>> or test in Mercurial - especially not without any reference to exactly
>>> what it is that need this feature. That would be unmaintainable.
>>>
>>> I assume the intent parameter should be considered some enumeration of
>>> some well-known path types - that do not seem like a nice API, but we
>>> would at least need a definition of which values are valid and what they
>>> mean.
>>>
>>> Also, Incoming should be like a preview of pull and outgoing like a
>>> preview of push. They should thus be treated exactly the same and it
>>> seems wrong that they have different "intents".
>>>
>>> The existing handling of push/pull paths is also a bit not-so-elegant.
>>> The default and -push handling is duplicated in all the places it is
>>> used.
>>>
>>> Instead, I suggest giving the ui class a couple of extra methods:
>>>
>>>      def pushpath(self, loc):
>>>          """Return path (which might be a URL) for pushing to loc.
>>>          loc will be 'default' if not specified.
>>>          If paths.loc is configured, use that instead of loc.
>>>          However, if paths.loc-push is configured, prefer that.
>>>          """
>>>      def pullpath(loc):
>>>          """Return path (which might be a URL) for pulling from loc.
>>>          loc will be 'default' if not specified.
>>>          If paths.loc is configured, use that instead of loc.
>>>          """
>>>
>>> (This docstring also specify that the -push suffix should apply to other
>>> paths than default. I find that an obvious extension that it would be
>>> hard not to implement accidentally.)
>>>
>>> (I have always found the special -push suffix a bit odd. Having a
>>> special -pull suffix seemed more obvious and safe to me. For symmetry
>>> reasons I would like to have both so I could specify default-push and
>>> default-pull and know that they "always" only would be used for their
>>> intended purpose.)
>>>
>>> I assume your extensions could do their job by monkeypatching these
>>> methods? Or why not? What do these extensions do?
>>
>> My primary case was documented in the commit message.
>
> Anecdotal evidence is no evidence. Do you have a link to the extension
> so we can see what problem you are solving? How will it look when using
> the API you suggest?
>
>> I think adding explicit APIs for pushpath() and pullpath() would be
>> sufficient for my needs. Although, outgoing could fall into either camp
>
> How is that? Outgoing is the planning phase of a push. How can it be
> pull-ish?

outgoing is read-only despite the association with push. In Mozilla's 
use case, only people with push access (a subset) have access to the 
push URL (SSH). Everybody has access to the HTTP endpoint. People 
without SSH access should still be able to use outgoing. So using the 
push/SSH URL for outgoing would be wrong in our case, even though it is 
logically the most appropriate URL to use.

To further complicate matters, our HTTP layer is currently rejecting 
some Mercurial HTTP requests because the encoding of 100+ heads during 
discovery is exceeding HTTP header limits. We have some workarounds to 
use SSH instead of HTTP for discovery, when appropriate.
Mads Kiilerich - Sept. 30, 2014, 4:53 p.m.
On 09/30/2014 02:30 AM, Gregory Szorc wrote:
> outgoing is read-only despite the association with push. In Mozilla's 
> use case, only people with push access (a subset) have access to the 
> push URL (SSH). Everybody has access to the HTTP endpoint. People 
> without SSH access should still be able to use outgoing. So using the 
> push/SSH URL for outgoing would be wrong in our case, even though it 
> is logically the most appropriate URL to use.

I think you put it backwards and describe your current setup as a 
requirement.

Yes, outgoing is a read-only (and very cheap - especially when using 
ssh) subset of push. It is intended to show exactly what will be pushed. 
Trying to treat them differently would be a confusing misfeature. I 
don't think Mercurial itself should do anything to encourage that 
misfeature - quite the opposite.

In the setup you describe, people with push access should use ssh for 
outgoing, just like they do for push. People without push access should 
just use http for everything.

> To further complicate matters, our HTTP layer is currently rejecting 
> some Mercurial HTTP requests because the encoding of 100+ heads during 
> discovery is exceeding HTTP header limits.

We only see that problem when people have full clones with all the 
gazillion of heads that they never will have any use for. Do you support 
that case or do you see it in other cases as well?

/Mads
Gregory Szorc - Sept. 30, 2014, 5:18 p.m.
On 9/30/14 9:53 AM, Mads Kiilerich wrote:
> On 09/30/2014 02:30 AM, Gregory Szorc wrote:
>> outgoing is read-only despite the association with push. In Mozilla's
>> use case, only people with push access (a subset) have access to the
>> push URL (SSH). Everybody has access to the HTTP endpoint. People
>> without SSH access should still be able to use outgoing. So using the
>> push/SSH URL for outgoing would be wrong in our case, even though it
>> is logically the most appropriate URL to use.
>
> I think you put it backwards and describe your current setup as a
> requirement.
>
> Yes, outgoing is a read-only (and very cheap - especially when using
> ssh) subset of push. It is intended to show exactly what will be pushed.
> Trying to treat them differently would be a confusing misfeature. I
> don't think Mercurial itself should do anything to encourage that
> misfeature - quite the opposite.
>
> In the setup you describe, people with push access should use ssh for
> outgoing, just like they do for push. People without push access should
> just use http for everything.

We don't want to overload our SSH server. We have N>1 HTTP servers to 
facilitate scaling. Therefore we'd prefer to use HTTP for all non-push 
operations.

>> To further complicate matters, our HTTP layer is currently rejecting
>> some Mercurial HTTP requests because the encoding of 100+ heads during
>> discovery is exceeding HTTP header limits.
>
> We only see that problem when people have full clones with all the
> gazillion of heads that they never will have any use for. Do you support
> that case or do you see it in other cases as well?

We see problems with N < 1000 heads. (This is separate from the N > 
10000 heads problem we have with Try.) If you have checked out the 
Firefox repositories with each release on its own branch/head, you get 
several hundred heads. Along that vein, we're considering doing "fake 
merges" to reduce the number of heads in our repos to mitigate this problem.
Mads Kiilerich - Sept. 30, 2014, 5:26 p.m.
On 09/30/2014 07:18 PM, Gregory Szorc wrote:
> On 9/30/14 9:53 AM, Mads Kiilerich wrote:
>> On 09/30/2014 02:30 AM, Gregory Szorc wrote:
>>> outgoing is read-only despite the association with push. In Mozilla's
>>> use case, only people with push access (a subset) have access to the
>>> push URL (SSH). Everybody has access to the HTTP endpoint. People
>>> without SSH access should still be able to use outgoing. So using the
>>> push/SSH URL for outgoing would be wrong in our case, even though it
>>> is logically the most appropriate URL to use.
>>
>> I think you put it backwards and describe your current setup as a
>> requirement.
>>
>> Yes, outgoing is a read-only (and very cheap - especially when using
>> ssh) subset of push. It is intended to show exactly what will be pushed.
>> Trying to treat them differently would be a confusing misfeature. I
>> don't think Mercurial itself should do anything to encourage that
>> misfeature - quite the opposite.
>>
>> In the setup you describe, people with push access should use ssh for
>> outgoing, just like they do for push. People without push access should
>> just use http for everything.
>
> We don't want to overload our SSH server. We have N>1 HTTP servers to 
> facilitate scaling. Therefore we'd prefer to use HTTP for all non-push 
> operations.

Do you have any numbers to support that the load from users with push 
access running outgoing would be significant? I doubt it will make a 
real difference - and certainly not enough to justify the added 
complexity and breaking the users expectations.

/Mads
Gregory Szorc - Sept. 30, 2014, 6 p.m.
On 9/30/14 10:26 AM, Mads Kiilerich wrote:
> On 09/30/2014 07:18 PM, Gregory Szorc wrote:
>> On 9/30/14 9:53 AM, Mads Kiilerich wrote:
>>> On 09/30/2014 02:30 AM, Gregory Szorc wrote:
>>>> outgoing is read-only despite the association with push. In Mozilla's
>>>> use case, only people with push access (a subset) have access to the
>>>> push URL (SSH). Everybody has access to the HTTP endpoint. People
>>>> without SSH access should still be able to use outgoing. So using the
>>>> push/SSH URL for outgoing would be wrong in our case, even though it
>>>> is logically the most appropriate URL to use.
>>>
>>> I think you put it backwards and describe your current setup as a
>>> requirement.
>>>
>>> Yes, outgoing is a read-only (and very cheap - especially when using
>>> ssh) subset of push. It is intended to show exactly what will be pushed.
>>> Trying to treat them differently would be a confusing misfeature. I
>>> don't think Mercurial itself should do anything to encourage that
>>> misfeature - quite the opposite.
>>>
>>> In the setup you describe, people with push access should use ssh for
>>> outgoing, just like they do for push. People without push access should
>>> just use http for everything.
>>
>> We don't want to overload our SSH server. We have N>1 HTTP servers to
>> facilitate scaling. Therefore we'd prefer to use HTTP for all non-push
>> operations.
>
> Do you have any numbers to support that the load from users with push
> access running outgoing would be significant? I doubt it will make a
> real difference - and certainly not enough to justify the added
> complexity and breaking the users expectations.

No. However the principle of it is that we shouldn't be providing a 
vector to overwhelm our master if it can be avoided. We already have 
enough scaling problems: I'd prefer to not introduce another one.

That being said, in the grand scheme of things I doubt outgoing requests 
are frequent enough to cause any real world issue. That being said, our 
high-volume server is our "try" repo. And we're moving that to a 
bundle-based approach. That will generate a lot of outgoing requests. 
But we'll have a custom extension for that and we can pin the URL used 
for discovery easily enough.
Augie Fackler - Oct. 14, 2014, 8:17 p.m.
On Tue, Sep 30, 2014 at 11:00:52AM -0700, Gregory Szorc wrote:
> On 9/30/14 10:26 AM, Mads Kiilerich wrote:
> >On 09/30/2014 07:18 PM, Gregory Szorc wrote:
> >>On 9/30/14 9:53 AM, Mads Kiilerich wrote:
> >>>On 09/30/2014 02:30 AM, Gregory Szorc wrote:
> >>>>outgoing is read-only despite the association with push. In Mozilla's
> >>>>use case, only people with push access (a subset) have access to the
> >>>>push URL (SSH). Everybody has access to the HTTP endpoint. People
> >>>>without SSH access should still be able to use outgoing. So using the
> >>>>push/SSH URL for outgoing would be wrong in our case, even though it
> >>>>is logically the most appropriate URL to use.
> >>>
> >>>I think you put it backwards and describe your current setup as a
> >>>requirement.
> >>>
> >>>Yes, outgoing is a read-only (and very cheap - especially when using
> >>>ssh) subset of push. It is intended to show exactly what will be pushed.
> >>>Trying to treat them differently would be a confusing misfeature. I
> >>>don't think Mercurial itself should do anything to encourage that
> >>>misfeature - quite the opposite.
> >>>
> >>>In the setup you describe, people with push access should use ssh for
> >>>outgoing, just like they do for push. People without push access should
> >>>just use http for everything.
> >>
> >>We don't want to overload our SSH server. We have N>1 HTTP servers to
> >>facilitate scaling. Therefore we'd prefer to use HTTP for all non-push
> >>operations.
> >
> >Do you have any numbers to support that the load from users with push
> >access running outgoing would be significant? I doubt it will make a
> >real difference - and certainly not enough to justify the added
> >complexity and breaking the users expectations.
>
> No. However the principle of it is that we shouldn't be providing a vector
> to overwhelm our master if it can be avoided. We already have enough scaling
> problems: I'd prefer to not introduce another one.
>
> That being said, in the grand scheme of things I doubt outgoing requests are
> frequent enough to cause any real world issue. That being said, our
> high-volume server is our "try" repo. And we're moving that to a
> bundle-based approach. That will generate a lot of outgoing requests. But
> we'll have a custom extension for that and we can pin the URL used for
> discovery easily enough.

FWIW, I can sort of see this making sense for things like lookaside
bundles. Think large, globally distributed company that doesn't have
enormous pipes to all their engineering offices - you could redirect
most pull traffic to a local replica transparently, but then do all
pushes to a central master, without having to do some sort of
complicated replica system.

> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
Pierre-Yves David - Oct. 17, 2014, 8:59 p.m.
On 09/29/2014 03:53 PM, Gregory Szorc wrote:
> I don't feel comfortable bloating scope to include adding magical
> "-push" for every path. For one, I believe there are proposals around a
> more robust [paths] definition (defining which branches/bookmarks to
> "track," etc) and I don't want to implement something that could
> interfere too much with that future. But if I have buy-in, I'll do what
> it takes to land this feature.

Maybe it is time to refresh the idea of having metadata attached to 
path. We could stick your special stuff there.
Pierre-Yves David - Oct. 17, 2014, 8:59 p.m.
On 09/30/2014 09:53 AM, Mads Kiilerich wrote:
> Yes, outgoing is a read-only (and very cheap - especially when using
> ssh) subset of push. It is intended to show exactly what will be pushed.
> Trying to treat them differently would be a confusing misfeature. I
> don't think Mercurial itself should do anything to encourage that
> misfeature - quite the opposite.

Note that outgoing will have to become more complex to handle other data 
than changegroup (phase, bookmarks, obsmarkers).

This does not change much to your statement. It is still a cheap 
operation. Just wanted to point than outgoing is changing.
Pierre-Yves David - Oct. 17, 2014, 8:59 p.m.
On 09/30/2014 11:00 AM, Gregory Szorc wrote:
> No. However the principle of it is that we shouldn't be providing a
> vector to overwhelm our master if it can be avoided. We already have
> enough scaling problems: I'd prefer to not introduce another one.

Did Mozilla looked into the hgsql extension that would allow them to 
spread their loads between multiple masters?
Pierre-Yves David - Oct. 17, 2014, 8:59 p.m.
On 10/14/2014 01:17 PM, Augie Fackler wrote:
> FWIW, I can sort of see this making sense for things like lookaside
> bundles. Think large, globally distributed company that doesn't have
> enormous pipes to all their engineering offices - you could redirect
> most pull traffic to a local replica transparently, but then do all
> pushes to a central master, without having to do some sort of
> complicated replica system.

This is now covered into bundle2 by the special bundle2 part Mike Hommey 
have been adding.
Pierre-Yves David - Oct. 17, 2014, 9 p.m.
On 09/28/2014 10:30 PM, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1411967336 25200
> #      Sun Sep 28 22:08:56 2014 -0700
> # Node ID 2f06625fe4d502f43294e9087d9a83398a36fe9f
> # Parent  4109cc16279ef0e04dc70e7f4c9ab7415e6a22d3
> ui: add an intent to expandpath

I'm dropping this series from patchwork. It need reworking at minimal 
and probably more discussion about the right approach here.
Gregory Szorc - Oct. 18, 2014, 12:48 a.m.
On 10/17/2014 1:59 PM, Pierre-Yves David wrote:
> 
> 
> On 09/30/2014 11:00 AM, Gregory Szorc wrote:
>> No. However the principle of it is that we shouldn't be providing a
>> vector to overwhelm our master if it can be avoided. We already have
>> enough scaling problems: I'd prefer to not introduce another one.
> 
> Did Mozilla looked into the hgsql extension that would allow them to
> spread their loads between multiple masters?
> 

Not seriously. We recently upgraded the read-only hgweb servers to 3.1.x
from 2.5.4. We're in the process of getting the SSH master upgraded to
3.1.x (still running 2.5.4). Once we're running a modern stack, we can
start some real investigations. Even then, short term effort will likely
be invested in hosting our Try and and code review bundles in S3 rather
than mega-headed repositories. SQL or alternative revlog storage will
likely have to wait until 2015.
Mike Hommey - Oct. 18, 2014, 1:13 a.m.
On Tue, Sep 30, 2014 at 07:26:22PM +0200, Mads Kiilerich wrote:
> On 09/30/2014 07:18 PM, Gregory Szorc wrote:
> >On 9/30/14 9:53 AM, Mads Kiilerich wrote:
> >>On 09/30/2014 02:30 AM, Gregory Szorc wrote:
> >>>outgoing is read-only despite the association with push. In Mozilla's
> >>>use case, only people with push access (a subset) have access to the
> >>>push URL (SSH). Everybody has access to the HTTP endpoint. People
> >>>without SSH access should still be able to use outgoing. So using the
> >>>push/SSH URL for outgoing would be wrong in our case, even though it
> >>>is logically the most appropriate URL to use.
> >>
> >>I think you put it backwards and describe your current setup as a
> >>requirement.
> >>
> >>Yes, outgoing is a read-only (and very cheap - especially when using
> >>ssh) subset of push. It is intended to show exactly what will be pushed.
> >>Trying to treat them differently would be a confusing misfeature. I
> >>don't think Mercurial itself should do anything to encourage that
> >>misfeature - quite the opposite.
> >>
> >>In the setup you describe, people with push access should use ssh for
> >>outgoing, just like they do for push. People without push access should
> >>just use http for everything.
> >
> >We don't want to overload our SSH server. We have N>1 HTTP servers to
> >facilitate scaling. Therefore we'd prefer to use HTTP for all non-push
> >operations.
> 
> Do you have any numbers to support that the load from users with push access
> running outgoing would be significant? I doubt it will make a real
> difference - and certainly not enough to justify the added complexity and
> breaking the users expectations.

It's not all about the servers. As a user I find it annoying to have to
open a ssh connection (which usually means typing a passphrase) to use
things like outgoing("foo") in a revset, or simply the hg outgoing
command.

Mike

Patch

diff --git a/mercurial/ui.py b/mercurial/ui.py
--- a/mercurial/ui.py
+++ b/mercurial/ui.py
@@ -496,10 +496,18 @@  class ui(object):
         if not self.verbose:
             user = util.shortuser(user)
         return user
 
-    def expandpath(self, loc, default=None):
-        """Return repository location relative to cwd or from [paths]"""
+    def expandpath(self, loc, default=None, intent=None):
+        '''Return repository location relative to cwd or from [paths].
+
+        loc is the location to expand. It can be a URL, filesystem path,
+        or symbolic name.
+
+        intent is a string describing the intended use of this repo. It is
+        intended for extensions so that extensions may do things like return
+        different URLs for push versus pull.
+        '''
         if util.hasscheme(loc) or os.path.isdir(os.path.join(loc, '.hg')):
             return loc
 
         path = self.config('paths', loc)