Patchwork [3,of,3,STABLE] clone: process 'lookup' return as an arbitrary symbol

login
register
mail settings
Submitter Boris Feld
Date July 26, 2018, 12:21 p.m.
Message ID <88a0bf46a3ffb78aaab2.1532607680@FB-lair>
Download mbox | patch
Permalink /patch/32946/
State New
Headers show

Comments

Boris Feld - July 26, 2018, 12:21 p.m.
# HG changeset patch
# User Boris Feld <boris.feld@octobus.net>
# Date 1532595350 -7200
#      Thu Jul 26 10:55:50 2018 +0200
# Branch stable
# Node ID 88a0bf46a3ffb78aaab203d13a7c9f53e244282b
# Parent  a920f2620726ef26e6caed3d72b24297699b5b39
# EXP-Topic compat-hggit
# Available At https://bitbucket.org/octobus/mercurial-devel/
#              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 88a0bf46a3ff
clone: process 'lookup' return as an arbitrary symbol

In theory, checkout is expected to be a node here because it was returned by
peer.lookup.

In practice, multiple important extensions (like hg-git, hg-subversion) use
peers not backed by a mercurial repository where lookup cannot return a node.

Allowing arbitrary symbols is necessary to make these extensions working with
4.7.

We should probably introduce a new API in Core to have these extensions to
work without abusing the lookup API. In the meantime, a small change to
restore compatibility in 4.7 seems in order.
Yuya Nishihara - July 26, 2018, 1:06 p.m.
On Thu, 26 Jul 2018 14:21:20 +0200, Boris Feld wrote:
> # HG changeset patch
> # User Boris Feld <boris.feld@octobus.net>
> # Date 1532595350 -7200
> #      Thu Jul 26 10:55:50 2018 +0200
> # Branch stable
> # Node ID 88a0bf46a3ffb78aaab203d13a7c9f53e244282b
> # Parent  a920f2620726ef26e6caed3d72b24297699b5b39
> # EXP-Topic compat-hggit
> # Available At https://bitbucket.org/octobus/mercurial-devel/
> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 88a0bf46a3ff
> clone: process 'lookup' return as an arbitrary symbol
> 
> In theory, checkout is expected to be a node here because it was returned by
> peer.lookup.
> 
> In practice, multiple important extensions (like hg-git, hg-subversion) use
> peers not backed by a mercurial repository where lookup cannot return a node.
> 
> Allowing arbitrary symbols is necessary to make these extensions working with
> 4.7.
> 
> We should probably introduce a new API in Core to have these extensions to
> work without abusing the lookup API. In the meantime, a small change to
> restore compatibility in 4.7 seems in order.
> 
> diff --git a/mercurial/hg.py b/mercurial/hg.py
> --- a/mercurial/hg.py
> +++ b/mercurial/hg.py
> @@ -730,6 +730,19 @@ def clone(ui, peeropts, source, dest=Non
>  
>                  uprev = None
>                  status = None
> +                if checkout is not None and not node.isnode(checkout):

IIUC, hggit/hgsubversion may pass in a 20-length symbolic name?

> +                    # checkout was expected to be a node here because it was
> +                    # returned by peer.lookup.
> +                    #
> +                    # However, some extension (like hg-git) introduce peer not
> +                    # backed by a mercurial repository where lookup cannot
> +                    # return a node. Processing 'checkout' as an arbitrary
> +                    # symbols make it possible with these extensions keep
> +                    # working.
> +                    if scmutil.isrevsymbol(destrepo, checkout):
> +                        checkout = scmutil.revsymbol(destrepo, checkout).node()
> +                    else:
> +                        checkout = None

Do you have any plan to deprecate this hack?
Boris FELD - July 26, 2018, 1:25 p.m.
On 26/07/2018 15:06, Yuya Nishihara wrote:
> On Thu, 26 Jul 2018 14:21:20 +0200, Boris Feld wrote:
>> # HG changeset patch
>> # User Boris Feld <boris.feld@octobus.net>
>> # Date 1532595350 -7200
>> #      Thu Jul 26 10:55:50 2018 +0200
>> # Branch stable
>> # Node ID 88a0bf46a3ffb78aaab203d13a7c9f53e244282b
>> # Parent  a920f2620726ef26e6caed3d72b24297699b5b39
>> # EXP-Topic compat-hggit
>> # Available At https://bitbucket.org/octobus/mercurial-devel/
>> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 88a0bf46a3ff
>> clone: process 'lookup' return as an arbitrary symbol
>>
>> In theory, checkout is expected to be a node here because it was returned by
>> peer.lookup.
>>
>> In practice, multiple important extensions (like hg-git, hg-subversion) use
>> peers not backed by a mercurial repository where lookup cannot return a node.
>>
>> Allowing arbitrary symbols is necessary to make these extensions working with
>> 4.7.
>>
>> We should probably introduce a new API in Core to have these extensions to
>> work without abusing the lookup API. In the meantime, a small change to
>> restore compatibility in 4.7 seems in order.
>>
>> diff --git a/mercurial/hg.py b/mercurial/hg.py
>> --- a/mercurial/hg.py
>> +++ b/mercurial/hg.py
>> @@ -730,6 +730,19 @@ def clone(ui, peeropts, source, dest=Non
>>   
>>                   uprev = None
>>                   status = None
>> +                if checkout is not None and not node.isnode(checkout):
> IIUC, hggit/hgsubversion may pass in a 20-length symbolic name?
Yes, this is unfortunate. Having everything working but 20-lenght names 
is a first step, we can follow-up with more logic to check if the symbol 
is a valid name.
>
>> +                    # checkout was expected to be a node here because it was
>> +                    # returned by peer.lookup.
>> +                    #
>> +                    # However, some extension (like hg-git) introduce peer not
>> +                    # backed by a mercurial repository where lookup cannot
>> +                    # return a node. Processing 'checkout' as an arbitrary
>> +                    # symbols make it possible with these extensions keep
>> +                    # working.
>> +                    if scmutil.isrevsymbol(destrepo, checkout):
>> +                        checkout = scmutil.revsymbol(destrepo, checkout).node()
>> +                    else:
>> +                        checkout = None
> Do you have any plan to deprecate this hack?
Our plan is to introduce a clear separate API that could be used by such 
extensions for this specific case in 4.8.
via Mercurial-devel - July 26, 2018, 3:24 p.m.
On Thu, Jul 26, 2018 at 6:26 AM Boris FELD <lothiraldan@gmail.com> wrote:

> On 26/07/2018 15:06, Yuya Nishihara wrote:
> > On Thu, 26 Jul 2018 14:21:20 +0200, Boris Feld wrote:
> >> # HG changeset patch
> >> # User Boris Feld <boris.feld@octobus.net>
> >> # Date 1532595350 -7200
> >> #      Thu Jul 26 10:55:50 2018 +0200
> >> # Branch stable
> >> # Node ID 88a0bf46a3ffb78aaab203d13a7c9f53e244282b
> >> # Parent  a920f2620726ef26e6caed3d72b24297699b5b39
> >> # EXP-Topic compat-hggit
> >> # Available At https://bitbucket.org/octobus/mercurial-devel/
> >> #              hg pull https://bitbucket.org/octobus/mercurial-devel/
> -r 88a0bf46a3ff
> >> clone: process 'lookup' return as an arbitrary symbol
> >>
> >> In theory, checkout is expected to be a node here because it was
> returned by
> >> peer.lookup.
> >>
> >> In practice, multiple important extensions (like hg-git, hg-subversion)
> use
> >> peers not backed by a mercurial repository where lookup cannot return a
> node.
> >>
> >> Allowing arbitrary symbols is necessary to make these extensions
> working with
> >> 4.7.
> >>
> >> We should probably introduce a new API in Core to have these extensions
> to
> >> work without abusing the lookup API. In the meantime, a small change to
> >> restore compatibility in 4.7 seems in order.
> >>
> >> diff --git a/mercurial/hg.py b/mercurial/hg.py
> >> --- a/mercurial/hg.py
> >> +++ b/mercurial/hg.py
> >> @@ -730,6 +730,19 @@ def clone(ui, peeropts, source, dest=Non
> >>
> >>                   uprev = None
> >>                   status = None
> >> +                if checkout is not None and not node.isnode(checkout):
> > IIUC, hggit/hgsubversion may pass in a 20-length symbolic name?
> Yes, this is unfortunate. Having everything working but 20-lenght names
> is a first step, we can follow-up with more logic to check if the symbol
> is a valid name.
>

I think Yuya is suggesting that you simply remove the "and not
node.isnode(checkout)", and I agree with that. That way 20-character names
would also work. There are two downsides that I can see: 1) the cost of an
extra check, and 2) a 20-character binary nodeid (from a well-behaving
peer) that happens to also be a valid name will now be interpreted as the
name. (1) seems negligible given the cost of a clone. (2) seems extremely
unlikely to happen (i.e. that a binary nodeid happens to be only readable
characters that someone also happened to use in a name). Am I missing
something?

>
> >> +                    # checkout was expected to be a node here because
> it was
> >> +                    # returned by peer.lookup.
> >> +                    #
> >> +                    # However, some extension (like hg-git) introduce
> peer not
> >> +                    # backed by a mercurial repository where lookup
> cannot
> >> +                    # return a node. Processing 'checkout' as an
> arbitrary
> >> +                    # symbols make it possible with these extensions
> keep
> >> +                    # working.
> >> +                    if scmutil.isrevsymbol(destrepo, checkout):
> >> +                        checkout = scmutil.revsymbol(destrepo,
> checkout).node()
> >> +                    else:
> >> +                        checkout = None
> > Do you have any plan to deprecate this hack?
> Our plan is to introduce a clear separate API that could be used by such
> extensions for this specific case in 4.8.
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
Boris Feld - July 27, 2018, 7:30 a.m.
On 26/07/2018 17:24, Martin von Zweigbergk wrote:
>
>
> On Thu, Jul 26, 2018 at 6:26 AM Boris FELD <lothiraldan@gmail.com 
> <mailto:lothiraldan@gmail.com>> wrote:
>
>     On 26/07/2018 15:06, Yuya Nishihara wrote:
>     > On Thu, 26 Jul 2018 14:21:20 +0200, Boris Feld wrote:
>     >> # HG changeset patch
>     >> # User Boris Feld <boris.feld@octobus.net
>     <mailto:boris.feld@octobus.net>>
>     >> # Date 1532595350 -7200
>     >> #      Thu Jul 26 10:55:50 2018 +0200
>     >> # Branch stable
>     >> # Node ID 88a0bf46a3ffb78aaab203d13a7c9f53e244282b
>     >> # Parent  a920f2620726ef26e6caed3d72b24297699b5b39
>     >> # EXP-Topic compat-hggit
>     >> # Available At https://bitbucket.org/octobus/mercurial-devel/
>     >> #              hg pull
>     https://bitbucket.org/octobus/mercurial-devel/ -r 88a0bf46a3ff
>     >> clone: process 'lookup' return as an arbitrary symbol
>     >>
>     >> In theory, checkout is expected to be a node here because it
>     was returned by
>     >> peer.lookup.
>     >>
>     >> In practice, multiple important extensions (like hg-git,
>     hg-subversion) use
>     >> peers not backed by a mercurial repository where lookup cannot
>     return a node.
>     >>
>     >> Allowing arbitrary symbols is necessary to make these
>     extensions working with
>     >> 4.7.
>     >>
>     >> We should probably introduce a new API in Core to have these
>     extensions to
>     >> work without abusing the lookup API. In the meantime, a small
>     change to
>     >> restore compatibility in 4.7 seems in order.
>     >>
>     >> diff --git a/mercurial/hg.py b/mercurial/hg.py
>     >> --- a/mercurial/hg.py
>     >> +++ b/mercurial/hg.py
>     >> @@ -730,6 +730,19 @@ def clone(ui, peeropts, source, dest=Non
>     >>
>     >>                   uprev = None
>     >>                   status = None
>     >> +                if checkout is not None and not
>     node.isnode(checkout):
>     > IIUC, hggit/hgsubversion may pass in a 20-length symbolic name?
>     Yes, this is unfortunate. Having everything working but 20-lenght
>     names
>     is a first step, we can follow-up with more logic to check if the
>     symbol
>     is a valid name.
>
>
> I think Yuya is suggesting that you simply remove the "and not 
> node.isnode(checkout)", and I agree with that. That way 20-character 
> names would also work. There are two downsides that I can see: 1) the 
> cost of an extra check, and 2) a 20-character binary nodeid (from a 
> well-behaving peer) that happens to also be a valid name will now be 
> interpreted as the name. (1) seems negligible given the cost of a 
> clone. (2) seems extremely unlikely to happen (i.e. that a binary 
> nodeid happens to be only readable characters that someone also 
> happened to use in a name). Am I missing something?

We need to detect valid node because they cannot go through revsymbol, 
it breaks the tests without that check. The previous API was able to 
handle any cases, it is now hard to get the same result when needed.

If a 20 chars string is a valid node (ie: exists in the repo) it should 
probably take precedence over the name. It seems extremely unlikely to 
not care about that case.

>
>     >
>     >> +                    # checkout was expected to be a node here
>     because it was
>     >> +                    # returned by peer.lookup.
>     >> +                    #
>     >> +                    # However, some extension (like hg-git)
>     introduce peer not
>     >> +                    # backed by a mercurial repository where
>     lookup cannot
>     >> +                    # return a node. Processing 'checkout' as
>     an arbitrary
>     >> +                    # symbols make it possible with these
>     extensions keep
>     >> +                    # working.
>     >> +                    if scmutil.isrevsymbol(destrepo, checkout):
>     >> +                        checkout = scmutil.revsymbol(destrepo,
>     checkout).node()
>     >> +                    else:
>     >> +                        checkout = None
>     > Do you have any plan to deprecate this hack?
>     Our plan is to introduce a clear separate API that could be used
>     by such
>     extensions for this specific case in 4.8.
>     _______________________________________________
>     Mercurial-devel mailing list
>     Mercurial-devel@mercurial-scm.org
>     <mailto:Mercurial-devel@mercurial-scm.org>
>     https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
Yuya Nishihara - July 27, 2018, 2 p.m.
On Fri, 27 Jul 2018 09:30:26 +0200, Boris FELD wrote:
> On 26/07/2018 17:24, Martin von Zweigbergk wrote:
> >     >> +                if checkout is not None and not
> >     node.isnode(checkout):
> >     > IIUC, hggit/hgsubversion may pass in a 20-length symbolic name?
> >     Yes, this is unfortunate. Having everything working but 20-lenght
> >     names
> >     is a first step, we can follow-up with more logic to check if the
> >     symbol
> >     is a valid name.
> >
> >
> > I think Yuya is suggesting that you simply remove the "and not 
> > node.isnode(checkout)",

Yes.

> We need to detect valid node because they cannot go through revsymbol, 
> it breaks the tests without that check. The previous API was able to 
> handle any cases, it is now hard to get the same result when needed.
> 
> If a 20 chars string is a valid node (ie: exists in the repo) it should 
> probably take precedence over the name. It seems extremely unlikely to 
> not care about that case.

Yes. So we need to check if 'checkout in destrepo' instead of isnode(checkout),
right?

  if checkout in destrepo:
      # binary nodeid
  elif isrevsymbol(destrepo, checkout):
      # work around for hggit/hgsubversion
Augie Fackler - July 27, 2018, 2:46 p.m.
On Fri, Jul 27, 2018 at 11:00:55PM +0900, Yuya Nishihara wrote:
> On Fri, 27 Jul 2018 09:30:26 +0200, Boris FELD wrote:
> > On 26/07/2018 17:24, Martin von Zweigbergk wrote:
> > >     >> +                if checkout is not None and not
> > >     node.isnode(checkout):
> > >     > IIUC, hggit/hgsubversion may pass in a 20-length symbolic name?
> > >     Yes, this is unfortunate. Having everything working but 20-lenght
> > >     names
> > >     is a first step, we can follow-up with more logic to check if the
> > >     symbol
> > >     is a valid name.
> > >
> > >
> > > I think Yuya is suggesting that you simply remove the "and not
> > > node.isnode(checkout)",
>
> Yes.
>
> > We need to detect valid node because they cannot go through revsymbol,
> > it breaks the tests without that check. The previous API was able to
> > handle any cases, it is now hard to get the same result when needed.
> >
> > If a 20 chars string is a valid node (ie: exists in the repo) it should
> > probably take precedence over the name. It seems extremely unlikely to
> > not care about that case.
>
> Yes. So we need to check if 'checkout in destrepo' instead of isnode(checkout),
> right?
>
>   if checkout in destrepo:
>       # binary nodeid
>   elif isrevsymbol(destrepo, checkout):
>       # work around for hggit/hgsubversion

Something like that, I think. You can see the failure this causes in
hgsubversion (if you like) with `nosetests
tests/test_fetch_branches.py` in the hgsubversion repository with
4.7rc0 installed (you'll also need working svn bindings), in case
seeing what's going on in hgsubversion is helpful.

Sorry I didn't catch this earlier in the cycle. :(

> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
via Mercurial-devel - July 27, 2018, 3:04 p.m.
On Fri, Jul 27, 2018 at 7:02 AM Yuya Nishihara <yuya@tcha.org> wrote:

> On Fri, 27 Jul 2018 09:30:26 +0200, Boris FELD wrote:
> > On 26/07/2018 17:24, Martin von Zweigbergk wrote:
> > >     >> +                if checkout is not None and not
> > >     node.isnode(checkout):
> > >     > IIUC, hggit/hgsubversion may pass in a 20-length symbolic name?
> > >     Yes, this is unfortunate. Having everything working but 20-lenght
> > >     names
> > >     is a first step, we can follow-up with more logic to check if the
> > >     symbol
> > >     is a valid name.
> > >
> > >
> > > I think Yuya is suggesting that you simply remove the "and not
> > > node.isnode(checkout)",
>
> Yes.
>
> > We need to detect valid node because they cannot go through revsymbol,
> > it breaks the tests without that check. The previous API was able to
> > handle any cases, it is now hard to get the same result when needed.
> >
> > If a 20 chars string is a valid node (ie: exists in the repo) it should
> > probably take precedence over the name. It seems extremely unlikely to
> > not care about that case.
>
> Yes. So we need to check if 'checkout in destrepo' instead of
> isnode(checkout),
> right?
>
>   if checkout in destrepo:
>       # binary nodeid
>   elif isrevsymbol(destrepo, checkout):
>       # work around for hggit/hgsubversion
>


Oh, the destrepo is available here? Then yes, it's definitely worth
checking. I suggested not worrying about the risk of conflicts because I
assumed it was awkward to check if the node was in the destination repo at
this point in the code.
Boris Feld - July 27, 2018, 10:26 p.m.
On 27/07/2018 16:00, Yuya Nishihara wrote:
> On Fri, 27 Jul 2018 09:30:26 +0200, Boris FELD wrote:
>> On 26/07/2018 17:24, Martin von Zweigbergk wrote:
>>>      >> +                if checkout is not None and not
>>>      node.isnode(checkout):
>>>      > IIUC, hggit/hgsubversion may pass in a 20-length symbolic name?
>>>      Yes, this is unfortunate. Having everything working but 20-lenght
>>>      names
>>>      is a first step, we can follow-up with more logic to check if the
>>>      symbol
>>>      is a valid name.
>>>
>>>
>>> I think Yuya is suggesting that you simply remove the "and not
>>> node.isnode(checkout)",
> Yes.
>
>> We need to detect valid node because they cannot go through revsymbol,
>> it breaks the tests without that check. The previous API was able to
>> handle any cases, it is now hard to get the same result when needed.
>>
>> If a 20 chars string is a valid node (ie: exists in the repo) it should
>> probably take precedence over the name. It seems extremely unlikely to
>> not care about that case.
> Yes. So we need to check if 'checkout in destrepo' instead of isnode(checkout),
> right?

`checkout in destrepo` no longer works because 'repo[checkout]' no 
longer works. This whole series is about resolving `checkout` to a node 
before calling `checkout in destrepo`.

So the above proposal will not work, we need something else
>
>    if checkout in destrepo:
>        # binary nodeid
>    elif isrevsymbol(destrepo, checkout):
>        # work around for hggit/hgsubversion
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/hg.py b/mercurial/hg.py
--- a/mercurial/hg.py
+++ b/mercurial/hg.py
@@ -730,6 +730,19 @@  def clone(ui, peeropts, source, dest=Non
 
                 uprev = None
                 status = None
+                if checkout is not None and not node.isnode(checkout):
+                    # checkout was expected to be a node here because it was
+                    # returned by peer.lookup.
+                    #
+                    # However, some extension (like hg-git) introduce peer not
+                    # backed by a mercurial repository where lookup cannot
+                    # return a node. Processing 'checkout' as an arbitrary
+                    # symbols make it possible with these extensions keep
+                    # working.
+                    if scmutil.isrevsymbol(destrepo, checkout):
+                        checkout = scmutil.revsymbol(destrepo, checkout).node()
+                    else:
+                        checkout = None
                 if checkout is not None:
                     if checkout in destrepo:
                         uprev = checkout