Patchwork [4,of,8] revset: enforce "%d" to be interpreted as literal revision number (API)

login
register
mail settings
Submitter Boris Feld
Date Jan. 11, 2019, 11:29 a.m.
Message ID <38733dd8559578267617.1547206146@Laptop-Boris.lan>
Download mbox | patch
Permalink /patch/37661/
State Accepted
Headers show

Comments

Yuya Nishihara - Jan. 12, 2019, 4:04 a.m.
On Fri, 11 Jan 2019 12:29:06 +0100, Boris Feld wrote:
> # HG changeset patch
> # User Boris Feld <boris.feld@octobus.net>
> # Date 1547130238 -3600
> #      Thu Jan 10 15:23:58 2019 +0100
> # Node ID 38733dd85595782676175141111a42f253efabb6
> # Parent  427247e84e29c144321d21a825d371458b5d3e1a
> # EXP-Topic revs-efficiency
> # Available At https://bitbucket.org/octobus/mercurial-devel/
> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 38733dd85595
> revset: enforce "%d" to be interpreted as literal revision number (API)

New behavior looks saner. Please also flag this as (BC). It's exposed as
revset() template function.

>      %r = revset expression, parenthesized
> -    %d = int(arg), no quoting
> +    %d = rev(int(arg)), no quoting

'rev(n)' returns an empty set if n is out of range, whereas 'n' aborts.
Suppose it's pretty much a coding error to pass in an invalid revision to
repo.revs(), we'll probably want aborts.

Maybe we'll need an internal '_rev()' function?

>      %s = string(arg), escaped and single-quoted
>      %b = arg.branch(), escaped and single-quoted
>      %n = hex(arg), single-quoted

We might want to map %n to _node() as well, but that isn't required in this
series.
Boris Feld - Jan. 13, 2019, 7:37 a.m.
On 12/01/2019 05:04, Yuya Nishihara wrote:
> On Fri, 11 Jan 2019 12:29:06 +0100, Boris Feld wrote:
>> # HG changeset patch
>> # User Boris Feld <boris.feld@octobus.net>
>> # Date 1547130238 -3600
>> #      Thu Jan 10 15:23:58 2019 +0100
>> # Node ID 38733dd85595782676175141111a42f253efabb6
>> # Parent  427247e84e29c144321d21a825d371458b5d3e1a
>> # EXP-Topic revs-efficiency
>> # Available At https://bitbucket.org/octobus/mercurial-devel/
>> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 38733dd85595
>> revset: enforce "%d" to be interpreted as literal revision number (API)
> New behavior looks saner. Please also flag this as (BC). It's exposed as
> revset() template function.
>
>>      %r = revset expression, parenthesized
>> -    %d = int(arg), no quoting
>> +    %d = rev(int(arg)), no quoting
> 'rev(n)' returns an empty set if n is out of range, whereas 'n' aborts.
> Suppose it's pretty much a coding error to pass in an invalid revision to
> repo.revs(), we'll probably want aborts.
>
> Maybe we'll need an internal '_rev()' function?
%ld silently pass on these too, so I would rather have %ld and %d
consistent here (and align on the silence %ld behavior).
>
>>      %s = string(arg), escaped and single-quoted
>>      %b = arg.branch(), escaped and single-quoted
>>      %n = hex(arg), single-quoted
> We might want to map %n to _node() as well, but that isn't required in this
> series.
Yeah, good point. I'll keep this series focused on %d/%ld but will
submit a follow up once this is in.
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Yuya Nishihara - Jan. 13, 2019, 9:12 a.m.
On Sun, 13 Jan 2019 08:37:47 +0100, Boris FELD wrote:
> 
> On 12/01/2019 05:04, Yuya Nishihara wrote:
> > On Fri, 11 Jan 2019 12:29:06 +0100, Boris Feld wrote:
> >> # HG changeset patch
> >> # User Boris Feld <boris.feld@octobus.net>
> >> # Date 1547130238 -3600
> >> #      Thu Jan 10 15:23:58 2019 +0100
> >> # Node ID 38733dd85595782676175141111a42f253efabb6
> >> # Parent  427247e84e29c144321d21a825d371458b5d3e1a
> >> # EXP-Topic revs-efficiency
> >> # Available At https://bitbucket.org/octobus/mercurial-devel/
> >> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 38733dd85595
> >> revset: enforce "%d" to be interpreted as literal revision number (API)
> > New behavior looks saner. Please also flag this as (BC). It's exposed as
> > revset() template function.
> >
> >>      %r = revset expression, parenthesized
> >> -    %d = int(arg), no quoting
> >> +    %d = rev(int(arg)), no quoting
> > 'rev(n)' returns an empty set if n is out of range, whereas 'n' aborts.
> > Suppose it's pretty much a coding error to pass in an invalid revision to
> > repo.revs(), we'll probably want aborts.
> >
> > Maybe we'll need an internal '_rev()' function?
> %ld silently pass on these too, so I would rather have %ld and %d
> consistent here (and align on the silence %ld behavior).

Indeed. I never thought %ld would behave in such way. Consistent behavior
should be better.
Boris Feld - Jan. 17, 2019, 7:30 p.m.
On 13/01/2019 10:12, Yuya Nishihara wrote:
> On Sun, 13 Jan 2019 08:37:47 +0100, Boris FELD wrote:
>> On 12/01/2019 05:04, Yuya Nishihara wrote:
>>> On Fri, 11 Jan 2019 12:29:06 +0100, Boris Feld wrote:
>>>> # HG changeset patch
>>>> # User Boris Feld <boris.feld@octobus.net>
>>>> # Date 1547130238 -3600
>>>> #      Thu Jan 10 15:23:58 2019 +0100
>>>> # Node ID 38733dd85595782676175141111a42f253efabb6
>>>> # Parent  427247e84e29c144321d21a825d371458b5d3e1a
>>>> # EXP-Topic revs-efficiency
>>>> # Available At https://bitbucket.org/octobus/mercurial-devel/
>>>> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 38733dd85595
>>>> revset: enforce "%d" to be interpreted as literal revision number (API)
>>> New behavior looks saner. Please also flag this as (BC). It's exposed as
>>> revset() template function.
>>>
>>>>      %r = revset expression, parenthesized
>>>> -    %d = int(arg), no quoting
>>>> +    %d = rev(int(arg)), no quoting
>>> 'rev(n)' returns an empty set if n is out of range, whereas 'n' aborts.
>>> Suppose it's pretty much a coding error to pass in an invalid revision to
>>> repo.revs(), we'll probably want aborts.
>>>
>>> Maybe we'll need an internal '_rev()' function?
>> %ld silently pass on these too, so I would rather have %ld and %d
>> consistent here (and align on the silence %ld behavior).
> Indeed. I never thought %ld would behave in such way. Consistent behavior
> should be better.

Through some buggy revset usage in evolve we discovered that the
optimization to passthrough input directly for %ld changed this
behavior. In these cases, the revisions are passed as is and
filtered/out-of-bound revision raise their usual error.

Reintroducing the silent filtering behavior for %ld seems possible (it
will have a performance impact, but still faster than the previous
serialization).

However, it is probably more efficient and saner to have these errors
raised. In this case, we should align all %ld case and %d behaviors on
the error raising version. What do you think?

I'll try to get a patch out tomorrow to align things in a direction or
the other.
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
via Mercurial-devel - Jan. 17, 2019, 8:36 p.m.
Makes sense to me. That's how it would work on the command line too.

On Thu, Jan 17, 2019, 11:32 Boris FELD <boris.feld@octobus.net> wrote:

>
> On 13/01/2019 10:12, Yuya Nishihara wrote:
> > On Sun, 13 Jan 2019 08:37:47 +0100, Boris FELD wrote:
> >> On 12/01/2019 05:04, Yuya Nishihara wrote:
> >>> On Fri, 11 Jan 2019 12:29:06 +0100, Boris Feld wrote:
> >>>> # HG changeset patch
> >>>> # User Boris Feld <boris.feld@octobus.net>
> >>>> # Date 1547130238 -3600
> >>>> #      Thu Jan 10 15:23:58 2019 +0100
> >>>> # Node ID 38733dd85595782676175141111a42f253efabb6
> >>>> # Parent  427247e84e29c144321d21a825d371458b5d3e1a
> >>>> # EXP-Topic revs-efficiency
> >>>> # Available At https://bitbucket.org/octobus/mercurial-devel/
> >>>> #              hg pull https://bitbucket.org/octobus/mercurial-devel/
> -r 38733dd85595
> >>>> revset: enforce "%d" to be interpreted as literal revision number
> (API)
> >>> New behavior looks saner. Please also flag this as (BC). It's exposed
> as
> >>> revset() template function.
> >>>
> >>>>      %r = revset expression, parenthesized
> >>>> -    %d = int(arg), no quoting
> >>>> +    %d = rev(int(arg)), no quoting
> >>> 'rev(n)' returns an empty set if n is out of range, whereas 'n' aborts.
> >>> Suppose it's pretty much a coding error to pass in an invalid revision
> to
> >>> repo.revs(), we'll probably want aborts.
> >>>
> >>> Maybe we'll need an internal '_rev()' function?
> >> %ld silently pass on these too, so I would rather have %ld and %d
> >> consistent here (and align on the silence %ld behavior).
> > Indeed. I never thought %ld would behave in such way. Consistent behavior
> > should be better.
>
> Through some buggy revset usage in evolve we discovered that the
> optimization to passthrough input directly for %ld changed this
> behavior. In these cases, the revisions are passed as is and
> filtered/out-of-bound revision raise their usual error.
>
> Reintroducing the silent filtering behavior for %ld seems possible (it
> will have a performance impact, but still faster than the previous
> serialization).
>
> However, it is probably more efficient and saner to have these errors
> raised. In this case, we should align all %ld case and %d behaviors on
> the error raising version. What do you think?
>
> I'll try to get a patch out tomorrow to align things in a direction or
> the other.
> > _______________________________________________
> > Mercurial-devel mailing list
> > Mercurial-devel@mercurial-scm.org
> > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
Yuya Nishihara - Jan. 18, 2019, 12:26 p.m.
> On Thu, Jan 17, 2019, 11:32 Boris FELD <boris.feld@octobus.net> wrote:
> > On 13/01/2019 10:12, Yuya Nishihara wrote:
> > > On Sun, 13 Jan 2019 08:37:47 +0100, Boris FELD wrote:
> > >> On 12/01/2019 05:04, Yuya Nishihara wrote:
> > >>> On Fri, 11 Jan 2019 12:29:06 +0100, Boris Feld wrote:
> > >>>> # HG changeset patch
> > >>>> # User Boris Feld <boris.feld@octobus.net>
> > >>>> # Date 1547130238 -3600
> > >>>> #      Thu Jan 10 15:23:58 2019 +0100
> > >>>> # Node ID 38733dd85595782676175141111a42f253efabb6
> > >>>> # Parent  427247e84e29c144321d21a825d371458b5d3e1a
> > >>>> # EXP-Topic revs-efficiency
> > >>>> # Available At https://bitbucket.org/octobus/mercurial-devel/
> > >>>> #              hg pull https://bitbucket.org/octobus/mercurial-devel/
> > -r 38733dd85595
> > >>>> revset: enforce "%d" to be interpreted as literal revision number
> > (API)
> > >>> New behavior looks saner. Please also flag this as (BC). It's exposed
> > as
> > >>> revset() template function.
> > >>>
> > >>>>      %r = revset expression, parenthesized
> > >>>> -    %d = int(arg), no quoting
> > >>>> +    %d = rev(int(arg)), no quoting
> > >>> 'rev(n)' returns an empty set if n is out of range, whereas 'n' aborts.
> > >>> Suppose it's pretty much a coding error to pass in an invalid revision
> > to
> > >>> repo.revs(), we'll probably want aborts.
> > >>>
> > >>> Maybe we'll need an internal '_rev()' function?
> > >> %ld silently pass on these too, so I would rather have %ld and %d
> > >> consistent here (and align on the silence %ld behavior).
> > > Indeed. I never thought %ld would behave in such way. Consistent behavior
> > > should be better.
> >
> > Through some buggy revset usage in evolve we discovered that the
> > optimization to passthrough input directly for %ld changed this
> > behavior. In these cases, the revisions are passed as is and
> > filtered/out-of-bound revision raise their usual error.
> >
> > Reintroducing the silent filtering behavior for %ld seems possible (it
> > will have a performance impact, but still faster than the previous
> > serialization).
> >
> > However, it is probably more efficient and saner to have these errors
> > raised. In this case, we should align all %ld case and %d behaviors on
> > the error raising version. What do you think?

On Thu, 17 Jan 2019 12:36:50 -0800, Martin von Zweigbergk wrote:
> Makes sense to me. That's how it would work on the command line too.

+1

Patch

diff --git a/mercurial/revsetlang.py b/mercurial/revsetlang.py
--- a/mercurial/revsetlang.py
+++ b/mercurial/revsetlang.py
@@ -583,7 +583,7 @@  def _quote(s):
 
 def _formatargtype(c, arg):
     if c == 'd':
-        return '%d' % int(arg)
+        return 'rev(%d)' % int(arg)
     elif c == 's':
         return _quote(arg)
     elif c == 'r':
@@ -638,7 +638,7 @@  def formatspec(expr, *args):
     Supported arguments:
 
     %r = revset expression, parenthesized
-    %d = int(arg), no quoting
+    %d = rev(int(arg)), no quoting
     %s = string(arg), escaped and single-quoted
     %b = arg.branch(), escaped and single-quoted
     %n = hex(arg), single-quoted
@@ -650,9 +650,9 @@  def formatspec(expr, *args):
     >>> formatspec(b'%r:: and %lr', b'10 or 11', (b"this()", b"that()"))
     '(10 or 11):: and ((this()) or (that()))'
     >>> formatspec(b'%d:: and not %d::', 10, 20)
-    '10:: and not 20::'
+    'rev(10):: and not rev(20)::'
     >>> formatspec(b'%ld or %ld', [], [1])
-    "_list('') or 1"
+    "_list('') or rev(1)"
     >>> formatspec(b'keyword(%s)', b'foo\\xe9')
     "keyword('foo\\\\xe9')"
     >>> b = lambda: b'default'
diff --git a/mercurial/scmutil.py b/mercurial/scmutil.py
--- a/mercurial/scmutil.py
+++ b/mercurial/scmutil.py
@@ -723,7 +723,7 @@  def revrange(repo, specs, localalias=Non
     allspecs = []
     for spec in specs:
         if isinstance(spec, int):
-            spec = revsetlang.formatspec('rev(%d)', spec)
+            spec = revsetlang.formatspec('%d', spec)
         allspecs.append(spec)
     return repo.anyrevs(allspecs, user=True, localalias=localalias)