Patchwork [1,of,2] template: add shortestnode keyword

login
register
mail settings
Submitter Durham Goode
Date Jan. 17, 2014, 8:45 a.m.
Message ID <b18359a70b640d2aeb3f.1389948329@dev010.prn1.facebook.com>
Download mbox | patch
Permalink /patch/3365/
State Deferred
Headers show

Comments

Durham Goode - Jan. 17, 2014, 8:45 a.m.
# HG changeset patch
# User Durham Goode <durham@fb.com>
# Date 1389946237 28800
#      Fri Jan 17 00:10:37 2014 -0800
# Node ID b18359a70b640d2aeb3f4afd1c8e47d775402b8e
# Parent  6545770bd37991b4ff0400479455a6e3ffa5976b
template: add shortestnode keyword

Adds a '{shortestnode}' template keyword that results in the shortest hex node
that uniquely identifies the changeset at that time.

I wanted to do this as a filter, like '{node|shortest}', but filters don't
have access to the repo.
Mads Kiilerich - Jan. 17, 2014, 1:24 p.m.
On 01/17/2014 09:45 AM, Durham Goode wrote:
> # HG changeset patch
> # User Durham Goode <durham@fb.com>
> # Date 1389946237 28800
> #      Fri Jan 17 00:10:37 2014 -0800
> # Node ID b18359a70b640d2aeb3f4afd1c8e47d775402b8e
> # Parent  6545770bd37991b4ff0400479455a6e3ffa5976b
> template: add shortestnode keyword
>
> Adds a '{shortestnode}' template keyword that results in the shortest hex node
> that uniquely identifies the changeset at that time.

Nice.
(I could imagine that it could be nice to have a way specify that I want 
n extra digits to make it more likely that it stay unique in the future 
or with other developer's local changes. But I haven't tried this yet 
and do not if that would solve a real problem.)

> I wanted to do this as a filter, like '{node|shortest}', but filters don't
> have access to the repo.

Could that be fixed so we could do it the preferred way instead?

> diff --git a/mercurial/templatekw.py b/mercurial/templatekw.py
> --- a/mercurial/templatekw.py
> +++ b/mercurial/templatekw.py
> @@ -307,6 +307,28 @@
>       """
>       return ctx.hex()
>   
> +def showshortestnode(repo, ctx, templ, **args):
> +    """:shortestnode: String. The shortest changeset identification hash that
> +    uniquely identifies the changeset.
> +    """
> +    node = ctx.hex()
> +
> +    shortest = node
> +    length = 6
> +    cl = repo.changelog
> +    while True:
> +        test = node[:length]
> +        try:
> +            cl.index.partialmatch(test)
> +            if length == 4:
> +                return test
> +            length -= 1
> +            shortest = test
> +        except error.RevlogError:
> +            length += 1
> +            if len(shortest) == length:
> +                return shortest

AFAICS, if the shortest hash has length 7, it will make a lookup with 
prefix 6 twice. A deliberate trade-off?

Wouldn't it be possible to do this much more efficiently using 
partialmatch internals?

/Mads
Augie Fackler - Jan. 17, 2014, 2:20 p.m.
On Fri, Jan 17, 2014 at 12:45:29AM -0800, Durham Goode wrote:
> # HG changeset patch
> # User Durham Goode <durham@fb.com>
> # Date 1389946237 28800
> #      Fri Jan 17 00:10:37 2014 -0800
> # Node ID b18359a70b640d2aeb3f4afd1c8e47d775402b8e
> # Parent  6545770bd37991b4ff0400479455a6e3ffa5976b
> template: add shortestnode keyword

I see what, but not a why. What's the motivation for having some sort
of shortnode less than 12 bytes?

>
> Adds a '{shortestnode}' template keyword that results in the shortest hex node
> that uniquely identifies the changeset at that time.
>
> I wanted to do this as a filter, like '{node|shortest}', but filters don't
> have access to the repo.
>
> diff --git a/mercurial/templatekw.py b/mercurial/templatekw.py
> --- a/mercurial/templatekw.py
> +++ b/mercurial/templatekw.py
> @@ -307,6 +307,28 @@
>      """
>      return ctx.hex()
>
> +def showshortestnode(repo, ctx, templ, **args):
> +    """:shortestnode: String. The shortest changeset identification hash that
> +    uniquely identifies the changeset.
> +    """
> +    node = ctx.hex()
> +
> +    shortest = node
> +    length = 6
> +    cl = repo.changelog
> +    while True:
> +        test = node[:length]
> +        try:
> +            cl.index.partialmatch(test)
> +            if length == 4:
> +                return test
> +            length -= 1
> +            shortest = test
> +        except error.RevlogError:
> +            length += 1
> +            if len(shortest) == length:
> +                return shortest
> +
>  def showp1rev(repo, ctx, templ, **args):
>      """:p1rev: Integer. The repository-local revision number of the changeset's
>      first parent, or -1 if the changeset has no parents."""
> @@ -381,6 +403,7 @@
>      'phase': showphase,
>      'phaseidx': showphaseidx,
>      'rev': showrev,
> +    'shortestnode': showshortestnode,
>      'tags': showtags,
>  }
>
> diff --git a/tests/test-command-template.t b/tests/test-command-template.t
> --- a/tests/test-command-template.t
> +++ b/tests/test-command-template.t
> @@ -1626,3 +1626,12 @@
>
>    $ hg log -r 0 --template '{if(branches, "yes", "no")}\n'
>    no
> +
> +Test shortestnode keyword:
> +
> +  $ echo b > b
> +  $ hg ci -qAm b
> +  $ hg log --template '{shortestnode}\n'
> +  d97c
> +  f776
> +
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
Durham Goode - Jan. 17, 2014, 6:16 p.m.
On 1/17/14 6:20 AM, "Augie Fackler" <raf@durin42.com> wrote:

>On Fri, Jan 17, 2014 at 12:45:29AM -0800, Durham Goode wrote:
>> # HG changeset patch
>> # User Durham Goode <durham@fb.com>
>> # Date 1389946237 28800
>> #      Fri Jan 17 00:10:37 2014 -0800
>> # Node ID b18359a70b640d2aeb3f4afd1c8e47d775402b8e
>> # Parent  6545770bd37991b4ff0400479455a6e3ffa5976b
>> template: add shortestnode keyword
>
>I see what, but not a why. What's the motivation for having some sort
>of shortnode less than 12 bytes?

Because it's easier to read and type.  In particular, on small repos (like
the mercurial repo), it ends up being 4 characters, which I can fit in my
human memory for typing.  Rev numbers don't work for this because there's
a lot of repetition in the first few digits, making it hard to remember
for typing.

It's the difference between my log looking like this:

@  durham   shortestnode
|  77cf62dd2af0 template: add pad function for padding output
|
o  durham
|  b18359a70b64 template: add shortestnode keyword
|
o  pierre-yves @
|  6545770bd379 backout: add a message after backout that need ...
|
| o  durham   manifestcache
|/   93f07171bf64 manifest cache
|
| o  durham   catperf
| |  c7652078e0d4 cat: increase perf when catting single files
| |
| o  durham
|/   9c531e73a03e changectx: increase perf of walk function
|



And like this:

@  durham  shortestnode
|  77cf    template: add pad function for padding output
|
o  durham
|  b183    template: add shortestnode keyword
|
o  pierre-yves @
|  6545    backout: add a message after backout that need manual commit
|
| o  durham   manifestcache
|/   93f0    manifest cache
|
| o  durham   catperf
| |  c765    cat: increase perf when catting single files
| |
| o  durham
|/   9c53    changectx: increase perf of walk function
|
Durham Goode - Jan. 17, 2014, 6:29 p.m.
On 1/17/14 5:24 AM, "Mads Kiilerich" <mads@kiilerich.com> wrote:

>On 01/17/2014 09:45 AM, Durham Goode wrote:
>> # HG changeset patch
>> # User Durham Goode <durham@fb.com>
>> # Date 1389946237 28800
>> #      Fri Jan 17 00:10:37 2014 -0800
>> # Node ID b18359a70b640d2aeb3f4afd1c8e47d775402b8e
>> # Parent  6545770bd37991b4ff0400479455a6e3ffa5976b
>> template: add shortestnode keyword
>>
>> I wanted to do this as a filter, like '{node|shortest}', but filters
>>don't
>> have access to the repo.
>
>Could that be fixed so we could do it the preferred way instead?

I wasn't sure if filters were meant to be super simple string
manipulations. I also thought the current version would be more likely to
get accepted before the code freeze than a version that touched every
filter method.

>
>> diff --git a/mercurial/templatekw.py b/mercurial/templatekw.py
>> --- a/mercurial/templatekw.py
>> +++ b/mercurial/templatekw.py
>> @@ -307,6 +307,28 @@
>>       """
>>       return ctx.hex()
>>   
>> +def showshortestnode(repo, ctx, templ, **args):
>> +    """:shortestnode: String. The shortest changeset identification
>>hash that
>> +    uniquely identifies the changeset.
>> +    """
>> +    node = ctx.hex()
>> +
>> +    shortest = node
>> +    length = 6
>> +    cl = repo.changelog
>> +    while True:
>> +        test = node[:length]
>> +        try:
>> +            cl.index.partialmatch(test)
>> +            if length == 4:
>> +                return test
>> +            length -= 1
>> +            shortest = test
>> +        except error.RevlogError:
>> +            length += 1
>> +            if len(shortest) == length:
>> +                return shortest
>
>AFAICS, if the shortest hash has length 7, it will make a lookup with
>prefix 6 twice. A deliberate trade-off?

Yes, but it's unlikely that 7 will be needed (3% chance in our big repo),
so I figured the algorithmic cleanliness was preferrable.

>
>Wouldn't it be possible to do this much more efficiently using
>partialmatch internals?

Probably, but the performance of this implementation seems pretty good.
Logging 1000 commits in hg-crew using {shortestnode} takes only 0.03
seconds longer than using {node} (0.37s vs 0.34s).  Even on our internal
large repo it only adds 0.15 seconds.
Matt Mackall - Jan. 18, 2014, 3:35 a.m.
On Fri, 2014-01-17 at 18:29 +0000, Durham Goode wrote:
> On 1/17/14 5:24 AM, "Mads Kiilerich" <mads@kiilerich.com> wrote:
> 
> >On 01/17/2014 09:45 AM, Durham Goode wrote:
> >> # HG changeset patch
> >> # User Durham Goode <durham@fb.com>
> >> # Date 1389946237 28800
> >> #      Fri Jan 17 00:10:37 2014 -0800
> >> # Node ID b18359a70b640d2aeb3f4afd1c8e47d775402b8e
> >> # Parent  6545770bd37991b4ff0400479455a6e3ffa5976b
> >> template: add shortestnode keyword
> >>
> >> I wanted to do this as a filter, like '{node|shortest}', but filters
> >>don't
> >> have access to the repo.
> >
> >Could that be fixed so we could do it the preferred way instead?
> 
> I wasn't sure if filters were meant to be super simple string
> manipulations.

Less and less so. I'm actually trying to move away from the filter-style
syntax to a function-style (eg the equivalent {shortest(node)}) and
functions can be much richer.

>  I also thought the current version would be more likely to
> get accepted before the code freeze than a version that touched every
> filter method.

I think there's maybe a way to do this that avoids that. For instance,
make it a template function instead of a filter and pass the repo or
changectx around in the context or mapping. 

Having access to the repo or context from templates opens up a whole
bunch of possibilities.

Patch

diff --git a/mercurial/templatekw.py b/mercurial/templatekw.py
--- a/mercurial/templatekw.py
+++ b/mercurial/templatekw.py
@@ -307,6 +307,28 @@ 
     """
     return ctx.hex()
 
+def showshortestnode(repo, ctx, templ, **args):
+    """:shortestnode: String. The shortest changeset identification hash that
+    uniquely identifies the changeset.
+    """
+    node = ctx.hex()
+
+    shortest = node
+    length = 6
+    cl = repo.changelog
+    while True:
+        test = node[:length]
+        try:
+            cl.index.partialmatch(test)
+            if length == 4:
+                return test
+            length -= 1
+            shortest = test
+        except error.RevlogError:
+            length += 1
+            if len(shortest) == length:
+                return shortest
+
 def showp1rev(repo, ctx, templ, **args):
     """:p1rev: Integer. The repository-local revision number of the changeset's
     first parent, or -1 if the changeset has no parents."""
@@ -381,6 +403,7 @@ 
     'phase': showphase,
     'phaseidx': showphaseidx,
     'rev': showrev,
+    'shortestnode': showshortestnode,
     'tags': showtags,
 }
 
diff --git a/tests/test-command-template.t b/tests/test-command-template.t
--- a/tests/test-command-template.t
+++ b/tests/test-command-template.t
@@ -1626,3 +1626,12 @@ 
 
   $ hg log -r 0 --template '{if(branches, "yes", "no")}\n'
   no
+
+Test shortestnode keyword:
+
+  $ echo b > b
+  $ hg ci -qAm b
+  $ hg log --template '{shortestnode}\n'
+  d97c
+  f776
+