Patchwork [2,of,3,STABLE,V2] templater: use unfiltered changelog to calculate shortest() at O(log(N))

login
register
mail settings
Submitter Yuya Nishihara
Date Oct. 25, 2016, 2:17 p.m.
Message ID <ecbce2fe4dea116c925a.1477405033@mimosa>
Download mbox | patch
Permalink /patch/17192/
State Accepted
Headers show

Comments

Yuya Nishihara - Oct. 25, 2016, 2:17 p.m.
# HG changeset patch
# User Yuya Nishihara <yuya@tcha.org>
# Date 1477399770 -32400
#      Tue Oct 25 21:49:30 2016 +0900
# Branch stable
# Node ID ecbce2fe4dea116c925a2fecd1b7b50df0a62589
# Parent  869574e70105ec60b88b1bb85a12369e5e560279
templater: use unfiltered changelog to calculate shortest() at O(log(N))

cl._partialmatch() can be pretty slow if hidden revisions are involved. This
patch cancels the slowdown introduced by the previous patch by using an
unfiltered changelog, which means shortest(node) isn't always the shortest.

The result isn't perfect, but seems okay as long as shortest(node) is short
enough to type and can be used as an identifier.

  (with hidden revisions)
  % hg log -R hg-committed -r0:20000 -T '{node|shortest}\n' --time > /dev/null
  (.^^) time: real 1.530 secs (user 1.480+0.000 sys 0.040+0.000)
  (.^)  time: real 43.080 secs (user 43.060+0.000 sys 0.030+0.000)
  (.)   time: real 1.680 secs (user 1.650+0.000 sys 0.020+0.000)
Yuya Nishihara - Oct. 26, 2016, 11:33 a.m.
On Tue, 25 Oct 2016 23:17:13 +0900, Yuya Nishihara wrote:
> # HG changeset patch
> # User Yuya Nishihara <yuya@tcha.org>
> # Date 1477399770 -32400
> #      Tue Oct 25 21:49:30 2016 +0900
> # Branch stable
> # Node ID ecbce2fe4dea116c925a2fecd1b7b50df0a62589
> # Parent  869574e70105ec60b88b1bb85a12369e5e560279
> templater: use unfiltered changelog to calculate shortest() at O(log(N))
                                                                 ^^^^^^^^^

The summary line was wrong. It should say s/O(log(N))/constant time/ since
we use prefix tree and the length of node characters is constant.
Pierre-Yves David - Oct. 28, 2016, 8:31 a.m.
On 10/26/2016 01:33 PM, Yuya Nishihara wrote:
> On Tue, 25 Oct 2016 23:17:13 +0900, Yuya Nishihara wrote:
>> # HG changeset patch
>> # User Yuya Nishihara <yuya@tcha.org>
>> # Date 1477399770 -32400
>> #      Tue Oct 25 21:49:30 2016 +0900
>> # Branch stable
>> # Node ID ecbce2fe4dea116c925a2fecd1b7b50df0a62589
>> # Parent  869574e70105ec60b88b1bb85a12369e5e560279
>> templater: use unfiltered changelog to calculate shortest() at O(log(N))
>                                                                  ^^^^^^^^^
>
> The summary line was wrong. It should say s/O(log(N))/constant time/ since
> we use prefix tree and the length of node characters is constant.

I've pushed this. Thanks for taking care of this.

Cheers,

Patch

diff --git a/mercurial/templater.py b/mercurial/templater.py
--- a/mercurial/templater.py
+++ b/mercurial/templater.py
@@ -837,7 +837,10 @@  def shortest(context, mapping, args):
                                 # i18n: "shortest" is a keyword
                                 _("shortest() expects an integer minlength"))
 
-    cl = mapping['ctx']._repo.changelog
+    # _partialmatch() of filtered changelog could take O(len(repo)) time,
+    # which would be unacceptably slow. so we look for hash collision in
+    # unfiltered space, which means some hashes may be slightly longer.
+    cl = mapping['ctx']._repo.unfiltered().changelog
     def isvalid(test):
         try:
             if cl._partialmatch(test) is None:
diff --git a/tests/test-command-template.t b/tests/test-command-template.t
--- a/tests/test-command-template.t
+++ b/tests/test-command-template.t
@@ -3469,9 +3469,10 @@  Test shortest(node) with the repo having
   4:107
 
  node 'c562' should be unique if the other 'c562' nodes are hidden
+ (but we don't try the slow path to filter out hidden nodes for now)
 
   $ hg log -r 8 -T '{rev}:{node|shortest}\n'
-  8:c562
+  8:c5625
   $ hg log -r 8:10 -T '{rev}:{node|shortest}\n' --hidden
   8:c5625
   9:c5623