Patchwork namespace: fastpath name lookup on invalid name

login
register
mail settings
Submitter Boris Feld
Date Feb. 22, 2018, 7:33 p.m.
Message ID <b65a85952c09cf4c71a1.1519328016@FB>
Download mbox | patch
Permalink /patch/28261/
State New
Headers show

Comments

Boris Feld - Feb. 22, 2018, 7:33 p.m.
# HG changeset patch
# User Boris Feld <boris.feld@octobus.net>
# Date 1519313522 -3600
#      Thu Feb 22 16:32:02 2018 +0100
# Node ID b65a85952c09cf4c71a1458fbc4ec77c49683314
# Parent  428de1a59f2df3d6d07ff1d7164c8ee56cbb7825
# EXP-Topic noname
# Available At https://bitbucket.org/octobus/mercurial-devel/
#              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r b65a85952c09
namespace: fastpath name lookup on invalid name

Since label cannot contains leading or trailing whitespace we can skip looking
for them. This is useful in repositories with slow labels (eg: special type of
tags). Short command running on a specific revision can benefit from such
shortcut.

eg on a repository where loading tags take 0.4s:

1: hg log --template '{node}\n' --rev 'rev(0)'
   0.560 seconds

2: hg log --template '{node}\n' --rev ' rev(0)'
   0.109 seconds

The changeset introduce a generic way to do such fast-pathing to help
extensions writer to apply the same principle to their extensions.
Yuya Nishihara - Feb. 24, 2018, 5:03 a.m.
On Thu, 22 Feb 2018 20:33:36 +0100, Boris Feld wrote:
> # HG changeset patch
> # User Boris Feld <boris.feld@octobus.net>
> # Date 1519313522 -3600
> #      Thu Feb 22 16:32:02 2018 +0100
> # Node ID b65a85952c09cf4c71a1458fbc4ec77c49683314
> # Parent  428de1a59f2df3d6d07ff1d7164c8ee56cbb7825
> # EXP-Topic noname
> # Available At https://bitbucket.org/octobus/mercurial-devel/
> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r b65a85952c09
> namespace: fastpath name lookup on invalid name
> 
> Since label cannot contains leading or trailing whitespace we can skip looking
> for them. This is useful in repositories with slow labels (eg: special type of
> tags). Short command running on a specific revision can benefit from such
> shortcut.
> 
> eg on a repository where loading tags take 0.4s:
> 
> 1: hg log --template '{node}\n' --rev 'rev(0)'
>    0.560 seconds
> 
> 2: hg log --template '{node}\n' --rev ' rev(0)'
>    0.109 seconds
> 
> The changeset introduce a generic way to do such fast-pathing to help
> extensions writer to apply the same principle to their extensions.

So is this basically the same as the previous version in that we have to
suggest using a weird syntax (leading/trailing space) to get to the fast path?

https://www.mercurial-scm.org/pipermail/mercurial-devel/2018-February/111432.html
> Instead, maybe we can make lookup() to not search slow labels assuming these
> labeling schemes didn't exist in pre-revset era. Alternatively, we could add
> a config knob to switch off the old-style range support.
Boris Feld - Feb. 26, 2018, 10:47 a.m.
On 24/02/2018 06:03, Yuya Nishihara wrote:
> On Thu, 22 Feb 2018 20:33:36 +0100, Boris Feld wrote:
>> # HG changeset patch
>> # User Boris Feld <boris.feld@octobus.net>
>> # Date 1519313522 -3600
>> #      Thu Feb 22 16:32:02 2018 +0100
>> # Node ID b65a85952c09cf4c71a1458fbc4ec77c49683314
>> # Parent  428de1a59f2df3d6d07ff1d7164c8ee56cbb7825
>> # EXP-Topic noname
>> # Available At https://bitbucket.org/octobus/mercurial-devel/
>> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r b65a85952c09
>> namespace: fastpath name lookup on invalid name
>>
>> Since label cannot contains leading or trailing whitespace we can skip looking
>> for them. This is useful in repositories with slow labels (eg: special type of
>> tags). Short command running on a specific revision can benefit from such
>> shortcut.
>>
>> eg on a repository where loading tags take 0.4s:
>>
>> 1: hg log --template '{node}\n' --rev 'rev(0)'
>>     0.560 seconds
>>
>> 2: hg log --template '{node}\n' --rev ' rev(0)'
>>     0.109 seconds
>>
>> The changeset introduce a generic way to do such fast-pathing to help
>> extensions writer to apply the same principle to their extensions.
> So is this basically the same as the previous version in that we have to
> suggest using a weird syntax (leading/trailing space) to get to the fast path?
>
> https://www.mercurial-scm.org/pipermail/mercurial-devel/2018-February/111432.html
Yes, it's a cleaner way to implement the first proposed implementation.

I just sent the email that I should have sent with this patch that 
proposes other potential solutions.
>> Instead, maybe we can make lookup() to not search slow labels assuming these
>> labeling schemes didn't exist in pre-revset era. Alternatively, we could add
>> a config knob to switch off the old-style range support.

Patch

diff --git a/mercurial/namespaces.py b/mercurial/namespaces.py
--- a/mercurial/namespaces.py
+++ b/mercurial/namespaces.py
@@ -1,5 +1,7 @@ 
 from __future__ import absolute_import
 
+import re
+
 from .i18n import _
 from . import (
     templatekw,
@@ -15,6 +17,23 @@  def tolist(val):
     else:
         return [val]
 
+def lazynamemap(regexp, function):
+    """wrap a namemap function in order to call it only if the name matches a
+    regexp
+    """
+    namefilter = re.compile(regexp)
+
+    def namemap(repo, name):
+        if namefilter.match(name):
+            return function(repo, name)
+        return []
+    return namemap
+
+# no ":", "\n", "\0", "\r" within the name, no " " around it.
+commonfilter = b'^[^ :\0\r\n]+([^:\0\r\n]*[^ :\0\r\n]+)?$'
+# at some point we allowed ":" in branch names.
+branchfilter = b'^[^ \0\r\n]+([^\0\r\n]*[^ \0\r\n]+)?$'
+
 class namespaces(object):
     """provides an interface to register and operate on multiple namespaces. See
     the namespace class below for details on the namespace object.
@@ -30,7 +49,8 @@  class namespaces(object):
         # we need current mercurial named objects (bookmarks, tags, and
         # branches) to be initialized somewhere, so that place is here
         bmknames = lambda repo: repo._bookmarks.keys()
-        bmknamemap = lambda repo, name: tolist(repo._bookmarks.get(name))
+        _bmknamemap = lambda repo, name: tolist(repo._bookmarks.get(name))
+        bmknamemap = lazynamemap(commonfilter, _bmknamemap)
         bmknodemap = lambda repo, node: repo.nodebookmarks(node)
         n = namespace("bookmarks", templatename="bookmark",
                       logfmt=columns['bookmark'],
@@ -40,7 +60,8 @@  class namespaces(object):
         self.addnamespace(n)
 
         tagnames = lambda repo: [t for t, n in repo.tagslist()]
-        tagnamemap = lambda repo, name: tolist(repo._tagscache.tags.get(name))
+        _tagnamemap = lambda repo, name: tolist(repo._tagscache.tags.get(name))
+        tagnamemap = lazynamemap(commonfilter, _tagnamemap)
         tagnodemap = lambda repo, node: repo.nodetags(node)
         n = namespace("tags", templatename="tag",
                       logfmt=columns['tag'],
@@ -51,7 +72,8 @@  class namespaces(object):
         self.addnamespace(n)
 
         bnames = lambda repo: repo.branchmap().keys()
-        bnamemap = lambda repo, name: tolist(repo.branchtip(name, True))
+        _bnamemap = lambda repo, name: tolist(repo.branchtip(name, True))
+        bnamemap = lazynamemap(branchfilter, _bnamemap)
         bnodemap = lambda repo, node: [repo[node].branch()]
         n = namespace("branches", templatename="branch",
                       logfmt=columns['branch'],