Patchwork [7,of,8] match: provide and use a quick way to escape a single byte

login
register
mail settings
Submitter Boris Feld
Date Nov. 21, 2018, 6:33 p.m.
Message ID <8992601f1d1a54a42c98.1542825237@localhost.localdomain>
Download mbox | patch
Permalink /patch/36705/
State Accepted
Headers show

Comments

Boris Feld - Nov. 21, 2018, 6:33 p.m.
# HG changeset patch
# User Boris Feld <boris.feld@octobus.net>
# Date 1542653684 0
#      Mon Nov 19 18:54:44 2018 +0000
# Node ID 8992601f1d1a54a42c984518097a246b49c06c12
# Parent  03b60ccca50c77a552de85ac3402c5174539f150
# EXP-Topic perf-ignore
# Available At https://bitbucket.org/octobus/mercurial-devel/
#              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 8992601f1d1a
match: provide and use a quick way to escape a single byte

The previous function has a lot of overhead (including being a function). In
the `_globre` case, we always escape a single byte. So we provide a dictionary
dedicated to this use case. We directly use the dictionary to avoid a function
call, these are expensive in Python.

Again, this raise a very significant performance gain:

Before: ! wall 0.059793 comb 0.060000 user 0.060000 sys 0.000000 (median of 100)
After:  ! wall 0.020390 comb 0.020000 user 0.020000 sys 0.000000 (median of 146)

Total improvement for the full series:

Before: ! wall 0.153153 comb 0.150000 user 0.150000 sys 0.000000 (median of 66)
After:  ! wall 0.020390 comb 0.020000 user 0.020000 sys 0.000000 (median of 146)
Yuya Nishihara - Nov. 22, 2018, 1:04 p.m.
On Wed, 21 Nov 2018 19:33:57 +0100, Boris Feld wrote:
> # HG changeset patch
> # User Boris Feld <boris.feld@octobus.net>
> # Date 1542653684 0
> #      Mon Nov 19 18:54:44 2018 +0000
> # Node ID 8992601f1d1a54a42c984518097a246b49c06c12
> # Parent  03b60ccca50c77a552de85ac3402c5174539f150
> # EXP-Topic perf-ignore
> # Available At https://bitbucket.org/octobus/mercurial-devel/
> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 8992601f1d1a
> match: provide and use a quick way to escape a single byte

Queued 7-8, thanks.

Patch

diff --git a/mercurial/match.py b/mercurial/match.py
--- a/mercurial/match.py
+++ b/mercurial/match.py
@@ -1057,14 +1057,14 @@  def _globre(pat):
     i, n = 0, len(pat)
     res = ''
     group = 0
-    escape = util.stringutil.reescape
+    escape = util.stringutil.regexbytesescapemap.get
     def peek():
         return i < n and pat[i:i + 1]
     while i < n:
         c = pat[i:i + 1]
         i += 1
         if c not in '*?[{},\\':
-            res += escape(c)
+            res += escape(c, c)
         elif c == '*':
             if peek() == '*':
                 i += 1
@@ -1105,11 +1105,11 @@  def _globre(pat):
             p = peek()
             if p:
                 i += 1
-                res += escape(p)
+                res += escape(p, p)
             else:
-                res += escape(c)
+                res += escape(c, c)
         else:
-            res += escape(c)
+            res += escape(c, c)
     return res
 
 def _regex(kind, pat, globsuffix):
diff --git a/mercurial/utils/stringutil.py b/mercurial/utils/stringutil.py
--- a/mercurial/utils/stringutil.py
+++ b/mercurial/utils/stringutil.py
@@ -28,6 +28,7 @@  from .. import (
 # which was part of Python 3.7.
 _respecial = pycompat.bytestr(b'()[]{}?*+-|^$\\.&~# \t\n\r\v\f')
 _regexescapemap = {ord(i): (b'\\' + i).decode('latin1') for i in _respecial}
+regexbytesescapemap = {i: (b'\\' + i) for i in _respecial}
 
 def reescape(pat):
     """Drop-in replacement for re.escape."""