Patchwork D3841: stringutil: add a new function to do minimal regex escaping

login
register
mail settings
Submitter phabricator
Date June 26, 2018, 5:51 p.m.
Message ID <a99c9d27ee91556897398c613bcbff88@localhost.localdomain>
Download mbox | patch
Permalink /patch/32444/
State Not Applicable
Headers show

Comments

phabricator - June 26, 2018, 5:51 p.m.
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG96f65bdf0bf4: stringutil: add a new function to do minimal regex escaping (authored by durin42, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D3841?vs=9308&id=9312

REVISION DETAIL
  https://phab.mercurial-scm.org/D3841

AFFECTED FILES
  mercurial/utils/stringutil.py

CHANGE DETAILS




To: durin42, #hg-reviewers, pulkit
Cc: mercurial-devel

Patch

diff --git a/mercurial/utils/stringutil.py b/mercurial/utils/stringutil.py
--- a/mercurial/utils/stringutil.py
+++ b/mercurial/utils/stringutil.py
@@ -23,6 +23,25 @@ 
     pycompat,
 )
 
+# regex special chars pulled from https://bugs.python.org/issue29995
+# which was part of Python 3.7.
+_respecial = pycompat.bytestr(b'()[]{}?*+-|^$\\.# \t\n\r\v\f')
+_regexescapemap = {ord(i): (b'\\' + i).decode('latin1') for i in _respecial}
+
+def reescape(pat):
+    """Drop-in replacement for re.escape."""
+    # NOTE: it is intentional that this works on unicodes and not
+    # bytes, as it's only possible to do the escaping with
+    # unicode.translate, not bytes.translate. Sigh.
+    wantuni = True
+    if isinstance(pat, bytes):
+        wantuni = False
+        pat = pat.decode('latin1')
+    pat = pat.translate(_regexescapemap)
+    if wantuni:
+        return pat
+    return pat.encode('latin1')
+
 def pprint(o, bprefix=False):
     """Pretty print an object."""
     if isinstance(o, bytes):