comparison mercurial/utils/stringutil.py @ 38478:96f65bdf0bf4

stringutil: add a new function to do minimal regex escaping Per https://bugs.python.org/issue29995, re.escape() used to over-escape regular expression strings, but in Python 3.7 that's been fixed, which also improved the performance of re.escape(). Since it's both an output change for us *and* a perfomance win, let's just effectively backport the new behavior to hg on all Python versions. Differential Revision: https://phab.mercurial-scm.org/D3841
author Augie Fackler <augie@google.com>
date Tue, 26 Jun 2018 10:33:52 -0400
parents fbb2eddea4d2
children de275ab362cb
comparison
equal deleted inserted replaced
38477:622f79e3a1cb 38478:96f65bdf0bf4
20 from .. import ( 20 from .. import (
21 encoding, 21 encoding,
22 error, 22 error,
23 pycompat, 23 pycompat,
24 ) 24 )
25
26 # regex special chars pulled from https://bugs.python.org/issue29995
27 # which was part of Python 3.7.
28 _respecial = pycompat.bytestr(b'()[]{}?*+-|^$\\.# \t\n\r\v\f')
29 _regexescapemap = {ord(i): (b'\\' + i).decode('latin1') for i in _respecial}
30
31 def reescape(pat):
32 """Drop-in replacement for re.escape."""
33 # NOTE: it is intentional that this works on unicodes and not
34 # bytes, as it's only possible to do the escaping with
35 # unicode.translate, not bytes.translate. Sigh.
36 wantuni = True
37 if isinstance(pat, bytes):
38 wantuni = False
39 pat = pat.decode('latin1')
40 pat = pat.translate(_regexescapemap)
41 if wantuni:
42 return pat
43 return pat.encode('latin1')
25 44
26 def pprint(o, bprefix=False): 45 def pprint(o, bprefix=False):
27 """Pretty print an object.""" 46 """Pretty print an object."""
28 if isinstance(o, bytes): 47 if isinstance(o, bytes):
29 if bprefix: 48 if bprefix: