mercurial-scm/hg: mercurial/patch.py comparison

comparison mercurial/patch.py @ 35383:82c3762349ac

patch: do not break up multibyte character when highlighting word This changes {\W} to {\W - any 8bit characters} so that multibyte sequences are taken as words. Since we don't know the encoding of user content, this is the most sensible definition of a non-word.

author	Yuya Nishihara <yuya@tcha.org>
date	Mon, 11 Dec 2017 22:38:31 +0900
parents	dce761558329
children	72b91f905065

comparison

equal deleted inserted replaced

-:dfae14354660
+:82c3762349ac
 diffhelpers = policy.importmod(r'diffhelpers')
 stringio = util.stringio
 gitre = re.compile(br'diff --git a/(.*) b/(.*)')
 tabsplitter = re.compile(br'(\t+|[^\t]+)')
+_nonwordre = re.compile(br'([^a-zA-Z0-9_\x80-\xff])')
 PatchError = error.PatchError
 # public functions
 s1 = s1[1:]
 else:
 raise error.ProgrammingError("Case not expected, operation = %s" %
 operation)
-s = difflib.ndiff(re.split(br'(\W)', s2), re.split(br'(\W)', s1))
+s = difflib.ndiff(_nonwordre.split(s2), _nonwordre.split(s1))
 for part in s:
 if part[0] in operation_skip or len(part) == 2:
 continue
 l = operation + '.highlight'
 if part[0] in ' ':

Mercurial > public > mercurial-scm > hg

comparison mercurial/patch.py @ 35383:82c3762349ac