Mercurial > public > mercurial-scm > hg-stable
diff mercurial/mdiff.py @ 36444:44c4a38bf563
diff: do not split function name if character encoding is unknown
Only ASCII characters can be split reliably at any byte positions, so let's
just leave long multi-byte sequence long. It's probably less bad than putting
an invalid byte sequence into a diff.
This doesn't try to split the first ASCII slice from multi-byte sequence
because a combining character may follow.
author | Yuya Nishihara <yuya@tcha.org> |
---|---|
date | Fri, 23 Feb 2018 23:09:58 +0900 |
parents | 29dd37a418aa |
children | c6061cadb400 |
line wrap: on
line diff
--- a/mercurial/mdiff.py Sun Feb 25 11:20:35 2018 +0900 +++ b/mercurial/mdiff.py Fri Feb 23 23:09:58 2018 +0900 @@ -13,6 +13,7 @@ from .i18n import _ from . import ( + encoding, error, policy, pycompat, @@ -348,7 +349,11 @@ # alphanumeric char. for i in xrange(astart - 1, lastpos - 1, -1): if l1[i][0:1].isalnum(): - func = ' ' + l1[i].rstrip()[:40] + func = b' ' + l1[i].rstrip() + # split long function name if ASCII. otherwise we have no + # idea where the multi-byte boundary is, so just leave it. + if encoding.isasciistr(func): + func = func[:41] lastfunc[1] = func break # by recording this hunk's starting point as the next place to