Mercurial > public > mercurial-scm > hg-stable

diff mercurial/mdiff.py @ 36444:44c4a38bf563
diff: do not split function name if character encoding is unknown Only ASCII characters can be split reliably at any byte positions, so let's just leave long multi-byte sequence long. It's probably less bad than putting an invalid byte sequence into a diff. This doesn't try to split the first ASCII slice from multi-byte sequence because a combining character may follow.
author: Yuya Nishihara <yuya@tcha.org>
date: Fri, 23 Feb 2018 23:09:58 +0900
parents: 29dd37a418aa
children: c6061cadb400
--- a/mercurial/mdiff.py	Sun Feb 25 11:20:35 2018 +0900
+++ b/mercurial/mdiff.py	Fri Feb 23 23:09:58 2018 +0900
@@ -13,6 +13,7 @@
 
 from .i18n import _
 from . import (
+    encoding,
     error,
     policy,
     pycompat,
@@ -348,7 +349,11 @@
             # alphanumeric char.
             for i in xrange(astart - 1, lastpos - 1, -1):
                 if l1[i][0:1].isalnum():
-                    func = ' ' + l1[i].rstrip()[:40]
+                    func = b' ' + l1[i].rstrip()
+                    # split long function name if ASCII. otherwise we have no
+                    # idea where the multi-byte boundary is, so just leave it.
+                    if encoding.isasciistr(func):
+                        func = func[:41]
                     lastfunc[1] = func
                     break
             # by recording this hunk's starting point as the next place to
author	Yuya Nishihara <yuya@tcha.org>
date	Fri, 23 Feb 2018 23:09:58 +0900
parents	29dd37a418aa
children	c6061cadb400