comparison mercurial/patch.py @ 38341:7b12a2d2eedc

py3: ditch email.parser.BytesParser which appears to be plain crap As I said before, BytesParser is a thin wrapper over the unicode Parser, and it's too thin to return bytes back. Today, I found it does normalize newline characters to '\n's thanks to the careless use of TextIOWrapper. So, this patch replaces BytesParser with Parser + TextIOWrapper, and fix newline handling. Since I don't know what's the least bad encoding strategy here, I just copied it from BytesParser. I've moved new parse() function from pycompat, as it is no longer a trivial wrapper.
author Yuya Nishihara <yuya@tcha.org>
date Sat, 16 Jun 2018 19:31:07 +0900
parents f47608575c10
children da2a7d8354b2
comparison
equal deleted inserted replaced
38340:cf59de802883 38341:7b12a2d2eedc
110 110
111 for line in stream: 111 for line in stream:
112 cur.append(line) 112 cur.append(line)
113 c = chunk(cur) 113 c = chunk(cur)
114 114
115 m = pycompat.emailparser().parse(c) 115 m = mail.parse(c)
116 if not m.is_multipart(): 116 if not m.is_multipart():
117 yield msgfp(m) 117 yield msgfp(m)
118 else: 118 else:
119 ok_types = ('text/plain', 'text/x-diff', 'text/x-patch') 119 ok_types = ('text/plain', 'text/x-diff', 'text/x-patch')
120 for part in m.walk(): 120 for part in m.walk():
228 br'\*\*\*[ \t].*?^---[ \t])', 228 br'\*\*\*[ \t].*?^---[ \t])',
229 re.MULTILINE | re.DOTALL) 229 re.MULTILINE | re.DOTALL)
230 230
231 data = {} 231 data = {}
232 232
233 msg = pycompat.emailparser().parse(fileobj) 233 msg = mail.parse(fileobj)
234 234
235 subject = msg[r'Subject'] and mail.headdecode(msg[r'Subject']) 235 subject = msg[r'Subject'] and mail.headdecode(msg[r'Subject'])
236 data['user'] = msg[r'From'] and mail.headdecode(msg[r'From']) 236 data['user'] = msg[r'From'] and mail.headdecode(msg[r'From'])
237 if not subject and not data['user']: 237 if not subject and not data['user']:
238 # Not an email, restore parsed headers if any 238 # Not an email, restore parsed headers if any