comparison mercurial/mail.py @ 38341:7b12a2d2eedc

py3: ditch email.parser.BytesParser which appears to be plain crap As I said before, BytesParser is a thin wrapper over the unicode Parser, and it's too thin to return bytes back. Today, I found it does normalize newline characters to '\n's thanks to the careless use of TextIOWrapper. So, this patch replaces BytesParser with Parser + TextIOWrapper, and fix newline handling. Since I don't know what's the least bad encoding strategy here, I just copied it from BytesParser. I've moved new parse() function from pycompat, as it is no longer a trivial wrapper.
author Yuya Nishihara <yuya@tcha.org>
date Sat, 16 Jun 2018 19:31:07 +0900
parents 7edf68862fe3
children 858fe9625dab
comparison
equal deleted inserted replaced
38340:cf59de802883 38341:7b12a2d2eedc
9 9
10 import email 10 import email
11 import email.charset 11 import email.charset
12 import email.header 12 import email.header
13 import email.message 13 import email.message
14 import email.parser
15 import io
14 import os 16 import os
15 import smtplib 17 import smtplib
16 import socket 18 import socket
17 import time 19 import time
18 20
320 cs = 'us-ascii' 322 cs = 'us-ascii'
321 if not display: 323 if not display:
322 s, cs = _encode(ui, s, charsets) 324 s, cs = _encode(ui, s, charsets)
323 return mimetextqp(s, 'plain', cs) 325 return mimetextqp(s, 'plain', cs)
324 326
327 if pycompat.ispy3:
328 def parse(fp):
329 ep = email.parser.Parser()
330 # disable the "universal newlines" mode, which isn't binary safe.
331 # I have no idea if ascii/surrogateescape is correct, but that's
332 # what the standard Python email parser does.
333 fp = io.TextIOWrapper(fp, encoding=r'ascii',
334 errors=r'surrogateescape', newline=chr(10))
335 try:
336 return ep.parse(fp)
337 finally:
338 fp.detach()
339 else:
340 def parse(fp):
341 ep = email.parser.Parser()
342 return ep.parse(fp)
343
325 def headdecode(s): 344 def headdecode(s):
326 '''Decodes RFC-2047 header''' 345 '''Decodes RFC-2047 header'''
327 uparts = [] 346 uparts = []
328 for part, charset in email.header.decode_header(s): 347 for part, charset in email.header.decode_header(s):
329 if charset is not None: 348 if charset is not None: