Mercurial > public > mercurial-scm > hg
comparison mercurial/pycompat.py @ 31820:45761ef1bc93
py3: have registrar process docstrings in bytes
Mixing bytes and unicode creates a mess. Do things in bytes as possible.
New sysbytes() helper only takes care of ASCII characters, but avoids raising
nasty unicode exception. This is the same design principle as sysstr().
author | Yuya Nishihara <yuya@tcha.org> |
---|---|
date | Wed, 05 Apr 2017 00:34:58 +0900 |
parents | 8181f378b073 |
children | c130d092042a |
comparison
equal
deleted
inserted
replaced
31819:95a67508fd72 | 31820:45761ef1bc93 |
---|---|
140 | 140 |
141 def iterbytestr(s): | 141 def iterbytestr(s): |
142 """Iterate bytes as if it were a str object of Python 2""" | 142 """Iterate bytes as if it were a str object of Python 2""" |
143 return map(bytechr, s) | 143 return map(bytechr, s) |
144 | 144 |
145 def sysbytes(s): | |
146 """Convert an internal str (e.g. keyword, __doc__) back to bytes | |
147 | |
148 This never raises UnicodeEncodeError, but only ASCII characters | |
149 can be round-trip by sysstr(sysbytes(s)). | |
150 """ | |
151 return s.encode(u'utf-8') | |
152 | |
145 def sysstr(s): | 153 def sysstr(s): |
146 """Return a keyword str to be passed to Python functions such as | 154 """Return a keyword str to be passed to Python functions such as |
147 getattr() and str.encode() | 155 getattr() and str.encode() |
148 | 156 |
149 This never raises UnicodeDecodeError. Non-ascii characters are | 157 This never raises UnicodeDecodeError. Non-ascii characters are |
208 import cStringIO | 216 import cStringIO |
209 | 217 |
210 bytechr = chr | 218 bytechr = chr |
211 bytestr = str | 219 bytestr = str |
212 iterbytestr = iter | 220 iterbytestr = iter |
221 sysbytes = identity | |
213 sysstr = identity | 222 sysstr = identity |
214 | 223 |
215 # Partial backport from os.py in Python 3, which only accepts bytes. | 224 # Partial backport from os.py in Python 3, which only accepts bytes. |
216 # In Python 2, our paths should only ever be bytes, a unicode path | 225 # In Python 2, our paths should only ever be bytes, a unicode path |
217 # indicates a bug. | 226 # indicates a bug. |