Mercurial > public > mercurial-scm > hg-stable
comparison mercurial/encoding.py @ 13940:b7b26e54e37a stable
encoding: avoid localstr when a string can be encoded losslessly (issue2763)
localstr's hash method exists to prevent bogus matching on lossy local
encodings. For instance, we don't want 'caf?' to match 'caf?' in an
ASCII locale.
But when caf? can be losslessly encoded in the local charset, we can
simply use a normal string and avoid the hashing trick.
This avoids using localstr's hash method, which would prevent a match between
author | Matt Mackall <mpm@selenic.com> |
---|---|
date | Fri, 15 Apr 2011 23:45:41 -0500 |
parents | 120eccaaa522 |
children | e38846a79a23 |
comparison
equal
deleted
inserted
replaced
13937:5f126c01ebfa | 13940:b7b26e54e37a |
---|---|
93 """ | 93 """ |
94 | 94 |
95 for e in ('UTF-8', fallbackencoding): | 95 for e in ('UTF-8', fallbackencoding): |
96 try: | 96 try: |
97 u = s.decode(e) # attempt strict decoding | 97 u = s.decode(e) # attempt strict decoding |
98 if e == 'UTF-8': | 98 r = u.encode(encoding, "replace") |
99 return localstr(s, u.encode(encoding, "replace")) | 99 if u == r.decode(encoding): |
100 # r is a safe, non-lossy encoding of s | |
101 return r | |
102 elif e == 'UTF-8': | |
103 return localstr(s, r) | |
100 else: | 104 else: |
101 return localstr(u.encode('UTF-8'), | 105 return localstr(u.encode('UTF-8'), r) |
102 u.encode(encoding, "replace")) | 106 |
103 except LookupError, k: | 107 except LookupError, k: |
104 raise error.Abort("%s, please check your locale settings" % k) | 108 raise error.Abort("%s, please check your locale settings" % k) |
105 except UnicodeDecodeError: | 109 except UnicodeDecodeError: |
106 pass | 110 pass |
107 u = s.decode("utf-8", "replace") # last ditch | 111 u = s.decode("utf-8", "replace") # last ditch |