Mercurial > public > mercurial-scm > hg
comparison mercurial/util.py @ 30395:10514a92860e
util: add iterfile to workaround a fileobj.__iter__ issue with EINTR
The fileobj.__iter__ implementation in Python 2.7.12 (hg changeset
45d4cea97b04) is buggy: it cannot handle EINTR correctly.
In Objects/fileobject.c:
size_t Py_UniversalNewlineFread(....) {
....
if (!f->f_univ_newline)
return fread(buf, 1, n, stream);
....
}
According to the "fread" man page:
If an error occurs, or the end of the file is reached, the return value
is a short item count (or zero).
Therefore it's possible for "fread" (and "Py_UniversalNewlineFread") to
return a positive value while errno is set to EINTR and ferror(stream)
changes from zero to non-zero.
There are multiple "Py_UniversalNewlineFread": "file_read", "file_readinto",
"file_readlines", "readahead". While the first 3 have code to handle the
EINTR case, the last one "readahead" doesn't:
static int readahead(PyFileObject *f, Py_ssize_t bufsize) {
....
chunksize = Py_UniversalNewlineFread(
f->f_buf, bufsize, f->f_fp, (PyObject *)f);
....
if (chunksize == 0) {
if (ferror(f->f_fp)) {
PyErr_SetFromErrno(PyExc_IOError);
....
}
}
....
}
It means "readahead" could ignore EINTR, if "Py_UniversalNewlineFread"
returns a non-zero value. And at the next time "readahead" got executed, if
"Py_UniversalNewlineFread" returns 0, "readahead" would raise a Python error
without a incorrect errno - could be 0 - thus "IOError: [Errno 0] Error".
The only user of "readahead" is "readahead_get_line_skip".
The only user of "readahead_get_line_skip" is "file_iternext", aka.
"fileobj.__iter__", which should be avoided.
There are multiple places where the pattern "for x in fp" is used. This
patch adds a "iterfile" method in "util.py" so we can migrate our code from
"for x in fp" to "fox x in util.iterfile(fp)".
author | Jun Wu <quark@fb.com> |
---|---|
date | Mon, 14 Nov 2016 23:32:54 +0000 |
parents | 673f0fdc1046 |
children | 854190becacb |
comparison
equal
deleted
inserted
replaced
30394:046a7e828ea6 | 30395:10514a92860e |
---|---|
2188 wrapper = MBTextWrapper(width=width, | 2188 wrapper = MBTextWrapper(width=width, |
2189 initial_indent=initindent, | 2189 initial_indent=initindent, |
2190 subsequent_indent=hangindent) | 2190 subsequent_indent=hangindent) |
2191 return wrapper.fill(line).encode(encoding.encoding) | 2191 return wrapper.fill(line).encode(encoding.encoding) |
2192 | 2192 |
2193 def iterfile(fp): | |
2194 """like fp.__iter__ but does not have issues with EINTR. Python 2.7.12 is | |
2195 known to have such issues.""" | |
2196 return iter(fp.readline, '') | |
2197 | |
2193 def iterlines(iterator): | 2198 def iterlines(iterator): |
2194 for chunk in iterator: | 2199 for chunk in iterator: |
2195 for line in chunk.splitlines(): | 2200 for line in chunk.splitlines(): |
2196 yield line | 2201 yield line |
2197 | 2202 |