comparison mercurial/util.py @ 30395:10514a92860e

util: add iterfile to workaround a fileobj.__iter__ issue with EINTR The fileobj.__iter__ implementation in Python 2.7.12 (hg changeset 45d4cea97b04) is buggy: it cannot handle EINTR correctly. In Objects/fileobject.c: size_t Py_UniversalNewlineFread(....) { .... if (!f->f_univ_newline) return fread(buf, 1, n, stream); .... } According to the "fread" man page: If an error occurs, or the end of the file is reached, the return value is a short item count (or zero). Therefore it's possible for "fread" (and "Py_UniversalNewlineFread") to return a positive value while errno is set to EINTR and ferror(stream) changes from zero to non-zero. There are multiple "Py_UniversalNewlineFread": "file_read", "file_readinto", "file_readlines", "readahead". While the first 3 have code to handle the EINTR case, the last one "readahead" doesn't: static int readahead(PyFileObject *f, Py_ssize_t bufsize) { .... chunksize = Py_UniversalNewlineFread( f->f_buf, bufsize, f->f_fp, (PyObject *)f); .... if (chunksize == 0) { if (ferror(f->f_fp)) { PyErr_SetFromErrno(PyExc_IOError); .... } } .... } It means "readahead" could ignore EINTR, if "Py_UniversalNewlineFread" returns a non-zero value. And at the next time "readahead" got executed, if "Py_UniversalNewlineFread" returns 0, "readahead" would raise a Python error without a incorrect errno - could be 0 - thus "IOError: [Errno 0] Error". The only user of "readahead" is "readahead_get_line_skip". The only user of "readahead_get_line_skip" is "file_iternext", aka. "fileobj.__iter__", which should be avoided. There are multiple places where the pattern "for x in fp" is used. This patch adds a "iterfile" method in "util.py" so we can migrate our code from "for x in fp" to "fox x in util.iterfile(fp)".
author Jun Wu <quark@fb.com>
date Mon, 14 Nov 2016 23:32:54 +0000
parents 673f0fdc1046
children 854190becacb
comparison
equal deleted inserted replaced
30394:046a7e828ea6 30395:10514a92860e
2188 wrapper = MBTextWrapper(width=width, 2188 wrapper = MBTextWrapper(width=width,
2189 initial_indent=initindent, 2189 initial_indent=initindent,
2190 subsequent_indent=hangindent) 2190 subsequent_indent=hangindent)
2191 return wrapper.fill(line).encode(encoding.encoding) 2191 return wrapper.fill(line).encode(encoding.encoding)
2192 2192
2193 def iterfile(fp):
2194 """like fp.__iter__ but does not have issues with EINTR. Python 2.7.12 is
2195 known to have such issues."""
2196 return iter(fp.readline, '')
2197
2193 def iterlines(iterator): 2198 def iterlines(iterator):
2194 for chunk in iterator: 2199 for chunk in iterator:
2195 for line in chunk.splitlines(): 2200 for line in chunk.splitlines():
2196 yield line 2201 yield line
2197 2202