comparison mercurial/revlog.py @ 30302:ceddc3d94d74

revlog: inline start() and end() for perf reasons When I implemented `hg perfrevlogchunks`, one of the things that stood out was N * _chunk() calls was ~38x slower than 1 _chunks() call. Specifically, on the mozilla-unified repo: N*_chunk: 0.528997s 1*_chunks: 0.013735s This repo has 352,097 changesets. So the average time per changeset comes out to: N*_chunk: 1.502us 1*_chunks: 0.039us If you extrapolate these numbers to a repository with 1M changesets, that comes out to 1.502s versus 0.039s, which is significant. At these latencies, Python attribute lookups and function calls matter. So, this patch inlines some code to cut down on that overhead. The impact of this patch on N*_chunk() calls is clear: ! wall 0.528997 comb 0.520000 user 0.500000 sys 0.020000 (best of 19) ! wall 0.367723 comb 0.370000 user 0.350000 sys 0.020000 (best of 27) So, we go from ~38x slower to ~27x. A nice improvement. But there's still a long way to go. It's worth noting that functionality like revsets perform changelog lookups one revision at a time. So this code path is worth optimizing.
author Gregory Szorc <gregory.szorc@gmail.com>
date Sat, 22 Oct 2016 15:41:23 -0700
parents 0986f225c149
children 1f92056c4066
comparison
equal deleted inserted replaced
30301:0986f225c149 30302:ceddc3d94d74
1107 revlog and data is a str or buffer of the raw byte data. 1107 revlog and data is a str or buffer of the raw byte data.
1108 1108
1109 Callers will need to call ``self.start(rev)`` and ``self.length(rev)`` 1109 Callers will need to call ``self.start(rev)`` and ``self.length(rev)``
1110 to determine where each revision's data begins and ends. 1110 to determine where each revision's data begins and ends.
1111 """ 1111 """
1112 start = self.start(startrev) 1112 # Inlined self.start(startrev) & self.end(endrev) for perf reasons
1113 end = self.end(endrev) 1113 # (functions are expensive).
1114 index = self.index
1115 istart = index[startrev]
1116 iend = index[endrev]
1117 start = int(istart[0] >> 16)
1118 end = int(iend[0] >> 16) + iend[1]
1119
1114 if self._inline: 1120 if self._inline:
1115 start += (startrev + 1) * self._io.size 1121 start += (startrev + 1) * self._io.size
1116 end += (endrev + 1) * self._io.size 1122 end += (endrev + 1) * self._io.size
1117 length = end - start 1123 length = end - start
1118 1124