mercurial-scm/hg: mercurial/util.py comparison

comparison mercurial/util.py @ 26480:6ae14d1ca3aa

util.chunkbuffer: avoid extra mutations when reading partial chunks Previously, a read(N) where N was less than the length of the first available chunk would mutate the deque instance twice and allocate a new str from the slice of the existing chunk. Profiling drawed my attention to these as a potential hot spot during changegroup reading. This patch makes the code more complicated in order to avoid the aforementioned 3 operations. On a pre-generated mozilla-central gzip bundle, this series has the following impact on `hg unbundle` performance on my MacBook Pro: before: 358.21 real 317.69 user 38.49 sys after: 301.57 real 262.69 user 37.11 sys delta: -56.64 real -55.00 user -1.38 sys

author	Gregory Szorc <gregory.szorc@gmail.com>
date	Mon, 05 Oct 2015 17:36:32 -0700
parents	46143f31290e
children	7d132557e44a

comparison

equal deleted inserted replaced

-:46143f31290e
+:6ae14d1ca3aa
 pos = end
 else:
 yield chunk
 self.iter = splitbig(in_iter)
 self._queue = collections.deque()
+self._chunkoffset = 0
 def read(self, l=None):
 """Read L bytes of data from the iterator of chunks of data.
 Returns less than L bytes if the iterator runs dry.
 if target <= 0:
 break
 if not queue:
 break
+# The easy way to do this would be to queue.popleft(), modify the
+# chunk (if necessary), then queue.appendleft(). However, for cases
+# where we read partial chunk content, this incurs 2 dequeue
+# mutations and creates a new str for the remaining chunk in the
+# queue. Our code below avoids this overhead.
 chunk = queue[0]
 chunkl = len(chunk)
+offset = self._chunkoffset
 # Use full chunk.
-if left >= chunkl:
+if offset == 0 and left >= chunkl:
 left -= chunkl
 queue.popleft()
 buf.append(chunk)
+# self._chunkoffset remains at 0.
+continue
+chunkremaining = chunkl - offset
+# Use all of unconsumed part of chunk.
+if left >= chunkremaining:
+left -= chunkremaining
+queue.popleft()
+# offset == 0 is enabled by block above, so this won't merely
+# copy via ``chunk[0:]``.
+buf.append(chunk[offset:])
+self._chunkoffset = 0
 # Partial chunk needed.
 else:
-left -= chunkl
+buf.append(chunk[offset:offset + left])
-queue.popleft()
+self._chunkoffset += left
-queue.appendleft(chunk[left:])
+left -= chunkremaining
-buf.append(chunk[:left])
 return ''.join(buf)
 def filechunkiter(f, size=65536, limit=None):
 """Create a generator that produces the data in the file size

Mercurial > public > mercurial-scm > hg

comparison mercurial/util.py @ 26480:6ae14d1ca3aa