Mercurial > public > mercurial-scm > hg
comparison mercurial/revlog.py @ 20179:5bb3826bdac4
revlog: read/cache chunks in fixed windows of 64 KB
When reading a revlog chunk, instead of reading up to 64 KB ahead of the
request offset and caching that, this change caches a fixed window before
and after the requested data that falls on 64 KB boundaries. This increases
cache hits when reading revlogs backwards.
Running perfmoonwalk on the Mercurial repo (with almost 20,000 changesets) on
Mac OS X with an SSD, before this change:
$ hg perfmoonwalk
! wall 2.307994 comb 2.310000 user 2.120000 sys 0.190000 (best of 5)
(Each run has 10,668 cache hits and 9,304 misses.)
After this change:
$ hg perfmoonwalk
! wall 1.814117 comb 1.810000 user 1.810000 sys 0.000000 (best of 6)
(19,931 cache hits, 62 misses.)
On a busy NFS share, before this change:
$ hg perfmoonwalk
! wall 17.000034 comb 4.100000 user 3.270000 sys 0.830000 (best of 3)
After:
$ hg perfmoonwalk
! wall 1.746115 comb 1.670000 user 1.660000 sys 0.010000 (best of 5)
author | Brodie Rao <brodie@sf.io> |
---|---|
date | Sun, 17 Nov 2013 18:04:28 -0500 |
parents | 5fc2ae1c631b |
children | 969148b49fc6 |
comparison
equal
deleted
inserted
replaced
20178:74aea4be8e78 | 20179:5bb3826bdac4 |
---|---|
840 if self._inline: | 840 if self._inline: |
841 df = self.opener(self.indexfile) | 841 df = self.opener(self.indexfile) |
842 else: | 842 else: |
843 df = self.opener(self.datafile) | 843 df = self.opener(self.datafile) |
844 | 844 |
845 readahead = max(65536, length) | 845 # Cache data both forward and backward around the requested |
846 df.seek(offset) | 846 # data, in a fixed size window. This helps speed up operations |
847 d = df.read(readahead) | 847 # involving reading the revlog backwards. |
848 realoffset = offset & ~65535 | |
849 reallength = ((offset + length + 65536) & ~65535) - realoffset | |
850 df.seek(realoffset) | |
851 d = df.read(reallength) | |
848 df.close() | 852 df.close() |
849 self._addchunk(offset, d) | 853 self._addchunk(realoffset, d) |
850 if readahead > length: | 854 if offset != realoffset or reallength != length: |
851 return util.buffer(d, 0, length) | 855 return util.buffer(d, offset - realoffset, length) |
852 return d | 856 return d |
853 | 857 |
854 def _getchunk(self, offset, length): | 858 def _getchunk(self, offset, length): |
855 o, d = self._chunkcache | 859 o, d = self._chunkcache |
856 l = len(d) | 860 l = len(d) |