Mercurial > public > mercurial-scm > hg-stable
comparison mercurial/revlog.py @ 38644:43d0619cec90
revlog: enforce chunk slicing down to a certain size
Limit maximum chunk size to 4x final size when reading a revision from a
revlog. We only apply this logic when the target size is known from the
revlog.
Ideally, revlog's delta chain would be written in a way that does not trigger
this extra slicing often. However, having this second guarantee that we won't
read unexpectedly large amounts of memory in all cases is important for the
future. Future delta chain building algorithms might have good reason to
create delta chain with such characteristics.
Including this code in core as soon as possible will make Mercurial 4.7
forward-compatible with such improvement.
author | Boris Feld <boris.feld@octobus.net> |
---|---|
date | Tue, 10 Jul 2018 12:20:57 +0200 |
parents | 967fee55e8d9 |
children | cd1c484e31e8 |
comparison
equal
deleted
inserted
replaced
38643:967fee55e8d9 | 38644:43d0619cec90 |
---|---|
1947 | 1947 |
1948 Returns a str holding uncompressed data for the requested revision. | 1948 Returns a str holding uncompressed data for the requested revision. |
1949 """ | 1949 """ |
1950 return self.decompress(self._getsegmentforrevs(rev, rev, df=df)[1]) | 1950 return self.decompress(self._getsegmentforrevs(rev, rev, df=df)[1]) |
1951 | 1951 |
1952 def _chunks(self, revs, df=None): | 1952 def _chunks(self, revs, df=None, targetsize=None): |
1953 """Obtain decompressed chunks for the specified revisions. | 1953 """Obtain decompressed chunks for the specified revisions. |
1954 | 1954 |
1955 Accepts an iterable of numeric revisions that are assumed to be in | 1955 Accepts an iterable of numeric revisions that are assumed to be in |
1956 ascending order. Also accepts an optional already-open file handle | 1956 ascending order. Also accepts an optional already-open file handle |
1957 to be used for reading. If used, the seek position of the file will | 1957 to be used for reading. If used, the seek position of the file will |
1974 ladd = l.append | 1974 ladd = l.append |
1975 | 1975 |
1976 if not self._withsparseread: | 1976 if not self._withsparseread: |
1977 slicedchunks = (revs,) | 1977 slicedchunks = (revs,) |
1978 else: | 1978 else: |
1979 slicedchunks = _slicechunk(self, revs) | 1979 slicedchunks = _slicechunk(self, revs, targetsize) |
1980 | 1980 |
1981 for revschunk in slicedchunks: | 1981 for revschunk in slicedchunks: |
1982 firstrev = revschunk[0] | 1982 firstrev = revschunk[0] |
1983 # Skip trailing revisions with empty diff | 1983 # Skip trailing revisions with empty diff |
1984 for lastrev in revschunk[::-1]: | 1984 for lastrev in revschunk[::-1]: |
2077 rawtext = self._cache[2] | 2077 rawtext = self._cache[2] |
2078 | 2078 |
2079 # drop cache to save memory | 2079 # drop cache to save memory |
2080 self._cache = None | 2080 self._cache = None |
2081 | 2081 |
2082 bins = self._chunks(chain, df=_df) | 2082 targetsize = None |
2083 rawsize = self.index[rev][2] | |
2084 if 0 <= rawsize: | |
2085 targetsize = 4 * rawsize | |
2086 | |
2087 bins = self._chunks(chain, df=_df, targetsize=targetsize) | |
2083 if rawtext is None: | 2088 if rawtext is None: |
2084 rawtext = bytes(bins[0]) | 2089 rawtext = bytes(bins[0]) |
2085 bins = bins[1:] | 2090 bins = bins[1:] |
2086 | 2091 |
2087 rawtext = mdiff.patches(rawtext, bins) | 2092 rawtext = mdiff.patches(rawtext, bins) |