Mercurial > public > mercurial-scm > hg-stable
diff mercurial/revlogutils/deltas.py @ 40991:42f59d3f714d
delta: exclude base candidate much smaller than the target
If a revision's full text is that much bigger than a base candidate full text,
we no longer consider that candidate.
This solves a pathological case we encountered on a very specify repository.
It contains a long series of changesets with a very small manifest (one file)
co-existing with others changesets using a very large manifest.
Without this filtering, we ended up considering a large number of tiny full
snapshots as a potential base. It resulted in very large delta (the size of
the full text) and mercurial spending 99% of its time compressing these
deltas.
The timing of a commit moved from about 400s to about 10s (still slow, but not
ridiculously slow).
author | Boris Feld <boris.feld@octobus.net> |
---|---|
date | Mon, 17 Dec 2018 10:42:19 +0100 |
parents | f960c51eebf3 |
children | ba09db267cb6 |
line wrap: on
line diff
--- a/mercurial/revlogutils/deltas.py Mon Dec 17 10:37:22 2018 +0100 +++ b/mercurial/revlogutils/deltas.py Mon Dec 17 10:42:19 2018 +0100 @@ -601,6 +601,11 @@ return True +# If a revision's full text is that much bigger than a base candidate full +# text's, it is very unlikely that it will produce a valid delta. We no longer +# consider these candidates. +LIMIT_BASE2TEXT = 50 + def _candidategroups(revlog, textlen, p1, p2, cachedelta): """Provides group of revision to be tested as delta base @@ -614,6 +619,7 @@ deltalength = revlog.length deltaparent = revlog.deltaparent + sparse = revlog._sparserevlog good = None deltas_limit = textlen * LIMIT_DELTA2TEXT @@ -644,6 +650,8 @@ # filter out delta base that will never produce good delta if deltas_limit < revlog.length(rev): continue + if sparse and revlog.rawsize(rev) < (textlen // LIMIT_BASE2TEXT): + continue # no delta for rawtext-changing revs (see "candelta" for why) if revlog.flags(rev) & REVIDX_RAWTEXT_CHANGING_FLAGS: continue