comparison mercurial/revlogutils/deltas.py @ 42479:a0b26fc8fbba

deltas: skip if projected delta size does not match text size constraint Before computing any delta, we get a basic estimation of the delta size we can expect and the resulted compressed value. We then checks this projected size against the ?? size constraints. This allows to exclude potential base candidates before doing any expensive computation. This only apply to the intermediate-snapshot case since this constraint only apply to them. In practice we only perform this new checks for the manifestlog. Manifest log combine two property: it is likely to have delta chain issue and its diffing/compression is fairly predictable. The initial author of this changeset is Valentin Gatien-Baron providing the initial idea and initial testing, Pierre-Yves David later consolidated the code in the right location and run more extensive testing.
author Valentin Gatien-Baron <vgatien-baron@janestreet.com>, Pierre-Yves David <pierre-yves.david@octobus.net>
date Thu, 25 Apr 2019 22:30:14 +0200
parents 566daffc607d
children 66c27df1be84
comparison
equal deleted inserted replaced
42478:bc4373babd04 42479:a0b26fc8fbba
677 if revlog._maxchainlen and chainlen >= revlog._maxchainlen: 677 if revlog._maxchainlen and chainlen >= revlog._maxchainlen:
678 continue 678 continue
679 # if chain already have too much data, skip base 679 # if chain already have too much data, skip base
680 if deltas_limit < chainsize: 680 if deltas_limit < chainsize:
681 continue 681 continue
682 if sparse and revlog.upperboundcomp is not None:
683 maxcomp = revlog.upperboundcomp
684 basenotsnap = (p1, p2, nullrev)
685 if rev not in basenotsnap and revlog.issnapshot(rev):
686 snapshotdepth = revlog.snapshotdepth(rev)
687 # If text is significantly larger than the base, we can
688 # expect the resulting delta to be proportional to the size
689 # difference
690 revsize = revlog.rawsize(rev)
691 rawsizedistance = max(textlen - revsize, 0)
692 # use an estimate of the compression upper bound.
693 lowestrealisticdeltalen = rawsizedistance // maxcomp
694
695 # check the absolute constraint on the delta size
696 snapshotlimit = textlen >> snapshotdepth
697 if snapshotlimit < lowestrealisticdeltalen:
698 # delta lower bound is larger than accepted upper bound
699 continue
700
682 group.append(rev) 701 group.append(rev)
683 if group: 702 if group:
684 # XXX: in the sparse revlog case, group can become large, 703 # XXX: in the sparse revlog case, group can become large,
685 # impacting performances. Some bounding or slicing mecanism 704 # impacting performances. Some bounding or slicing mecanism
686 # would help to reduce this impact. 705 # would help to reduce this impact.