Mercurial > public > mercurial-scm > hg-stable
comparison mercurial/revlogutils/deltas.py @ 40991:42f59d3f714d
delta: exclude base candidate much smaller than the target
If a revision's full text is that much bigger than a base candidate full text,
we no longer consider that candidate.
This solves a pathological case we encountered on a very specify repository.
It contains a long series of changesets with a very small manifest (one file)
co-existing with others changesets using a very large manifest.
Without this filtering, we ended up considering a large number of tiny full
snapshots as a potential base. It resulted in very large delta (the size of
the full text) and mercurial spending 99% of its time compressing these
deltas.
The timing of a commit moved from about 400s to about 10s (still slow, but not
ridiculously slow).
author | Boris Feld <boris.feld@octobus.net> |
---|---|
date | Mon, 17 Dec 2018 10:42:19 +0100 |
parents | f960c51eebf3 |
children | ba09db267cb6 |
comparison
equal
deleted
inserted
replaced
40990:21a9cace4bbf | 40991:42f59d3f714d |
---|---|
599 and revlog.length(deltainfo.base) < deltainfo.deltalen): | 599 and revlog.length(deltainfo.base) < deltainfo.deltalen): |
600 return False | 600 return False |
601 | 601 |
602 return True | 602 return True |
603 | 603 |
604 # If a revision's full text is that much bigger than a base candidate full | |
605 # text's, it is very unlikely that it will produce a valid delta. We no longer | |
606 # consider these candidates. | |
607 LIMIT_BASE2TEXT = 50 | |
608 | |
604 def _candidategroups(revlog, textlen, p1, p2, cachedelta): | 609 def _candidategroups(revlog, textlen, p1, p2, cachedelta): |
605 """Provides group of revision to be tested as delta base | 610 """Provides group of revision to be tested as delta base |
606 | 611 |
607 This top level function focus on emitting groups with unique and worthwhile | 612 This top level function focus on emitting groups with unique and worthwhile |
608 content. See _raw_candidate_groups for details about the group order. | 613 content. See _raw_candidate_groups for details about the group order. |
612 yield None | 617 yield None |
613 return | 618 return |
614 | 619 |
615 deltalength = revlog.length | 620 deltalength = revlog.length |
616 deltaparent = revlog.deltaparent | 621 deltaparent = revlog.deltaparent |
622 sparse = revlog._sparserevlog | |
617 good = None | 623 good = None |
618 | 624 |
619 deltas_limit = textlen * LIMIT_DELTA2TEXT | 625 deltas_limit = textlen * LIMIT_DELTA2TEXT |
620 | 626 |
621 tested = set([nullrev]) | 627 tested = set([nullrev]) |
641 if rev in tested: | 647 if rev in tested: |
642 continue | 648 continue |
643 tested.add(rev) | 649 tested.add(rev) |
644 # filter out delta base that will never produce good delta | 650 # filter out delta base that will never produce good delta |
645 if deltas_limit < revlog.length(rev): | 651 if deltas_limit < revlog.length(rev): |
652 continue | |
653 if sparse and revlog.rawsize(rev) < (textlen // LIMIT_BASE2TEXT): | |
646 continue | 654 continue |
647 # no delta for rawtext-changing revs (see "candelta" for why) | 655 # no delta for rawtext-changing revs (see "candelta" for why) |
648 if revlog.flags(rev) & REVIDX_RAWTEXT_CHANGING_FLAGS: | 656 if revlog.flags(rev) & REVIDX_RAWTEXT_CHANGING_FLAGS: |
649 continue | 657 continue |
650 group.append(rev) | 658 group.append(rev) |