comparison mercurial/logcmdutil.py @ 49622:dcb2581e33be stable

memory-usage: fix `hg log --follow --rev R F` space complexity When running `hg log --follow --rev REVS FILES`, the log code will walk the history of all FILES starting from the file revisions that exists in each REVS. Before doing so, it looks if the files actually exists in the target revisions. To do so, it opens the manifest of each revision in REVS to look up if we find the associated items in FILES. Before this changeset this was done in a way that created a changectx for each target revision, keeping them in memory while we look into each file. If the set of REVS is large, this means keeping the manifest for each entry in REVS in memory. That can be large? if REV is in the form `::X`, this can quickly become huge and saturate the memory. We have seen usage allocating 2GB per second until memory runs out. So this changeset invert the two loop so that only one revision is kept in memory during the operation. This solve the memory explosion issue.
author Pierre-Yves David <pierre-yves.david@octobus.net>
date Sat, 19 Nov 2022 01:35:01 +0100
parents 79b2c98ab7b4
children 204af2aa4931
comparison
equal deleted inserted replaced
49621:55c6ebd11cb9 49622:dcb2581e33be
815 if not slowpath: 815 if not slowpath:
816 if wopts.follow and wopts.revspec: 816 if wopts.follow and wopts.revspec:
817 # There may be the case that a path doesn't exist in some (but 817 # There may be the case that a path doesn't exist in some (but
818 # not all) of the specified start revisions, but let's consider 818 # not all) of the specified start revisions, but let's consider
819 # the path is valid. Missing files will be warned by the matcher. 819 # the path is valid. Missing files will be warned by the matcher.
820 startctxs = [repo[r] for r in revs] 820 all_files = list(match.files())
821 for f in match.files(): 821 missing_files = set(all_files)
822 found = False 822 files = all_files
823 for c in startctxs: 823 for r in revs:
824 if f in c: 824 if not files:
825 found = True 825 # We don't have any file to check anymore.
826 elif c.hasdir(f): 826 break
827 ctx = repo[r]
828 for f in files:
829 if f in ctx:
830 missing_files.discard(f)
831 elif ctx.hasdir(f):
827 # If a directory exists in any of the start revisions, 832 # If a directory exists in any of the start revisions,
828 # take the slow path. 833 # take the slow path.
829 found = slowpath = True 834 missing_files.discard(f)
830 if not found: 835 slowpath = True
836 # we found on slow path, no need to search for more.
837 files = missing_files
838 for f in all_files:
839 if f in missing_files:
831 raise error.StateError( 840 raise error.StateError(
832 _( 841 _(
833 b'cannot follow file not in any of the specified ' 842 b'cannot follow file not in any of the specified '
834 b'revisions: "%s"' 843 b'revisions: "%s"'
835 ) 844 )