comparison mercurial/copies.py @ 26013:38f92d12357c

copy: add flag for disabling copy tracing Copy tracing can be up to 80% of rebase time when rebasing stacks of commits in large repos (hundreds of thousands of files). This provides the option of turning off the majority of copy tracing. It does not turn off _forwardcopies() since that is used to carry copy information inside a commit across a rebase. This will affect the situation where a user edits a file, then rebases on top of commits that have moved that file. The move will not be detected and the user will have to manually resolve the issue (possibly by redoing the rebase with this flag off). The reason to have a flag instead of trying to fix the actual copy tracing performance is that copy tracing is fundamentally an O(number of files in the repo) operation. In order to know if file X in the rebase source was copied anywhere, we have to walk the filelog for every new file that exists in the rebase destination (i.e. a file in the destination that is not in the common ancestor). Without an index that lets us trace forward (i.e. from file Y in the common ancestor forward to the rebase destination), it will never be an O(number of changes in my branch) operation. In mozilla-central, rebasing a 3 commit stack across 20,000 revs goes from 39s to 11s.
author Durham Goode <durham@fb.com>
date Tue, 27 Jan 2015 11:26:27 -0800
parents cfc24c22454e
children 3c6902ed9f07
comparison
equal deleted inserted replaced
26011:ce77436162a5 26013:38f92d12357c
189 cm = _chain(a, w, cm, _dirstatecopies(w)) 189 cm = _chain(a, w, cm, _dirstatecopies(w))
190 190
191 return cm 191 return cm
192 192
193 def _backwardrenames(a, b): 193 def _backwardrenames(a, b):
194 if a._repo.ui.configbool('experimental', 'disablecopytrace'):
195 return {}
196
194 # Even though we're not taking copies into account, 1:n rename situations 197 # Even though we're not taking copies into account, 1:n rename situations
195 # can still exist (e.g. hg cp a b; hg mv a c). In those cases we 198 # can still exist (e.g. hg cp a b; hg mv a c). In those cases we
196 # arbitrarily pick one of the renames. 199 # arbitrarily pick one of the renames.
197 f = _forwardcopies(b, a) 200 f = _forwardcopies(b, a)
198 r = {} 201 r = {}
261 return {}, {}, {}, {} 264 return {}, {}, {}, {}
262 265
263 # avoid silly behavior for parent -> working dir 266 # avoid silly behavior for parent -> working dir
264 if c2.node() is None and c1.node() == repo.dirstate.p1(): 267 if c2.node() is None and c1.node() == repo.dirstate.p1():
265 return repo.dirstate.copies(), {}, {}, {} 268 return repo.dirstate.copies(), {}, {}, {}
269
270 # Copy trace disabling is explicitly below the node == p1 logic above
271 # because the logic above is required for a simple copy to be kept across a
272 # rebase.
273 if repo.ui.configbool('experimental', 'disablecopytrace'):
274 return {}, {}, {}, {}
266 275
267 limit = _findlimit(repo, c1.rev(), c2.rev()) 276 limit = _findlimit(repo, c1.rev(), c2.rev())
268 if limit is None: 277 if limit is None:
269 # no common ancestor, no copies 278 # no common ancestor, no copies
270 return {}, {}, {}, {} 279 return {}, {}, {}, {}
511 filter copy records. Any copies that occur between fromrev and 520 filter copy records. Any copies that occur between fromrev and
512 skiprev will not be duplicated, even if they appear in the set of 521 skiprev will not be duplicated, even if they appear in the set of
513 copies between fromrev and rev. 522 copies between fromrev and rev.
514 ''' 523 '''
515 exclude = {} 524 exclude = {}
516 if skiprev is not None: 525 if (skiprev is not None and
526 not repo.ui.configbool('experimental', 'disablecopytrace')):
527 # disablecopytrace skips this line, but not the entire function because
528 # the line below is O(size of the repo) during a rebase, while the rest
529 # of the function is much faster (and is required for carrying copy
530 # metadata across the rebase anyway).
517 exclude = pathcopies(repo[fromrev], repo[skiprev]) 531 exclude = pathcopies(repo[fromrev], repo[skiprev])
518 for dst, src in pathcopies(repo[fromrev], repo[rev]).iteritems(): 532 for dst, src in pathcopies(repo[fromrev], repo[rev]).iteritems():
519 # copies.pathcopies returns backward renames, so dst might not 533 # copies.pathcopies returns backward renames, so dst might not
520 # actually be in the dirstate 534 # actually be in the dirstate
521 if dst in exclude: 535 if dst in exclude: