comparison mercurial/localrepo.py @ 42406:f385ba70e4af

changelog: optionally store added and removed files in changeset extras As mentioned in an earlier patch, copies._chain() is used a lot in the changeset-centric version of pathcopies(). It is expensive because it needs to look at the manifest in order to filter out copies whose target file has since been removed. I want to store the sets of added and removed files in the changeset in order to speed that up. This patch does the writing part of that. It could easily be a separate config, but it's currently tied to experimental.copies.write-to since that's the only real use case (it will also make the {file_*} template keywords faster, but I doubt that anyone cares enough about those to write extra metadata for them). The new information is stored in the changeset extras. Since they're always subsets of the changeset's "files" list, they're stored as indexes into that list. I've stored the indexes as stringified ints separated by NUL bytes. The size of 00changelog.d for the hg repo increased in size by 0.28% percent (compared to the size with only copy information in the changesets, which in turn is 0.17% larger than without copy information). We could store only the delta between the indexes and we could store them in binary, but the chosen format is more readable. We could also have implemented this as a cache outside the changelog. One advantage of doing it that way is that we would get the speedups from the {file_*} template keywords also on old repos. Another advantage is that it we can rewrite the cache if we find a bug in how we calculate the set of files. A disadvantage is that it would be more complex. Another is that it would surely use more space. We already write the copy information to the changeset extras, so it seems like a small step to also write these file sets. Differential Revision: https://phab.mercurial-scm.org/D6416
author Martin von Zweigbergk <martinvonz@google.com>
date Tue, 14 May 2019 22:19:51 -0700
parents ffab9eed3921
children 381d8fa53f34
comparison
equal deleted inserted replaced
42405:0c72eddb4be5 42406:f385ba70e4af
2587 p1, p2 = ctx.p1(), ctx.p2() 2587 p1, p2 = ctx.p1(), ctx.p2()
2588 user = ctx.user() 2588 user = ctx.user()
2589 2589
2590 writecopiesto = self.ui.config('experimental', 'copies.write-to') 2590 writecopiesto = self.ui.config('experimental', 'copies.write-to')
2591 writefilecopymeta = writecopiesto != 'changeset-only' 2591 writefilecopymeta = writecopiesto != 'changeset-only'
2592 writechangesetcopy = (writecopiesto in
2593 ('changeset-only', 'compatibility'))
2592 p1copies, p2copies = None, None 2594 p1copies, p2copies = None, None
2593 if writecopiesto in ('changeset-only', 'compatibility'): 2595 if writechangesetcopy:
2594 p1copies = ctx.p1copies() 2596 p1copies = ctx.p1copies()
2595 p2copies = ctx.p2copies() 2597 p2copies = ctx.p2copies()
2598 filesadded, filesremoved = None, None
2596 with self.lock(), self.transaction("commit") as tr: 2599 with self.lock(), self.transaction("commit") as tr:
2597 trp = weakref.proxy(tr) 2600 trp = weakref.proxy(tr)
2598 2601
2599 if ctx.manifestnode(): 2602 if ctx.manifestnode():
2600 # reuse an existing manifest revision 2603 # reuse an existing manifest revision
2601 self.ui.debug('reusing known manifest\n') 2604 self.ui.debug('reusing known manifest\n')
2602 mn = ctx.manifestnode() 2605 mn = ctx.manifestnode()
2603 files = ctx.files() 2606 files = ctx.files()
2607 if writechangesetcopy:
2608 filesadded = ctx.filesadded()
2609 filesremoved = ctx.filesremoved()
2604 elif ctx.files(): 2610 elif ctx.files():
2605 m1ctx = p1.manifestctx() 2611 m1ctx = p1.manifestctx()
2606 m2ctx = p2.manifestctx() 2612 m2ctx = p2.manifestctx()
2607 mctx = m1ctx.copy() 2613 mctx = m1ctx.copy()
2608 2614
2665 # case where the merge has files outside of the narrowspec, 2671 # case where the merge has files outside of the narrowspec,
2666 # so this is safe. 2672 # so this is safe.
2667 mn = mctx.write(trp, linkrev, 2673 mn = mctx.write(trp, linkrev,
2668 p1.manifestnode(), p2.manifestnode(), 2674 p1.manifestnode(), p2.manifestnode(),
2669 added, drop, match=self.narrowmatch()) 2675 added, drop, match=self.narrowmatch())
2676
2677 if writechangesetcopy:
2678 filesadded = [f for f in changed
2679 if not (f in m1 or f in m2)]
2680 filesremoved = removed
2670 else: 2681 else:
2671 self.ui.debug('reusing manifest from p1 (listed files ' 2682 self.ui.debug('reusing manifest from p1 (listed files '
2672 'actually unchanged)\n') 2683 'actually unchanged)\n')
2673 mn = p1.manifestnode() 2684 mn = p1.manifestnode()
2674 else: 2685 else:
2681 # no entry should be written. If writing to both, write an empty 2692 # no entry should be written. If writing to both, write an empty
2682 # entry to prevent the reader from falling back to reading 2693 # entry to prevent the reader from falling back to reading
2683 # filelogs. 2694 # filelogs.
2684 p1copies = p1copies or None 2695 p1copies = p1copies or None
2685 p2copies = p2copies or None 2696 p2copies = p2copies or None
2697 filesadded = filesadded or None
2698 filesremoved = filesremoved or None
2686 2699
2687 # update changelog 2700 # update changelog
2688 self.ui.note(_("committing changelog\n")) 2701 self.ui.note(_("committing changelog\n"))
2689 self.changelog.delayupdate(tr) 2702 self.changelog.delayupdate(tr)
2690 n = self.changelog.add(mn, files, ctx.description(), 2703 n = self.changelog.add(mn, files, ctx.description(),
2691 trp, p1.node(), p2.node(), 2704 trp, p1.node(), p2.node(),
2692 user, ctx.date(), ctx.extra().copy(), 2705 user, ctx.date(), ctx.extra().copy(),
2693 p1copies, p2copies) 2706 p1copies, p2copies, filesadded, filesremoved)
2694 xp1, xp2 = p1.hex(), p2 and p2.hex() or '' 2707 xp1, xp2 = p1.hex(), p2 and p2.hex() or ''
2695 self.hook('pretxncommit', throw=True, node=hex(n), parent1=xp1, 2708 self.hook('pretxncommit', throw=True, node=hex(n), parent1=xp1,
2696 parent2=xp2) 2709 parent2=xp2)
2697 # set the new commit is proper phase 2710 # set the new commit is proper phase
2698 targetphase = subrepoutil.newcommitphase(self.ui, ctx) 2711 targetphase = subrepoutil.newcommitphase(self.ui, ctx)