annotate mercurial/filelog.py @ 37442:0596d27457c6

revlog: move parsemeta() and packmeta() from filelog (API) filelog.parsemeta() and filelog.packmeta() are used to decode and encode metadata for file copies and censor. An upcoming commit will move the core logic for censoring revlogs into revlog.py. This would create a cycle between revlog.py and filelog.py. So we move these metadata functions to revlog.py. .. api:: filelog.parsemeta() and filelog.packmeta() have been moved to the revlog module. Differential Revision: https://phab.mercurial-scm.org/D3150
author Gregory Szorc <gregory.szorc@gmail.com>
date Thu, 05 Apr 2018 18:22:35 -0700
parents a3202fa83aff
children 65250a66b55c
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
1089
142b5d5ec9cc Break apart hg.py
mpm@selenic.com
parents: 1072
diff changeset
1 # filelog.py - file history class for mercurial
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
2 #
4635
63b9d2deed48 Updated copyright notices and add "and others" to "hg version"
Thomas Arendsen Hein <thomas@intevation.de>
parents: 4258
diff changeset
3 # Copyright 2005-2007 Matt Mackall <mpm@selenic.com>
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
4 #
8225
46293a0c7e9f updated license to be explicit about GPL version 2
Martin Geisler <mg@lazybytes.net>
parents: 7634
diff changeset
5 # This software may be used and distributed according to the terms of the
10263
25e572394f5c Update license to GPLv2+
Matt Mackall <mpm@selenic.com>
parents: 8531
diff changeset
6 # GNU General Public License version 2 or any later version.
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
7
25948
34bd1a5eef5b filelog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 24255
diff changeset
8 from __future__ import absolute_import
34bd1a5eef5b filelog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 24255
diff changeset
9
34bd1a5eef5b filelog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 24255
diff changeset
10 import struct
34bd1a5eef5b filelog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 24255
diff changeset
11
37441
a3202fa83aff filelog: declare that filelog implements a storage interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 35567
diff changeset
12 from .thirdparty.zope import (
a3202fa83aff filelog: declare that filelog implements a storage interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 35567
diff changeset
13 interface as zi,
a3202fa83aff filelog: declare that filelog implements a storage interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 35567
diff changeset
14 )
25948
34bd1a5eef5b filelog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 24255
diff changeset
15 from . import (
34bd1a5eef5b filelog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 24255
diff changeset
16 error,
34bd1a5eef5b filelog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 24255
diff changeset
17 mdiff,
37441
a3202fa83aff filelog: declare that filelog implements a storage interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 35567
diff changeset
18 repository,
25948
34bd1a5eef5b filelog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 24255
diff changeset
19 revlog,
34bd1a5eef5b filelog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 24255
diff changeset
20 )
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
21
22596
27e2317efe89 filelog: raise CensoredNodeError when hash checks fail with censor metadata
Mike Edgar <adgar@google.com>
parents: 22422
diff changeset
22 def _censoredtext(text):
37442
0596d27457c6 revlog: move parsemeta() and packmeta() from filelog (API)
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37441
diff changeset
23 m, offs = revlog.parsemeta(text)
24117
9cfd7c4f22f5 filelog: allow censored files to contain padding data
Mike Edgar <adgar@google.com>
parents: 24003
diff changeset
24 return m and "censored" in m
22596
27e2317efe89 filelog: raise CensoredNodeError when hash checks fail with censor metadata
Mike Edgar <adgar@google.com>
parents: 22422
diff changeset
25
37441
a3202fa83aff filelog: declare that filelog implements a storage interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 35567
diff changeset
26 @zi.implementer(repository.ifilestorage)
7634
14a4337a9b9b revlog: kill from-style imports
Matt Mackall <mpm@selenic.com>
parents: 7622
diff changeset
27 class filelog(revlog.revlog):
4258
b11a2fb59cf5 revlog: simplify revlog version handling
Matt Mackall <mpm@selenic.com>
parents: 4257
diff changeset
28 def __init__(self, opener, path):
19148
3bda242bf244 filelog: use super() for calling base functions
Durham Goode <durham@fb.com>
parents: 14287
diff changeset
29 super(filelog, self).__init__(opener,
8531
810387f59696 filelog encoding: move the encoding/decoding into store
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 8225
diff changeset
30 "/".join(("data", path + ".i")))
35567
07769a04bc66 filelog: add the ability to report the user facing name
Matt Harbison <matt_harbison@yahoo.com>
parents: 34041
diff changeset
31 # full name of the user visible file, relative to the repository root
07769a04bc66 filelog: add the ability to report the user facing name
Matt Harbison <matt_harbison@yahoo.com>
parents: 34041
diff changeset
32 self.filename = path
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
33
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
34 def read(self, node):
360
10519e4cbd02 filelog: add metadata support
mpm@selenic.com
parents: 358
diff changeset
35 t = self.revision(node)
686
d7d68d27ebe5 Reapply startswith() changes that got lost with stale edit
Matt Mackall <mpm@selenic.com>
parents: 681
diff changeset
36 if not t.startswith('\1\n'):
360
10519e4cbd02 filelog: add metadata support
mpm@selenic.com
parents: 358
diff changeset
37 return t
2579
0875cda033fd use __contains__, index or split instead of str.find
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 2470
diff changeset
38 s = t.index('\1\n', 2)
10282
08a0f04b56bd many, many trivial check-code fixups
Matt Mackall <mpm@selenic.com>
parents: 10263
diff changeset
39 return t[s + 2:]
360
10519e4cbd02 filelog: add metadata support
mpm@selenic.com
parents: 358
diff changeset
40
10519e4cbd02 filelog: add metadata support
mpm@selenic.com
parents: 358
diff changeset
41 def add(self, text, meta, transaction, link, p1=None, p2=None):
686
d7d68d27ebe5 Reapply startswith() changes that got lost with stale edit
Matt Mackall <mpm@selenic.com>
parents: 681
diff changeset
42 if meta or text.startswith('\1\n'):
37442
0596d27457c6 revlog: move parsemeta() and packmeta() from filelog (API)
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37441
diff changeset
43 text = revlog.packmeta(meta, text)
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
44 return self.addrevision(text, transaction, link, p1, p2)
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
45
1116
0cdd73b0767c Add some rename debugging support
mpm@selenic.com
parents: 1089
diff changeset
46 def renamed(self, node):
7634
14a4337a9b9b revlog: kill from-style imports
Matt Mackall <mpm@selenic.com>
parents: 7622
diff changeset
47 if self.parents(node)[0] != revlog.nullid:
1116
0cdd73b0767c Add some rename debugging support
mpm@selenic.com
parents: 1089
diff changeset
48 return False
13240
e5060aa22043 filelog: move metadata parsing to a helper function
Matt Mackall <mpm@selenic.com>
parents: 11541
diff changeset
49 t = self.revision(node)
37442
0596d27457c6 revlog: move parsemeta() and packmeta() from filelog (API)
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37441
diff changeset
50 m = revlog.parsemeta(t)[0]
5915
d0576d065993 Prefer i in d over d.has_key(i)
Christian Ebert <blacktrash@gmx.net>
parents: 4635
diff changeset
51 if m and "copy" in m:
7634
14a4337a9b9b revlog: kill from-style imports
Matt Mackall <mpm@selenic.com>
parents: 7622
diff changeset
52 return (m["copy"], revlog.bin(m["copyrev"]))
1116
0cdd73b0767c Add some rename debugging support
mpm@selenic.com
parents: 1089
diff changeset
53 return False
0cdd73b0767c Add some rename debugging support
mpm@selenic.com
parents: 1089
diff changeset
54
2898
db397c38005d merge: use file size stored in revlog index
Matt Mackall <mpm@selenic.com>
parents: 2895
diff changeset
55 def size(self, rev):
db397c38005d merge: use file size stored in revlog index
Matt Mackall <mpm@selenic.com>
parents: 2895
diff changeset
56 """return the size of a given revision"""
db397c38005d merge: use file size stored in revlog index
Matt Mackall <mpm@selenic.com>
parents: 2895
diff changeset
57
db397c38005d merge: use file size stored in revlog index
Matt Mackall <mpm@selenic.com>
parents: 2895
diff changeset
58 # for revisions with renames, we have to go the slow way
db397c38005d merge: use file size stored in revlog index
Matt Mackall <mpm@selenic.com>
parents: 2895
diff changeset
59 node = self.node(rev)
db397c38005d merge: use file size stored in revlog index
Matt Mackall <mpm@selenic.com>
parents: 2895
diff changeset
60 if self.renamed(node):
db397c38005d merge: use file size stored in revlog index
Matt Mackall <mpm@selenic.com>
parents: 2895
diff changeset
61 return len(self.read(node))
24118
76f6ae06ddf5 revlog: add "iscensored()" to revlog public API
Mike Edgar <adgar@google.com>
parents: 24117
diff changeset
62 if self.iscensored(rev):
22597
58ec36686f0e filelog: censored files compare against empty data, have 0 size
Mike Edgar <adgar@google.com>
parents: 22596
diff changeset
63 return 0
2898
db397c38005d merge: use file size stored in revlog index
Matt Mackall <mpm@selenic.com>
parents: 2895
diff changeset
64
11540
2370e270a29a filelog: test behaviour for data starting with "\1\n"
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 11539
diff changeset
65 # XXX if self.read(node).startswith("\1\n"), this returns (size+4)
19148
3bda242bf244 filelog: use super() for calling base functions
Durham Goode <durham@fb.com>
parents: 14287
diff changeset
66 return super(filelog, self).size(rev)
2898
db397c38005d merge: use file size stored in revlog index
Matt Mackall <mpm@selenic.com>
parents: 2895
diff changeset
67
2887
05257fd28591 filelog: add hash-based comparisons
Matt Mackall <mpm@selenic.com>
parents: 2859
diff changeset
68 def cmp(self, node, text):
11539
a463e3c50212 cmp: document the fact that we return True if content is different
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 10706
diff changeset
69 """compare text with a given file revision
a463e3c50212 cmp: document the fact that we return True if content is different
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 10706
diff changeset
70
a463e3c50212 cmp: document the fact that we return True if content is different
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 10706
diff changeset
71 returns True if text is different than what is stored.
a463e3c50212 cmp: document the fact that we return True if content is different
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 10706
diff changeset
72 """
2887
05257fd28591 filelog: add hash-based comparisons
Matt Mackall <mpm@selenic.com>
parents: 2859
diff changeset
73
11541
ab9fa7a85dd9 filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 11540
diff changeset
74 t = text
ab9fa7a85dd9 filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 11540
diff changeset
75 if text.startswith('\1\n'):
ab9fa7a85dd9 filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 11540
diff changeset
76 t = '\1\n\1\n' + text
ab9fa7a85dd9 filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 11540
diff changeset
77
19148
3bda242bf244 filelog: use super() for calling base functions
Durham Goode <durham@fb.com>
parents: 14287
diff changeset
78 samehashes = not super(filelog, self).cmp(node, t)
11541
ab9fa7a85dd9 filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 11540
diff changeset
79 if samehashes:
ab9fa7a85dd9 filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 11540
diff changeset
80 return False
ab9fa7a85dd9 filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 11540
diff changeset
81
22597
58ec36686f0e filelog: censored files compare against empty data, have 0 size
Mike Edgar <adgar@google.com>
parents: 22596
diff changeset
82 # censored files compare against the empty file
24118
76f6ae06ddf5 revlog: add "iscensored()" to revlog public API
Mike Edgar <adgar@google.com>
parents: 24117
diff changeset
83 if self.iscensored(self.rev(node)):
22597
58ec36686f0e filelog: censored files compare against empty data, have 0 size
Mike Edgar <adgar@google.com>
parents: 22596
diff changeset
84 return text != ''
58ec36686f0e filelog: censored files compare against empty data, have 0 size
Mike Edgar <adgar@google.com>
parents: 22596
diff changeset
85
11541
ab9fa7a85dd9 filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 11540
diff changeset
86 # renaming a file produces a different hash, even if the data
ab9fa7a85dd9 filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 11540
diff changeset
87 # remains unchanged. Check if it's the case (slow):
ab9fa7a85dd9 filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 11540
diff changeset
88 if self.renamed(node):
2887
05257fd28591 filelog: add hash-based comparisons
Matt Mackall <mpm@selenic.com>
parents: 2859
diff changeset
89 t2 = self.read(node)
2895
21631c2c09a5 filelog.cmp: return 0 for equality
Matt Mackall <mpm@selenic.com>
parents: 2890
diff changeset
90 return t2 != text
2887
05257fd28591 filelog: add hash-based comparisons
Matt Mackall <mpm@selenic.com>
parents: 2859
diff changeset
91
11541
ab9fa7a85dd9 filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 11540
diff changeset
92 return True
14287
7c231754a621 filelog: add file function to open other filelogs
Sune Foldager <cryo@cyanite.org>
parents: 14074
diff changeset
93
30589
be5b2098a817 revlog: merge hash checking subfunctions
Remi Chaintron <remi@fb.com>
parents: 25948
diff changeset
94 def checkhash(self, text, node, p1=None, p2=None, rev=None):
22596
27e2317efe89 filelog: raise CensoredNodeError when hash checks fail with censor metadata
Mike Edgar <adgar@google.com>
parents: 22422
diff changeset
95 try:
30589
be5b2098a817 revlog: merge hash checking subfunctions
Remi Chaintron <remi@fb.com>
parents: 25948
diff changeset
96 super(filelog, self).checkhash(text, node, p1=p1, p2=p2, rev=rev)
22596
27e2317efe89 filelog: raise CensoredNodeError when hash checks fail with censor metadata
Mike Edgar <adgar@google.com>
parents: 22422
diff changeset
97 except error.RevlogError:
27e2317efe89 filelog: raise CensoredNodeError when hash checks fail with censor metadata
Mike Edgar <adgar@google.com>
parents: 22422
diff changeset
98 if _censoredtext(text):
24190
903c7e8c97ad changegroup: emit full-replacement deltas if either revision is censored
Mike Edgar <adgar@google.com>
parents: 24118
diff changeset
99 raise error.CensoredNodeError(self.indexfile, node, text)
22596
27e2317efe89 filelog: raise CensoredNodeError when hash checks fail with censor metadata
Mike Edgar <adgar@google.com>
parents: 22422
diff changeset
100 raise
27e2317efe89 filelog: raise CensoredNodeError when hash checks fail with censor metadata
Mike Edgar <adgar@google.com>
parents: 22422
diff changeset
101
24118
76f6ae06ddf5 revlog: add "iscensored()" to revlog public API
Mike Edgar <adgar@google.com>
parents: 24117
diff changeset
102 def iscensored(self, rev):
22597
58ec36686f0e filelog: censored files compare against empty data, have 0 size
Mike Edgar <adgar@google.com>
parents: 22596
diff changeset
103 """Check if a file revision is censored."""
23858
22a979d1ae56 filelog: use censored revlog flag bit to quickly check if a node is censored
Mike Edgar <adgar@google.com>
parents: 22597
diff changeset
104 return self.flags(rev) & revlog.REVIDX_ISCENSORED
24255
4bfe9f2d9761 revlog: addgroup checks if incoming deltas add censored revs, sets flag bit
Mike Edgar <adgar@google.com>
parents: 24190
diff changeset
105
4bfe9f2d9761 revlog: addgroup checks if incoming deltas add censored revs, sets flag bit
Mike Edgar <adgar@google.com>
parents: 24190
diff changeset
106 def _peek_iscensored(self, baserev, delta, flush):
4bfe9f2d9761 revlog: addgroup checks if incoming deltas add censored revs, sets flag bit
Mike Edgar <adgar@google.com>
parents: 24190
diff changeset
107 """Quickly check if a delta produces a censored revision."""
4bfe9f2d9761 revlog: addgroup checks if incoming deltas add censored revs, sets flag bit
Mike Edgar <adgar@google.com>
parents: 24190
diff changeset
108 # Fragile heuristic: unless new file meta keys are added alphabetically
4bfe9f2d9761 revlog: addgroup checks if incoming deltas add censored revs, sets flag bit
Mike Edgar <adgar@google.com>
parents: 24190
diff changeset
109 # preceding "censored", all censored revisions are prefixed by
4bfe9f2d9761 revlog: addgroup checks if incoming deltas add censored revs, sets flag bit
Mike Edgar <adgar@google.com>
parents: 24190
diff changeset
110 # "\1\ncensored:". A delta producing such a censored revision must be a
4bfe9f2d9761 revlog: addgroup checks if incoming deltas add censored revs, sets flag bit
Mike Edgar <adgar@google.com>
parents: 24190
diff changeset
111 # full-replacement delta, so we inspect the first and only patch in the
4bfe9f2d9761 revlog: addgroup checks if incoming deltas add censored revs, sets flag bit
Mike Edgar <adgar@google.com>
parents: 24190
diff changeset
112 # delta for this prefix.
4bfe9f2d9761 revlog: addgroup checks if incoming deltas add censored revs, sets flag bit
Mike Edgar <adgar@google.com>
parents: 24190
diff changeset
113 hlen = struct.calcsize(">lll")
4bfe9f2d9761 revlog: addgroup checks if incoming deltas add censored revs, sets flag bit
Mike Edgar <adgar@google.com>
parents: 24190
diff changeset
114 if len(delta) <= hlen:
4bfe9f2d9761 revlog: addgroup checks if incoming deltas add censored revs, sets flag bit
Mike Edgar <adgar@google.com>
parents: 24190
diff changeset
115 return False
4bfe9f2d9761 revlog: addgroup checks if incoming deltas add censored revs, sets flag bit
Mike Edgar <adgar@google.com>
parents: 24190
diff changeset
116
4bfe9f2d9761 revlog: addgroup checks if incoming deltas add censored revs, sets flag bit
Mike Edgar <adgar@google.com>
parents: 24190
diff changeset
117 oldlen = self.rawsize(baserev)
4bfe9f2d9761 revlog: addgroup checks if incoming deltas add censored revs, sets flag bit
Mike Edgar <adgar@google.com>
parents: 24190
diff changeset
118 newlen = len(delta) - hlen
4bfe9f2d9761 revlog: addgroup checks if incoming deltas add censored revs, sets flag bit
Mike Edgar <adgar@google.com>
parents: 24190
diff changeset
119 if delta[:hlen] != mdiff.replacediffheader(oldlen, newlen):
4bfe9f2d9761 revlog: addgroup checks if incoming deltas add censored revs, sets flag bit
Mike Edgar <adgar@google.com>
parents: 24190
diff changeset
120 return False
4bfe9f2d9761 revlog: addgroup checks if incoming deltas add censored revs, sets flag bit
Mike Edgar <adgar@google.com>
parents: 24190
diff changeset
121
4bfe9f2d9761 revlog: addgroup checks if incoming deltas add censored revs, sets flag bit
Mike Edgar <adgar@google.com>
parents: 24190
diff changeset
122 add = "\1\ncensored:"
4bfe9f2d9761 revlog: addgroup checks if incoming deltas add censored revs, sets flag bit
Mike Edgar <adgar@google.com>
parents: 24190
diff changeset
123 addlen = len(add)
4bfe9f2d9761 revlog: addgroup checks if incoming deltas add censored revs, sets flag bit
Mike Edgar <adgar@google.com>
parents: 24190
diff changeset
124 return newlen >= addlen and delta[hlen:hlen + addlen] == add