Mercurial > public > mercurial-scm > hg-stable
annotate mercurial/filelog.py @ 37443:65250a66b55c
revlog: move censor logic into main revlog class
Previously, the revlog class implemented dummy methods for
various censor-related functionality. Revision censoring was
(and will continue to be) only possible on filelog instances.
So filelog implemented these methods to perform something
reasonable.
A problem with implementing censoring on filelog is that
it assumes filelog is a revlog. Upcoming work to formalize
the filelog interface will make this not true.
Furthermore, the censoring logic is security-sensitive. I
think action-at-a-distance with custom implementation of core
revlog APIs in derived classes is a bit dangerous. I think at
a minimum the censor logic should live in revlog.py.
I was tempted to created a "censored revlog" class that
basically pulled these methods out of filelog. But, I wasn't
a huge fan of overriding core methods in child classes. A
reason to do that would be performance. However, the censoring
code only comes into play when:
* hash verification fails
* delta generation
* applying deltas from changegroups
The new code is conditional on an instance attribute. So the
overhead for running the censored code when the revlog isn't
censorable is an attribute lookup. All of these operations are
at least a magnitude slower than a Python attribute lookup. So
there shouldn't be a performance concern.
Differential Revision: https://phab.mercurial-scm.org/D3151
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Thu, 05 Apr 2018 16:31:45 -0700 |
parents | 0596d27457c6 |
children | 1541e1a8e87d |
rev | line source |
---|---|
1089 | 1 # filelog.py - file history class for mercurial |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
2 # |
4635
63b9d2deed48
Updated copyright notices and add "and others" to "hg version"
Thomas Arendsen Hein <thomas@intevation.de>
parents:
4258
diff
changeset
|
3 # Copyright 2005-2007 Matt Mackall <mpm@selenic.com> |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
4 # |
8225
46293a0c7e9f
updated license to be explicit about GPL version 2
Martin Geisler <mg@lazybytes.net>
parents:
7634
diff
changeset
|
5 # This software may be used and distributed according to the terms of the |
10263 | 6 # GNU General Public License version 2 or any later version. |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
7 |
25948
34bd1a5eef5b
filelog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
24255
diff
changeset
|
8 from __future__ import absolute_import |
34bd1a5eef5b
filelog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
24255
diff
changeset
|
9 |
37441
a3202fa83aff
filelog: declare that filelog implements a storage interface
Gregory Szorc <gregory.szorc@gmail.com>
parents:
35567
diff
changeset
|
10 from .thirdparty.zope import ( |
a3202fa83aff
filelog: declare that filelog implements a storage interface
Gregory Szorc <gregory.szorc@gmail.com>
parents:
35567
diff
changeset
|
11 interface as zi, |
a3202fa83aff
filelog: declare that filelog implements a storage interface
Gregory Szorc <gregory.szorc@gmail.com>
parents:
35567
diff
changeset
|
12 ) |
25948
34bd1a5eef5b
filelog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
24255
diff
changeset
|
13 from . import ( |
37441
a3202fa83aff
filelog: declare that filelog implements a storage interface
Gregory Szorc <gregory.szorc@gmail.com>
parents:
35567
diff
changeset
|
14 repository, |
25948
34bd1a5eef5b
filelog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
24255
diff
changeset
|
15 revlog, |
34bd1a5eef5b
filelog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
24255
diff
changeset
|
16 ) |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
17 |
37441
a3202fa83aff
filelog: declare that filelog implements a storage interface
Gregory Szorc <gregory.szorc@gmail.com>
parents:
35567
diff
changeset
|
18 @zi.implementer(repository.ifilestorage) |
7634
14a4337a9b9b
revlog: kill from-style imports
Matt Mackall <mpm@selenic.com>
parents:
7622
diff
changeset
|
19 class filelog(revlog.revlog): |
4258
b11a2fb59cf5
revlog: simplify revlog version handling
Matt Mackall <mpm@selenic.com>
parents:
4257
diff
changeset
|
20 def __init__(self, opener, path): |
19148
3bda242bf244
filelog: use super() for calling base functions
Durham Goode <durham@fb.com>
parents:
14287
diff
changeset
|
21 super(filelog, self).__init__(opener, |
37443
65250a66b55c
revlog: move censor logic into main revlog class
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37442
diff
changeset
|
22 "/".join(("data", path + ".i")), |
65250a66b55c
revlog: move censor logic into main revlog class
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37442
diff
changeset
|
23 censorable=True) |
35567
07769a04bc66
filelog: add the ability to report the user facing name
Matt Harbison <matt_harbison@yahoo.com>
parents:
34041
diff
changeset
|
24 # full name of the user visible file, relative to the repository root |
07769a04bc66
filelog: add the ability to report the user facing name
Matt Harbison <matt_harbison@yahoo.com>
parents:
34041
diff
changeset
|
25 self.filename = path |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
26 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
27 def read(self, node): |
360 | 28 t = self.revision(node) |
686
d7d68d27ebe5
Reapply startswith() changes that got lost with stale edit
Matt Mackall <mpm@selenic.com>
parents:
681
diff
changeset
|
29 if not t.startswith('\1\n'): |
360 | 30 return t |
2579
0875cda033fd
use __contains__, index or split instead of str.find
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents:
2470
diff
changeset
|
31 s = t.index('\1\n', 2) |
10282
08a0f04b56bd
many, many trivial check-code fixups
Matt Mackall <mpm@selenic.com>
parents:
10263
diff
changeset
|
32 return t[s + 2:] |
360 | 33 |
34 def add(self, text, meta, transaction, link, p1=None, p2=None): | |
686
d7d68d27ebe5
Reapply startswith() changes that got lost with stale edit
Matt Mackall <mpm@selenic.com>
parents:
681
diff
changeset
|
35 if meta or text.startswith('\1\n'): |
37442
0596d27457c6
revlog: move parsemeta() and packmeta() from filelog (API)
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37441
diff
changeset
|
36 text = revlog.packmeta(meta, text) |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
37 return self.addrevision(text, transaction, link, p1, p2) |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
38 |
1116 | 39 def renamed(self, node): |
7634
14a4337a9b9b
revlog: kill from-style imports
Matt Mackall <mpm@selenic.com>
parents:
7622
diff
changeset
|
40 if self.parents(node)[0] != revlog.nullid: |
1116 | 41 return False |
13240
e5060aa22043
filelog: move metadata parsing to a helper function
Matt Mackall <mpm@selenic.com>
parents:
11541
diff
changeset
|
42 t = self.revision(node) |
37442
0596d27457c6
revlog: move parsemeta() and packmeta() from filelog (API)
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37441
diff
changeset
|
43 m = revlog.parsemeta(t)[0] |
5915
d0576d065993
Prefer i in d over d.has_key(i)
Christian Ebert <blacktrash@gmx.net>
parents:
4635
diff
changeset
|
44 if m and "copy" in m: |
7634
14a4337a9b9b
revlog: kill from-style imports
Matt Mackall <mpm@selenic.com>
parents:
7622
diff
changeset
|
45 return (m["copy"], revlog.bin(m["copyrev"])) |
1116 | 46 return False |
47 | |
2898
db397c38005d
merge: use file size stored in revlog index
Matt Mackall <mpm@selenic.com>
parents:
2895
diff
changeset
|
48 def size(self, rev): |
db397c38005d
merge: use file size stored in revlog index
Matt Mackall <mpm@selenic.com>
parents:
2895
diff
changeset
|
49 """return the size of a given revision""" |
db397c38005d
merge: use file size stored in revlog index
Matt Mackall <mpm@selenic.com>
parents:
2895
diff
changeset
|
50 |
db397c38005d
merge: use file size stored in revlog index
Matt Mackall <mpm@selenic.com>
parents:
2895
diff
changeset
|
51 # for revisions with renames, we have to go the slow way |
db397c38005d
merge: use file size stored in revlog index
Matt Mackall <mpm@selenic.com>
parents:
2895
diff
changeset
|
52 node = self.node(rev) |
db397c38005d
merge: use file size stored in revlog index
Matt Mackall <mpm@selenic.com>
parents:
2895
diff
changeset
|
53 if self.renamed(node): |
db397c38005d
merge: use file size stored in revlog index
Matt Mackall <mpm@selenic.com>
parents:
2895
diff
changeset
|
54 return len(self.read(node)) |
24118
76f6ae06ddf5
revlog: add "iscensored()" to revlog public API
Mike Edgar <adgar@google.com>
parents:
24117
diff
changeset
|
55 if self.iscensored(rev): |
22597
58ec36686f0e
filelog: censored files compare against empty data, have 0 size
Mike Edgar <adgar@google.com>
parents:
22596
diff
changeset
|
56 return 0 |
2898
db397c38005d
merge: use file size stored in revlog index
Matt Mackall <mpm@selenic.com>
parents:
2895
diff
changeset
|
57 |
11540
2370e270a29a
filelog: test behaviour for data starting with "\1\n"
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents:
11539
diff
changeset
|
58 # XXX if self.read(node).startswith("\1\n"), this returns (size+4) |
19148
3bda242bf244
filelog: use super() for calling base functions
Durham Goode <durham@fb.com>
parents:
14287
diff
changeset
|
59 return super(filelog, self).size(rev) |
2898
db397c38005d
merge: use file size stored in revlog index
Matt Mackall <mpm@selenic.com>
parents:
2895
diff
changeset
|
60 |
2887
05257fd28591
filelog: add hash-based comparisons
Matt Mackall <mpm@selenic.com>
parents:
2859
diff
changeset
|
61 def cmp(self, node, text): |
11539
a463e3c50212
cmp: document the fact that we return True if content is different
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents:
10706
diff
changeset
|
62 """compare text with a given file revision |
a463e3c50212
cmp: document the fact that we return True if content is different
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents:
10706
diff
changeset
|
63 |
a463e3c50212
cmp: document the fact that we return True if content is different
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents:
10706
diff
changeset
|
64 returns True if text is different than what is stored. |
a463e3c50212
cmp: document the fact that we return True if content is different
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents:
10706
diff
changeset
|
65 """ |
2887
05257fd28591
filelog: add hash-based comparisons
Matt Mackall <mpm@selenic.com>
parents:
2859
diff
changeset
|
66 |
11541
ab9fa7a85dd9
filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents:
11540
diff
changeset
|
67 t = text |
ab9fa7a85dd9
filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents:
11540
diff
changeset
|
68 if text.startswith('\1\n'): |
ab9fa7a85dd9
filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents:
11540
diff
changeset
|
69 t = '\1\n\1\n' + text |
ab9fa7a85dd9
filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents:
11540
diff
changeset
|
70 |
19148
3bda242bf244
filelog: use super() for calling base functions
Durham Goode <durham@fb.com>
parents:
14287
diff
changeset
|
71 samehashes = not super(filelog, self).cmp(node, t) |
11541
ab9fa7a85dd9
filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents:
11540
diff
changeset
|
72 if samehashes: |
ab9fa7a85dd9
filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents:
11540
diff
changeset
|
73 return False |
ab9fa7a85dd9
filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents:
11540
diff
changeset
|
74 |
22597
58ec36686f0e
filelog: censored files compare against empty data, have 0 size
Mike Edgar <adgar@google.com>
parents:
22596
diff
changeset
|
75 # censored files compare against the empty file |
24118
76f6ae06ddf5
revlog: add "iscensored()" to revlog public API
Mike Edgar <adgar@google.com>
parents:
24117
diff
changeset
|
76 if self.iscensored(self.rev(node)): |
22597
58ec36686f0e
filelog: censored files compare against empty data, have 0 size
Mike Edgar <adgar@google.com>
parents:
22596
diff
changeset
|
77 return text != '' |
58ec36686f0e
filelog: censored files compare against empty data, have 0 size
Mike Edgar <adgar@google.com>
parents:
22596
diff
changeset
|
78 |
11541
ab9fa7a85dd9
filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents:
11540
diff
changeset
|
79 # renaming a file produces a different hash, even if the data |
ab9fa7a85dd9
filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents:
11540
diff
changeset
|
80 # remains unchanged. Check if it's the case (slow): |
ab9fa7a85dd9
filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents:
11540
diff
changeset
|
81 if self.renamed(node): |
2887
05257fd28591
filelog: add hash-based comparisons
Matt Mackall <mpm@selenic.com>
parents:
2859
diff
changeset
|
82 t2 = self.read(node) |
2895
21631c2c09a5
filelog.cmp: return 0 for equality
Matt Mackall <mpm@selenic.com>
parents:
2890
diff
changeset
|
83 return t2 != text |
2887
05257fd28591
filelog: add hash-based comparisons
Matt Mackall <mpm@selenic.com>
parents:
2859
diff
changeset
|
84 |
11541
ab9fa7a85dd9
filelog: cmp: don't read data if hashes are identical (issue2273)
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents:
11540
diff
changeset
|
85 return True |