Mercurial > public > mercurial-scm > hg-stable
annotate mercurial/py3kcompat.py @ 26117:4dc5b51f38fe
revlog: change generaldelta delta parent heuristic
The old generaldelta heuristic was "if p1 (or p2) was closer than the last full text,
use it, otherwise use prev". This was problematic when a repo contained multiple
branches that were very different. If commits to branch A were pushed, and the
last full text was branch B, it would generate a fulltext. Then if branch B was
pushed, it would generate another fulltext. The problem is that the last
fulltext (and delta'ing against `prev` in general) has no correlation with the
contents of the incoming revision, and therefore will always have degenerate
cases.
According to the blame, that algorithm was chosen to minimize the chain length.
Since there is already code that protects against that (the delta-vs-fulltext
code), and since it has been improved since the original generaldelta algorithm
went in (2011), I believe the chain length criteria will still be preserved.
The new algorithm always diffs against p1 (or p2 if it's closer), unless the
resulting delta will fail the delta-vs-fulltext check, in which case we delta
against prev.
Some before and after stats on manifest.d size.
internal large repo
old heuristic - 2.0 GB
new heuristic - 1.2 GB
mozilla-central
old heuristic - 242 MB
new heuristic - 261 MB
The regression in mozilla central is due to the new heuristic choosing p2r as
the delta when it's closer to the tip. Switching the algorithm to always prefer
p1r brings the size back down (242 MB). This is result of the way in which
mozilla does merges and pushes, and the result could easily swing the other
direction in other repos (depending on if they merge X into Y or Y into X), but
will never be as degenerate as before.
I future patch will address the regression by introducing an optional, even more
aggressive delta heuristic which will knock the mozilla manifest size down
dramatically.
author | Durham Goode <durham@fb.com> |
---|---|
date | Sun, 30 Aug 2015 13:58:11 -0700 |
parents | a7a9d84f5e4a |
children | 5bfd01a3c2a9 |
rev | line source |
---|---|
11748
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
1 # py3kcompat.py - compatibility definitions for running hg in py3k |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
2 # |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
3 # Copyright 2010 Renato Cunha <renatoc@gmail.com> |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
4 # |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
5 # This software may be used and distributed according to the terms of the |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
6 # GNU General Public License version 2 or any later version. |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
7 |
21292
a7a9d84f5e4a
py3kcompat: drop unused export
Matt Mackall <mpm@selenic.com>
parents:
21291
diff
changeset
|
8 import builtins |
11748
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
9 |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
10 from numbers import Number |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
11 |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
12 def bytesformatter(format, args): |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
13 '''Custom implementation of a formatter for bytestrings. |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
14 |
17424
e7cfe3587ea4
fix trivial spelling errors
Mads Kiilerich <mads@kiilerich.com>
parents:
11878
diff
changeset
|
15 This function currently relies on the string formatter to do the |
11748
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
16 formatting and always returns bytes objects. |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
17 |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
18 >>> bytesformatter(20, 10) |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
19 0 |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
20 >>> bytesformatter('unicode %s, %s!', ('string', 'foo')) |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
21 b'unicode string, foo!' |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
22 >>> bytesformatter(b'test %s', 'me') |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
23 b'test me' |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
24 >>> bytesformatter('test %s', 'me') |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
25 b'test me' |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
26 >>> bytesformatter(b'test %s', b'me') |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
27 b'test me' |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
28 >>> bytesformatter('test %s', b'me') |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
29 b'test me' |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
30 >>> bytesformatter('test %d: %s', (1, b'result')) |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
31 b'test 1: result' |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
32 ''' |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
33 # The current implementation just converts from bytes to unicode, do |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
34 # what's needed and then convert the results back to bytes. |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
35 # Another alternative is to use the Python C API implementation. |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
36 if isinstance(format, Number): |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
37 # If the fixer erroneously passes a number remainder operation to |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
38 # bytesformatter, we just return the correct operation |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
39 return format % args |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
40 if isinstance(format, bytes): |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
41 format = format.decode('utf-8', 'surrogateescape') |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
42 if isinstance(args, bytes): |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
43 args = args.decode('utf-8', 'surrogateescape') |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
44 if isinstance(args, tuple): |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
45 newargs = [] |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
46 for arg in args: |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
47 if isinstance(arg, bytes): |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
48 arg = arg.decode('utf-8', 'surrogateescape') |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
49 newargs.append(arg) |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
50 args = tuple(newargs) |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
51 ret = format % args |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
52 return ret.encode('utf-8', 'surrogateescape') |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
53 builtins.bytesformatter = bytesformatter |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
54 |
11878
8bb1481cf08f
py3kcompat: added fake ord implementation for py3k
Renato Cunha <renatoc@gmail.com>
parents:
11748
diff
changeset
|
55 origord = builtins.ord |
8bb1481cf08f
py3kcompat: added fake ord implementation for py3k
Renato Cunha <renatoc@gmail.com>
parents:
11748
diff
changeset
|
56 def fakeord(char): |
8bb1481cf08f
py3kcompat: added fake ord implementation for py3k
Renato Cunha <renatoc@gmail.com>
parents:
11748
diff
changeset
|
57 if isinstance(char, int): |
8bb1481cf08f
py3kcompat: added fake ord implementation for py3k
Renato Cunha <renatoc@gmail.com>
parents:
11748
diff
changeset
|
58 return char |
8bb1481cf08f
py3kcompat: added fake ord implementation for py3k
Renato Cunha <renatoc@gmail.com>
parents:
11748
diff
changeset
|
59 return origord(char) |
8bb1481cf08f
py3kcompat: added fake ord implementation for py3k
Renato Cunha <renatoc@gmail.com>
parents:
11748
diff
changeset
|
60 builtins.ord = fakeord |
8bb1481cf08f
py3kcompat: added fake ord implementation for py3k
Renato Cunha <renatoc@gmail.com>
parents:
11748
diff
changeset
|
61 |
11748
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
62 if __name__ == '__main__': |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
63 import doctest |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
64 doctest.testmod() |
37a70a784397
py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents:
diff
changeset
|
65 |