Mercurial > public > mercurial-scm > hg-stable
annotate mercurial/pushkey.py @ 26117:4dc5b51f38fe
revlog: change generaldelta delta parent heuristic
The old generaldelta heuristic was "if p1 (or p2) was closer than the last full text,
use it, otherwise use prev". This was problematic when a repo contained multiple
branches that were very different. If commits to branch A were pushed, and the
last full text was branch B, it would generate a fulltext. Then if branch B was
pushed, it would generate another fulltext. The problem is that the last
fulltext (and delta'ing against `prev` in general) has no correlation with the
contents of the incoming revision, and therefore will always have degenerate
cases.
According to the blame, that algorithm was chosen to minimize the chain length.
Since there is already code that protects against that (the delta-vs-fulltext
code), and since it has been improved since the original generaldelta algorithm
went in (2011), I believe the chain length criteria will still be preserved.
The new algorithm always diffs against p1 (or p2 if it's closer), unless the
resulting delta will fail the delta-vs-fulltext check, in which case we delta
against prev.
Some before and after stats on manifest.d size.
internal large repo
old heuristic - 2.0 GB
new heuristic - 1.2 GB
mozilla-central
old heuristic - 242 MB
new heuristic - 261 MB
The regression in mozilla central is due to the new heuristic choosing p2r as
the delta when it's closer to the tip. Switching the algorithm to always prefer
p1r brings the size back down (242 MB). This is result of the way in which
mozilla does merges and pushes, and the result could easily swing the other
direction in other repos (depending on if they merge X into Y or Y into X), but
will never be as degenerate as before.
I future patch will address the regression by introducing an optional, even more
aggressive delta heuristic which will knock the mozilla manifest size down
dramatically.
author | Durham Goode <durham@fb.com> |
---|---|
date | Sun, 30 Aug 2015 13:58:11 -0700 |
parents | 7b200566e474 |
children | 57875cf423c9 |
rev | line source |
---|---|
11367 | 1 # pushkey.py - dispatching for pushing and pulling keys |
2 # | |
3 # Copyright 2010 Matt Mackall <mpm@selenic.com> | |
4 # | |
5 # This software may be used and distributed according to the terms of the | |
6 # GNU General Public License version 2 or any later version. | |
7 | |
25969
7b200566e474
pushkey: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
22953
diff
changeset
|
8 from __future__ import absolute_import |
7b200566e474
pushkey: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
22953
diff
changeset
|
9 |
7b200566e474
pushkey: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
22953
diff
changeset
|
10 from . import ( |
7b200566e474
pushkey: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
22953
diff
changeset
|
11 bookmarks, |
7b200566e474
pushkey: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
22953
diff
changeset
|
12 encoding, |
7b200566e474
pushkey: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
22953
diff
changeset
|
13 obsolete, |
7b200566e474
pushkey: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
22953
diff
changeset
|
14 phases, |
7b200566e474
pushkey: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
22953
diff
changeset
|
15 ) |
13353
689bf32b3bbd
bookmarks: move pushkey functions into core
Matt Mackall <mpm@selenic.com>
parents:
11367
diff
changeset
|
16 |
11367 | 17 def _nslist(repo): |
18 n = {} | |
19 for k in _namespaces: | |
20 n[k] = "" | |
22953
b1d694d3975e
obsolete: add exchange option
Durham Goode <durham@fb.com>
parents:
21661
diff
changeset
|
21 if not obsolete.isenabled(repo, obsolete.exchangeopt): |
17298
59c14bf5a48c
pushkey: do not exchange obsole markers if feature is disabled
Pierre-Yves David <pierre-yves.david@ens-lyon.org>
parents:
17075
diff
changeset
|
22 n.pop('obsolete') |
11367 | 23 return n |
24 | |
13353
689bf32b3bbd
bookmarks: move pushkey functions into core
Matt Mackall <mpm@selenic.com>
parents:
11367
diff
changeset
|
25 _namespaces = {"namespaces": (lambda *x: False, _nslist), |
15648
79cc89de5be1
phases: add basic pushkey support
Pierre-Yves David <pierre-yves.david@logilab.fr>
parents:
13353
diff
changeset
|
26 "bookmarks": (bookmarks.pushbookmark, bookmarks.listbookmarks), |
79cc89de5be1
phases: add basic pushkey support
Pierre-Yves David <pierre-yves.david@logilab.fr>
parents:
13353
diff
changeset
|
27 "phases": (phases.pushphase, phases.listphases), |
17075
28ed1c4511ce
obsolete: exchange obsolete marker over pushkey
Pierre-Yves.David@ens-lyon.org
parents:
15648
diff
changeset
|
28 "obsolete": (obsolete.pushmarker, obsolete.listmarkers), |
15648
79cc89de5be1
phases: add basic pushkey support
Pierre-Yves David <pierre-yves.david@logilab.fr>
parents:
13353
diff
changeset
|
29 } |
11367 | 30 |
31 def register(namespace, pushkey, listkeys): | |
32 _namespaces[namespace] = (pushkey, listkeys) | |
33 | |
34 def _get(namespace): | |
35 return _namespaces.get(namespace, (lambda *x: False, lambda *x: {})) | |
36 | |
37 def push(repo, namespace, key, old, new): | |
38 '''should succeed iff value was old''' | |
39 pk = _get(namespace)[0] | |
40 return pk(repo, key, old, new) | |
41 | |
42 def list(repo, namespace): | |
43 '''return a dict''' | |
44 lk = _get(namespace)[1] | |
45 return lk(repo) | |
46 | |
21661
2f52a16f2bee
pushkey: add an ``encode`` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
21659
diff
changeset
|
47 encode = encoding.fromlocal |
2f52a16f2bee
pushkey: add an ``encode`` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
21659
diff
changeset
|
48 |
21659
a319842539f5
pushkey: add a ``decode`` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
21652
diff
changeset
|
49 decode = encoding.tolocal |
a319842539f5
pushkey: add a ``decode`` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
21652
diff
changeset
|
50 |
21650
a2c7ae21e8f4
pushkey: introduce an ``encodekeys`` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
17298
diff
changeset
|
51 def encodekeys(keys): |
a2c7ae21e8f4
pushkey: introduce an ``encodekeys`` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
17298
diff
changeset
|
52 """encode the content of a pushkey namespace for exchange over the wire""" |
21661
2f52a16f2bee
pushkey: add an ``encode`` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
21659
diff
changeset
|
53 return '\n'.join(['%s\t%s' % (encode(k), encode(v)) for k, v in keys]) |
21652
ed6e61eaebc0
pushkey: introduce an ``decodekeys`` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
21650
diff
changeset
|
54 |
ed6e61eaebc0
pushkey: introduce an ``decodekeys`` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
21650
diff
changeset
|
55 def decodekeys(data): |
ed6e61eaebc0
pushkey: introduce an ``decodekeys`` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
21650
diff
changeset
|
56 """decode the content of a pushkey namespace from exchange over the wire""" |
ed6e61eaebc0
pushkey: introduce an ``decodekeys`` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
21650
diff
changeset
|
57 result = {} |
ed6e61eaebc0
pushkey: introduce an ``decodekeys`` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
21650
diff
changeset
|
58 for l in data.splitlines(): |
ed6e61eaebc0
pushkey: introduce an ``decodekeys`` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
21650
diff
changeset
|
59 k, v = l.split('\t') |
21659
a319842539f5
pushkey: add a ``decode`` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
21652
diff
changeset
|
60 result[decode(k)] = decode(v) |
21652
ed6e61eaebc0
pushkey: introduce an ``decodekeys`` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
21650
diff
changeset
|
61 return result |