Mercurial > public > mercurial-scm > hg
annotate mercurial/setdiscovery.py @ 23807:e97e363a7000
setdiscovery: delay sample building calls to gather them in a single place
Some of the logic around sample building is duplicated in the sample builders,
it would clean up thing to extract it in the top function, but this requires
all codes to be in the same place.
This changeset mostly exists to make the next one more clear.
author | Pierre-Yves David <pierre-yves.david@fb.com> |
---|---|
date | Wed, 07 Jan 2015 09:30:06 -0800 |
parents | d6cbbe3baef0 |
children | 07d0f59e0ba7 |
rev | line source |
---|---|
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
1 # setdiscovery.py - improved discovery of common nodeset for mercurial |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
2 # |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
3 # Copyright 2010 Benoit Boissinot <bboissin@gmail.com> |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
4 # and Peter Arrenbrecht <peter@arrenbrecht.ch> |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
5 # |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
6 # This software may be used and distributed according to the terms of the |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
7 # GNU General Public License version 2 or any later version. |
20656
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
8 """ |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
9 Algorithm works in the following way. You have two repository: local and |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
10 remote. They both contains a DAG of changelists. |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
11 |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
12 The goal of the discovery protocol is to find one set of node *common*, |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
13 the set of nodes shared by local and remote. |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
14 |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
15 One of the issue with the original protocol was latency, it could |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
16 potentially require lots of roundtrips to discover that the local repo was a |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
17 subset of remote (which is a very common case, you usually have few changes |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
18 compared to upstream, while upstream probably had lots of development). |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
19 |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
20 The new protocol only requires one interface for the remote repo: `known()`, |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
21 which given a set of changelists tells you if they are present in the DAG. |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
22 |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
23 The algorithm then works as follow: |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
24 |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
25 - We will be using three sets, `common`, `missing`, `unknown`. Originally |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
26 all nodes are in `unknown`. |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
27 - Take a sample from `unknown`, call `remote.known(sample)` |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
28 - For each node that remote knows, move it and all its ancestors to `common` |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
29 - For each node that remote doesn't know, move it and all its descendants |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
30 to `missing` |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
31 - Iterate until `unknown` is empty |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
32 |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
33 There are a couple optimizations, first is instead of starting with a random |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
34 sample of missing, start by sending all heads, in the case where the local |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
35 repo is a subset, you computed the answer in one round trip. |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
36 |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
37 Then you can do something similar to the bisecting strategy used when |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
38 finding faulty changesets. Instead of random samples, you can try picking |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
39 nodes that will maximize the number of nodes that will be |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
40 classified with it (since all ancestors or descendants will be marked as well). |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
41 """ |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
42 |
23343
f8a2647fe020
setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents:
23192
diff
changeset
|
43 from node import nullid, nullrev |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
44 from i18n import _ |
20034
1e5b38a919dd
cleanup: move stdlib imports to their own import statement
Augie Fackler <raf@durin42.com>
parents:
17426
diff
changeset
|
45 import random |
1e5b38a919dd
cleanup: move stdlib imports to their own import statement
Augie Fackler <raf@durin42.com>
parents:
17426
diff
changeset
|
46 import util, dagutil |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
47 |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
48 def _updatesample(dag, nodes, sample, always, quicksamplesize=0): |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
49 # if nodes is empty we scan the entire graph |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
50 if nodes: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
51 heads = dag.headsetofconnecteds(nodes) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
52 else: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
53 heads = dag.heads() |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
54 dist = {} |
16834
cafd8a8fb713
util: subclass deque for Python 2.4 backwards compatibility
Bryan O'Sullivan <bryano@fb.com>
parents:
16683
diff
changeset
|
55 visit = util.deque(heads) |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
56 seen = set() |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
57 factor = 1 |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
58 while visit: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
59 curr = visit.popleft() |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
60 if curr in seen: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
61 continue |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
62 d = dist.setdefault(curr, 1) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
63 if d > factor: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
64 factor *= 2 |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
65 if d == factor: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
66 if curr not in always: # need this check for the early exit below |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
67 sample.add(curr) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
68 if quicksamplesize and (len(sample) >= quicksamplesize): |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
69 return |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
70 seen.add(curr) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
71 for p in dag.parents(curr): |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
72 if not nodes or p in nodes: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
73 dist.setdefault(p, d + 1) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
74 visit.append(p) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
75 |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
76 def _setupsample(dag, nodes, size): |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
77 if len(nodes) <= size: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
78 return set(nodes), None, 0 |
15063
c20688b7c061
setdiscovery: fix hang when #heads>200 (issue2971)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14981
diff
changeset
|
79 always = dag.headsetofconnecteds(nodes) |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
80 desiredlen = size - len(always) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
81 if desiredlen <= 0: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
82 # This could be bad if there are very many heads, all unknown to the |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
83 # server. We're counting on long request support here. |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
84 return always, None, desiredlen |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
85 return always, set(), desiredlen |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
86 |
23806
d6cbbe3baef0
setdiscovery: drop unused 'initial' argument for '_takequicksample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
23747
diff
changeset
|
87 def _takequicksample(dag, nodes, size): |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
88 always, sample, desiredlen = _setupsample(dag, nodes, size) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
89 if sample is None: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
90 return always |
23806
d6cbbe3baef0
setdiscovery: drop unused 'initial' argument for '_takequicksample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
23747
diff
changeset
|
91 _updatesample(dag, None, sample, always, quicksamplesize=desiredlen) |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
92 sample.update(always) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
93 return sample |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
94 |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
95 def _takefullsample(dag, nodes, size): |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
96 always, sample, desiredlen = _setupsample(dag, nodes, size) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
97 if sample is None: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
98 return always |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
99 # update from heads |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
100 _updatesample(dag, nodes, sample, always) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
101 # update from roots |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
102 _updatesample(dag.inverse(), nodes, sample, always) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
103 assert sample |
23083
ee45f5c2ffcc
setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
20656
diff
changeset
|
104 sample = _limitsample(sample, desiredlen) |
ee45f5c2ffcc
setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
20656
diff
changeset
|
105 if len(sample) < desiredlen: |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
106 more = desiredlen - len(sample) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
107 sample.update(random.sample(list(nodes - sample - always), more)) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
108 sample.update(always) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
109 return sample |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
110 |
23083
ee45f5c2ffcc
setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
20656
diff
changeset
|
111 def _limitsample(sample, desiredlen): |
ee45f5c2ffcc
setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
20656
diff
changeset
|
112 """return a random subset of sample of at most desiredlen item""" |
ee45f5c2ffcc
setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
20656
diff
changeset
|
113 if len(sample) > desiredlen: |
ee45f5c2ffcc
setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
20656
diff
changeset
|
114 sample = set(random.sample(sample, desiredlen)) |
ee45f5c2ffcc
setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
20656
diff
changeset
|
115 return sample |
ee45f5c2ffcc
setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
20656
diff
changeset
|
116 |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
117 def findcommonheads(ui, local, remote, |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
118 initialsamplesize=100, |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
119 fullsamplesize=200, |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
120 abortwhenunrelated=True): |
14206
2bf60f158ecb
setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents:
14164
diff
changeset
|
121 '''Return a tuple (common, anyincoming, remoteheads) used to identify |
2bf60f158ecb
setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents:
14164
diff
changeset
|
122 missing nodes from or in remote. |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
123 ''' |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
124 roundtrips = 0 |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
125 cl = local.changelog |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
126 dag = dagutil.revlogdag(cl) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
127 |
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
128 # early exit if we know all the specified remote heads already |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
129 ui.debug("query 1; heads\n") |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
130 roundtrips += 1 |
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
131 ownheads = dag.heads() |
23084
3ef893520a85
setdiscovery: limit the size of the initial sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
23083
diff
changeset
|
132 sample = _limitsample(ownheads, initialsamplesize) |
23192
73cfaa348650
discovery: indices between sample and yesno must match (issue4438)
Mads Kiilerich <madski@unity3d.com>
parents:
23191
diff
changeset
|
133 # indices between sample and externalized version must match |
73cfaa348650
discovery: indices between sample and yesno must match (issue4438)
Mads Kiilerich <madski@unity3d.com>
parents:
23191
diff
changeset
|
134 sample = list(sample) |
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
135 if remote.local(): |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
136 # stopgap until we have a proper localpeer that supports batch() |
17204
4feb55e6931f
localpeer: return only visible heads and branchmap
Pierre-Yves David <pierre-yves.david@ens-lyon.org>
parents:
17191
diff
changeset
|
137 srvheadhashes = remote.heads() |
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
138 yesno = remote.known(dag.externalizeall(sample)) |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
139 elif remote.capable('batch'): |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
140 batch = remote.batch() |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
141 srvheadhashesref = batch.heads() |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
142 yesnoref = batch.known(dag.externalizeall(sample)) |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
143 batch.submit() |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
144 srvheadhashes = srvheadhashesref.value |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
145 yesno = yesnoref.value |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
146 else: |
17424
e7cfe3587ea4
fix trivial spelling errors
Mads Kiilerich <mads@kiilerich.com>
parents:
17204
diff
changeset
|
147 # compatibility with pre-batch, but post-known remotes during 1.9 |
e7cfe3587ea4
fix trivial spelling errors
Mads Kiilerich <mads@kiilerich.com>
parents:
17204
diff
changeset
|
148 # development |
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
149 srvheadhashes = remote.heads() |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
150 sample = [] |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
151 |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
152 if cl.tip() == nullid: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
153 if srvheadhashes != [nullid]: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
154 return [nullid], True, srvheadhashes |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
155 return [nullid], False, [] |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
156 |
14206
2bf60f158ecb
setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents:
14164
diff
changeset
|
157 # start actual discovery (we note this before the next "if" for |
2bf60f158ecb
setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents:
14164
diff
changeset
|
158 # compatibility reasons) |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
159 ui.status(_("searching for changes\n")) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
160 |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
161 srvheads = dag.internalizeall(srvheadhashes, filterunknown=True) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
162 if len(srvheads) == len(srvheadhashes): |
14833
308e1b5acc87
discovery: quiet note about heads
Matt Mackall <mpm@selenic.com>
parents:
14624
diff
changeset
|
163 ui.debug("all remote heads known locally\n") |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
164 return (srvheadhashes, False, srvheadhashes,) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
165 |
23191
86c35b7ae300
discovery: limit 'all local heads known remotely' to real 'all' (issue4438)
Mads Kiilerich <madski@unity3d.com>
parents:
23130
diff
changeset
|
166 if sample and len(ownheads) <= initialsamplesize and util.all(yesno): |
15497
9bea3aed6ee1
add missing localization markup
Mads Kiilerich <mads@kiilerich.com>
parents:
15063
diff
changeset
|
167 ui.note(_("all local heads known remotely\n")) |
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
168 ownheadhashes = dag.externalizeall(ownheads) |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
169 return (ownheadhashes, True, srvheadhashes,) |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
170 |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
171 # full blown discovery |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
172 |
16683 | 173 # own nodes I know we both know |
23343
f8a2647fe020
setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents:
23192
diff
changeset
|
174 # treat remote heads (and maybe own heads) as a first implicit sample |
f8a2647fe020
setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents:
23192
diff
changeset
|
175 # response |
f8a2647fe020
setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents:
23192
diff
changeset
|
176 common = cl.incrementalmissingrevs(srvheads) |
f8a2647fe020
setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents:
23192
diff
changeset
|
177 commoninsample = set(n for i, n in enumerate(sample) if yesno[i]) |
f8a2647fe020
setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents:
23192
diff
changeset
|
178 common.addbases(commoninsample) |
23746
4ef2f2fa8b8b
setdiscovery: drop shadowed 'undecided' assignment
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
23343
diff
changeset
|
179 # own nodes where I don't know if remote knows them |
23343
f8a2647fe020
setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents:
23192
diff
changeset
|
180 undecided = set(common.missingancestors(ownheads)) |
16683 | 181 # own nodes I know remote lacks |
182 missing = set() | |
183 | |
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
184 full = False |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
185 while undecided: |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
186 |
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
187 if sample: |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
188 missinginsample = [n for i, n in enumerate(sample) if not yesno[i]] |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
189 missing.update(dag.descendantset(missinginsample, missing)) |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
190 |
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
191 undecided.difference_update(missing) |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
192 |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
193 if not undecided: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
194 break |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
195 |
23747
f82173a90c2c
setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
23746
diff
changeset
|
196 if full or common.hasbases(): |
f82173a90c2c
setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
23746
diff
changeset
|
197 if full: |
f82173a90c2c
setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
23746
diff
changeset
|
198 ui.note(_("sampling from both directions\n")) |
f82173a90c2c
setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
23746
diff
changeset
|
199 else: |
f82173a90c2c
setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
23746
diff
changeset
|
200 ui.debug("taking initial sample\n") |
23807
e97e363a7000
setdiscovery: delay sample building calls to gather them in a single place
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
23806
diff
changeset
|
201 samplefunc = _takefullsample |
23130
ced632394371
setdiscovery: limit the size of all sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
23084
diff
changeset
|
202 targetsize = fullsamplesize |
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
203 else: |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
204 # use even cheaper initial sample |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
205 ui.debug("taking quick initial sample\n") |
23807
e97e363a7000
setdiscovery: delay sample building calls to gather them in a single place
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
23806
diff
changeset
|
206 samplefunc = _takequicksample |
23130
ced632394371
setdiscovery: limit the size of all sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
23084
diff
changeset
|
207 targetsize = initialsamplesize |
23807
e97e363a7000
setdiscovery: delay sample building calls to gather them in a single place
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
23806
diff
changeset
|
208 sample = samplefunc(dag, undecided, targetsize) |
23130
ced632394371
setdiscovery: limit the size of all sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
23084
diff
changeset
|
209 sample = _limitsample(sample, targetsize) |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
210 |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
211 roundtrips += 1 |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
212 ui.progress(_('searching'), roundtrips, unit=_('queries')) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
213 ui.debug("query %i; still undecided: %i, sample size is: %i\n" |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
214 % (roundtrips, len(undecided), len(sample))) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
215 # indices between sample and externalized version must match |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
216 sample = list(sample) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
217 yesno = remote.known(dag.externalizeall(sample)) |
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
218 full = True |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
219 |
23343
f8a2647fe020
setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents:
23192
diff
changeset
|
220 if sample: |
f8a2647fe020
setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents:
23192
diff
changeset
|
221 commoninsample = set(n for i, n in enumerate(sample) if yesno[i]) |
f8a2647fe020
setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents:
23192
diff
changeset
|
222 common.addbases(commoninsample) |
f8a2647fe020
setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents:
23192
diff
changeset
|
223 common.removeancestorsfrom(undecided) |
f8a2647fe020
setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents:
23192
diff
changeset
|
224 |
f8a2647fe020
setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents:
23192
diff
changeset
|
225 # heads(common) == heads(common.bases) since common represents common.bases |
f8a2647fe020
setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents:
23192
diff
changeset
|
226 # and all its ancestors |
f8a2647fe020
setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents:
23192
diff
changeset
|
227 result = dag.headsetofconnecteds(common.bases) |
f8a2647fe020
setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents:
23192
diff
changeset
|
228 # common.bases can include nullrev, but our contract requires us to not |
f8a2647fe020
setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents:
23192
diff
changeset
|
229 # return any heads in that case, so discard that |
f8a2647fe020
setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents:
23192
diff
changeset
|
230 result.discard(nullrev) |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
231 ui.progress(_('searching'), None) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
232 ui.debug("%d total queries\n" % roundtrips) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
233 |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
234 if not result and srvheadhashes != [nullid]: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
235 if abortwhenunrelated: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
236 raise util.Abort(_("repository is unrelated")) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
237 else: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
238 ui.warn(_("warning: repository is unrelated\n")) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
239 return (set([nullid]), True, srvheadhashes,) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
240 |
14981
192e02680d09
setdiscovery: return anyincoming=False when remote's only head is nullid
Andrew Pritchard <andrewp@fogcreek.com>
parents:
14833
diff
changeset
|
241 anyincoming = (srvheadhashes != [nullid]) |
192e02680d09
setdiscovery: return anyincoming=False when remote's only head is nullid
Andrew Pritchard <andrewp@fogcreek.com>
parents:
14833
diff
changeset
|
242 return dag.externalizeall(result), anyincoming, srvheadhashes |