annotate mercurial/ancestor.py @ 17970:0b03454abae7

ancestor: faster algorithm for difference of ancestor sets One of the major reasons rebase is slow in large repositories is the computation of the detach set: the set of ancestors of the changesets to rebase not in the destination parent. This is currently done via a revset that does two walks all the way to the root of the DAG. Instead of doing that, to find ancestors of a set <revs> not in another set <common> we walk up the tree in reverse revision number order, maintaining sets of nodes visited from <revs>, <common> or both. For the common case where the sets are close both topologically and in revision number (relative to repository size), this has been found to speed up rebase by around 15-20%. When the nodes are farther apart and the DAG is highly branching, it is harder to say which would win. Here's how long computing the detach set takes in a linear repository with over 400000 changesets, rebasing near tip: Rebasing across 4 changesets Revset method: 2.2s New algorithm: 0.00015s Rebasing across 250 changesets Revset method: 2.2s New algorithm: 0.00069s Rebasing across 10000 changesets Revset method: 2.4s New algorithm: 0.019s
author Siddharth Agarwal <sid0@fb.com>
date Mon, 26 Nov 2012 11:46:51 -0800
parents 1ffeeb91c55d
children 09d5681d5b72
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
3135
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
1 # ancestor.py - generic DAG ancestor algorithm for mercurial
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
2 #
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
3 # Copyright 2006 Matt Mackall <mpm@selenic.com>
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
4 #
8225
46293a0c7e9f updated license to be explicit about GPL version 2
Martin Geisler <mg@lazybytes.net>
parents: 7882
diff changeset
5 # This software may be used and distributed according to the terms of the
10263
25e572394f5c Update license to GPLv2+
Matt Mackall <mpm@selenic.com>
parents: 8465
diff changeset
6 # GNU General Public License version 2 or any later version.
3135
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
7
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
8 import heapq
17970
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
9 from node import nullrev
3135
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
10
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
11 def ancestor(a, b, pfunc):
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
12 """
13554
22565ddb28e7 ancestor: improve description
Matt Mackall <mpm@selenic.com>
parents: 12401
diff changeset
13 Returns the common ancestor of a and b that is furthest from a
22565ddb28e7 ancestor: improve description
Matt Mackall <mpm@selenic.com>
parents: 12401
diff changeset
14 root (as measured by longest path) or None if no ancestor is
22565ddb28e7 ancestor: improve description
Matt Mackall <mpm@selenic.com>
parents: 12401
diff changeset
15 found. If there are multiple common ancestors at the same
22565ddb28e7 ancestor: improve description
Matt Mackall <mpm@selenic.com>
parents: 12401
diff changeset
16 distance, the first one found is returned.
3135
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
17
9915
806e6b6cb8d8 ancestor: improve docstring
Sune Foldager <cryo@cyanite.org>
parents: 8465
diff changeset
18 pfunc must return a list of parent vertices for a given vertex
3135
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
19 """
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
20
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
21 if a == b:
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
22 return a
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
23
11418
67bb9d78f05e merge: sort arguments to stabilize the ancestor search
Matt Mackall <mpm@selenic.com>
parents: 10264
diff changeset
24 a, b = sorted([a, b])
67bb9d78f05e merge: sort arguments to stabilize the ancestor search
Matt Mackall <mpm@selenic.com>
parents: 10264
diff changeset
25
3135
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
26 # find depth from root of all ancestors
13554
22565ddb28e7 ancestor: improve description
Matt Mackall <mpm@selenic.com>
parents: 12401
diff changeset
27 # depth is stored as a negative for heapq
7882
8d78fc991b71 ancestor: caching the parent list to improve performance
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 6429
diff changeset
28 parentcache = {}
3135
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
29 visit = [a, b]
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
30 depth = {}
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
31 while visit:
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
32 vertex = visit[-1]
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
33 pl = pfunc(vertex)
7882
8d78fc991b71 ancestor: caching the parent list to improve performance
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 6429
diff changeset
34 parentcache[vertex] = pl
3135
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
35 if not pl:
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
36 depth[vertex] = 0
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
37 visit.pop()
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
38 else:
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
39 for p in pl:
12401
4cdaf1adafc8 backout most of 4f8067c94729
Matt Mackall <mpm@selenic.com>
parents: 12387
diff changeset
40 if p == a or p == b: # did we find a or b as a parent?
3135
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
41 return p # we're done
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
42 if p not in depth:
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
43 visit.append(p)
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
44 if visit[-1] == vertex:
13554
22565ddb28e7 ancestor: improve description
Matt Mackall <mpm@selenic.com>
parents: 12401
diff changeset
45 # -(maximum distance of parents + 1)
3135
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
46 depth[vertex] = min([depth[p] for p in pl]) - 1
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
47 visit.pop()
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
48
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
49 # traverse ancestors in order of decreasing distance from root
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
50 def ancestors(vertex):
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
51 h = [(depth[vertex], vertex)]
8465
23429ebd3f9d ancestor: use set instead of dict
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 8225
diff changeset
52 seen = set()
3135
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
53 while h:
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
54 d, n = heapq.heappop(h)
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
55 if n not in seen:
8465
23429ebd3f9d ancestor: use set instead of dict
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 8225
diff changeset
56 seen.add(n)
3135
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
57 yield (d, n)
7882
8d78fc991b71 ancestor: caching the parent list to improve performance
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 6429
diff changeset
58 for p in parentcache[n]:
3135
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
59 heapq.heappush(h, (depth[p], p))
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
60
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
61 def generations(vertex):
8465
23429ebd3f9d ancestor: use set instead of dict
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 8225
diff changeset
62 sg, s = None, set()
3673
eb0b4a2d70a9 white space and line break cleanups
Thomas Arendsen Hein <thomas@intevation.de>
parents: 3135
diff changeset
63 for g, v in ancestors(vertex):
3135
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
64 if g != sg:
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
65 if sg:
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
66 yield sg, s
8465
23429ebd3f9d ancestor: use set instead of dict
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 8225
diff changeset
67 sg, s = g, set((v,))
3135
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
68 else:
8465
23429ebd3f9d ancestor: use set instead of dict
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 8225
diff changeset
69 s.add(v)
3135
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
70 yield sg, s
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
71
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
72 x = generations(a)
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
73 y = generations(b)
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
74 gx = x.next()
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
75 gy = y.next()
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
76
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
77 # increment each ancestor list until it is closer to root than
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
78 # the other, or they match
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
79 try:
14494
1ffeeb91c55d check-code: flag 0/1 used as constant Boolean expression
Martin Geisler <mg@lazybytes.net>
parents: 13554
diff changeset
80 while True:
3135
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
81 if gx[0] == gy[0]:
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
82 for v in gx[1]:
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
83 if v in gy[1]:
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
84 return v
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
85 gy = y.next()
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
86 gx = x.next()
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
87 elif gx[0] > gy[0]:
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
88 gy = y.next()
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
89 else:
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
90 gx = x.next()
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
91 except StopIteration:
b1db258e875c Abstract ancestor algorithm into generic function
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
92 return None
17970
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
93
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
94 def missingancestors(revs, bases, pfunc):
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
95 """Return all the ancestors of revs that are not ancestors of bases.
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
96
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
97 This may include elements from revs.
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
98
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
99 Equivalent to the revset (::revs - ::bases). Revs are returned in
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
100 revision number order, which is a topological order.
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
101
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
102 revs and bases should both be iterables. pfunc must return a list of
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
103 parent revs for a given revs.
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
104
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
105 graph is a dict of child->parent adjacency lists for this graph:
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
106 o 13
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
107 |
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
108 | o 12
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
109 | |
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
110 | | o 11
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
111 | | |\
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
112 | | | | o 10
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
113 | | | | |
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
114 | o---+ | 9
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
115 | | | | |
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
116 o | | | | 8
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
117 / / / /
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
118 | | o | 7
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
119 | | | |
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
120 o---+ | 6
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
121 / / /
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
122 | | o 5
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
123 | |/
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
124 | o 4
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
125 | |
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
126 o | 3
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
127 | |
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
128 | o 2
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
129 |/
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
130 o 1
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
131 |
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
132 o 0
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
133 >>> graph = {0: [-1], 1: [0], 2: [1], 3: [1], 4: [2], 5: [4], 6: [4],
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
134 ... 7: [4], 8: [-1], 9: [6, 7], 10: [5], 11: [3, 7], 12: [9],
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
135 ... 13: [8]}
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
136 >>> pfunc = graph.get
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
137
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
138 Empty revs
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
139 >>> missingancestors([], [1], pfunc)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
140 []
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
141 >>> missingancestors([], [], pfunc)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
142 []
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
143
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
144 If bases is empty, it's the same as if it were [nullrev]
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
145 >>> missingancestors([12], [], pfunc)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
146 [0, 1, 2, 4, 6, 7, 9, 12]
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
147
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
148 Trivial case: revs == bases
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
149 >>> missingancestors([0], [0], pfunc)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
150 []
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
151 >>> missingancestors([4, 5, 6], [6, 5, 4], pfunc)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
152 []
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
153
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
154 With nullrev
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
155 >>> missingancestors([-1], [12], pfunc)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
156 []
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
157 >>> missingancestors([12], [-1], pfunc)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
158 [0, 1, 2, 4, 6, 7, 9, 12]
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
159
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
160 9 is a parent of 12. 7 is a parent of 9, so an ancestor of 12. 6 is an
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
161 ancestor of 12 but not of 7.
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
162 >>> missingancestors([12], [9], pfunc)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
163 [12]
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
164 >>> missingancestors([9], [12], pfunc)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
165 []
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
166 >>> missingancestors([12, 9], [7], pfunc)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
167 [6, 9, 12]
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
168 >>> missingancestors([7, 6], [12], pfunc)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
169 []
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
170
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
171 More complex cases
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
172 >>> missingancestors([10], [11, 12], pfunc)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
173 [5, 10]
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
174 >>> missingancestors([11], [10], pfunc)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
175 [3, 7, 11]
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
176 >>> missingancestors([11], [10, 12], pfunc)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
177 [3, 11]
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
178 >>> missingancestors([12], [10], pfunc)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
179 [6, 7, 9, 12]
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
180 >>> missingancestors([12], [11], pfunc)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
181 [6, 9, 12]
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
182 >>> missingancestors([10, 11, 12], [13], pfunc)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
183 [0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 12]
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
184 >>> missingancestors([13], [10, 11, 12], pfunc)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
185 [8, 13]
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
186 """
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
187
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
188 revsvisit = set(revs)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
189 basesvisit = set(bases)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
190 if not revsvisit:
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
191 return []
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
192 if not basesvisit:
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
193 basesvisit.add(nullrev)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
194 start = max(max(revsvisit), max(basesvisit))
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
195 bothvisit = revsvisit.intersection(basesvisit)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
196 revsvisit.difference_update(bothvisit)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
197 basesvisit.difference_update(bothvisit)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
198 # At this point, we hold the invariants that:
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
199 # - revsvisit is the set of nodes we know are an ancestor of at least one
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
200 # of the nodes in revs
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
201 # - basesvisit is the same for bases
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
202 # - bothvisit is the set of nodes we know are ancestors of at least one of
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
203 # the nodes in revs and one of the nodes in bases
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
204 # - a node may be in none or one, but not more, of revsvisit, basesvisit
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
205 # and bothvisit at any given time
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
206 # Now we walk down in reverse topo order, adding parents of nodes already
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
207 # visited to the sets while maintaining the invariants. When a node is
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
208 # found in both revsvisit and basesvisit, it is removed from them and
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
209 # added to bothvisit instead. When revsvisit becomes empty, there are no
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
210 # more ancestors of revs that aren't also ancestors of bases, so exit.
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
211
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
212 missing = []
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
213 for curr in xrange(start, nullrev, -1):
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
214 if not revsvisit:
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
215 break
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
216
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
217 if curr in bothvisit:
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
218 bothvisit.remove(curr)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
219 # curr's parents might have made it into revsvisit or basesvisit
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
220 # through another path
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
221 for p in pfunc(curr):
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
222 revsvisit.discard(p)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
223 basesvisit.discard(p)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
224 bothvisit.add(p)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
225 continue
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
226
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
227 # curr will never be in both revsvisit and basesvisit, since if it
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
228 # were it'd have been pushed to bothvisit
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
229 if curr in revsvisit:
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
230 missing.append(curr)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
231 thisvisit = revsvisit
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
232 othervisit = basesvisit
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
233 elif curr in basesvisit:
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
234 thisvisit = basesvisit
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
235 othervisit = revsvisit
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
236 else:
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
237 # not an ancestor of a or b: ignore
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
238 continue
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
239
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
240 thisvisit.remove(curr)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
241 for p in pfunc(curr):
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
242 if p == nullrev:
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
243 pass
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
244 elif p in othervisit or p in bothvisit:
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
245 # p is implicitly in thisvisit. This means p is or should be
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
246 # in bothvisit
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
247 revsvisit.discard(p)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
248 basesvisit.discard(p)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
249 bothvisit.add(p)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
250 else:
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
251 # visit later
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
252 thisvisit.add(p)
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
253
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
254 missing.reverse()
0b03454abae7 ancestor: faster algorithm for difference of ancestor sets
Siddharth Agarwal <sid0@fb.com>
parents: 14494
diff changeset
255 return missing