comparison mercurial/graphmod.py @ 23570:3f86fe9bcef0

graphmod: attempt to clarify documentation of groupbranchiter() Thanks to Pierre-Yves for checking my cleanups here and helping me understand the algorithm well enough to help document it.
author Augie Fackler <raf@durin42.com>
date Tue, 09 Dec 2014 09:35:04 -0500
parents 3ecbcffdeb0c
children d8e0c591781c
comparison
equal deleted inserted replaced
23569:3ecbcffdeb0c 23570:3f86fe9bcef0
23 import heapq 23 import heapq
24 24
25 CHANGESET = 'C' 25 CHANGESET = 'C'
26 26
27 def groupbranchiter(revs, parentsfunc, firstbranch=()): 27 def groupbranchiter(revs, parentsfunc, firstbranch=()):
28 """yield revision from heads to roots one (topo) branch after the other. 28 """Yield revisions from heads to roots one (topo) branch at a time.
29 29
30 This function aims to be used by a graph generator that wishes to minimize 30 This function aims to be used by a graph generator that wishes to minimize
31 the amount of parallel branches and their interleaving. 31 the number of parallel branches and their interleaving.
32 32
33 Example iteration order: 33 Example iteration order (numbers show the "true" order in a changelog):
34 34
35 o 4 35 o 4
36 | 36 |
37 o 1 37 o 1
38 | 38 |
40 | | 40 | |
41 | o 2 41 | o 2
42 |/ 42 |/
43 o 0 43 o 0
44 44
45 Currently consider every changeset under a merge to be on the same branch 45 Note that the ancestors of merges are understood by the current
46 using revision number to sort them. 46 algorithm to be on the same branch. This means no reordering will
47 occur behind a merge.
47 """ 48 """
48 49
49 ### Quick summary of the algorithm 50 ### Quick summary of the algorithm
50 # 51 #
51 # This function is based around a "retention" principle. We keep revisions 52 # This function is based around a "retention" principle. We keep revisions
52 # in memory until we are ready to emit a whole branch that immediately 53 # in memory until we are ready to emit a whole branch that immediately
53 # "merge" into an existing one. This reduce the number of branch "ongoing" 54 # "merges" into an existing one. This reduces the number of parallel
54 # at the same time. 55 # branches with interleaved revisions.
55 # 56 #
56 # During iteration revs are split into two groups: 57 # During iteration revs are split into two groups:
57 # A) revision already emitted 58 # A) revision already emitted
58 # B) revision in "retention". They are stored as different subgroups. 59 # B) revision in "retention". They are stored as different subgroups.
59 # 60 #
60 # for each REV, we do the follow logic: 61 # for each REV, we do the following logic:
61 # 62 #
62 # a) if REV is a parent of (A), we will emit it. But before emitting it, 63 # 1) if REV is a parent of (A), we will emit it. If there is a
63 # we'll "free" all the revs from subgroup in (B) that were waiting for 64 # retention group ((B) above) that is blocked on REV being
64 # REV to be available. So we emit all revision of such subgroup before 65 # available, we emit all the revisions out of that retention
65 # emitting REV 66 # group first.
66 # 67 #
67 # b) else, we'll search for a subgroup in (B) awaiting for REV to be 68 # 2) else, we'll search for a subgroup in (B) awaiting for REV to be
68 # available, if such subgroup exist, we add REV to it and the subgroup is 69 # available, if such subgroup exist, we add REV to it and the subgroup is
69 # now awaiting for REV.parents() to be available. 70 # now awaiting for REV.parents() to be available.
70 # 71 #
71 # c) finally if no such group existed in (B), we create a new subgroup. 72 # 3) finally if no such group existed in (B), we create a new subgroup.
72 # 73 #
73 # 74 #
74 # To bootstrap the algorithm, we emit the tipmost revision. 75 # To bootstrap the algorithm, we emit the tipmost revision (which
76 # puts it in group (A) from above).
75 77
76 revs.sort(reverse=True) 78 revs.sort(reverse=True)
77 79
78 # Set of parents of revision that have been yield. They can be considered 80 # Set of parents of revision that have been emitted. They can be considered
79 # unblocked as the graph generator is already aware of them so there is no 81 # unblocked as the graph generator is already aware of them so there is no
80 # need to delay the one that reference them. 82 # need to delay the revisions that reference them.
81 # 83 #
82 # If someone wants to prioritize a branch over the others, pre-filling this 84 # If someone wants to prioritize a branch over the others, pre-filling this
83 # set will force all other branches to wait until this branch is ready to be 85 # set will force all other branches to wait until this branch is ready to be
84 # outputed. 86 # emitted.
85 unblocked = set(firstbranch) 87 unblocked = set(firstbranch)
86 88
87 # list of group waiting to be displayed, each group is defined by: 89 # list of groups waiting to be displayed, each group is defined by:
88 # 90 #
89 # (revs: lists of revs waiting to be displayed, 91 # (revs: lists of revs waiting to be displayed,
90 # blocked: set of that cannot be displayed before those in 'revs') 92 # blocked: set of that cannot be displayed before those in 'revs')
91 # 93 #
92 # The second value ('blocked') correspond to parents of any revision in the 94 # The second value ('blocked') correspond to parents of any revision in the
93 # group ('revs') that is not itself contained in the group. The main idea 95 # group ('revs') that is not itself contained in the group. The main idea
94 # of this algorithm is to delay as much as possible the emission of any 96 # of this algorithm is to delay as much as possible the emission of any
95 # revision. This means waiting for the moment we are about to display 97 # revision. This means waiting for the moment we are about to display
96 # theses parents to display the revs in a group. 98 # these parents to display the revs in a group.
97 # 99 #
98 # This first implementation is smart until it meet a merge: it will emit 100 # This first implementation is smart until it encounters a merge: it will
99 # revs as soon as any parents is about to be emitted and can grow an 101 # emit revs as soon as any parent is about to be emitted and can grow an
100 # arbitrary number of revs in 'blocked'. In practice this mean we properly 102 # arbitrary number of revs in 'blocked'. In practice this mean we properly
101 # retains new branches but give up on any special ordering for ancestors of 103 # retains new branches but gives up on any special ordering for ancestors
102 # merges. The implementation can be improved to handle this better. 104 # of merges. The implementation can be improved to handle this better.
103 # 105 #
104 # The first subgroup is special. It correspond to all the revision that 106 # The first subgroup is special. It corresponds to all the revision that
105 # were already emitted. The 'revs' lists is expected to be empty and the 107 # were already emitted. The 'revs' lists is expected to be empty and the
106 # 'blocked' set contains the parents revisions of already emitted revision. 108 # 'blocked' set contains the parents revisions of already emitted revision.
107 # 109 #
108 # You could pre-seed the <parents> set of groups[0] to a specific 110 # You could pre-seed the <parents> set of groups[0] to a specific
109 # changesets to select what the first emitted branch should be. 111 # changesets to select what the first emitted branch should be.
128 130
129 # Seek for a subgroup blocked, waiting for the current revision. 131 # Seek for a subgroup blocked, waiting for the current revision.
130 matching = [i for i, g in enumerate(groups) if rev in g[1]] 132 matching = [i for i, g in enumerate(groups) if rev in g[1]]
131 133
132 if matching: 134 if matching:
133 # The main idea is to gather together all sets that await on 135 # The main idea is to gather together all sets that are blocked
134 # the same revision. 136 # on the same revision.
135 # 137 #
136 # This merging is done at the time we are about to add this 138 # Groups are merged when a common blocking ancestor is
137 # common awaited to the subgroup for simplicity purpose. Such 139 # observed. For example, given two groups:
138 # merge could happen sooner when we update the "blocked" set of 140 #
139 # revision. 141 # revs [5, 4] waiting for 1
142 # revs [3, 2] waiting for 1
143 #
144 # These two groups will be merged when we process
145 # 1. In theory, we could have merged the groups when
146 # we added 2 to the group it is now in (we could have
147 # noticed the groups were both blocked on 1 then), but
148 # the way it works now makes the algorithm simpler.
140 # 149 #
141 # We also always keep the oldest subgroup first. We can 150 # We also always keep the oldest subgroup first. We can
142 # probably improve the behavior by having the longuest set 151 # probably improve the behavior by having the longest set
143 # first. That way, graph algorythms could minimise the length 152 # first. That way, graph algorithms could minimise the length
144 # of parallele lines their draw. This is currently not done. 153 # of parallel lines their drawing. This is currently not done.
145 targetidx = matching.pop(0) 154 targetidx = matching.pop(0)
146 trevs, tparents = groups[targetidx] 155 trevs, tparents = groups[targetidx]
147 for i in matching: 156 for i in matching:
148 gr = groups[i] 157 gr = groups[i]
149 trevs.extend(gr[0]) 158 trevs.extend(gr[0])
150 tparents |= gr[1] 159 tparents |= gr[1]
151 # delete all merged subgroups (but the one we keep) (starting 160 # delete all merged subgroups (except the one we kept)
152 # from the last subgroup for performance and sanity reason) 161 # (starting from the last subgroup for performance and
162 # sanity reasons)
153 for i in reversed(matching): 163 for i in reversed(matching):
154 del groups[i] 164 del groups[i]
155 else: 165 else:
156 # This is a new head. We create a new subgroup for it. 166 # This is a new head. We create a new subgroup for it.
157 targetidx = len(groups) 167 targetidx = len(groups)
158 groups.append(([], set([rev]))) 168 groups.append(([], set([rev])))
159 169
160 gr = groups[targetidx] 170 gr = groups[targetidx]
161 171
162 # We now adds the current nodes to this subgroups. This is done 172 # We now add the current nodes to this subgroups. This is done
163 # after the subgroup merging because all elements from a subgroup 173 # after the subgroup merging because all elements from a subgroup
164 # that relied on this rev must preceed it. 174 # that relied on this rev must precede it.
165 # 175 #
166 # we also update the <parents> set to includes the parents on the 176 # we also update the <parents> set to include the parents of the
167 # new nodes. 177 # new nodes.
168 if rev == currentrev: # only display stuff in rev 178 if rev == currentrev: # only display stuff in rev
169 gr[0].append(rev) 179 gr[0].append(rev)
170 gr[1].remove(rev) 180 gr[1].remove(rev)
171 parents = [p for p in parentsfunc(rev) if p > nullrev] 181 parents = [p for p in parentsfunc(rev) if p > nullrev]
175 pendingset.add(p) 185 pendingset.add(p)
176 heappush(pendingheap, -p) 186 heappush(pendingheap, -p)
177 187
178 # Look for a subgroup to display 188 # Look for a subgroup to display
179 # 189 #
180 # When unblocked is empty (if clause), We are not waiting over any 190 # When unblocked is empty (if clause), we were not waiting for any
181 # revision during the first iteration (if no priority was given) or 191 # revisions during the first iteration (if no priority was given) or
182 # if we outputed a whole disconnected sets of the graph (reached a 192 # if we emitted a whole disconnected set of the graph (reached a
183 # root). In that case we arbitrarily takes the oldest known 193 # root). In that case we arbitrarily take the oldest known
184 # subgroup. The heuristique could probably be better. 194 # subgroup. The heuristic could probably be better.
185 # 195 #
186 # Otherwise (elif clause) this mean we have some emitted revision. 196 # Otherwise (elif clause) if the subgroup is blocked on
187 # if the subgroup awaits on the same revision that the outputed 197 # a revision we just emitted, we can safely emit it as
188 # ones, we can safely output it. 198 # well.
189 if not unblocked: 199 if not unblocked:
190 if len(groups) > 1: # display other subset 200 if len(groups) > 1: # display other subset
191 targetidx = 1 201 targetidx = 1
192 gr = groups[1] 202 gr = groups[1]
193 elif not gr[1] & unblocked: 203 elif not gr[1] & unblocked:
204 # unless it is groups[0] in which case you just empty it. 214 # unless it is groups[0] in which case you just empty it.
205 if targetidx: 215 if targetidx:
206 del groups[targetidx] 216 del groups[targetidx]
207 else: 217 else:
208 gr[0][:] = [] 218 gr[0][:] = []
209 # Check if we have some subgroup waiting for revision we are not going to 219 # Check if we have some subgroup waiting for revisions we are not going to
210 # iterate over 220 # iterate over
211 for g in groups: 221 for g in groups:
212 for r in g[0]: 222 for r in g[0]:
213 yield r 223 yield r
214 224