Mercurial > public > mercurial-scm > hg
comparison mercurial/graphmod.py @ 23570:3f86fe9bcef0
graphmod: attempt to clarify documentation of groupbranchiter()
Thanks to Pierre-Yves for checking my cleanups here and helping me
understand the algorithm well enough to help document it.
author | Augie Fackler <raf@durin42.com> |
---|---|
date | Tue, 09 Dec 2014 09:35:04 -0500 |
parents | 3ecbcffdeb0c |
children | d8e0c591781c |
comparison
equal
deleted
inserted
replaced
23569:3ecbcffdeb0c | 23570:3f86fe9bcef0 |
---|---|
23 import heapq | 23 import heapq |
24 | 24 |
25 CHANGESET = 'C' | 25 CHANGESET = 'C' |
26 | 26 |
27 def groupbranchiter(revs, parentsfunc, firstbranch=()): | 27 def groupbranchiter(revs, parentsfunc, firstbranch=()): |
28 """yield revision from heads to roots one (topo) branch after the other. | 28 """Yield revisions from heads to roots one (topo) branch at a time. |
29 | 29 |
30 This function aims to be used by a graph generator that wishes to minimize | 30 This function aims to be used by a graph generator that wishes to minimize |
31 the amount of parallel branches and their interleaving. | 31 the number of parallel branches and their interleaving. |
32 | 32 |
33 Example iteration order: | 33 Example iteration order (numbers show the "true" order in a changelog): |
34 | 34 |
35 o 4 | 35 o 4 |
36 | | 36 | |
37 o 1 | 37 o 1 |
38 | | 38 | |
40 | | | 40 | | |
41 | o 2 | 41 | o 2 |
42 |/ | 42 |/ |
43 o 0 | 43 o 0 |
44 | 44 |
45 Currently consider every changeset under a merge to be on the same branch | 45 Note that the ancestors of merges are understood by the current |
46 using revision number to sort them. | 46 algorithm to be on the same branch. This means no reordering will |
47 occur behind a merge. | |
47 """ | 48 """ |
48 | 49 |
49 ### Quick summary of the algorithm | 50 ### Quick summary of the algorithm |
50 # | 51 # |
51 # This function is based around a "retention" principle. We keep revisions | 52 # This function is based around a "retention" principle. We keep revisions |
52 # in memory until we are ready to emit a whole branch that immediately | 53 # in memory until we are ready to emit a whole branch that immediately |
53 # "merge" into an existing one. This reduce the number of branch "ongoing" | 54 # "merges" into an existing one. This reduces the number of parallel |
54 # at the same time. | 55 # branches with interleaved revisions. |
55 # | 56 # |
56 # During iteration revs are split into two groups: | 57 # During iteration revs are split into two groups: |
57 # A) revision already emitted | 58 # A) revision already emitted |
58 # B) revision in "retention". They are stored as different subgroups. | 59 # B) revision in "retention". They are stored as different subgroups. |
59 # | 60 # |
60 # for each REV, we do the follow logic: | 61 # for each REV, we do the following logic: |
61 # | 62 # |
62 # a) if REV is a parent of (A), we will emit it. But before emitting it, | 63 # 1) if REV is a parent of (A), we will emit it. If there is a |
63 # we'll "free" all the revs from subgroup in (B) that were waiting for | 64 # retention group ((B) above) that is blocked on REV being |
64 # REV to be available. So we emit all revision of such subgroup before | 65 # available, we emit all the revisions out of that retention |
65 # emitting REV | 66 # group first. |
66 # | 67 # |
67 # b) else, we'll search for a subgroup in (B) awaiting for REV to be | 68 # 2) else, we'll search for a subgroup in (B) awaiting for REV to be |
68 # available, if such subgroup exist, we add REV to it and the subgroup is | 69 # available, if such subgroup exist, we add REV to it and the subgroup is |
69 # now awaiting for REV.parents() to be available. | 70 # now awaiting for REV.parents() to be available. |
70 # | 71 # |
71 # c) finally if no such group existed in (B), we create a new subgroup. | 72 # 3) finally if no such group existed in (B), we create a new subgroup. |
72 # | 73 # |
73 # | 74 # |
74 # To bootstrap the algorithm, we emit the tipmost revision. | 75 # To bootstrap the algorithm, we emit the tipmost revision (which |
76 # puts it in group (A) from above). | |
75 | 77 |
76 revs.sort(reverse=True) | 78 revs.sort(reverse=True) |
77 | 79 |
78 # Set of parents of revision that have been yield. They can be considered | 80 # Set of parents of revision that have been emitted. They can be considered |
79 # unblocked as the graph generator is already aware of them so there is no | 81 # unblocked as the graph generator is already aware of them so there is no |
80 # need to delay the one that reference them. | 82 # need to delay the revisions that reference them. |
81 # | 83 # |
82 # If someone wants to prioritize a branch over the others, pre-filling this | 84 # If someone wants to prioritize a branch over the others, pre-filling this |
83 # set will force all other branches to wait until this branch is ready to be | 85 # set will force all other branches to wait until this branch is ready to be |
84 # outputed. | 86 # emitted. |
85 unblocked = set(firstbranch) | 87 unblocked = set(firstbranch) |
86 | 88 |
87 # list of group waiting to be displayed, each group is defined by: | 89 # list of groups waiting to be displayed, each group is defined by: |
88 # | 90 # |
89 # (revs: lists of revs waiting to be displayed, | 91 # (revs: lists of revs waiting to be displayed, |
90 # blocked: set of that cannot be displayed before those in 'revs') | 92 # blocked: set of that cannot be displayed before those in 'revs') |
91 # | 93 # |
92 # The second value ('blocked') correspond to parents of any revision in the | 94 # The second value ('blocked') correspond to parents of any revision in the |
93 # group ('revs') that is not itself contained in the group. The main idea | 95 # group ('revs') that is not itself contained in the group. The main idea |
94 # of this algorithm is to delay as much as possible the emission of any | 96 # of this algorithm is to delay as much as possible the emission of any |
95 # revision. This means waiting for the moment we are about to display | 97 # revision. This means waiting for the moment we are about to display |
96 # theses parents to display the revs in a group. | 98 # these parents to display the revs in a group. |
97 # | 99 # |
98 # This first implementation is smart until it meet a merge: it will emit | 100 # This first implementation is smart until it encounters a merge: it will |
99 # revs as soon as any parents is about to be emitted and can grow an | 101 # emit revs as soon as any parent is about to be emitted and can grow an |
100 # arbitrary number of revs in 'blocked'. In practice this mean we properly | 102 # arbitrary number of revs in 'blocked'. In practice this mean we properly |
101 # retains new branches but give up on any special ordering for ancestors of | 103 # retains new branches but gives up on any special ordering for ancestors |
102 # merges. The implementation can be improved to handle this better. | 104 # of merges. The implementation can be improved to handle this better. |
103 # | 105 # |
104 # The first subgroup is special. It correspond to all the revision that | 106 # The first subgroup is special. It corresponds to all the revision that |
105 # were already emitted. The 'revs' lists is expected to be empty and the | 107 # were already emitted. The 'revs' lists is expected to be empty and the |
106 # 'blocked' set contains the parents revisions of already emitted revision. | 108 # 'blocked' set contains the parents revisions of already emitted revision. |
107 # | 109 # |
108 # You could pre-seed the <parents> set of groups[0] to a specific | 110 # You could pre-seed the <parents> set of groups[0] to a specific |
109 # changesets to select what the first emitted branch should be. | 111 # changesets to select what the first emitted branch should be. |
128 | 130 |
129 # Seek for a subgroup blocked, waiting for the current revision. | 131 # Seek for a subgroup blocked, waiting for the current revision. |
130 matching = [i for i, g in enumerate(groups) if rev in g[1]] | 132 matching = [i for i, g in enumerate(groups) if rev in g[1]] |
131 | 133 |
132 if matching: | 134 if matching: |
133 # The main idea is to gather together all sets that await on | 135 # The main idea is to gather together all sets that are blocked |
134 # the same revision. | 136 # on the same revision. |
135 # | 137 # |
136 # This merging is done at the time we are about to add this | 138 # Groups are merged when a common blocking ancestor is |
137 # common awaited to the subgroup for simplicity purpose. Such | 139 # observed. For example, given two groups: |
138 # merge could happen sooner when we update the "blocked" set of | 140 # |
139 # revision. | 141 # revs [5, 4] waiting for 1 |
142 # revs [3, 2] waiting for 1 | |
143 # | |
144 # These two groups will be merged when we process | |
145 # 1. In theory, we could have merged the groups when | |
146 # we added 2 to the group it is now in (we could have | |
147 # noticed the groups were both blocked on 1 then), but | |
148 # the way it works now makes the algorithm simpler. | |
140 # | 149 # |
141 # We also always keep the oldest subgroup first. We can | 150 # We also always keep the oldest subgroup first. We can |
142 # probably improve the behavior by having the longuest set | 151 # probably improve the behavior by having the longest set |
143 # first. That way, graph algorythms could minimise the length | 152 # first. That way, graph algorithms could minimise the length |
144 # of parallele lines their draw. This is currently not done. | 153 # of parallel lines their drawing. This is currently not done. |
145 targetidx = matching.pop(0) | 154 targetidx = matching.pop(0) |
146 trevs, tparents = groups[targetidx] | 155 trevs, tparents = groups[targetidx] |
147 for i in matching: | 156 for i in matching: |
148 gr = groups[i] | 157 gr = groups[i] |
149 trevs.extend(gr[0]) | 158 trevs.extend(gr[0]) |
150 tparents |= gr[1] | 159 tparents |= gr[1] |
151 # delete all merged subgroups (but the one we keep) (starting | 160 # delete all merged subgroups (except the one we kept) |
152 # from the last subgroup for performance and sanity reason) | 161 # (starting from the last subgroup for performance and |
162 # sanity reasons) | |
153 for i in reversed(matching): | 163 for i in reversed(matching): |
154 del groups[i] | 164 del groups[i] |
155 else: | 165 else: |
156 # This is a new head. We create a new subgroup for it. | 166 # This is a new head. We create a new subgroup for it. |
157 targetidx = len(groups) | 167 targetidx = len(groups) |
158 groups.append(([], set([rev]))) | 168 groups.append(([], set([rev]))) |
159 | 169 |
160 gr = groups[targetidx] | 170 gr = groups[targetidx] |
161 | 171 |
162 # We now adds the current nodes to this subgroups. This is done | 172 # We now add the current nodes to this subgroups. This is done |
163 # after the subgroup merging because all elements from a subgroup | 173 # after the subgroup merging because all elements from a subgroup |
164 # that relied on this rev must preceed it. | 174 # that relied on this rev must precede it. |
165 # | 175 # |
166 # we also update the <parents> set to includes the parents on the | 176 # we also update the <parents> set to include the parents of the |
167 # new nodes. | 177 # new nodes. |
168 if rev == currentrev: # only display stuff in rev | 178 if rev == currentrev: # only display stuff in rev |
169 gr[0].append(rev) | 179 gr[0].append(rev) |
170 gr[1].remove(rev) | 180 gr[1].remove(rev) |
171 parents = [p for p in parentsfunc(rev) if p > nullrev] | 181 parents = [p for p in parentsfunc(rev) if p > nullrev] |
175 pendingset.add(p) | 185 pendingset.add(p) |
176 heappush(pendingheap, -p) | 186 heappush(pendingheap, -p) |
177 | 187 |
178 # Look for a subgroup to display | 188 # Look for a subgroup to display |
179 # | 189 # |
180 # When unblocked is empty (if clause), We are not waiting over any | 190 # When unblocked is empty (if clause), we were not waiting for any |
181 # revision during the first iteration (if no priority was given) or | 191 # revisions during the first iteration (if no priority was given) or |
182 # if we outputed a whole disconnected sets of the graph (reached a | 192 # if we emitted a whole disconnected set of the graph (reached a |
183 # root). In that case we arbitrarily takes the oldest known | 193 # root). In that case we arbitrarily take the oldest known |
184 # subgroup. The heuristique could probably be better. | 194 # subgroup. The heuristic could probably be better. |
185 # | 195 # |
186 # Otherwise (elif clause) this mean we have some emitted revision. | 196 # Otherwise (elif clause) if the subgroup is blocked on |
187 # if the subgroup awaits on the same revision that the outputed | 197 # a revision we just emitted, we can safely emit it as |
188 # ones, we can safely output it. | 198 # well. |
189 if not unblocked: | 199 if not unblocked: |
190 if len(groups) > 1: # display other subset | 200 if len(groups) > 1: # display other subset |
191 targetidx = 1 | 201 targetidx = 1 |
192 gr = groups[1] | 202 gr = groups[1] |
193 elif not gr[1] & unblocked: | 203 elif not gr[1] & unblocked: |
204 # unless it is groups[0] in which case you just empty it. | 214 # unless it is groups[0] in which case you just empty it. |
205 if targetidx: | 215 if targetidx: |
206 del groups[targetidx] | 216 del groups[targetidx] |
207 else: | 217 else: |
208 gr[0][:] = [] | 218 gr[0][:] = [] |
209 # Check if we have some subgroup waiting for revision we are not going to | 219 # Check if we have some subgroup waiting for revisions we are not going to |
210 # iterate over | 220 # iterate over |
211 for g in groups: | 221 for g in groups: |
212 for r in g[0]: | 222 for r in g[0]: |
213 yield r | 223 yield r |
214 | 224 |