comparison mercurial/wireprotoframing.py @ 37063:0a6c5cc09a88

wireproto: define human output side channel frame Currently, the SSH protocol delivers output tailored for people over the stderr file descriptor. The HTTP protocol doesn't have this file descriptor (because it only has an input and output pipe). So it encodes textual output intended for humans within the protocol responses. So response types have a facility for capturing output to be printed to users. Some don't. And sometimes the implementation of how that output is conveyed is super hacky. On top of that, bundle2 has an "output" part that is used to store output that should be printed when this part is encountered. bundle2 also has the concept of "interrupt" chunks, which can be used to signal that the regular bundle2 stream is to be preempted by an out-of-band part that should be processed immediately. This "interrupt" part can be an "output" part and can be used to print data on the receiver. The status quo is inconsistent and insane. We can do better. This commit introduces a dedicated frame type on the frame-based protocol for denoting textual data that should be printed on the receiver. This frame type effectively constitutes a side-channel by which textual data can be printed on the receiver without interfering with other in-progress transmissions, such as the transmission of command responses. But wait - there's more! Previous implementations that transferred textual data basically instructed the client to "print these bytes." This suffered from a few problems. First, the text data that was transmitted and eventually printed originated from a server with a specic i18n configuration. This meant that clients would see text using whatever the i18n settings were on the server. Someone in France could connect to a server in Japan and see unlegible Japanese glyphs - or maybe even mojibake. Second, the normalization of all text data originated on servers resulted in the loss of the ability to apply formatting to that data. Local Mercurial clients can apply specific formatting settings to individual atoms of text. For example, a revision can be colored differently from a commit message. With data over the wire, the potential for this rich formatting was lost. The best you could do (without parsing the text to be printed), was apply a universal label to it and e.g. color it specially. The new mechanism for instructing the peer to print data does not have these limitations. Frames instructing the peer to print text are composed of a formatting string plus arguments. In other words, receivers can plug the formatting string into the i18n database to see if a local translation is available. In addition, each atom being instructed to print has a series of "labels" associated with it. These labels can be mapped to the Mercurial UI's labels so locally configured coloring, styling, etc settings can be applied. What this all means is that textual messages originating on servers can be localized on the client and richly formatted, all while respecting the client's settings. This is slightly more complicated than "print these bytes." But it is vastly more user friendly. FWIW, I'm not aware of other protocols that attempt to encode i18n and textual styling in this manner. You could lobby the claim that this feature is over-engineered. However, if I were to sit in the shoes of a non-English speaker learning how to use version control, I think I would *love* this feature because it would enable me to see richly formatted text in my chosen locale. Anyway, we only implement support for encoding frames of this type and basic tests for that encoding. We'll still need to hook up the server and its ui instance to emit these frames. I recognize this feature may be a bit more controversial than other aspects of the wire protocol because it is a bit "radical." So I'd figured I'd start small to test the waters and see if others feel this feature is worthwhile. Differential Revision: https://phab.mercurial-scm.org/D2872
author Gregory Szorc <gregory.szorc@gmail.com>
date Wed, 14 Mar 2018 22:19:00 -0700
parents c5e9c3b47366
children 884a0c1604ad
comparison
equal deleted inserted replaced
37062:bbea991635d0 37063:0a6c5cc09a88
25 FRAME_TYPE_COMMAND_NAME = 0x01 25 FRAME_TYPE_COMMAND_NAME = 0x01
26 FRAME_TYPE_COMMAND_ARGUMENT = 0x02 26 FRAME_TYPE_COMMAND_ARGUMENT = 0x02
27 FRAME_TYPE_COMMAND_DATA = 0x03 27 FRAME_TYPE_COMMAND_DATA = 0x03
28 FRAME_TYPE_BYTES_RESPONSE = 0x04 28 FRAME_TYPE_BYTES_RESPONSE = 0x04
29 FRAME_TYPE_ERROR_RESPONSE = 0x05 29 FRAME_TYPE_ERROR_RESPONSE = 0x05
30 FRAME_TYPE_TEXT_OUTPUT = 0x06
30 31
31 FRAME_TYPES = { 32 FRAME_TYPES = {
32 b'command-name': FRAME_TYPE_COMMAND_NAME, 33 b'command-name': FRAME_TYPE_COMMAND_NAME,
33 b'command-argument': FRAME_TYPE_COMMAND_ARGUMENT, 34 b'command-argument': FRAME_TYPE_COMMAND_ARGUMENT,
34 b'command-data': FRAME_TYPE_COMMAND_DATA, 35 b'command-data': FRAME_TYPE_COMMAND_DATA,
35 b'bytes-response': FRAME_TYPE_BYTES_RESPONSE, 36 b'bytes-response': FRAME_TYPE_BYTES_RESPONSE,
36 b'error-response': FRAME_TYPE_ERROR_RESPONSE, 37 b'error-response': FRAME_TYPE_ERROR_RESPONSE,
38 b'text-output': FRAME_TYPE_TEXT_OUTPUT,
37 } 39 }
38 40
39 FLAG_COMMAND_NAME_EOS = 0x01 41 FLAG_COMMAND_NAME_EOS = 0x01
40 FLAG_COMMAND_NAME_HAVE_ARGS = 0x02 42 FLAG_COMMAND_NAME_HAVE_ARGS = 0x02
41 FLAG_COMMAND_NAME_HAVE_DATA = 0x04 43 FLAG_COMMAND_NAME_HAVE_DATA = 0x04
83 FRAME_TYPE_COMMAND_NAME: FLAGS_COMMAND, 85 FRAME_TYPE_COMMAND_NAME: FLAGS_COMMAND,
84 FRAME_TYPE_COMMAND_ARGUMENT: FLAGS_COMMAND_ARGUMENT, 86 FRAME_TYPE_COMMAND_ARGUMENT: FLAGS_COMMAND_ARGUMENT,
85 FRAME_TYPE_COMMAND_DATA: FLAGS_COMMAND_DATA, 87 FRAME_TYPE_COMMAND_DATA: FLAGS_COMMAND_DATA,
86 FRAME_TYPE_BYTES_RESPONSE: FLAGS_BYTES_RESPONSE, 88 FRAME_TYPE_BYTES_RESPONSE: FLAGS_BYTES_RESPONSE,
87 FRAME_TYPE_ERROR_RESPONSE: FLAGS_ERROR_RESPONSE, 89 FRAME_TYPE_ERROR_RESPONSE: FLAGS_ERROR_RESPONSE,
90 FRAME_TYPE_TEXT_OUTPUT: {},
88 } 91 }
89 92
90 ARGUMENT_FRAME_HEADER = struct.Struct(r'<HH') 93 ARGUMENT_FRAME_HEADER = struct.Struct(r'<HH')
91 94
92 def makeframe(requestid, frametype, frameflags, payload): 95 def makeframe(requestid, frametype, frameflags, payload):
278 flags |= FLAG_ERROR_RESPONSE_PROTOCOL 281 flags |= FLAG_ERROR_RESPONSE_PROTOCOL
279 if application: 282 if application:
280 flags |= FLAG_ERROR_RESPONSE_APPLICATION 283 flags |= FLAG_ERROR_RESPONSE_APPLICATION
281 284
282 yield makeframe(requestid, FRAME_TYPE_ERROR_RESPONSE, flags, msg) 285 yield makeframe(requestid, FRAME_TYPE_ERROR_RESPONSE, flags, msg)
286
287 def createtextoutputframe(requestid, atoms):
288 """Create a text output frame to render text to people.
289
290 ``atoms`` is a 3-tuple of (formatting string, args, labels).
291
292 The formatting string contains ``%s`` tokens to be replaced by the
293 corresponding indexed entry in ``args``. ``labels`` is an iterable of
294 formatters to be applied at rendering time. In terms of the ``ui``
295 class, each atom corresponds to a ``ui.write()``.
296 """
297 bytesleft = DEFAULT_MAX_FRAME_SIZE
298 atomchunks = []
299
300 for (formatting, args, labels) in atoms:
301 if len(args) > 255:
302 raise ValueError('cannot use more than 255 formatting arguments')
303 if len(labels) > 255:
304 raise ValueError('cannot use more than 255 labels')
305
306 # TODO look for localstr, other types here?
307
308 if not isinstance(formatting, bytes):
309 raise ValueError('must use bytes formatting strings')
310 for arg in args:
311 if not isinstance(arg, bytes):
312 raise ValueError('must use bytes for arguments')
313 for label in labels:
314 if not isinstance(label, bytes):
315 raise ValueError('must use bytes for labels')
316
317 # Formatting string must be UTF-8.
318 formatting = formatting.decode(r'utf-8', r'replace').encode(r'utf-8')
319
320 # Arguments must be UTF-8.
321 args = [a.decode(r'utf-8', r'replace').encode(r'utf-8') for a in args]
322
323 # Labels must be ASCII.
324 labels = [l.decode(r'ascii', r'strict').encode(r'ascii')
325 for l in labels]
326
327 if len(formatting) > 65535:
328 raise ValueError('formatting string cannot be longer than 64k')
329
330 if any(len(a) > 65535 for a in args):
331 raise ValueError('argument string cannot be longer than 64k')
332
333 if any(len(l) > 255 for l in labels):
334 raise ValueError('label string cannot be longer than 255 bytes')
335
336 chunks = [
337 struct.pack(r'<H', len(formatting)),
338 struct.pack(r'<BB', len(labels), len(args)),
339 struct.pack(r'<' + r'B' * len(labels), *map(len, labels)),
340 struct.pack(r'<' + r'H' * len(args), *map(len, args)),
341 ]
342 chunks.append(formatting)
343 chunks.extend(labels)
344 chunks.extend(args)
345
346 atom = b''.join(chunks)
347 atomchunks.append(atom)
348 bytesleft -= len(atom)
349
350 if bytesleft < 0:
351 raise ValueError('cannot encode data in a single frame')
352
353 yield makeframe(requestid, FRAME_TYPE_TEXT_OUTPUT, 0, b''.join(atomchunks))
283 354
284 class serverreactor(object): 355 class serverreactor(object):
285 """Holds state of a server handling frame-based protocol requests. 356 """Holds state of a server handling frame-based protocol requests.
286 357
287 This class is the "brain" of the unified frame-based protocol server 358 This class is the "brain" of the unified frame-based protocol server