diff mercurial/help/internals/wireprotocolv2.txt @ 40022:33eb670e2834

wireprotov2: define semantics for content redirects When I implemented the clonebundles feature and deployed it on hg.mozilla.org using Amazon S3 as a content server, server-side CPU and bandwidth usage dropped off a cliff and a ton of server scaling headaches went away pretty much the instant clients with support for clonebundles were rolled out to Firefox CI. An obvious takeaway from that experience was that offloading server load to scalable file servers - potentially backed by a CDN - is a really good idea. Another takeaway was that Mercurial's wire protocol wasn't in a good position to support data offload generally. In wire protocol version 1, there isn't a mechanism in the protocol to say "grab the data from over here instead." For HTTP, we could teach the client to follow HTTP redirects. Or we could invent a media type that encoded redirects inline. But for SSH, we were pretty much out of luck because that protocol wasn't very flexible. Wire protocol version 2 offers the opportunity to do something better. The recent generic server-side content caching layer in the wire protocol version 2 server demonstrated that it is possible to have drop-in caching of responses to command requests. This by itself adds tons of value and already makes the built-in server much more scalable. But I don't want to stop there. The existing server-side caching implementation has a big weakness: it requires the server to send data to the client. This means that the Mercurial server is potentially sending gigabytes of data to thousands of clients. This is problematic because compared to scaling static file servers, scaling dynamic servers is *hard*. A solution to this is to "offload" serving of content to something that isn't the Mercurial server. By offloading content serving, you turn the Mercurial server from a centralized monolithic service to a distributed mostly-indexing service. Assuming high rates of content offload, this should drastically reduce the total work performed by the Mercurial server, both in terms of CPU and data transfer. This will make Mercurial servers vastly easier to scale. This commit defines the semantics for "content redirects" in wire protocol version 2. Essentially: * Servers advertise the set of locations a response could be served from. * When making requests, clients advertise the set of locations they are willing to fetch content from. * Servers can then replace the inline response with one that says "get the response from over here instead." This feature - when fully implemented - will allow extending the server-side caching layer to facilitate such things as integrating your server-side cache with a scalable blob store (such as S3 or a CDN) and offloading most data transfer to that external service. This feature could also be leveraged for load balancing. e.g. requests could come into a central server and then get redirected to an available mirror depending on server availability or locality. There's tons of potential :) Differential Revision: https://phab.mercurial-scm.org/D4774
author Gregory Szorc <gregory.szorc@gmail.com>
date Wed, 26 Sep 2018 18:02:06 -0700
parents d3d333ab167a
children 9b19b8ce3804
line wrap: on
line diff
--- a/mercurial/help/internals/wireprotocolv2.txt	Wed Sep 26 17:16:56 2018 -0700
+++ b/mercurial/help/internals/wireprotocolv2.txt	Wed Sep 26 18:02:06 2018 -0700
@@ -111,6 +111,38 @@
    requirements can be used to determine whether a client can read a
    *raw* copy of file data available.
 
+redirect
+   A map declaring potential *content redirects* that may be used by this
+   server. Contains the following bytestring keys:
+
+   targets
+      (array of maps) Potential redirect targets. Values are maps describing
+      this target in more detail. Each map has the following bytestring keys:
+
+      name
+         (bytestring) Identifier for this target. The identifier will be used
+         by clients to uniquely identify this target.
+
+      protocol
+         (bytestring) High-level network protocol. Values can be
+         ``http``, ```https``, ``ssh``, etc.
+
+      uris
+          (array of bytestrings) Representative URIs for this target.
+
+      snirequired (optional)
+          (boolean) Indicates whether Server Name Indication is required
+          to use this target. Defaults to False.
+
+      tlsversions (optional)
+          (array of bytestring) Indicates which TLS versions are supported by
+          this target. Values are ``1.1``, ``1.2``, ``1.3``, etc.
+
+   hashes
+      (array of bytestring) Indicates support for hashing algorithms that are
+      used to ensure content integrity. Values include ``sha1``, ``sha256``,
+      etc.
+
 changesetdata
 -------------