Mercurial > public > mercurial-scm > hg
diff hgext/clonebundles.py @ 37498:aacfca6f9767
wireproto: support for pullbundles
Pullbundles are similar to clonebundles, but served as normal inline
bundle streams. They are almost transparent to the client -- the only
visible effect is that the client might get less changes than what it
asked for, i.e. not all requested head revisions are provided.
The client announces support for the necessary retries with the
partial-pull capability. After receiving a partial bundle, it updates
the set of revisions shared with the server and drops all now-known
heads from the request list. It will then rerun getbundle until
no changes are received or all remote heads are present.
Extend badserverext to support per-socket limit, i.e. don't assume that
the same limits should be applied to all sockets.
Differential Revision: https://phab.mercurial-scm.org/D1856
author | Joerg Sonnenberger <joerg@bec.de> |
---|---|
date | Thu, 18 Jan 2018 12:54:01 +0100 |
parents | d25802b0eef5 |
children | b4d85bc122bd |
line wrap: on
line diff
--- a/hgext/clonebundles.py Fri Apr 06 22:39:58 2018 -0700 +++ b/hgext/clonebundles.py Thu Jan 18 12:54:01 2018 +0100 @@ -6,7 +6,8 @@ "clonebundles" is a server-side extension used to advertise the existence of pre-generated, externally hosted bundle files to clients that are cloning so that cloning can be faster, more reliable, and require less -resources on the server. +resources on the server. "pullbundles" is a related feature for sending +pre-generated bundle files to clients as part of pull operations. Cloning can be a CPU and I/O intensive operation on servers. Traditionally, the server, in response to a client's request to clone, dynamically generates @@ -16,8 +17,12 @@ servers with large repositories or with high clone volume, the load from clones can make scaling the server challenging and costly. -This extension provides server operators the ability to offload potentially -expensive clone load to an external service. Here's how it works. +This extension provides server operators the ability to offload +potentially expensive clone load to an external service. Pre-generated +bundles also allow using more CPU intensive compression, reducing the +effective bandwidth requirements. + +Here's how clone bundles work: 1. A server operator establishes a mechanism for making bundle files available on a hosting service where Mercurial clients can fetch them. @@ -33,7 +38,7 @@ 7. The client reconnects to the original server and performs the equivalent of :hg:`pull` to retrieve all repository data not in the bundle. (The repository could have been updated between when the bundle was created - and when the client started the clone.) + and when the client started the clone.) This may use "pullbundles". Instead of the server generating full repository bundles for every clone request, it generates full bundles once and they are subsequently reused to @@ -42,13 +47,27 @@ created. For large, established repositories, this can reduce server load for clones to less than 1% of original. +Here's how pullbundles work: + +1. A manifest file listing available bundles and describing the revisions + is added to the Mercurial repository on the server. +2. A new-enough client informs the server that it supports partial pulls + and initiates a pull. +3. If the server has pull bundles enabled and sees the client advertising + partial pulls, it checks for a matching pull bundle in the manifest. + A bundle matches if the format is supported by the client, the client + has the required revisions already and needs something from the bundle. +4. If there is at least one matching bundle, the server sends it to the client. +5. The client applies the bundle and notices that the server reply was + incomplete. It initiates another pull. + To work, this extension requires the following of server operators: * Generating bundle files of repository content (typically periodically, such as once per day). -* A file server that clients have network access to and that Python knows - how to talk to through its normal URL handling facility (typically an - HTTP server). +* Clone bundles: A file server that clients have network access to and that + Python knows how to talk to through its normal URL handling facility + (typically an HTTP/HTTPS server). * A process for keeping the bundles manifest in sync with available bundle files. @@ -61,7 +80,7 @@ :hg:`bundle --all` is used to produce a bundle of the entire repository. :hg:`debugcreatestreamclonebundle` can be used to produce a special -*streaming clone bundle*. These are bundle files that are extremely efficient +*streaming clonebundle*. These are bundle files that are extremely efficient to produce and consume (read: fast). However, they are larger than traditional bundle formats and require that clients support the exact set of repository data store formats in use by the repository that created them. @@ -73,7 +92,8 @@ A server operator is responsible for creating a ``.hg/clonebundles.manifest`` file containing the list of available bundle files suitable for seeding clones. If this file does not exist, the repository will not advertise the -existence of clone bundles when clients connect. +existence of clone bundles when clients connect. For pull bundles, +``.hg/pullbundles.manifest`` is used. The manifest file contains a newline (\\n) delimited list of entries. @@ -85,6 +105,9 @@ pairs describing additional properties of this bundle. Both keys and values are URI encoded. +For pull bundles, the URL is a path under the ``.hg`` directory of the +repository. + Keys in UPPERCASE are reserved for use by Mercurial and are defined below. All non-uppercase keys can be used by site installations. An example use for custom properties is to use the *datacenter* attribute to define which @@ -133,6 +156,15 @@ Value should be "true". +heads + Used for pull bundles. This contains the ``;`` separated changeset + hashes of the heads of the bundle content. + +bases + Used for pull bundles. This contains the ``;`` separated changeset + hashes of the roots of the bundle content. This can be skipped if + the bundle was created without ``--base``. + Manifests can contain multiple entries. Assuming metadata is defined, clients will filter entries from the manifest that they don't support. The remaining entries are optionally sorted by client preferences