mercurial-scm/hg: mercurial/namespaces.py@f81c17ec303c

hgdemandimport: apply lazy module loading to sys.meta_path finders Python's `sys.meta_path` finders are the primary objects whose job it is to find a module at import time. When `import` is called, Python iterates objects in this list and calls `o.find_spec(...)` to find a `ModuleSpec` (or None if the module couldn't be found by that finder). If no meta path finder can find a module, import fails. One of the default meta path finders is `PathFinder`. Its job is to import modules from the filesystem and is probably the most important importer. This finder looks at `sys.path` and `sys.path_hooks` to do its job. The `ModuleSpec` returned by `MetaPathImporter.find_spec()` has a `loader` attribute, which defines the concrete module loader to use. `sys.path_hooks` is a hook point for teaching `PathFinder` to instantiate custom loader types. Previously, we injected a custom `sys.path_hook` that told `PathFinder` to wrap the default loaders with a loader that creates a module object that is lazy. This approach worked. But its main limitation was that it only applied to the `PathFinder` meta path importer. There are other meta path importers that are registered. And in the case of PyOxidizer loading modules from memory, `PathFinder` doesn't come into play since PyOxidizer's own meta path importer was handling all imports. This commit changes our approach to lazy module loading by proxying all meta path importers. Specifically, we overload the `find_spec()` method to swap in a wrapped loader on the `ModuleSpec` before it is returned. The end result of this is all meta path importers should be lazy. As much as I would have loved to utilize .__class__ manipulation to achieve this, some meta path importers are implemented in C/Rust in such a way that they cannot be monkeypatched. This is why we use __getattribute__ to define a proxy. Also, this change could theoretically open us up to regressions in meta path importers whose loader is creating module objects which can't be monkeypatched. But I'm not aware of any of these in the wild. So I think we'll be safe. According to hyperfine, this change yields a decent startup time win of 5-6ms: ``` Benchmark #1: ~/.pyenv/versions/3.6.10/bin/python ./hg version Time (mean ? ?): 86.8 ms ? 0.5 ms [User: 78.0 ms, System: 8.7 ms] Range (min ? max): 86.0 ms ? 89.1 ms 50 runs Time (mean ? ?): 81.1 ms ? 2.7 ms [User: 74.5 ms, System: 6.5 ms] Range (min ? max): 77.8 ms ? 90.5 ms 50 runs Benchmark #2: ~/.pyenv/versions/3.7.6/bin/python ./hg version Time (mean ? ?): 78.9 ms ? 0.6 ms [User: 70.2 ms, System: 8.7 ms] Range (min ? max): 78.1 ms ? 81.2 ms 50 runs Time (mean ? ?): 73.4 ms ? 0.6 ms [User: 65.3 ms, System: 8.0 ms] Range (min ? max): 72.4 ms ? 75.7 ms 50 runs Benchmark #3: ~/.pyenv/versions/3.8.1/bin/python ./hg version Time (mean ? ?): 78.1 ms ? 0.6 ms [User: 70.2 ms, System: 7.9 ms] Range (min ? max): 77.4 ms ? 80.9 ms 50 runs Time (mean ? ?): 72.1 ms ? 0.4 ms [User: 64.4 ms, System: 7.6 ms] Range (min ? max): 71.4 ms ? 74.1 ms 50 runs ``` Differential Revision: https://phab.mercurial-scm.org/D7954


from __future__ import absolute_import

from .i18n import _
from . import (
    pycompat,
    registrar,
    templatekw,
    util,
)


def tolist(val):
    """
    a convenience method to return an empty list instead of None
    """
    if val is None:
        return []
    else:
        return [val]


class namespaces(object):
    """provides an interface to register and operate on multiple namespaces. See
    the namespace class below for details on the namespace object.

    """

    _names_version = 0

    def __init__(self):
        self._names = util.sortdict()
        columns = templatekw.getlogcolumns()

        # we need current mercurial named objects (bookmarks, tags, and
        # branches) to be initialized somewhere, so that place is here
        bmknames = lambda repo: repo._bookmarks.keys()
        bmknamemap = lambda repo, name: tolist(repo._bookmarks.get(name))
        bmknodemap = lambda repo, node: repo.nodebookmarks(node)
        n = namespace(
            b"bookmarks",
            templatename=b"bookmark",
            logfmt=columns[b'bookmark'],
            listnames=bmknames,
            namemap=bmknamemap,
            nodemap=bmknodemap,
            builtin=True,
        )
        self.addnamespace(n)

        tagnames = lambda repo: [t for t, n in repo.tagslist()]
        tagnamemap = lambda repo, name: tolist(repo._tagscache.tags.get(name))
        tagnodemap = lambda repo, node: repo.nodetags(node)
        n = namespace(
            b"tags",
            templatename=b"tag",
            logfmt=columns[b'tag'],
            listnames=tagnames,
            namemap=tagnamemap,
            nodemap=tagnodemap,
            deprecated={b'tip'},
            builtin=True,
        )
        self.addnamespace(n)

        bnames = lambda repo: repo.branchmap().keys()
        bnamemap = lambda repo, name: tolist(repo.branchtip(name, True))
        bnodemap = lambda repo, node: [repo[node].branch()]
        n = namespace(
            b"branches",
            templatename=b"branch",
            logfmt=columns[b'branch'],
            listnames=bnames,
            namemap=bnamemap,
            nodemap=bnodemap,
            builtin=True,
        )
        self.addnamespace(n)

    def __getitem__(self, namespace):
        """returns the namespace object"""
        return self._names[namespace]

    def __iter__(self):
        return self._names.__iter__()

    def items(self):
        return pycompat.iteritems(self._names)

    iteritems = items

    def addnamespace(self, namespace, order=None):
        """register a namespace

        namespace: the name to be registered (in plural form)
        order: optional argument to specify the order of namespaces
               (e.g. 'branches' should be listed before 'bookmarks')

        """
        if order is not None:
            self._names.insert(order, namespace.name, namespace)
        else:
            self._names[namespace.name] = namespace

        # we only generate a template keyword if one does not already exist
        if namespace.name not in templatekw.keywords:
            templatekeyword = registrar.templatekeyword(templatekw.keywords)

            @templatekeyword(namespace.name, requires={b'repo', b'ctx'})
            def generatekw(context, mapping):
                return templatekw.shownames(context, mapping, namespace.name)

    def singlenode(self, repo, name):
        """
        Return the 'best' node for the given name. What's best is defined
        by the namespace's singlenode() function. The first match returned by
        a namespace in the defined precedence order is used.

        Raises a KeyError if there is no such node.
        """
        for ns, v in pycompat.iteritems(self._names):
            n = v.singlenode(repo, name)
            if n:
                return n
        raise KeyError(_(b'no such name: %s') % name)


class namespace(object):
    """provides an interface to a namespace

    Namespaces are basically generic many-to-many mapping between some
    (namespaced) names and nodes. The goal here is to control the pollution of
    jamming things into tags or bookmarks (in extension-land) and to simplify
    internal bits of mercurial: log output, tab completion, etc.

    More precisely, we define a mapping of names to nodes, and a mapping from
    nodes to names. Each mapping returns a list.

    Furthermore, each name mapping will be passed a name to lookup which might
    not be in its domain. In this case, each method should return an empty list
    and not raise an error.

    This namespace object will define the properties we need:
      'name': the namespace (plural form)
      'templatename': name to use for templating (usually the singular form
                      of the plural namespace name)
      'listnames': list of all names in the namespace (usually the keys of a
                   dictionary)
      'namemap': function that takes a name and returns a list of nodes
      'nodemap': function that takes a node and returns a list of names
      'deprecated': set of names to be masked for ordinary use
      'builtin': bool indicating if this namespace is supported by core
                 Mercurial.
    """

    def __init__(
        self,
        name,
        templatename=None,
        logname=None,
        colorname=None,
        logfmt=None,
        listnames=None,
        namemap=None,
        nodemap=None,
        deprecated=None,
        builtin=False,
        singlenode=None,
    ):
        """create a namespace

        name: the namespace to be registered (in plural form)
        templatename: the name to use for templating
        logname: the name to use for log output; if not specified templatename
                 is used
        colorname: the name to use for colored log output; if not specified
                   logname is used
        logfmt: the format to use for (i18n-ed) log output; if not specified
                it is composed from logname
        listnames: function to list all names
        namemap: function that inputs a name, output node(s)
        nodemap: function that inputs a node, output name(s)
        deprecated: set of names to be masked for ordinary use
        builtin: whether namespace is implemented by core Mercurial
        singlenode: function that inputs a name, output best node (or None)
        """
        self.name = name
        self.templatename = templatename
        self.logname = logname
        self.colorname = colorname
        self.logfmt = logfmt
        self.listnames = listnames
        self.namemap = namemap
        self.nodemap = nodemap
        if singlenode:
            self.singlenode = singlenode

        # if logname is not specified, use the template name as backup
        if self.logname is None:
            self.logname = self.templatename

        # if colorname is not specified, just use the logname as a backup
        if self.colorname is None:
            self.colorname = self.logname

        # if logfmt is not specified, compose it from logname as backup
        if self.logfmt is None:
            # i18n: column positioning for "hg log"
            self.logfmt = (b"%s:" % self.logname).ljust(13) + b"%s\n"

        if deprecated is None:
            self.deprecated = set()
        else:
            self.deprecated = deprecated

        self.builtin = builtin

    def names(self, repo, node):
        """method that returns a (sorted) list of names in a namespace that
        match a given node"""
        return sorted(self.nodemap(repo, node))

    def nodes(self, repo, name):
        """method that returns a list of nodes in a namespace that
        match a given name.

        """
        return sorted(self.namemap(repo, name))

    def singlenode(self, repo, name):
        """returns the best node for the given name

        By default, the best node is the node from nodes() with the highest
        revision number. It can be overriden by the namespace."""
        n = self.namemap(repo, name)
        if n:
            # return max revision number
            if len(n) > 1:
                cl = repo.changelog
                maxrev = max(cl.rev(node) for node in n)
                return cl.node(maxrev)
            return n[0]
        return None

author	Gregory Szorc <gregory.szorc@gmail.com>
	Mon, 20 Jan 2020 23:51:25 -0800
changeset 44118	f81c17ec303c
parent 43106	d783f945a701
child 44728	59ad165f6cdb
permissions	-rw-r--r--