Mercurial > public > mercurial-scm > hg-stable
view rust/hg-core/src/lib.rs @ 49041:11c0411bf4e2
dirstate-tree: optimize HashMap lookups with raw_entry_mut
This switches to using `HashMap` from the hashbrown crate,
in order to use its `raw_entry_mut` method.
The standard library?s `HashMap` is also based on this same crate,
but `raw_entry_mut` is not yet stable there:
https://github.com/rust-lang/rust/issues/56167
Using version 0.9 because 0.10 is yanked and 0.11?requires Rust 1.49
This replaces in `DirstateMap::get_or_insert_node` a call to
`HashMap<K, V>::entry` with `K = WithBasename<Cow<'on_disk, HgPath>>`.
`entry` takes and consumes an "owned" `key: K` parameter, in case a new entry
ends up inserted. This key is converted by `to_cow` from a value that borrows
the `'path` lifetime.
When this function is called by `Dirstate::new_v1`, `'path` is in fact
the same as `'on_disk` so `to_cow` can return an owned key that contains
`Cow::Borrowed`.
For other callers, `to_cow` needs to create a `Cow::Owned` and thus make
a costly heap memory allocation. This is wasteful if this key was already
present in the map. Even when inserting a new node this is typically the case
for its ancestor nodes (assuming most directories have numerous descendants).
Differential Revision: https://phab.mercurial-scm.org/D12317
author | Simon Sapin <simon.sapin@octobus.net> |
---|---|
date | Tue, 08 Feb 2022 15:51:52 +0100 |
parents | 791f5d5f7a96 |
children | ffd4b1f1c9cb |
line wrap: on
line source
// Copyright 2018-2020 Georges Racinet <georges.racinet@octobus.net> // and Mercurial contributors // // This software may be used and distributed according to the terms of the // GNU General Public License version 2 or any later version. mod ancestors; pub mod dagops; pub mod errors; pub use ancestors::{AncestorsIterator, MissingAncestors}; pub mod dirstate; pub mod dirstate_tree; pub mod discovery; pub mod exit_codes; pub mod requirements; pub mod testing; // unconditionally built, for use from integration tests pub use dirstate::{ dirs_multiset::{DirsMultiset, DirsMultisetIter}, status::{ BadMatch, BadType, DirstateStatus, HgPathCow, StatusError, StatusOptions, }, DirstateEntry, DirstateParents, EntryState, }; pub mod copy_tracing; mod filepatterns; pub mod matchers; pub mod repo; pub mod revlog; pub use revlog::*; pub mod config; pub mod lock; pub mod logging; pub mod operations; pub mod revset; pub mod utils; pub mod vfs; use crate::utils::hg_path::{HgPathBuf, HgPathError}; pub use filepatterns::{ parse_pattern_syntax, read_pattern_file, IgnorePattern, PatternFileWarning, PatternSyntax, }; use std::collections::HashMap; use std::fmt; use twox_hash::RandomXxHashBuilder64; /// This is a contract between the `micro-timer` crate and us, to expose /// the `log` crate as `crate::log`. use log; pub type LineNumber = usize; /// Rust's default hasher is too slow because it tries to prevent collision /// attacks. We are not concerned about those: if an ill-minded person has /// write access to your repository, you have other issues. pub type FastHashMap<K, V> = HashMap<K, V, RandomXxHashBuilder64>; // TODO: should this be the default `FastHashMap` for all of hg-core, not just // dirstate_tree? How does XxHash compare with AHash, hashbrown’s default? pub type FastHashbrownMap<K, V> = hashbrown::HashMap<K, V, RandomXxHashBuilder64>; #[derive(Debug, PartialEq)] pub enum DirstateMapError { PathNotFound(HgPathBuf), EmptyPath, InvalidPath(HgPathError), } impl fmt::Display for DirstateMapError { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { match self { DirstateMapError::PathNotFound(_) => { f.write_str("expected a value, found none") } DirstateMapError::EmptyPath => { f.write_str("Overflow in dirstate.") } DirstateMapError::InvalidPath(path_error) => path_error.fmt(f), } } } #[derive(Debug, derive_more::From)] pub enum DirstateError { Map(DirstateMapError), Common(errors::HgError), } impl fmt::Display for DirstateError { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { match self { DirstateError::Map(error) => error.fmt(f), DirstateError::Common(error) => error.fmt(f), } } } #[derive(Debug, derive_more::From)] pub enum PatternError { #[from] Path(HgPathError), UnsupportedSyntax(String), UnsupportedSyntaxInFile(String, String, usize), TooLong(usize), #[from] IO(std::io::Error), /// Needed a pattern that can be turned into a regex but got one that /// can't. This should only happen through programmer error. NonRegexPattern(IgnorePattern), } impl fmt::Display for PatternError { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { match self { PatternError::UnsupportedSyntax(syntax) => { write!(f, "Unsupported syntax {}", syntax) } PatternError::UnsupportedSyntaxInFile(syntax, file_path, line) => { write!( f, "{}:{}: unsupported syntax {}", file_path, line, syntax ) } PatternError::TooLong(size) => { write!(f, "matcher pattern is too long ({} bytes)", size) } PatternError::IO(error) => error.fmt(f), PatternError::Path(error) => error.fmt(f), PatternError::NonRegexPattern(pattern) => { write!(f, "'{:?}' cannot be turned into a regex", pattern) } } } }