view rust/hg-core/src/fncache.rs @ 52316:f4aede0f01af

rust-manifest: use `memchr` crate for all byte-finding needs While writing a very dumb manifest diffing algorithm for a proof-of-concept I saw that `Manifest::find_by_path` was much slower than I was expecting. It turns out that the Rust stdlib uses slow (all is relative) code when searching for byte positions for reasons ranging from portability, SIMD API stability, nobody doing the work, etc. `memch` is much faster for these purposes, so let's use it. I was measuring ~670ms of profile time in `find_by_path`, after this patch it went down to ~230ms.
author Rapha?l Gom?s <rgomes@octobus.net>
date Tue, 12 Nov 2024 23:20:04 +0100
parents 1a8466fd904a
children
line wrap: on
line source

use std::path::Path;

use dyn_clone::DynClone;

/// The FnCache stores the list of most files contained in the store and is
/// used for stream/copy clones.
///
/// It keeps track of the name of "all" indexes and data files for all revlogs.
/// The names are relative to the store roots and are stored before any
/// encoding or path compression.
///
/// Despite its name, the FnCache is *NOT* a cache, it keep tracks of
/// information that is not easily available elsewhere. It has no mechanism
/// for detecting isn't up to date, and de-synchronization with the actual
/// contents of the repository will lead to a corrupted clone and possibly
/// other corruption during maintenance operations.
/// Strictly speaking, it could be recomputed by looking at the contents of all
/// manifests AND actual store files on disk, however that is a
/// prohibitively expensive operation.
pub trait FnCache: Sync + Send + DynClone {
    /// Whether the fncache was loaded from disk
    fn is_loaded(&self) -> bool;
    /// Add a path to be tracked in the fncache
    fn add(&self, path: &Path);
    // TODO add more methods once we start doing more with the FnCache
}