Mercurial > public > mercurial-scm > hg-stable
changeset 53025:f212c1cf38b0 default
branching: merge with current default
Now that the merge is fixed, we can merge this back into default.
author | Pierre-Yves David <pierre-yves.david@octobus.net> |
---|---|
date | Fri, 28 Feb 2025 23:28:10 +0100 |
parents | 877c20982972 (current diff) 878846a50203 (diff) |
children | 6e9e2891a2ad |
files | contrib/heptapod-ci.yml |
diffstat | 288 files changed, 12157 insertions(+), 4786 deletions(-) [+] |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/.flake8 Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,2 @@ +[flake8] +max-line-length = 80
--- a/.hgignore Fri Feb 28 23:25:42 2025 +0100 +++ b/.hgignore Fri Feb 28 23:28:10 2025 +0100 @@ -52,8 +52,6 @@ doc/build doc/html doc/man -MANIFEST -MANIFEST.in patches mercurial/__modulepolicy__.py mercurial/__version__.py
--- a/.hgsigs Fri Feb 28 23:25:42 2025 +0100 +++ b/.hgsigs Fri Feb 28 23:28:10 2025 +0100 @@ -270,3 +270,4 @@ 31d45a1cbc479ac73fc8b355abe99e090eb6c747 0 iQHNBAABCgA3FiEEH2b4zfZU6QXBHaBhoR4BzQ4F2VYFAmc2E+wZHGFscGhhcmVAcmFwaGFlbGdvbWVzLmRldgAKCRChHgHNDgXZVgOeC/9kMZuDpTdSdGj2Fd8mTK8BLA+7PRvoUM4rbHlBDZjtCCcLkKTC1sB0FJzlbfNEYbFxwqnzCTFzwNBYwWYWW5of20EoMxl7KGFJDY4hZPhAe9uN346lnp3GkaIe9kI4B++NUrLuc3SfbSFo3cAQyBAmgwK0fAYec6TF+ZdkGrswgu6CMplckW35FkI24sNzYrjV5w0wUMhGQo2uT1g2XZFd2NsMaMrvCZIho916VLDumNglHAaxhoDbj7A9nQDesSlckSPDSu9Axu0NLoFWUndSheZQamoOJpJZ5IsyymsbZYGrrZeZREG/TeSSHV0WrvIfcLQQlJSKYrrakUSiqfXalwXrUS3fDdVymyEBy0q+cXkWyNMEqIYjH3coOof6F/9/DuVCsxDHJMJm5Bs4rLy2mHcMGXPSkWf75TwPHqPIsQm4WgaAaJNvEtc6XHMtw8Xu4z9wPywNeLBJITAipxI32xHHFW0yj2F//ttG47yM4FWPZJXgNAZlVK1WBtGCY6k= b267c5764cc6b804c619a42067405f27e8705beb 0 iQHNBAABCgA3FiEEH2b4zfZU6QXBHaBhoR4BzQ4F2VYFAmc99H8ZHGFscGhhcmVAcmFwaGFlbGdvbWVzLmRldgAKCRChHgHNDgXZVlpkDACOfStBiT60lrkLPDKzwQH/vM8U26XIPkxQ5lypmyomeWS8ss/+dDEHVWdoBM1wAIf90sCEV4yxRuEcT00YNqvW0aI4R6If8VB1Xg1aJ7c3MLpIWWs9BFp1uoi2Fvpx9HJmY3mPyrS4uIxPWaG+QVYOcmx6CGru+7Yd6w5aUFhWBJ/8ZqR496so3Q59z3+MJjHOVx+3UruGEjqP8tfWgX2RgwLi+utckq2Z+pDzDz/hfBQMx6aFmZN9pHBtQDyDuZD30bBLQi6xiPb6ddOXd6h2OjEa+X2VNUW2adbTVU4LBXSe4uvLx8jXcVE5TSxmL1v7FuHJxPUHz5sRh7NiQoOceHO7DWZn8cO73jF+L6WI946bbEsSE+7JgIEpcshsS1njw6LcPGPqFFdqyJ+eEmJ4/Naqd52/j8yWOIKEkNzGLDl8AADzxXnjejCgW/L7+sqF60JRz7p0H4WaT40rALeVTxxL/UhlRaSNKPzGwkfIlhSyP6VuCVVpTg6EmEUDjKE= 9751b9ccd74d8386687f88fbdfe280877840ec7d 0 iQHNBAABCgA3FiEEH2b4zfZU6QXBHaBhoR4BzQ4F2VYFAmeJLnYZHGFscGhhcmVAcmFwaGFlbGdvbWVzLmRldgAKCRChHgHNDgXZVtVZC/4tQcl/jGwcw8VQqg7l4gNNyk4GRvM+YsHQwfTtp8Xt2OnqwbI8sMuvEdXC3vmb9qfgKZX6qjwLe+9A8Jyz3jl0bIZSEHAiL5s9DZ/eKMKaxOn1DHrx3W/sFjd+GQOA/Xk6g4DmRSLB+zJTpgCz4rJjQzhOYczBpu+aTniAsb1X6OShz6ycKR90Cf3Sdp/evzL6MEaVFV/pg6e/jx+KxuMtlba3W/YuhvlvtzfeWWA2penmuQTSryhKGOTOCTrL9snmcLbvkHzfGRFHrtFCKdcBRAKGXCs+/W3HXvNVbtGSQbXbJueMmAg3vNdE4CkXJxyBD2bkBbvnnadjswAApBnIVEfB/FRtOFTx2qUnWpho1yxHk38eNOE0ytMHOxzlIyfjoVLsshxMDz1SM5YEBP0/cIeIDJzQjl63tfI5zm7BwORwYcWVcXOkiJtDBgNqktrsFClymH2MTO2C6nExAHyS4XYxURYJws0RKl+DWjcSwvHvbOocH3hcVIqAV3cky1M= +b964f92261d4fbb64f19aa6af2b072f7730b913a 0 iQHNBAABCgA3FiEEH2b4zfZU6QXBHaBhoR4BzQ4F2VYFAme2VVUZHGFscGhhcmVAcmFwaGFlbGdvbWVzLmRldgAKCRChHgHNDgXZVn3aDACSMVaJexSgl1UfjBAKjwaF4t9Y2pBKnYibahXmddViwhhIISPzeVtvaM9y/4Cm4SP11S6PQ356aiZ3RjhtQbmRHQJe5cXGkBaxykIxLSC/KgDy9HXHDDATwvo+aF/QVBX8ig/cr0NdVpwtvQq7+rkDNfbObwu8pPIbZGqOoNM1ND2Kz6P+FqbNZfGPwLP/AaCtCl2dXcf/Z774JUsAEZ6InqvP1/m/atAG7phesXhem8cpPb6e8LohuiJpnbV2rUj7SEqk0eF2BRapSukSZC2vxdqsy4hcXO1uwJ3V3GPtegpdMG25OE3ALy/2WKoh4inJV+WfJy1+DEiSdP++Rpadv/By68WIBvWY/rKgWAYPqIE5IKH5CtcZkkFMtfoooFGiz7uvci5+ZaetZnHVPm9FZH3KZsNccsESDkT25I+rwynqt8LKt1qEA+Ur43U6ipG+LZxT7sOGGYYElU6cSoSIcrcMUfsbi0XhgpnZch4QwjoMyzWnXgcjnivnn3arMkw=
--- a/.hgtags Fri Feb 28 23:25:42 2025 +0100 +++ b/.hgtags Fri Feb 28 23:28:10 2025 +0100 @@ -286,3 +286,4 @@ 31d45a1cbc479ac73fc8b355abe99e090eb6c747 6.9rc1 b267c5764cc6b804c619a42067405f27e8705beb 6.9 9751b9ccd74d8386687f88fbdfe280877840ec7d 6.9.1 +b964f92261d4fbb64f19aa6af2b072f7730b913a 6.9.2
--- a/CONTRIBUTING Fri Feb 28 23:25:42 2025 +0100 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,13 +0,0 @@ -Our full contribution guidelines are in our wiki, please see: - -https://www.mercurial-scm.org/wiki/ContributingChanges - -If you just want a checklist to follow, you can go straight to - -https://www.mercurial-scm.org/wiki/ContributingChanges#Submission_checklist - -If you can't run the entire testsuite for some reason (it can be -difficult on Windows), please at least run `contrib/check-code.py` on -any files you've modified and run `python contrib/check-commit` on any -commits you've made (for example, `python contrib/check-commit -273ce12ad8f1` will report some style violations on a very old commit).
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/CONTRIBUTING.md Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,45 @@ +# Mercurial's Contributing guidelines + +Our full contribution guidelines are in our wiki, please see: + +<https://www.mercurial-scm.org/wiki/ContributingChanges> + +If you just want a checklist to follow, you can go straight to + +<https://www.mercurial-scm.org/wiki/ContributingChanges#Submission_checklist> + +If you can't run the entire testsuite for some reason (it can be +difficult on Windows), please at least run `contrib/check-code.py` on +any files you've modified and run `python contrib/check-commit` on any +commits you've made (for example, `python contrib/check-commit +273ce12ad8f1` will report some style violations on a very old commit). + +## Development dependencies + +### Required dependencies + +- Python (see `project.requires-python` in `pyproject.toml`) with `venv` and `pip` + +- `make` with few other standard Unix tools (`diff`, `grep`, `unzip`, `gunzip`, `bunzip2` and `sed`) + + For Windows, see `contrib/install-windows-dependencies.ps1`. + +### Optional dependencies + +- Mercurial contributors should install a quite recent Mercurial with the + extensions `evolve` and `topic` activated. + +- A C compiler and Python headers (typically Debian package `python3-dev` + or Microsoft Build Tools for Visual Studio on Windows) + +- `msgfmt` from the Debian package `gettext` (used to build the translations) + +- [Rust tools](https://www.rust-lang.org/tools/install) (see `rust/README.rst`) + +#### Note on installation + +Mercurial is a Python application that can be installed with +[pipx](https://pipx.pypa.io) and `uv tool`. UV is a Rust application that can also be +installed with pipx or with +[its own installer](https://docs.astral.sh/uv/getting-started/installation/). +
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/MANIFEST.in Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,5 @@ +include mercurial/__version__.py +include doc/*.html +include doc/*.1 doc/*.5 doc/*.8 +include doc/html/*.html +include doc/man/*.1
--- a/Makefile Fri Feb 28 23:25:42 2025 +0100 +++ b/Makefile Fri Feb 28 23:28:10 2025 +0100 @@ -20,30 +20,25 @@ $(eval HGROOT := $(shell pwd)) HGPYTHONS ?= $(HGROOT)/build/pythons PURE= +PIP_OPTIONS_PURE=--config-settings --global-option="$(PURE)" +PIP_OPTIONS_INSTALL=--no-deps --ignore-installed --no-build-isolation +PIP_PREFIX=$(PREFIX) PYFILESCMD=find mercurial hgext doc -name '*.py' PYFILES:=$(shell $(PYFILESCMD)) DOCFILES=mercurial/helptext/*.txt export LANGUAGE=C export LC_ALL=C TESTFLAGS ?= $(shell echo $$HGTESTFLAGS) -OSXVERSIONFLAGS ?= $(shell echo $$OSXVERSIONFLAGS) CARGO = cargo -# Set this to e.g. "mingw32" to use a non-default compiler. -COMPILER= - -COMPILERFLAG_tmp_ = -COMPILERFLAG_tmp_${COMPILER} ?= -c $(COMPILER) -COMPILERFLAG=${COMPILERFLAG_tmp_${COMPILER}} - VENV_NAME=$(shell $(PYTHON) -c "import sys; v = sys.version_info; print(f'.venv_{sys.implementation.name}{v.major}.{v.minor}')") PYBINDIRNAME=$(shell $(PYTHON) -c "import os; print('Scripts' if os.name == 'nt' else 'bin')") +.PHONY: help help: @echo 'Commonly used make targets:' - @echo ' all - build program and documentation' @echo ' install - install program and man pages to $$PREFIX ($(PREFIX))' - @echo ' install-home - install with setup.py install --home=$$HOME ($(HOME))' + @echo ' install-home - install with pip install --user' @echo ' local - build for inplace usage' @echo ' tests - run all tests in the automatic test suite' @echo ' test-foo - run only specified tests (e.g. test-merge1.t)' @@ -52,42 +47,53 @@ @echo ' (except installed files or dist source tarball)' @echo ' update-pot - update i18n/hg.pot' @echo + @echo 'See CONTRIBUTING.md for the build and development dependencies.' + @echo + @echo 'Example for a system-wide installation under /usr/local for' + @echo 'downstream packaging (build and runtime deps have to be installed by hand)' + @echo ' su -c "make install" && hg version' + @echo @echo 'Example for a system-wide installation under /usr/local:' - @echo ' make all && su -c "make install" && hg version' + @echo ' make doc' + @echo ' su -c "make install PIP_OPTIONS_INSTALL=" && hg version' + @echo + @echo 'On some Linux distributions, you might need to specify both' + @echo 'PREFIX and PIP_PREFIX (here to install everything in /data/local)' + @echo ' make install PREFIX=/data/local PIP_PREFIX=/data PIP_OPTIONS_INSTALL=' @echo @echo 'Example for a local installation (usable in this directory):' @echo ' make local && ./hg version' -all: build doc - +.PHONY: local local: $(PYTHON) -m venv $(VENV_NAME) --clear --upgrade-deps - MERCURIAL_SETUP_MAKE_LOCAL=1 $(VENV_NAME)/$(PYBINDIRNAME)/python -m \ - pip install -e . -v --config-settings --global-option=$(PURE) + $(VENV_NAME)/$(PYBINDIRNAME)/python -m \ + pip install -e . -v $(PIP_OPTIONS_PURE) env HGRCPATH= $(VENV_NAME)/$(PYBINDIRNAME)/hg version -build: - $(PYTHON) setup.py $(PURE) build $(COMPILERFLAG) - +.PHONY: build-chg build-chg: make -C contrib/chg +.PHONY: build-rhg build-rhg: (cd rust/rhg; cargo build --release) +.PHONY: wheel wheel: - $(PYTHON) setup.py $(PURE) bdist_wheel $(COMPILERFLAG) - + $(PYTHON) -m build --config-setting=--global-option="$(PURE)" +.PHONY: doc doc: $(MAKE) -C doc +.PHONY: cleanbutpackages cleanbutpackages: rm -f hg.exe - -$(PYTHON) setup.py clean --all # ignore errors from this command + rm -rf mercurial.egg-info dist find contrib doc hgext hgext3rd i18n mercurial tests hgdemandimport \ \( -name '*.py[cdo]' -o -name '*.so' \) -exec rm -f '{}' ';' rm -rf .venv_* - rm -f MANIFEST MANIFEST.in hgext/__index__.py tests/*.err + rm -f hgext/__index__.py tests/*.err rm -f mercurial/__modulepolicy__.py if test -d .hg; then rm -f mercurial/__version__.py; fi rm -rf build mercurial/locale @@ -96,57 +102,64 @@ rm -rf rust/target rm -f mercurial/rustext.so +.PHONY: clean clean: cleanbutpackages rm -rf packages +.PHONY: install install: install-bin install-doc -install-bin: build - $(PYTHON) setup.py $(PURE) install --root="$(DESTDIR)/" --prefix="$(PREFIX)" --force +.PHONY: install-bin +install-bin: + $(PYTHON) -m pip install . --prefix="$(PIP_PREFIX)" --force -v $(PIP_OPTIONS_PURE) $(PIP_OPTIONS_INSTALL) +.PHONY: install-chg install-chg: build-chg make -C contrib/chg install PREFIX="$(PREFIX)" -install-doc: doc - cd doc && $(MAKE) $(MFLAGS) install +.PHONY: install-doc +install-doc: + $(MAKE) -C doc $(MFLAGS) PREFIX="$(PREFIX)" install +.PHONY: install-home install-home: install-home-bin install-home-doc -install-home-bin: build - $(PYTHON) setup.py $(PURE) install --home="$(HOME)" --prefix="" --force +.PHONY: install-home-bin +install-home-bin: + $(PYTHON) -m pip install . --user --force -v $(PIP_OPTIONS_PURE) $(PIP_OPTIONS_INSTALL) -install-home-doc: doc - cd doc && $(MAKE) $(MFLAGS) PREFIX="$(HOME)" install +.PHONY: install-home-doc +install-home-doc: + $(MAKE) -C doc $(MFLAGS) PREFIX="$(HOME)" install +.PHONY: install-rhg install-rhg: build-rhg install -m 755 rust/target/release/rhg "$(PREFIX)"/bin/ -MANIFEST-doc: - $(MAKE) -C doc MANIFEST +.PHONY: dist +dist: tests dist-notests -MANIFEST.in: MANIFEST-doc - hg manifest | sed -e 's/^/include /' > MANIFEST.in - echo include mercurial/__version__.py >> MANIFEST.in - sed -e 's/^/include /' < doc/MANIFEST >> MANIFEST.in +.PHONY: dist-notests +dist-notests: doc + TAR_OPTIONS="--owner=root --group=root --mode=u+w,go-w,a+rX-s" $(PYTHON) -m build --sdist -dist: tests dist-notests - -dist-notests: doc MANIFEST.in - TAR_OPTIONS="--owner=root --group=root --mode=u+w,go-w,a+rX-s" $(PYTHON) setup.py -q sdist - +.PHONY: check check: tests +.PHONY: tests tests: - # Run Rust tests if cargo is installed + # Run Rust tests if cargo is installed if command -v $(CARGO) >/dev/null 2>&1; then \ $(MAKE) rust-tests; \ $(MAKE) cargo-clippy; \ fi cd tests && $(PYTHON) run-tests.py $(TESTFLAGS) +.PHONY: test-% test-%: cd tests && $(PYTHON) run-tests.py $(TESTFLAGS) $@ +.PHONY: testpy-% testpy-%: @echo Looking for Python $* in $(HGPYTHONS) [ -e $(HGPYTHONS)/$*/bin/python ] || ( \ @@ -154,22 +167,27 @@ $(MAKE) -f $(HGROOT)/contrib/Makefile.python PYTHONVER=$* PREFIX=$(HGPYTHONS)/$* python ) cd tests && $(HGPYTHONS)/$*/bin/python run-tests.py $(TESTFLAGS) +.PHONY: rust-tests rust-tests: cd $(HGROOT)/rust \ && $(CARGO) test --quiet --all \ --features "$(HG_RUST_FEATURES)" --no-default-features +.PHONY: cargo-clippy cargo-clippy: cd $(HGROOT)/rust \ && $(CARGO) clippy --all --features "$(HG_RUST_FEATURES)" -- -D warnings +.PHONY: check-code check-code: hg manifest | xargs python contrib/check-code.py +.PHONY: format-c format-c: clang-format --style file -i \ `hg files 'set:(**.c or **.cc or **.h) and not "listfile:contrib/clang-format-ignorelist"'` +.PHONY: update-pot update-pot: i18n/hg.pot i18n/hg.pot: $(PYFILES) $(DOCFILES) i18n/posplit i18n/hggettext @@ -235,43 +253,19 @@ ppa # Forward packaging targets for convenience. +.PHONY: $(packaging_targets) $(packaging_targets): $(MAKE) -C contrib/packaging $(MAKEFLAGS) $@ -osx: - rm -rf build/mercurial - /usr/bin/python2.7 setup.py install --optimize=1 \ - --root=build/mercurial/ --prefix=/usr/local/ \ - --install-lib=/Library/Python/2.7/site-packages/ - make -C doc all install DESTDIR="$(PWD)/build/mercurial/" - # Place a bogon .DS_Store file in the target dir so we can be - # sure it doesn't get included in the final package. - touch build/mercurial/.DS_Store - make -C contrib/chg \ - HGPATH=/usr/local/bin/hg \ - PYTHON=/usr/bin/python2.7 \ - DESTDIR=../../build/mercurial \ - PREFIX=/usr/local \ - clean install - mkdir -p $${OUTPUTDIR:-dist} - HGVER=$$(python contrib/genosxversion.py $(OSXVERSIONFLAGS) build/mercurial/Library/Python/2.7/site-packages/mercurial/__version__.py) && \ - OSXVER=$$(sw_vers -productVersion | cut -d. -f1,2) && \ - pkgbuild --filter \\.DS_Store --root build/mercurial/ \ - --identifier org.mercurial-scm.mercurial \ - --version "$${HGVER}" \ - build/mercurial.pkg && \ - productbuild --distribution contrib/packaging/macosx/distribution.xml \ - --package-path build/ \ - --version "$${HGVER}" \ - --resources contrib/packaging/macosx/ \ - "$${OUTPUTDIR:-dist/}"/Mercurial-"$${HGVER}"-macosx"$${OSXVER}".pkg +.PHONY: pyoxidizer pyoxidizer: $(PYOXIDIZER) build --path ./rust/hgcli --release # a temporary target to setup all we need for run-tests.py --pyoxidizer # (should go away as the run-tests implementation improves +.PHONY: pyoxidizer-windows-tests pyoxidizer-windows-tests: PYOX_DIR=build/pyoxidizer/x86_64-pc-windows-msvc/release/app pyoxidizer-windows-tests: pyoxidizer rm -rf $(PYOX_DIR)/templates @@ -288,6 +282,7 @@ # a temporary target to setup all we need for run-tests.py --pyoxidizer # (should go away as the run-tests implementation improves +.PHONY: pyoxidizer-macos-tests pyoxidizer-macos-tests: PYOX_DIR=build/pyoxidizer/x86_64-apple-darwin/release/app pyoxidizer-macos-tests: pyoxidizer rm -rf $(PYOX_DIR)/templates @@ -301,12 +296,6 @@ rm -rf $(PYOX_DIR)/doc cp -a doc $(PYOX_DIR)/doc +.PHONY: pytype-docker pytype-docker: contrib/docker/pytype/recipe.sh - -.PHONY: help all local build doc cleanbutpackages clean install install-bin \ - install-doc install-home install-home-bin install-home-doc \ - dist dist-notests check tests rust-tests check-code format-c \ - update-pot pyoxidizer pyoxidizer-windows-tests pyoxidizer-macos-tests \ - $(packaging_targets) \ - osx pytype-docker
--- a/contrib/build-one-linux-wheel.sh Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/build-one-linux-wheel.sh Fri Feb 28 23:28:10 2025 +0100 @@ -5,17 +5,26 @@ # set -eu -# enforce that the translation are built -export MERCURIAL_SETUP_FORCE_TRANSLATIONS=1 - if [ $# -lt 2 ]; then - echo "usage $0 PYTHONTAG DEST_DIR" >&2 + echo "usage $0 PYTHONTAG DEST_DIR [FLAVOR]" >&2 echo "" >&2 echo 'PYTHONTAG should be of the form "cp310-cp310"' >&2 exit 64 fi py_tag=$1 destination_directory=$2 +flavor=${3:-c} + + +flavor_arg="" +if [[ "${flavor}" == "c" ]]; then + : +elif [[ "$flavor" == "rust" ]]; then + flavor_arg="--config-setting=--global-option=--rust" +else + echo "unknown flavor: \"$flavor\"" + exit 96 +fi tmp_wheel_dir=./tmp-wheelhouse @@ -23,6 +32,6 @@ if [ -e $tmp_wheel_dir ]; then rm -rf $tmp_wheel_dir fi -/opt/python/$py_tag/bin/python setup.py bdist_wheel --dist-dir $tmp_wheel_dir +/opt/python/$py_tag/bin/python -m build --outdir $tmp_wheel_dir $flavor_arg # adjust it to make it universal auditwheel repair $tmp_wheel_dir/*.whl -w $destination_directory
--- a/contrib/check-code.py Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/check-code.py Fri Feb 28 23:28:10 2025 +0100 @@ -372,7 +372,7 @@ r'\s(\+=|-=|!=|<>|<=|>=|<<=|>>=|%=)\S', "missing whitespace around operator", ), - (r'[^^+=*/!<>&| %-](\s=|=\s)[^= ]', "wrong whitespace around ="), + (r'[^^+=*/!<>&| %-:](\s=|=\s)[^= ]', "wrong whitespace around ="), ( r'raise [^,(]+, (\([^\)]+\)|[^,\(\)]+)$', "don't use old-style two-argument raise, use Exception(message)",
--- a/contrib/check-pytype.sh Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/check-pytype.sh Fri Feb 28 23:28:10 2025 +0100 @@ -42,7 +42,6 @@ # hgext/remotefilelog/fileserverclient.py # [attribute-error] # hgext/remotefilelog/shallowbundle.py # [attribute-error] # hgext/remotefilelog/remotefilectx.py # [module-attr] (This is an actual bug) -# hgext/sqlitestore.py # [attribute-error] # hgext/zeroconf/__init__.py # bytes vs str; tests fail on macOS # # mercurial/context.py # many [attribute-error] @@ -102,7 +101,6 @@ -x hgext/remotefilelog/fileserverclient.py \ -x hgext/remotefilelog/remotefilectx.py \ -x hgext/remotefilelog/shallowbundle.py \ - -x hgext/sqlitestore.py \ -x hgext/zeroconf/__init__.py \ -x mercurial/context.py \ -x mercurial/crecord.py \
--- a/contrib/docker/pytype/Dockerfile Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/docker/pytype/Dockerfile Fri Feb 28 23:28:10 2025 +0100 @@ -2,6 +2,7 @@ USER ci-runner +RUN mkdir /home/ci-runner/.local/ ENV PATH=/home/ci-runner/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ENV PYTHONPATH=/home/ci-runner/.local/lib/python3.11/site-packages
--- a/contrib/heptapod-ci.yml Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/heptapod-ci.yml Fri Feb 28 23:28:10 2025 +0100 @@ -46,9 +46,6 @@ # with shell runner, its content is not cleaned from one call to the next, # so plan for it. TMP_WORK_DIR: "${CI_PROJECT_DIR}/../.." - # we use CIBW_SKIP="pp*" to prevent the building of pypy wheel that are neither - # needed nor working. - CIBW_SKIP: "pp*" .all: # help changing all job at once when debugging @@ -100,7 +97,6 @@ variables: WHEEL_TYPE: "" FLAVOR: "" - MERCURIAL_SETUP_FORCE_TRANSLATIONS: "1" CI_CLEVER_CLOUD_FLAVOR: "XS" script: - PLATFORM=`/opt/python/cp313-cp313/bin/python -c 'import sys; print(sys.platform)'` @@ -108,7 +104,7 @@ - test -n "$WHEEL_TYPE" - echo $FLAVOR - mkdir -p wheels/$PLATFORM/$WHEEL_TYPE/$BUILD_PY_ID - - contrib/build-one-linux-wheel.sh $BUILD_PY_ID wheels/$PLATFORM/$WHEEL_TYPE/$BUILD_PY_ID + - contrib/build-one-linux-wheel.sh $BUILD_PY_ID wheels/$PLATFORM/$WHEEL_TYPE/$BUILD_PY_ID $WHEEL_TYPE artifacts: paths: - wheels/ @@ -123,13 +119,18 @@ parallel: matrix: - BUILD_PY_ID: + - cp311-cp311 - cp38-cp38 - cp39-cp39 - cp310-cp310 - - cp311-cp311 - cp312-cp312 - cp313-cp313 +build-rust-wheel: + image: "registry.heptapod.net:443/mercurial/ci-images/core-wheel-x86_64-rust:v3.0" + extends: build-c-wheel + variables: + WHEEL_TYPE: "rust" .wheel-trigger: extends: .trigger @@ -230,16 +231,20 @@ test -n "$WHEEL"; echo installing from $WHEEL; WHEEL_ARG="--hg-wheel $WHEEL"; - echo disabling flavor as this is currently incompatible with '"--hg-wheel"'; - FLAVOR=""; + if [[ -n "$FLAVOR" ]] && [[ "$FLAVOR" != "--rust" ]]; then + echo disabling flavor '"'$FLAVOR'"' as this is currently incompatible with '"--hg-wheel"'; + FLAVOR=""; + fi else echo installing from source; fi; - - if [ -n "$CI_NODE_INDEX" ]; then + - if [ -n "$CI_NODE_INDEX" ]; then echo "Running the test in multiple shard - [$CI_NODE_INDEX/$CI_NODE_TOTAL]"; SHARDING_ARGS="--shard-index $CI_NODE_INDEX --shard-total $CI_NODE_TOTAL"; echo "sharding... $SHARDING_ARGS"; fi + - echo HGTESTS_ALLOW_NETIO="$TEST_HGTESTS_ALLOW_NETIO" "$PYTHON" tests/run-tests.py + --color=always --tail-report $PORT_ARG $WHEEL_ARG $FLAVOR $SHARDING_ARGS $FILTER $RUNTEST_ARGS; - HGTESTS_ALLOW_NETIO="$TEST_HGTESTS_ALLOW_NETIO" "$PYTHON" tests/run-tests.py --color=always @@ -295,12 +300,25 @@ variables: FLAVOR: "--pure" -test-rust: +.test-rust: extends: .runtests-no-check variables: HGWITHRUSTEXT: "cpython" FLAVOR: "--rust" +test-rust: + extends: .test-rust + variables: + HGWITHRUSTEXT: "cpython" + FLAVOR: "--rust" + needs: + - job: build-rust-wheel + parallel: + matrix: + - BUILD_PY_ID: "cp311-cp311" + variables: + WHEEL_TYPE: "rust" + test-rhg: extends: .runtests-no-check variables: @@ -357,7 +375,7 @@ - BUILD_PY_ID: "cp312-cp312" test-3.12-rust: - extends: test-rust + extends: .test-rust stage: py-version-compat needs: - trigger-pycompat @@ -376,7 +394,7 @@ - BUILD_PY_ID: "cp313-cp313" test-3.13-rust: - extends: test-rust + extends: .test-rust stage: py-version-compat needs: - trigger-pycompat @@ -384,7 +402,7 @@ PYTHON: python3.13 check-pytype: - extends: test-rust + extends: .test-rust stage: checks before_script: - export PATH="/home/ci-runner/vendor/pyenv/pyenv-2.4.7-adf3c2bccf09cdb81febcfd15b186711a33ac7a8/shims:/home/ci-runner/vendor/pyenv/pyenv-2.4.7-adf3c2bccf09cdb81febcfd15b186711a33ac7a8/bin:$PATH" @@ -392,6 +410,9 @@ - hg clone . "${TMP_WORK_DIR}"/mercurial-ci/ --noupdate --config phases.publish=no - hg -R "${TMP_WORK_DIR}"/mercurial-ci/ update `hg log --rev '.' --template '{node}'` - cd "${TMP_WORK_DIR}"/mercurial-ci/ + - echo $HGWITHRUSTEXT + # We need to unset HGWITHRUSTEXT since editable install is broken with Rust + - unset HGWITHRUSTEXT - make local PYTHON=$PYTHON - ./contrib/setup-pytype.sh script: @@ -439,8 +460,6 @@ - if: $CI_COMMIT_BRANCH =~ $RE_TOPIC needs: - "trigger-wheel-windows" - variables: - MERCURIAL_SETUP_FORCE_TRANSLATIONS: "1" script: - echo "Entering script section" - echo "python used, $Env:PYTHON" @@ -554,8 +573,6 @@ # this is the only one we need to test. However testing that build work on all # version is useful and match what we do with Linux. # -# CIBW_SKIP is set globally at the start of the file. See comment there. -# # The weird directory structure match the one we use for Linux to deal with the # multiple jobs. (all this might be unnecessary) build-c-wheel-macos: @@ -567,15 +584,17 @@ when: manual # avoid overloading the CI by default allow_failure: true stage: build + variables: + # TODO: drop this when CI system is updated to support arm64 builds + CIBW_ARCHS: "x86_64" tags: - macos - variables: - MERCURIAL_SETUP_FORCE_TRANSLATIONS: "1" script: - PLATFORM=`$PYTHON -c 'import sys; print(sys.platform)'` - rm -rf tmp-wheels - cibuildwheel --output-dir tmp-wheels/ - - for py_version in cp38-cp38 cp39-cp39 cp310-cp310 cp311-cp311 cp312-cp312 cp313-cp313; do + - for py_version in $(cibuildwheel --print-build-identifiers | egrep -o 'cp[0-9]+' | sort | uniq); do + py_version="${py_version}-${py_version}"; mkdir -p wheels/$PLATFORM/c/$py_version/; mv tmp-wheels/*$py_version*.whl wheels/$PLATFORM/c/$py_version/; done
--- a/contrib/hgperf Fri Feb 28 23:25:42 2025 +0100 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,112 +0,0 @@ -#!/usr/bin/env python3 -# -# hgperf - measure performance of Mercurial commands -# -# Copyright 2014 Olivia Mackall <olivia@selenic.com> -# -# This software may be used and distributed according to the terms of the -# GNU General Public License version 2 or any later version. - -'''measure performance of Mercurial commands - -Using ``hgperf`` instead of ``hg`` measures performance of the target -Mercurial command. For example, the execution below measures -performance of :hg:`heads --topo`:: - - $ hgperf heads --topo - -All command output via ``ui`` is suppressed, and just measurement -result is displayed: see also "perf" extension in "contrib". - -Costs of processing before dispatching to the command function like -below are not measured:: - - - parsing command line (e.g. option validity check) - - reading configuration files in - -But ``pre-`` and ``post-`` hook invocation for the target command is -measured, even though these are invoked before or after dispatching to -the command function, because these may be required to repeat -execution of the target command correctly. -''' - -import os -import sys - -libdir = '@LIBDIR@' - -if libdir != '@' 'LIBDIR' '@': - if not os.path.isabs(libdir): - libdir = os.path.join( - os.path.dirname(os.path.realpath(__file__)), libdir - ) - libdir = os.path.abspath(libdir) - sys.path.insert(0, libdir) - -# enable importing on demand to reduce startup time -try: - from mercurial import demandimport - - demandimport.enable() -except ImportError: - import sys - - sys.stderr.write( - "abort: couldn't find mercurial libraries in [%s]\n" - % ' '.join(sys.path) - ) - sys.stderr.write("(check your install and PYTHONPATH)\n") - sys.exit(-1) - -from mercurial import ( - dispatch, - util, -) - - -def timer(func, title=None): - results = [] - begin = util.timer() - count = 0 - while True: - ostart = os.times() - cstart = util.timer() - r = func() - cstop = util.timer() - ostop = os.times() - count += 1 - a, b = ostart, ostop - results.append((cstop - cstart, b[0] - a[0], b[1] - a[1])) - if cstop - begin > 3 and count >= 100: - break - if cstop - begin > 10 and count >= 3: - break - if title: - sys.stderr.write("! %s\n" % title) - if r: - sys.stderr.write("! result: %s\n" % r) - m = min(results) - sys.stderr.write( - "! wall %f comb %f user %f sys %f (best of %d)\n" - % (m[0], m[1] + m[2], m[1], m[2], count) - ) - - -orgruncommand = dispatch.runcommand - - -def runcommand(lui, repo, cmd, fullargs, ui, options, d, cmdpats, cmdoptions): - ui.pushbuffer() - lui.pushbuffer() - timer( - lambda: orgruncommand( - lui, repo, cmd, fullargs, ui, options, d, cmdpats, cmdoptions - ) - ) - ui.popbuffer() - lui.popbuffer() - - -dispatch.runcommand = runcommand - -dispatch.run()
--- a/contrib/import-checker.py Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/import-checker.py Fri Feb 28 23:28:10 2025 +0100 @@ -26,6 +26,8 @@ 'mercurial.hgweb.request', 'mercurial.i18n', 'mercurial.interfaces', + 'mercurial.interfaces._basetypes', + 'mercurial.interfaces.types', 'mercurial.node', 'mercurial.pycompat', # for revlog to re-export constant to extensions
--- a/contrib/packaging/Makefile Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/packaging/Makefile Fri Feb 28 23:28:10 2025 +0100 @@ -19,10 +19,10 @@ 9 # Build a Python for these RHEL (and derivatives) releases. -RHEL_WITH_PYTHON_RELEASES := +RHEL_WITH_PYTHON_RELEASES := 8 RHEL_WITH_NONVERSIONED_PYTHON := -RHEL_WITH_36_DOCUTILS := 7 +.PHONY: help help: @echo 'Packaging Make Targets' @echo '' @@ -62,8 +62,6 @@ @echo 'fedora' @echo ' Build an RPM for Fedora $(FEDORA_RELEASE) locally' -.PHONY: help - .PHONY: deb deb: ./builddeb @@ -114,13 +112,13 @@ .PHONY: rhel$(1) rhel$(1): mkdir -p $$(HGROOT)/packages/rhel$(1) - ./buildrpm $$(if $$(filter $(1),$$(RHEL_WITH_PYTHON_RELEASES)),--withpython,$$(if $$(filter $(1),$$(RHEL_WITH_NONVERSIONED_PYTHON)),--python python,))$$(if $$(filter $(1),$$(RHEL_WITH_36_DOCUTILS)), --docutilspackage python36-docutils,) + ./buildrpm $$(if $$(filter $(1),$$(RHEL_WITH_PYTHON_RELEASES)),--withpython,$$(if $$(filter $(1),$$(RHEL_WITH_NONVERSIONED_PYTHON)),--python python,)) cp $$(HGROOT)/contrib/packaging/rpmbuild/RPMS/*/* $$(HGROOT)/packages/rhel$(1) cp $$(HGROOT)/contrib/packaging/rpmbuild/SRPMS/* $$(HGROOT)/packages/rhel$(1) .PHONY: docker-rhel$(1) docker-rhel$(1): - ./dockerrpm rhel$(1) $$(if $$(filter $(1),$$(RHEL_WITH_PYTHON_RELEASES)),--withpython,$$(if $$(filter $(1),$$(RHEL_WITH_NONVERSIONED_PYTHON)),--python python,))$$(if $$(filter $(1),$$(RHEL_WITH_36_DOCUTILS)), --docutilspackage python36-docutils,) + ./dockerrpm rhel$(1) $$(if $$(filter $(1),$$(RHEL_WITH_PYTHON_RELEASES)),--withpython,$$(if $$(filter $(1),$$(RHEL_WITH_NONVERSIONED_PYTHON)),--python python,)) endef
--- a/contrib/packaging/build-linux-wheels.sh Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/packaging/build-linux-wheels.sh Fri Feb 28 23:28:10 2025 +0100 @@ -11,8 +11,6 @@ PYTHON_TARGETS="cp38-cp38 cp39-cp39 cp310-cp310 cp311-cp311 cp312-cp312 cp313-cp313" -export MERCURIAL_SETUP_FORCE_TRANSLATIONS=1 - # We need to copy the repository to ensure: # (1) we don't wrongly write roots files in the repository (or any other wrong # users)
--- a/contrib/packaging/build-macos-wheels.sh Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/packaging/build-macos-wheels.sh Fri Feb 28 23:28:10 2025 +0100 @@ -17,19 +17,11 @@ set -e -# Build translations; requires msgfmt on PATH. -export MERCURIAL_SETUP_FORCE_TRANSLATIONS=1 - if ! which msgfmt 2>/dev/null 1>/dev/null; then echo "msgfmt executable not found" >&2 exit 1 fi -# Prevent building pypy wheels, which is broken. -export CIBW_SKIP=pp* - -export CIBW_ARCHS=universal2 - # TODO: purge the repo? cibuildwheel --output-dir dist/wheels
--- a/contrib/packaging/build-windows-wheels.bat Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/packaging/build-windows-wheels.bat Fri Feb 28 23:28:10 2025 +0100 @@ -9,29 +9,6 @@ REM - None of the variable set here live past this script exiting. setlocal -REM - Build translations; requires msgfmt.exe on PATH. -set MERCURIAL_SETUP_FORCE_TRANSLATIONS=1 - -REM - Prevent building pypy wheels, which is broken. -set CIBW_SKIP=pp* - -REM - Disable warning about not being able to test without an arm64 runner. -set CIBW_TEST_SKIP=*-win_arm64 - - -REM - arm64 support starts with py39, but the first arm64 installer wasn't -REM - available until py311, so skip arm64 on the older, EOL versions. -set CIBW_ARCHS=x86 AMD64 -set CIBW_BUILD=cp38-* cp39-* cp310-* - cibuildwheel --output-dir dist/wheels if %errorlevel% neq 0 exit /b %errorlevel% - - -set CIBW_ARCHS=x86 AMD64 ARM64 -set CIBW_BUILD=cp311-* cp312-* cp313-* - -cibuildwheel --output-dir dist/wheels - -if %errorlevel% neq 0 exit /b %errorlevel%
--- a/contrib/packaging/buildrpm Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/packaging/buildrpm Fri Feb 28 23:28:10 2025 +0100 @@ -7,7 +7,6 @@ BUILD=1 RPMBUILDDIR="$PWD/rpmbuild" PYTHONEXE=python3 -DOCUTILSPACKAGE=python3-docutils while [ "$1" ]; do case "$1" in @@ -22,14 +21,9 @@ ;; --withpython | --with-python) shift - PYTHONVER=2.7.16 - PYTHONMD5=f1a2ace631068444831d01485466ece0 - PYTHONEXE=python - ;; - --docutilspackage) - shift - DOCUTILSPACKAGE="$1" - shift + PYTHONVER=3.13.1 + PYTHONMD5=6820ac52d77af870f795dabc64583234 + PYTHONEXE=python3 ;; --rpmbuilddir ) shift @@ -88,14 +82,6 @@ fi ln -f $PYTHON_SRCFILE $RPMBUILDDIR/SOURCES/$PYTHON_SRCFILE - DOCUTILSVER=`sed -ne "s/^%global docutilsname docutils-//p" $specfile` - DOCUTILS_SRCFILE=docutils-$DOCUTILSVER.tar.gz - [ -f $DOCUTILS_SRCFILE ] || curl -Lo $DOCUTILS_SRCFILE http://downloads.sourceforge.net/project/docutils/docutils/$DOCUTILSVER/$DOCUTILS_SRCFILE - DOCUTILSMD5=`sed -ne "s/^%global docutilsmd5 //p" $specfile` - if [ "$DOCUTILSMD5" ]; then - echo "$DOCUTILSMD5 $DOCUTILS_SRCFILE" | md5sum -w -c - fi - ln -f $DOCUTILS_SRCFILE $RPMBUILDDIR/SOURCES/$DOCUTILS_SRCFILE ) fi @@ -155,9 +141,6 @@ sed -i \ -e "s/^%define withpython.*$/%define withpython $RPMPYTHONVER/" \ $rpmspec -sed -i \ - -e "s/^%global pythondocutils.*$/%global pythondocutils $DOCUTILSPACKAGE/" \ - $rpmspec if [ "$BUILD" ]; then rpmbuild --define "_topdir $RPMBUILDDIR" -ba $rpmspec --clean
--- a/contrib/packaging/docker/rhel8 Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/packaging/docker/rhel8 Fri Feb 28 23:28:10 2025 +0100 @@ -4,12 +4,19 @@ useradd -u %UID% -g %GID% -s /bin/bash -d /build -m build RUN yum install -y \ + bzip2-devel \ gcc \ gettext \ + libffi-devel \ make \ + ncurses-devel \ + openssl-devel \ python3-devel \ python3-docutils \ - rpm-build + readline-devel \ + rpm-build \ + sqlite-devel \ + zlib-devel # For creating repo meta data RUN yum install -y createrepo
--- a/contrib/packaging/docker/rhel9 Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/packaging/docker/rhel9 Fri Feb 28 23:28:10 2025 +0100 @@ -13,7 +13,8 @@ make \ python3-devel \ python3-docutils \ - rpm-build + rpm-build \ + which # For creating repo meta data RUN yum install -y createrepo
--- a/contrib/packaging/hg-docker Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/packaging/hg-docker Fri Feb 28 23:28:10 2025 +0100 @@ -82,7 +82,7 @@ p = subprocess.Popen(args, stdin=subprocess.PIPE) p.communicate(input=dockerfile) if p.returncode: - raise subprocess.CalledProcessException( + raise subprocess.CalledProcessError( p.returncode, 'failed to build docker image: %s %s' % (p.stdout, p.stderr), )
--- a/contrib/packaging/hgpackaging/util.py Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/packaging/hgpackaging/util.py Fri Feb 28 23:28:10 2025 +0100 @@ -182,7 +182,7 @@ p = source_dir / 'mercurial' / '__version__.py' with p.open('r', encoding='utf-8') as fh: - m = re.search('version = b"([^"]+)"', fh.read(), re.MULTILINE) + m = re.search("version = '([^']+)'", fh.read(), re.MULTILINE) if not m: raise Exception('could not parse %s' % p)
--- a/contrib/packaging/mercurial.spec Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/packaging/mercurial.spec Fri Feb 28 23:28:10 2025 +0100 @@ -3,21 +3,16 @@ %define withpython %{nil} %global pythonexe python3 -%global pythondocutils python3-docutils %if "%{?withpython}" - %global pythonver %{withpython} %global pythonname Python-%{withpython} -%global docutilsname docutils-0.14 -%global docutilsmd5 c53768d63db3873b7d452833553469de %global pythonhg python-hg %global hgpyprefix /opt/%{pythonhg} # byte compilation will fail on some some Python /test/ files %global _python_bytecompile_errors_terminate_build 0 %else - %global pythonver %(%{pythonexe} -c 'import sys;print(".".join(map(str, sys.version_info[:2])))') %endif @@ -33,7 +28,6 @@ Source0: %{name}-%{version}-%{release}.tar.gz %if "%{?withpython}" Source1: %{pythonname}.tgz -Source2: %{docutilsname}.tar.gz %endif BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root @@ -41,7 +35,7 @@ %if "%{?withpython}" BuildRequires: readline-devel, openssl-devel, ncurses-devel, zlib-devel, bzip2-devel %else -BuildRequires: %{pythonexe} >= %{pythonver}, %{pythonexe}-devel, %{pythondocutils} +BuildRequires: %{pythonexe} >= %{pythonver}, %{pythonexe}-devel, python3-docutils Requires: %{pythonexe} >= %{pythonver} %endif # The hgk extension uses the wish tcl interpreter, but we don't enforce it @@ -54,9 +48,7 @@ %prep %if "%{?withpython}" -%setup -q -n mercurial-%{version}-%{release} -a1 -a2 -# despite the comments in cgi.py, we do this to prevent rpmdeps from picking /usr/local/bin/python up -sed -i '1c#! /usr/bin/env %{pythonexe}' %{pythonname}/Lib/cgi.py +%setup -q -n mercurial-%{version}-%{release} -a1 %else %setup -q -n mercurial-%{version}-%{release} %endif @@ -68,30 +60,35 @@ %if "%{?withpython}" PYPATH=$PWD/%{pythonname} +PYTHON_FULLPATH=$PYPATH/python3 cd $PYPATH -./configure --prefix=%{hgpyprefix} +./configure --prefix=%{hgpyprefix} --with-ensurepip=install --enable-optimizations --with-lto make all %{?_smp_mflags} -cd - - -cd %{docutilsname} -LD_LIBRARY_PATH=$PYPATH $PYPATH/python setup.py build +# add a symlink and only refer to python3 from here on +ln -s python python3 +# remove python reference +sed -i 's|#!/usr/bin/env python|#!/usr/bin/env python3|' Lib/encodings/rot_13.py +$PYTHON_FULLPATH -m ensurepip --default-pip +$PYTHON_FULLPATH -m pip install setuptools setuptools-scm docutils cd - # verify Python environment -LD_LIBRARY_PATH=$PYPATH PYTHONPATH=$PWD/%{docutilsname} $PYPATH/python -c 'import sys, zlib, bz2, ssl, curses, readline' +LD_LIBRARY_PATH=$PYPATH $PYTHON_FULLPATH -c 'import sys, zlib, bz2, ssl, curses, readline' +LD_LIBRARY_PATH=$PYPATH $PYTHON_FULLPATH -c "import ssl; print(ssl.HAS_TLSv1_2)" +LD_LIBRARY_PATH=$PYPATH $PYTHON_FULLPATH -c "import docutils" # set environment for make export PATH=$PYPATH:$PATH export LD_LIBRARY_PATH=$PYPATH export CFLAGS="-L $PYPATH" -export PYTHONPATH=$PWD/%{docutilsname} - +%else +PYTHON_FULLPATH=$(which python3) +$PYTHON_FULLPATH -m pip install pip setuptools setuptools-scm packaging --upgrade %endif -make all PYTHON=%{pythonexe} make -C contrib/chg -sed -i -e '1s|#!/usr/bin/env python$|#!/usr/bin/env %{pythonexe}|' contrib/hg-ssh +sed -i -e '1s|#!/usr/bin/env python3$|#!/usr/bin/env %{pythonexe}|' contrib/hg-ssh %install rm -rf $RPM_BUILD_ROOT @@ -101,24 +98,21 @@ %if "%{?withpython}" PYPATH=$PWD/%{pythonname} +PYTHON_FULLPATH=$PYPATH/python3 cd $PYPATH make install DESTDIR=$RPM_BUILD_ROOT # these .a are not necessary and they are readonly and strip fails - kill them! rm -f %{buildroot}%{hgpyprefix}/lib/{,python2.*/config}/libpython2.*.a cd - -cd %{docutilsname} -LD_LIBRARY_PATH=$PYPATH $PYPATH/python setup.py install --root="$RPM_BUILD_ROOT" -cd - - -PATH=$PYPATH:$PATH LD_LIBRARY_PATH=$PYPATH make install PYTHON=%{pythonexe} DESTDIR=$RPM_BUILD_ROOT PREFIX=%{hgpyprefix} MANDIR=%{_mandir} PURE="--rust" +PATH=$PYPATH:$PATH LD_LIBRARY_PATH=$PYPATH make install PYTHON=$PYTHON_FULLPATH DESTDIR=$RPM_BUILD_ROOT PIP_PREFIX=$RPM_BUILD_ROOT/%{hgpyprefix} PREFIX=$RPM_BUILD_ROOT/%{hgpyprefix} MANDIR=%{_mandir} PURE="--rust" mkdir -p $RPM_BUILD_ROOT%{_bindir} ( cd $RPM_BUILD_ROOT%{_bindir}/ && ln -s ../..%{hgpyprefix}/bin/hg . ) ( cd $RPM_BUILD_ROOT%{_bindir}/ && ln -s ../..%{hgpyprefix}/bin/python2.? %{pythonhg} ) %else - -make install PYTHON=%{pythonexe} DESTDIR=$RPM_BUILD_ROOT PREFIX=%{_prefix} MANDIR=%{_mandir} PURE="--rust" +PYTHON_FULLPATH=$(which python3) +make install PYTHON=$PYTHON_FULLPATH DESTDIR=$RPM_BUILD_ROOT PREFIX=$RPM_BUILD_ROOT/%{_prefix} MANDIR=%{_mandir} PURE="--rust" %endif @@ -140,12 +134,6 @@ %doc CONTRIBUTORS COPYING doc/README doc/hg*.txt doc/hg*.html *.cgi contrib/*.fcgi contrib/*.wsgi %doc %attr(644,root,root) %{_mandir}/man?/hg* %doc %attr(644,root,root) contrib/*.svg -%dir %{_datadir}/bash-completion/ -%dir %{_datadir}/bash-completion/completions -%{_datadir}/bash-completion/completions/hg -%dir %{_datadir}/zsh/ -%dir %{_datadir}/zsh/site-functions/ -%{_datadir}/zsh/site-functions/_hg %dir %{_datadir}/emacs/site-lisp/ %{_datadir}/emacs/site-lisp/mercurial.el %{_datadir}/emacs/site-lisp/mq.el @@ -159,9 +147,11 @@ %{_bindir}/%{pythonhg} %{hgpyprefix} %else -%{_libdir}/python%{pythonver}/site-packages/%{name}-*-py%{pythonver}.egg-info +%{_libdir}/python%{pythonver}/site-packages/mercurial-*.dist-info/ %{_libdir}/python%{pythonver}/site-packages/%{name} %{_libdir}/python%{pythonver}/site-packages/hgext %{_libdir}/python%{pythonver}/site-packages/hgext3rd %{_libdir}/python%{pythonver}/site-packages/hgdemandimport +/usr/share/bash-completion/completions/hg +/usr/share/zsh/site-functions/_hg %endif
--- a/contrib/packaging/packagelib.sh Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/packaging/packagelib.sh Fri Feb 28 23:28:10 2025 +0100 @@ -8,6 +8,15 @@ # # node: the node|short hg was built from, or empty if built from a tag gethgversion() { + # allow passing the version from an environment variable + # in case of builds on an older platform that cannot build hg + if [ ! -z "$USELOCALHG" ]; then + HG="$(which hg)" + version=$($HG log -r . --template "{latesttag}") + distance=$($HG log -r . --template "{latesttagdistance}") + node=$($HG log -r . --template "{node|short}") + return + fi if [ -z "${1+x}" ]; then python="python3" else
--- a/contrib/packaging/requirements.txt Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/packaging/requirements.txt Fri Feb 28 23:28:10 2025 +0100 @@ -1,8 +1,8 @@ # -# This file is autogenerated by pip-compile with python 3.7 -# To update, run: +# This file is autogenerated by pip-compile with Python 3.9 +# by the following command: # -# pip-compile --generate-hashes --no-reuse-hashes --output-file=contrib/packaging/requirements.txt contrib/packaging/requirements.txt.in +# pip-compile --allow-unsafe --generate-hashes --no-reuse-hashes --output-file=contrib/packaging/requirements.txt contrib/packaging/requirements.txt.in # docutils==0.16 \ --hash=sha256:0c5b78adfbf7762415433f5515cd5c9e762339e23369dbe8000d84a4bf4ab3af \ @@ -66,3 +66,61 @@ --hash=sha256:e8313f01ba26fbbe36c7be1966a7b7424942f670f38e666995b88d012765b9be \ --hash=sha256:feb7b34d6325451ef96bc0e36e1a6c0c1c64bc1fbec4b854f4529e51887b1621 # via jinja2 +packaging==24.2 \ + --hash=sha256:09abb1bccd265c01f4a3aa3f7a7db064b36514d2cba19a2f694fe6150451a759 \ + --hash=sha256:c228a6dc5e932d346bc5739379109d49e8853dd8223571c7c5b55260edc0b97f + # via setuptools-scm +setuptools-scm==8.2.0 \ + --hash=sha256:136e2b1d393d709d2bcf26f275b8dec06c48b811154167b0fd6bb002aad17d6d \ + --hash=sha256:a18396a1bc0219c974d1a74612b11f9dce0d5bd8b1dc55c65f6ac7fd609e8c28 + # via -r contrib/packaging/requirements.txt.in +tomli==2.2.1 \ + --hash=sha256:023aa114dd824ade0100497eb2318602af309e5a55595f76b626d6d9f3b7b0a6 \ + --hash=sha256:02abe224de6ae62c19f090f68da4e27b10af2b93213d36cf44e6e1c5abd19fdd \ + --hash=sha256:286f0ca2ffeeb5b9bd4fcc8d6c330534323ec51b2f52da063b11c502da16f30c \ + --hash=sha256:2d0f2fdd22b02c6d81637a3c95f8cd77f995846af7414c5c4b8d0545afa1bc4b \ + --hash=sha256:33580bccab0338d00994d7f16f4c4ec25b776af3ffaac1ed74e0b3fc95e885a8 \ + --hash=sha256:400e720fe168c0f8521520190686ef8ef033fb19fc493da09779e592861b78c6 \ + --hash=sha256:40741994320b232529c802f8bc86da4e1aa9f413db394617b9a256ae0f9a7f77 \ + --hash=sha256:465af0e0875402f1d226519c9904f37254b3045fc5084697cefb9bdde1ff99ff \ + --hash=sha256:4a8f6e44de52d5e6c657c9fe83b562f5f4256d8ebbfe4ff922c495620a7f6cea \ + --hash=sha256:4e340144ad7ae1533cb897d406382b4b6fede8890a03738ff1683af800d54192 \ + --hash=sha256:678e4fa69e4575eb77d103de3df8a895e1591b48e740211bd1067378c69e8249 \ + --hash=sha256:6972ca9c9cc9f0acaa56a8ca1ff51e7af152a9f87fb64623e31d5c83700080ee \ + --hash=sha256:7fc04e92e1d624a4a63c76474610238576942d6b8950a2d7f908a340494e67e4 \ + --hash=sha256:889f80ef92701b9dbb224e49ec87c645ce5df3fa2cc548664eb8a25e03127a98 \ + --hash=sha256:8d57ca8095a641b8237d5b079147646153d22552f1c637fd3ba7f4b0b29167a8 \ + --hash=sha256:8dd28b3e155b80f4d54beb40a441d366adcfe740969820caf156c019fb5c7ec4 \ + --hash=sha256:9316dc65bed1684c9a98ee68759ceaed29d229e985297003e494aa825ebb0281 \ + --hash=sha256:a198f10c4d1b1375d7687bc25294306e551bf1abfa4eace6650070a5c1ae2744 \ + --hash=sha256:a38aa0308e754b0e3c67e344754dff64999ff9b513e691d0e786265c93583c69 \ + --hash=sha256:a92ef1a44547e894e2a17d24e7557a5e85a9e1d0048b0b5e7541f76c5032cb13 \ + --hash=sha256:ac065718db92ca818f8d6141b5f66369833d4a80a9d74435a268c52bdfa73140 \ + --hash=sha256:b82ebccc8c8a36f2094e969560a1b836758481f3dc360ce9a3277c65f374285e \ + --hash=sha256:c954d2250168d28797dd4e3ac5cf812a406cd5a92674ee4c8f123c889786aa8e \ + --hash=sha256:cb55c73c5f4408779d0cf3eef9f762b9c9f147a77de7b258bef0a5628adc85cc \ + --hash=sha256:cd45e1dc79c835ce60f7404ec8119f2eb06d38b1deba146f07ced3bbc44505ff \ + --hash=sha256:d3f5614314d758649ab2ab3a62d4f2004c825922f9e370b29416484086b264ec \ + --hash=sha256:d920f33822747519673ee656a4b6ac33e382eca9d331c87770faa3eef562aeb2 \ + --hash=sha256:db2b95f9de79181805df90bedc5a5ab4c165e6ec3fe99f970d0e302f384ad222 \ + --hash=sha256:e59e304978767a54663af13c07b3d1af22ddee3bb2fb0618ca1593e4f593a106 \ + --hash=sha256:e85e99945e688e32d5a35c1ff38ed0b3f41f43fad8df0bdf79f72b2ba7bc5272 \ + --hash=sha256:ece47d672db52ac607a3d9599a9d48dcb2f2f735c6c2d1f34130085bb12b112a \ + --hash=sha256:f4039b9cbc3048b2416cc57ab3bda989a6fcf9b36cf8937f01a6e731b64f80d7 + # via setuptools-scm +typing-extensions==4.12.2 \ + --hash=sha256:04e5ca0351e0f3f85c6853954072df659d0d13fac324d0072316b67d7794700d \ + --hash=sha256:1a7ead55c7e559dd4dee8856e3a88b41225abfe1ce8df57b7c13915fe121ffb8 + # via setuptools-scm +wheel==0.45.1 \ + --hash=sha256:661e1abd9198507b1409a20c02106d9670b2576e916d58f520316666abca6729 \ + --hash=sha256:708e7481cc80179af0e556bbf0cc00b8444c7321e2700b8d8580231d13017248 + # via -r contrib/packaging/requirements.txt.in + +# The following packages are considered to be unsafe in a requirements file: +setuptools==75.8.0 \ + --hash=sha256:c5afc8f407c626b8313a86e10311dd3f661c6cd9c09d4bf8c15c0e11f9f2b0e6 \ + --hash=sha256:e3982f444617239225d675215d51f6ba05f845d4eec313da4418fdbb56fb27e3 + # via + # -r contrib/packaging/requirements.txt.in + # setuptools-scm
--- a/contrib/packaging/requirements.txt.in Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/packaging/requirements.txt.in Fri Feb 28 23:28:10 2025 +0100 @@ -3,3 +3,8 @@ docutils jinja2 + +# Keep these in sync with pyproject.toml +setuptools >= 64 +setuptools-scm >= 8.1.0 +wheel
--- a/contrib/perf.py Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/perf.py Fri Feb 28 23:28:10 2025 +0100 @@ -2179,7 +2179,21 @@ @command( b'perf::stream-consume', - formatteropts, + [ + ( + b'', + b'in-memory-bundle', + False, + b'load the full bundle in userspace memory before proceeding', + ), + ( + b'', + b'unbundle-progress', + False, + b"compute and display progress during stream processing", + ), + ] + + formatteropts, ) def perf_stream_clone_consume(ui, repo, filename, **opts): """benchmark the full application of a stream clone @@ -2217,7 +2231,7 @@ if not (os.path.isfile(filename) and os.access(filename, os.R_OK)): raise error.Abort("not a readable file: %s" % filename) - run_variables = [None, None] + run_variables = [None, None, None] # we create the new repository next to the other one for two reasons: # - this way we use the same file system, which are relevant for benchmark @@ -2226,21 +2240,30 @@ @contextlib.contextmanager def context(): - with open(filename, mode='rb') as bundle: + with open(filename, mode='rb', buffering=0) as bundle: + bundle_name = bundle.name + if opts.get(b'in_memory_bundle'): + # you hate memory, don't you? + import io + + bundle = io.BytesIO(bundle.read()) with tempfile.TemporaryDirectory( prefix=b'hg-perf-stream-consume-', dir=source_repo_dir, ) as tmp_dir: tmp_dir = fsencode(tmp_dir) run_variables[0] = bundle - run_variables[1] = tmp_dir + run_variables[1] = bundle_name + run_variables[2] = tmp_dir yield run_variables[0] = None run_variables[1] = None + run_variables[2] = None def runone(): bundle = run_variables[0] - tmp_dir = run_variables[1] + bundle_name = run_variables[1] + tmp_dir = run_variables[2] # we actually wants to copy all config to ensure the repo config is # taken in account during the benchmark @@ -2250,7 +2273,12 @@ new_ui, tmp_dir, requirements=repo.requirements ) target = hg.repository(new_ui, tmp_dir) - gen = exchange.readbundle(target.ui, bundle, bundle.name) + # we don't need to use a config override here because this is a + # dedicated UI object for the disposable repository create for the + # benchmark. + show_progress = bool(opts.get("show_progress")) + target.ui.setconfig(b"progress", b"disable", not show_progress) + gen = exchange.readbundle(target.ui, bundle, bundle_name) # stream v1 if util.safehasattr(gen, 'apply'): gen.apply(target) @@ -3285,6 +3313,14 @@ if parse_index_v1 is None: parse_index_v1 = mercurial.revlog.revlogio().parseindex + uses_generaldelta = "uses_generaldelta" in getargspec(parse_index_v1).args + if uses_generaldelta is not None: + # Mercurial 7.0 and above + # This test isn't affected by generaldelta at all, so just pass `False` + parse_index_v1 = functools.partial( + parse_index_v1, uses_generaldelta=False + ) + rllen = len(rl) node0 = rl.node(0)
--- a/contrib/python-zstandard/setup.py Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/python-zstandard/setup.py Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,4 @@ -#!/usr/bin/env python +#!/usr/bin/env python3 # Copyright (c) 2016-present, Gregory Szorc # All rights reserved. #
--- a/contrib/win32/ReadMe.html Fri Feb 28 23:25:42 2025 +0100 +++ b/contrib/win32/ReadMe.html Fri Feb 28 23:28:10 2025 +0100 @@ -124,7 +124,7 @@ <p> If you are IRC-savvy, that's usually the fastest way to get - help. Go to <tt>#mercurial</tt> on <tt>irc.freenode.net</tt>. + help. Go to <tt>#mercurial</tt> on <tt>libra.chat</tt>. </p> <h1>Author and copyright information</h1>
--- a/doc/Makefile Fri Feb 28 23:25:42 2025 +0100 +++ b/doc/Makefile Fri Feb 28 23:28:10 2025 +0100 @@ -27,7 +27,8 @@ export HGENCODING=UTF-8 -.PHONY: all man html install clean knownrefs +.PHONY: all +all: man html # Generate a list of hg commands and extensions. commandlist.txt: $(GENDOC) @@ -43,6 +44,7 @@ mv $@.tmp $@ # Build target for running runrst more easily by hand +.PHONY: knownrefs knownrefs: commandlist.txt topiclist.txt extensionlist.txt BUILDFILES=commandlist.txt topiclist.txt extensionlist.txt @@ -56,16 +58,19 @@ define RuleAllCommandsTemplate HG_COMMANDS=$(1) +.PHONY: all-commands all-commands: $$(HG_COMMANDS:%=$$(BUILDDIR)/hg-%.gendoc.txt) endef define RuleAllTopicsTemplate HG_TOPICS=$(1) +.PHONY: all-topics all-topics: $$(HG_TOPICS:%=$$(BUILDDIR)/%.gendoc.txt) endef define RuleAllExtensionsTemplate HG_EXTENSIONS=$(1) +.PHONY: all-extensions all-extensions: $$(HG_EXTENSIONS:%=$$(BUILDDIR)/%.gendoc.txt) endef @@ -135,7 +140,7 @@ # - help topics: topic-foo (html), hgfoo (man) # - extensions: ext-foo (html), hgext-foo (man) # -# Man pages for commands are in section 1 (user commands), topics and +# Man pages for commands are in section 1 (user commands), topics and # extensions are in section 7 (miscellanea) # # NOTE: topics and extension are temporarily disabled for man pages because @@ -158,11 +163,10 @@ # Also add the HTML index page HTML+=$(HTMLOUT)/index.html - -all: man html - +.PHONY: man man: $(MAN) +.PHONY: html html: $(HTML) # This logic is duplicated in setup.py:hgbuilddoc() @@ -206,14 +210,8 @@ $(PYTHON) runrst html --hg-individual-pages $(RSTARGS) --halt warning \ --link-stylesheet --stylesheet-path style.css $(BUILDDIR)/$*.gendoc.txt $@ -MANIFEST: man html -# tracked files are already in the main MANIFEST - $(RM) $@ - for i in $(MAN) $(HTML); do \ - echo "doc/$$i" >> $@ ; \ - done - -install: man +.PHONY: install +install: man html for i in $(MAN) ; do \ subdir=`echo $$i | sed -n 's/^.*\.\([0-9]\)$$/man\1/p'` ; \ mkdir -p "$(DESTDIR)$(MANDIR)"/$$subdir ; \ @@ -224,5 +222,6 @@ # know anything about all the command/topic/extension targets and files. # $(HTML) only has the basic topics, so we need to delete $(HTMLOUT)/*.html and # other similar files "by hand" here. +.PHONY: clean clean: - $(RM) $(MAN) $(HTML) common.txt $(SOURCES) MANIFEST *.gendoc.txt $(BUILDFILES) $(BUILDDIR)/*.gendoc.* $(HTMLOUT)/*.html + $(RM) $(MAN) $(HTML) common.txt $(SOURCES) *.gendoc.txt $(BUILDFILES) $(BUILDDIR)/*.gendoc.* $(HTMLOUT)/*.html
--- a/hg Fri Feb 28 23:25:42 2025 +0100 +++ b/hg Fri Feb 28 23:28:10 2025 +0100 @@ -12,16 +12,6 @@ import os import sys -libdir = '@LIBDIR@' - -if libdir != '@' 'LIBDIR' '@': - if not os.path.isabs(libdir): - libdir = os.path.join( - os.path.dirname(os.path.realpath(__file__)), libdir - ) - libdir = os.path.abspath(libdir) - sys.path.insert(0, libdir) - # Make `pip install --user ...` packages available to the official Windows # build. Most py2 packaging installs directly into the system python # environment, so no changes are necessary for other platforms. The Windows
--- a/hgext/clonebundles.py Fri Feb 28 23:25:42 2025 +0100 +++ b/hgext/clonebundles.py Fri Feb 28 23:28:10 2025 +0100 @@ -773,7 +773,8 @@ inline = repo.ui.config(b'clone-bundles', b'auto-generate.serve-inline') basename = repo.vfs.basename(bundle.filepath) if inline: - dest_dir = repo.vfs.join(bundlecaches.BUNDLE_CACHE_DIR) + bundle_cache_root = repo.ui.config(b'server', b'peer-bundle-cache-root') + dest_dir = repo.vfs.join(bundle_cache_root) repo.vfs.makedirs(dest_dir) dest = repo.vfs.join(dest_dir, basename) util.copyfiles(bundle.filepath, dest, hardlink=True) @@ -815,10 +816,8 @@ repo.ui.debug(msg) if inline: - inline_path = repo.vfs.join( - bundlecaches.BUNDLE_CACHE_DIR, - bundle.basename, - ) + bundle_cache_root = repo.ui.config(b'server', b'peer-bundle-cache-root') + inline_path = repo.vfs.join(bundle_cache_root, bundle.basename) util.tryunlink(inline_path) else: cmd = repo.ui.config(b'clone-bundles', b'delete-command')
--- a/hgext/fix.py Fri Feb 28 23:25:42 2025 +0100 +++ b/hgext/fix.py Fri Feb 28 23:28:10 2025 +0100 @@ -120,6 +120,9 @@ mapping fixer tool names to lists of metadata values returned from executions that modified a file. This aggregates the same metadata previously passed to the "postfixfile" hook. + +You can specify a list of directories to search the tool command in using the +`fix.extra-bin-paths` configuration. """ from __future__ import annotations @@ -129,6 +132,7 @@ import os import re import subprocess +import sys from mercurial.i18n import _ from mercurial.node import ( @@ -143,6 +147,7 @@ cmdutil, context, copies, + encoding, error, logcmdutil, match as matchmod, @@ -193,6 +198,8 @@ # problem. configitem(b'fix', b'failure', default=b'continue') +configitem(b'fix', b'extra-bin-paths', default=list) + def checktoolfailureaction(ui, message, hint=None): """Abort with 'message' if fix.failure=abort""" @@ -678,6 +685,28 @@ ) +def _augmented_env(wvfs, extra_paths): + if os.supports_bytes_environ: + env = encoding.environ.copy() + raw_path = env.get(b'PATH', b'') + extra_paths = [wvfs.join(i) for i in extra_paths] + path_items = extra_paths + raw_path.split(pycompat.ospathsep) + env[b'PATH'] = pycompat.ospathsep.join(path_items) + else: + path_encoding = sys.getfilesystemencoding() or sys.getdefaultencoding() + extra_str = [p.decode(path_encoding, 'ignore') for p in extra_paths] + base = wvfs.base.decode(path_encoding, 'ignore') + extra_str = [os.path.join(base, i) for i in extra_str] + + # use (os).xxx to bypass checkcode complains. We are doing this + # unicode access on purpose. + env = (os).environ.copy() + raw_path = env.get('PATH', '') + path_items = extra_str + raw_path.split((os).pathsep) + env['PATH'] = (os).pathsep.join(path_items) + return env + + def fixfile(ui, repo, opts, fixers, fixctx, path, basepaths, basectxs): """Run any configured fixers that should affect the file in this context @@ -692,6 +721,11 @@ """ metadata = {} newdata = fixctx[path].data() + + env = None + extra_paths = ui.configlist(b'fix', b'extra-bin-paths') + if extra_paths: + env = _augmented_env(repo.wvfs, extra_paths) for fixername, fixer in fixers.items(): if fixer.affects(opts, fixctx, path): ranges = lineranges( @@ -711,6 +745,7 @@ stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, + env=env, ) stdout, stderr = proc.communicate(newdata) if stderr:
--- a/hgext/git/dirstate.py Fri Feb 28 23:25:42 2025 +0100 +++ b/hgext/git/dirstate.py Fri Feb 28 23:28:10 2025 +0100 @@ -13,6 +13,10 @@ Tuple, ) +from mercurial.interfaces.types import ( + MatcherT, + TransactionT, +) from mercurial.node import sha1nodeconstants from mercurial import ( dirstatemap, @@ -163,7 +167,7 @@ def status( self, - match: matchmod.basematcher, + match: MatcherT, subrepos: bool, ignored: bool, clean: bool, @@ -316,7 +320,7 @@ ) -> None: raise NotImplementedError - def write(self, tr: Optional[intdirstate.TransactionT]) -> None: + def write(self, tr: Optional[TransactionT]) -> None: # TODO: call parent change callbacks if tr: @@ -336,7 +340,7 @@ r = util.pathto(self._root, cwd, f) return r - def matches(self, match: matchmod.basematcher) -> Iterable[bytes]: + def matches(self, match: MatcherT) -> Iterable[bytes]: for x in self.git.index: p = pycompat.fsencode(x.path) if match(p): @@ -354,7 +358,7 @@ def walk( self, - match: matchmod.basematcher, + match: MatcherT, subrepos: Any, unknown: bool, ignored: bool, @@ -455,7 +459,7 @@ self._plchangecallbacks[category] = callback def setbranch( - self, branch: bytes, transaction: Optional[intdirstate.TransactionT] + self, branch: bytes, transaction: Optional[TransactionT] ) -> None: raise error.Abort( b'git repos do not support branches. try using bookmarks'
--- a/hgext/git/gitlog.py Fri Feb 28 23:25:42 2025 +0100 +++ b/hgext/git/gitlog.py Fri Feb 28 23:28:10 2025 +0100 @@ -149,7 +149,7 @@ 'SELECT node FROM changelog WHERE rev = ?', (r,) ).fetchone() if t is None: - raise error.LookupError(r, b'00changelog.i', _(b'no rev')) + raise error.LookupError(b'%d' % r, b'00changelog.i', _(b'no rev')) return bin(t[0]) def synthetic(self, n): @@ -487,7 +487,7 @@ 'SELECT p1, p2 FROM changelog WHERE rev = ?', (rev,) ).fetchone() if t is None: - raise error.LookupError(rev, b'00changelog.i', _(b'no rev')) + raise error.LookupError(b'%d' % rev, b'00changelog.i', _(b'no rev')) return self.rev(bin(t[0])), self.rev(bin(t[1])) # Private method is used at least by the tags code.
--- a/hgext/git/manifest.py Fri Feb 28 23:25:42 2025 +0100 +++ b/hgext/git/manifest.py Fri Feb 28 23:28:10 2025 +0100 @@ -11,6 +11,7 @@ ) from mercurial.node import sha1nodeconstants +from mercurial.interfaces.types import MatcherT from mercurial import ( match as matchmod, @@ -277,7 +278,7 @@ ) -> tuple[ByteString, ByteString]: raise NotImplementedError # TODO: implement this - def _walkonetree(self, tree, match, subdir): + def _walkonetree(self, tree, match, subdir) -> Iterator[bytes]: for te in tree: # TODO: can we prune dir walks with the matcher? realname = subdir + pycompat.fsencode(te.name) @@ -288,7 +289,7 @@ elif match(realname): yield pycompat.fsencode(realname) - def walk(self, match: matchmod.basematcher) -> Iterator[bytes]: + def walk(self, match: MatcherT) -> Iterator[bytes]: # TODO: this is a very lazy way to merge in the pending # changes. There is absolutely room for optimization here by # being clever about walking over the sets...
--- a/hgext/largefiles/overrides.py Fri Feb 28 23:25:42 2025 +0100 +++ b/hgext/largefiles/overrides.py Fri Feb 28 23:28:10 2025 +0100 @@ -19,6 +19,8 @@ from mercurial.i18n import _ +from mercurial.interfaces.types import MatcherT + from mercurial.hgweb import webcommands from mercurial import ( @@ -1232,7 +1234,7 @@ node, kind, decode=True, - match: Optional[matchmod.basematcher] = None, + match: Optional[MatcherT] = None, prefix=b'', mtime=None, subrepos=None, @@ -1336,9 +1338,9 @@ # allow only hgsubrepos to set this, instead of the current scheme # where the parent sets this for the child. with ( - hasattr(sub, '_repo') - and lfstatus(sub._repo) - or util.nullcontextmanager() + lfstatus(sub._repo) + if hasattr(sub, '_repo') + else util.nullcontextmanager() ): sub.archive(opencallback, subprefix, submatch) @@ -1347,9 +1349,7 @@ @eh.wrapfunction(subrepo.hgsubrepo, 'archive') -def hgsubrepoarchive( - orig, repo, opener, prefix, match: matchmod.basematcher, decode=True -): +def hgsubrepoarchive(orig, repo, opener, prefix, match: MatcherT, decode=True): lfenabled = hasattr(repo._repo, '_largefilesenabled') if not lfenabled or not repo._repo.lfstatus: return orig(repo, opener, prefix, match, decode) @@ -1410,9 +1410,9 @@ # would allow only hgsubrepos to set this, instead of the current scheme # where the parent sets this for the child. with ( - hasattr(sub, '_repo') - and lfstatus(sub._repo) - or util.nullcontextmanager() + lfstatus(sub._repo) + if hasattr(sub, '_repo') + else util.nullcontextmanager() ): sub.archive(opener, subprefix, submatch, decode)
--- a/hgext/remotefilelog/remotefilectx.py Fri Feb 28 23:25:42 2025 +0100 +++ b/hgext/remotefilelog/remotefilectx.py Fri Feb 28 23:28:10 2025 +0100 @@ -9,6 +9,7 @@ import collections import time +import typing from mercurial.node import bin, hex, nullrev from mercurial import ( @@ -20,6 +21,11 @@ ) from . import shallowutil +if typing.TYPE_CHECKING: + from typing import ( + Iterator, + ) + propertycache = util.propertycache FASTLOG_TIMEOUT_IN_SECS = 0.5 @@ -379,7 +385,7 @@ # the correct linknode. return False - def ancestors(self, followfirst=False): + def ancestors(self, followfirst=False) -> Iterator[remotefilectx]: ancestors = [] queue = collections.deque((self,)) seen = set()
--- a/hgext/sqlitestore.py Fri Feb 28 23:25:42 2025 +0100 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,1355 +0,0 @@ -# sqlitestore.py - Storage backend that uses SQLite -# -# Copyright 2018 Gregory Szorc <gregory.szorc@gmail.com> -# -# This software may be used and distributed according to the terms of the -# GNU General Public License version 2 or any later version. - -"""store repository data in SQLite (EXPERIMENTAL) - -The sqlitestore extension enables the storage of repository data in SQLite. - -This extension is HIGHLY EXPERIMENTAL. There are NO BACKWARDS COMPATIBILITY -GUARANTEES. This means that repositories created with this extension may -only be usable with the exact version of this extension/Mercurial that was -used. The extension attempts to enforce this in order to prevent repository -corruption. - -In addition, several features are not yet supported or have known bugs: - -* Only some data is stored in SQLite. Changeset, manifest, and other repository - data is not yet stored in SQLite. -* Transactions are not robust. If the process is aborted at the right time - during transaction close/rollback, the repository could be in an inconsistent - state. This problem will diminish once all repository data is tracked by - SQLite. -* Bundle repositories do not work (the ability to use e.g. - `hg -R <bundle-file> log` to automatically overlay a bundle on top of the - existing repository). -* Various other features don't work. - -This extension should work for basic clone/pull, update, and commit workflows. -Some history rewriting operations may fail due to lack of support for bundle -repositories. - -To use, activate the extension and set the ``storage.new-repo-backend`` config -option to ``sqlite`` to enable new repositories to use SQLite for storage. -""" - -# To run the test suite with repos using SQLite by default, execute the -# following: -# -# HGREPOFEATURES="sqlitestore" run-tests.py \ -# --extra-config-opt extensions.sqlitestore= \ -# --extra-config-opt storage.new-repo-backend=sqlite - -from __future__ import annotations - -import sqlite3 -import struct -import threading -import typing -import zlib - -from typing import ( - Iterable, - Iterator, - Optional, -) - -from mercurial.i18n import _ -from mercurial.node import ( - nullrev, - sha1nodeconstants, - short, -) -from mercurial.thirdparty import attr - -# Force pytype to use the non-vendored package -if typing.TYPE_CHECKING: - # noinspection PyPackageRequirements - import attr - -from mercurial import ( - ancestor, - dagop, - encoding, - error, - extensions, - localrepo, - mdiff, - pycompat, - registrar, - requirements, - util, - verify, -) -from mercurial.interfaces import ( - repository, -) -from mercurial.utils import ( - hashutil, - storageutil, -) - -try: - from mercurial import zstd # pytype: disable=import-error - - zstd.__version__ -except ImportError: - zstd = None - -configtable = {} -configitem = registrar.configitem(configtable) - -# experimental config: storage.sqlite.compression -configitem( - b'storage', - b'sqlite.compression', - default=b'zstd' if zstd else b'zlib', - experimental=True, -) - -# Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for -# extensions which SHIP WITH MERCURIAL. Non-mainline extensions should -# be specifying the version(s) of Mercurial they are tested with, or -# leave the attribute unspecified. -testedwith = b'ships-with-hg-core' - -REQUIREMENT = b'exp-sqlite-001' -REQUIREMENT_ZSTD = b'exp-sqlite-comp-001=zstd' -REQUIREMENT_ZLIB = b'exp-sqlite-comp-001=zlib' -REQUIREMENT_NONE = b'exp-sqlite-comp-001=none' -REQUIREMENT_SHALLOW_FILES = b'exp-sqlite-shallow-files' - -CURRENT_SCHEMA_VERSION = 1 - -COMPRESSION_NONE = 1 -COMPRESSION_ZSTD = 2 -COMPRESSION_ZLIB = 3 - -FLAG_CENSORED = 1 -FLAG_MISSING_P1 = 2 -FLAG_MISSING_P2 = 4 - -CREATE_SCHEMA = [ - # Deltas are stored as content-indexed blobs. - # compression column holds COMPRESSION_* constant for how the - # delta is encoded. - 'CREATE TABLE delta (' - ' id INTEGER PRIMARY KEY, ' - ' compression INTEGER NOT NULL, ' - ' hash BLOB UNIQUE ON CONFLICT ABORT, ' - ' delta BLOB NOT NULL ' - ')', - # Tracked paths are denormalized to integers to avoid redundant - # storage of the path name. - 'CREATE TABLE filepath (' - ' id INTEGER PRIMARY KEY, ' - ' path BLOB NOT NULL ' - ')', - 'CREATE UNIQUE INDEX filepath_path ON filepath (path)', - # We have a single table for all file revision data. - # Each file revision is uniquely described by a (path, rev) and - # (path, node). - # - # Revision data is stored as a pointer to the delta producing this - # revision and the file revision whose delta should be applied before - # that one. One can reconstruct the delta chain by recursively following - # the delta base revision pointers until one encounters NULL. - # - # flags column holds bitwise integer flags controlling storage options. - # These flags are defined by the FLAG_* constants. - 'CREATE TABLE fileindex (' - ' id INTEGER PRIMARY KEY, ' - ' pathid INTEGER REFERENCES filepath(id), ' - ' revnum INTEGER NOT NULL, ' - ' p1rev INTEGER NOT NULL, ' - ' p2rev INTEGER NOT NULL, ' - ' linkrev INTEGER NOT NULL, ' - ' flags INTEGER NOT NULL, ' - ' deltaid INTEGER REFERENCES delta(id), ' - ' deltabaseid INTEGER REFERENCES fileindex(id), ' - ' node BLOB NOT NULL ' - ')', - 'CREATE UNIQUE INDEX fileindex_pathrevnum ' - ' ON fileindex (pathid, revnum)', - 'CREATE UNIQUE INDEX fileindex_pathnode ON fileindex (pathid, node)', - # Provide a view over all file data for convenience. - 'CREATE VIEW filedata AS ' - 'SELECT ' - ' fileindex.id AS id, ' - ' filepath.id AS pathid, ' - ' filepath.path AS path, ' - ' fileindex.revnum AS revnum, ' - ' fileindex.node AS node, ' - ' fileindex.p1rev AS p1rev, ' - ' fileindex.p2rev AS p2rev, ' - ' fileindex.linkrev AS linkrev, ' - ' fileindex.flags AS flags, ' - ' fileindex.deltaid AS deltaid, ' - ' fileindex.deltabaseid AS deltabaseid ' - 'FROM filepath, fileindex ' - 'WHERE fileindex.pathid=filepath.id', - 'PRAGMA user_version=%d' % CURRENT_SCHEMA_VERSION, -] - - -def resolvedeltachain(db, pathid, node, revisioncache, stoprids, zstddctx=None): - """Resolve a delta chain for a file node.""" - - # TODO the "not in ({stops})" here is possibly slowing down the query - # because it needs to perform the lookup on every recursive invocation. - # This could possibly be faster if we created a temporary query with - # baseid "poisoned" to null and limited the recursive filter to - # "is not null". - res = db.execute( - 'WITH RECURSIVE ' - ' deltachain(deltaid, baseid) AS (' - ' SELECT deltaid, deltabaseid FROM fileindex ' - ' WHERE pathid=? AND node=? ' - ' UNION ALL ' - ' SELECT fileindex.deltaid, deltabaseid ' - ' FROM fileindex, deltachain ' - ' WHERE ' - ' fileindex.id=deltachain.baseid ' - ' AND deltachain.baseid IS NOT NULL ' - ' AND fileindex.id NOT IN ({stops}) ' - ' ) ' - 'SELECT deltachain.baseid, compression, delta ' - 'FROM deltachain, delta ' - 'WHERE delta.id=deltachain.deltaid'.format( - stops=','.join(['?'] * len(stoprids)) - ), - tuple([pathid, node] + list(stoprids.keys())), - ) - - deltas = [] - lastdeltabaseid = None - - for deltabaseid, compression, delta in res: - lastdeltabaseid = deltabaseid - - if compression == COMPRESSION_ZSTD: - delta = zstddctx.decompress(delta) - elif compression == COMPRESSION_NONE: - delta = delta - elif compression == COMPRESSION_ZLIB: - delta = zlib.decompress(delta) - else: - raise SQLiteStoreError( - b'unhandled compression type: %d' % compression - ) - - deltas.append(delta) - - if lastdeltabaseid in stoprids: - basetext = revisioncache[stoprids[lastdeltabaseid]] - else: - basetext = deltas.pop() - - deltas.reverse() - fulltext = mdiff.patches(basetext, deltas) - - # SQLite returns buffer instances for blob columns on Python 2. This - # type can propagate through the delta application layer. Because - # downstream callers assume revisions are bytes, cast as needed. - if not isinstance(fulltext, bytes): - fulltext = bytes(delta) - - return fulltext - - -def insertdelta(db, compression, hash, delta): - try: - return db.execute( - 'INSERT INTO delta (compression, hash, delta) VALUES (?, ?, ?)', - (compression, hash, delta), - ).lastrowid - except sqlite3.IntegrityError: - return db.execute( - 'SELECT id FROM delta WHERE hash=?', (hash,) - ).fetchone()[0] - - -class SQLiteStoreError(error.StorageError): - pass - - -@attr.s -class revisionentry: - rid = attr.ib() - rev = attr.ib() - node = attr.ib() - p1rev = attr.ib() - p2rev = attr.ib() - p1node = attr.ib() - p2node = attr.ib() - linkrev = attr.ib() - flags = attr.ib() - - -@attr.s(slots=True) -class sqliterevisiondelta(repository.irevisiondelta): - node = attr.ib(type=bytes) - p1node = attr.ib(type=bytes) - p2node = attr.ib(type=bytes) - basenode = attr.ib(type=bytes) - flags = attr.ib(type=int) - baserevisionsize = attr.ib(type=Optional[int]) - revision = attr.ib(type=Optional[bytes]) - delta = attr.ib(type=Optional[bytes]) - sidedata = attr.ib(type=Optional[bytes]) - protocol_flags = attr.ib(type=int) - linknode = attr.ib(default=None, type=Optional[bytes]) - - -@attr.s(frozen=True) -class sqliteproblem(repository.iverifyproblem): - warning = attr.ib(default=None, type=Optional[bytes]) - error = attr.ib(default=None, type=Optional[bytes]) - node = attr.ib(default=None, type=Optional[bytes]) - - -class sqlitefilestore(repository.ifilestorage): - """Implements storage for an individual tracked path.""" - - def __init__(self, db, path, compression): - self.nullid = sha1nodeconstants.nullid - self._db = db - self._path = path - - self._pathid = None - - # revnum -> node - self._revtonode = {} - # node -> revnum - self._nodetorev = {} - # node -> data structure - self._revisions = {} - - self._revisioncache = util.lrucachedict(10) - - self._compengine = compression - - if compression == b'zstd': - self._cctx = zstd.ZstdCompressor(level=3) - self._dctx = zstd.ZstdDecompressor() - else: - self._cctx = None - self._dctx = None - - self._refreshindex() - - def _refreshindex(self): - self._revtonode = {} - self._nodetorev = {} - self._revisions = {} - - res = list( - self._db.execute( - 'SELECT id FROM filepath WHERE path=?', (self._path,) - ) - ) - - if not res: - self._pathid = None - return - - self._pathid = res[0][0] - - res = self._db.execute( - 'SELECT id, revnum, node, p1rev, p2rev, linkrev, flags ' - 'FROM fileindex ' - 'WHERE pathid=? ' - 'ORDER BY revnum ASC', - (self._pathid,), - ) - - for i, row in enumerate(res): - rid, rev, node, p1rev, p2rev, linkrev, flags = row - - if i != rev: - raise SQLiteStoreError( - _(b'sqlite database has inconsistent revision numbers') - ) - - if p1rev == nullrev: - p1node = sha1nodeconstants.nullid - else: - p1node = self._revtonode[p1rev] - - if p2rev == nullrev: - p2node = sha1nodeconstants.nullid - else: - p2node = self._revtonode[p2rev] - - entry = revisionentry( - rid=rid, - rev=rev, - node=node, - p1rev=p1rev, - p2rev=p2rev, - p1node=p1node, - p2node=p2node, - linkrev=linkrev, - flags=flags, - ) - - self._revtonode[rev] = node - self._nodetorev[node] = rev - self._revisions[node] = entry - - # Start of ifileindex interface. - - def __len__(self) -> int: - return len(self._revisions) - - def __iter__(self) -> Iterator[int]: - return iter(range(len(self._revisions))) - - def hasnode(self, node): - if node == sha1nodeconstants.nullid: - return False - - return node in self._nodetorev - - def revs(self, start=0, stop=None): - return storageutil.iterrevs( - len(self._revisions), start=start, stop=stop - ) - - def parents(self, node): - if node == sha1nodeconstants.nullid: - return sha1nodeconstants.nullid, sha1nodeconstants.nullid - - if node not in self._revisions: - raise error.LookupError(node, self._path, _(b'no node')) - - entry = self._revisions[node] - return entry.p1node, entry.p2node - - def parentrevs(self, rev): - if rev == nullrev: - return nullrev, nullrev - - if rev not in self._revtonode: - raise IndexError(rev) - - entry = self._revisions[self._revtonode[rev]] - return entry.p1rev, entry.p2rev - - def ancestors(self, revs, stoprev=0, inclusive=False): - """Generate the ancestors of 'revs' in reverse revision order. - Does not generate revs lower than stoprev. - - See the documentation for ancestor.lazyancestors for more details.""" - - # first, make sure start revisions aren't filtered - revs = list(revs) - checkrev = self.node - for r in revs: - checkrev(r) - - return ancestor.lazyancestors( - self.parentrevs, - revs, - stoprev=stoprev, - inclusive=inclusive, - ) - - def rev(self, node): - if node == sha1nodeconstants.nullid: - return nullrev - - if node not in self._nodetorev: - raise error.LookupError(node, self._path, _(b'no node')) - - return self._nodetorev[node] - - def node(self, rev): - if rev == nullrev: - return sha1nodeconstants.nullid - - if rev not in self._revtonode: - raise IndexError(rev) - - return self._revtonode[rev] - - def lookup(self, node): - return storageutil.fileidlookup(self, node, self._path) - - def linkrev(self, rev): - if rev == nullrev: - return nullrev - - if rev not in self._revtonode: - raise IndexError(rev) - - entry = self._revisions[self._revtonode[rev]] - return entry.linkrev - - def iscensored(self, rev): - if rev == nullrev: - return False - - if rev not in self._revtonode: - raise IndexError(rev) - - return self._revisions[self._revtonode[rev]].flags & FLAG_CENSORED - - def commonancestorsheads(self, node1, node2): - rev1 = self.rev(node1) - rev2 = self.rev(node2) - - ancestors = ancestor.commonancestorsheads(self.parentrevs, rev1, rev2) - return pycompat.maplist(self.node, ancestors) - - def descendants(self, revs): - # TODO we could implement this using a recursive SQL query, which - # might be faster. - return dagop.descendantrevs(revs, self.revs, self.parentrevs) - - def heads(self, start=None, stop=None): - if start is None and stop is None: - if not len(self): - return [sha1nodeconstants.nullid] - - startrev = self.rev(start) if start is not None else nullrev - stoprevs = {self.rev(n) for n in stop or []} - - revs = dagop.headrevssubset( - self.revs, self.parentrevs, startrev=startrev, stoprevs=stoprevs - ) - - return [self.node(rev) for rev in revs] - - def children(self, node): - rev = self.rev(node) - - res = self._db.execute( - 'SELECT' - ' node ' - ' FROM filedata ' - ' WHERE path=? AND (p1rev=? OR p2rev=?) ' - ' ORDER BY revnum ASC', - (self._path, rev, rev), - ) - - return [row[0] for row in res] - - # End of ifileindex interface. - - # Start of ifiledata interface. - - def size(self, rev): - if rev == nullrev: - return 0 - - if rev not in self._revtonode: - raise IndexError(rev) - - node = self._revtonode[rev] - - if self.renamed(node): - return len(self.read(node)) - - return len(self.revision(node)) - - def revision(self, node, raw=False, _verifyhash=True): - if node in (sha1nodeconstants.nullid, nullrev): - return b'' - - if isinstance(node, int): - node = self.node(node) - - if node not in self._nodetorev: - raise error.LookupError(node, self._path, _(b'no node')) - - if node in self._revisioncache: - return self._revisioncache[node] - - # Because we have a fulltext revision cache, we are able to - # short-circuit delta chain traversal and decompression as soon as - # we encounter a revision in the cache. - - stoprids = {self._revisions[n].rid: n for n in self._revisioncache} - - if not stoprids: - stoprids[-1] = None - - fulltext = resolvedeltachain( - self._db, - self._pathid, - node, - self._revisioncache, - stoprids, - zstddctx=self._dctx, - ) - - # Don't verify hashes if parent nodes were rewritten, as the hash - # wouldn't verify. - if self._revisions[node].flags & (FLAG_MISSING_P1 | FLAG_MISSING_P2): - _verifyhash = False - - if _verifyhash: - self._checkhash(fulltext, node) - self._revisioncache[node] = fulltext - - return fulltext - - def rawdata(self, *args, **kwargs): - return self.revision(*args, **kwargs) - - def read(self, node): - return storageutil.filtermetadata(self.revision(node)) - - def renamed(self, node): - return storageutil.filerevisioncopied(self, node) - - def cmp(self, node, fulltext): - return not storageutil.filedataequivalent(self, node, fulltext) - - def emitrevisions( - self, - nodes, - nodesorder=None, - revisiondata=False, - assumehaveparentrevisions=False, - deltamode=repository.CG_DELTAMODE_STD, - sidedata_helpers=None, - debug_info=None, - ): - if nodesorder not in (b'nodes', b'storage', b'linear', None): - raise error.ProgrammingError( - b'unhandled value for nodesorder: %s' % nodesorder - ) - - nodes = [n for n in nodes if n != sha1nodeconstants.nullid] - - if not nodes: - return - - # TODO perform in a single query. - res = self._db.execute( - 'SELECT revnum, deltaid FROM fileindex ' - 'WHERE pathid=? ' - ' AND node in (%s)' % (','.join(['?'] * len(nodes))), - tuple([self._pathid] + nodes), - ) - - deltabases = {} - - for rev, deltaid in res: - res = self._db.execute( - 'SELECT revnum from fileindex WHERE pathid=? AND deltaid=?', - (self._pathid, deltaid), - ) - deltabases[rev] = res.fetchone()[0] - - # TODO define revdifffn so we can use delta from storage. - yield from storageutil.emitrevisions( - self, - nodes, - nodesorder, - sqliterevisiondelta, - deltaparentfn=deltabases.__getitem__, - revisiondata=revisiondata, - assumehaveparentrevisions=assumehaveparentrevisions, - deltamode=deltamode, - sidedata_helpers=sidedata_helpers, - ) - - # End of ifiledata interface. - - # Start of ifilemutation interface. - - def add(self, filedata, meta, transaction, linkrev, p1, p2): - if meta or filedata.startswith(b'\x01\n'): - filedata = storageutil.packmeta(meta, filedata) - - rev = self.addrevision(filedata, transaction, linkrev, p1, p2) - return self.node(rev) - - def addrevision( - self, - revisiondata, - transaction, - linkrev, - p1, - p2, - node=None, - flags=0, - cachedelta=None, - ): - if flags: - raise SQLiteStoreError(_(b'flags not supported on revisions')) - - validatehash = node is not None - node = node or storageutil.hashrevisionsha1(revisiondata, p1, p2) - - if validatehash: - self._checkhash(revisiondata, node, p1, p2) - - rev = self._nodetorev.get(node) - if rev is not None: - return rev - - rev = self._addrawrevision( - node, revisiondata, transaction, linkrev, p1, p2 - ) - - self._revisioncache[node] = revisiondata - return rev - - def addgroup( - self, - deltas, - linkmapper, - transaction, - addrevisioncb=None, - duplicaterevisioncb=None, - maybemissingparents=False, - ): - empty = True - - for ( - node, - p1, - p2, - linknode, - deltabase, - delta, - wireflags, - sidedata, - ) in deltas: - storeflags = 0 - - if wireflags & repository.REVISION_FLAG_CENSORED: - storeflags |= FLAG_CENSORED - - if wireflags & ~repository.REVISION_FLAG_CENSORED: - raise SQLiteStoreError(b'unhandled revision flag') - - if maybemissingparents: - if p1 != sha1nodeconstants.nullid and not self.hasnode(p1): - p1 = sha1nodeconstants.nullid - storeflags |= FLAG_MISSING_P1 - - if p2 != sha1nodeconstants.nullid and not self.hasnode(p2): - p2 = sha1nodeconstants.nullid - storeflags |= FLAG_MISSING_P2 - - baserev = self.rev(deltabase) - - # If base is censored, delta must be full replacement in a single - # patch operation. - if baserev != nullrev and self.iscensored(baserev): - hlen = struct.calcsize(b'>lll') - oldlen = len(self.rawdata(deltabase, _verifyhash=False)) - newlen = len(delta) - hlen - - if delta[:hlen] != mdiff.replacediffheader(oldlen, newlen): - raise error.CensoredBaseError(self._path, deltabase) - - if not (storeflags & FLAG_CENSORED) and storageutil.deltaiscensored( - delta, baserev, lambda x: len(self.rawdata(x)) - ): - storeflags |= FLAG_CENSORED - - linkrev = linkmapper(linknode) - - if node in self._revisions: - # Possibly reset parents to make them proper. - entry = self._revisions[node] - - if ( - entry.flags & FLAG_MISSING_P1 - and p1 != sha1nodeconstants.nullid - ): - entry.p1node = p1 - entry.p1rev = self._nodetorev[p1] - entry.flags &= ~FLAG_MISSING_P1 - - self._db.execute( - 'UPDATE fileindex SET p1rev=?, flags=? WHERE id=?', - (self._nodetorev[p1], entry.flags, entry.rid), - ) - - if ( - entry.flags & FLAG_MISSING_P2 - and p2 != sha1nodeconstants.nullid - ): - entry.p2node = p2 - entry.p2rev = self._nodetorev[p2] - entry.flags &= ~FLAG_MISSING_P2 - - self._db.execute( - 'UPDATE fileindex SET p2rev=?, flags=? WHERE id=?', - (self._nodetorev[p1], entry.flags, entry.rid), - ) - - if duplicaterevisioncb: - duplicaterevisioncb(self, self.rev(node)) - empty = False - continue - - if deltabase == sha1nodeconstants.nullid: - text = mdiff.patch(b'', delta) - storedelta = None - else: - text = None - storedelta = (deltabase, delta) - - rev = self._addrawrevision( - node, - text, - transaction, - linkrev, - p1, - p2, - storedelta=storedelta, - flags=storeflags, - ) - - if addrevisioncb: - addrevisioncb(self, rev) - empty = False - - return not empty - - def censorrevision(self, tr, censor_nodes, tombstone=b''): - for node in censor_nodes: - self._censor_one_revision(tr, node, tombstone=tombstone) - - def _censor_one_revision(self, tr, censornode, tombstone): - tombstone = storageutil.packmeta({b'censored': tombstone}, b'') - - # This restriction is cargo culted from revlogs and makes no sense for - # SQLite, since columns can be resized at will. - if len(tombstone) > len(self.rawdata(censornode)): - raise error.Abort( - _(b'censor tombstone must be no longer than censored data') - ) - - # We need to replace the censored revision's data with the tombstone. - # But replacing that data will have implications for delta chains that - # reference it. - # - # While "better," more complex strategies are possible, we do something - # simple: we find delta chain children of the censored revision and we - # replace those incremental deltas with fulltexts of their corresponding - # revision. Then we delete the now-unreferenced delta and original - # revision and insert a replacement. - - # Find the delta to be censored. - censoreddeltaid = self._db.execute( - 'SELECT deltaid FROM fileindex WHERE id=?', - (self._revisions[censornode].rid,), - ).fetchone()[0] - - # Find all its delta chain children. - # TODO once we support storing deltas for !files, we'll need to look - # for those delta chains too. - rows = list( - self._db.execute( - 'SELECT id, pathid, node FROM fileindex ' - 'WHERE deltabaseid=? OR deltaid=?', - (censoreddeltaid, censoreddeltaid), - ) - ) - - for row in rows: - rid, pathid, node = row - - fulltext = resolvedeltachain( - self._db, pathid, node, {}, {-1: None}, zstddctx=self._dctx - ) - - deltahash = hashutil.sha1(fulltext).digest() - - if self._compengine == b'zstd': - deltablob = self._cctx.compress(fulltext) - compression = COMPRESSION_ZSTD - elif self._compengine == b'zlib': - deltablob = zlib.compress(fulltext) - compression = COMPRESSION_ZLIB - elif self._compengine == b'none': - deltablob = fulltext - compression = COMPRESSION_NONE - else: - raise error.ProgrammingError( - b'unhandled compression engine: %s' % self._compengine - ) - - if len(deltablob) >= len(fulltext): - deltablob = fulltext - compression = COMPRESSION_NONE - - deltaid = insertdelta(self._db, compression, deltahash, deltablob) - - self._db.execute( - 'UPDATE fileindex SET deltaid=?, deltabaseid=NULL ' - 'WHERE id=?', - (deltaid, rid), - ) - - # Now create the tombstone delta and replace the delta on the censored - # node. - deltahash = hashutil.sha1(tombstone).digest() - tombstonedeltaid = insertdelta( - self._db, COMPRESSION_NONE, deltahash, tombstone - ) - - flags = self._revisions[censornode].flags - flags |= FLAG_CENSORED - - self._db.execute( - 'UPDATE fileindex SET flags=?, deltaid=?, deltabaseid=NULL ' - 'WHERE pathid=? AND node=?', - (flags, tombstonedeltaid, self._pathid, censornode), - ) - - self._db.execute('DELETE FROM delta WHERE id=?', (censoreddeltaid,)) - - self._refreshindex() - self._revisioncache.clear() - - def getstrippoint(self, minlink): - return storageutil.resolvestripinfo( - minlink, - len(self) - 1, - [self.rev(n) for n in self.heads()], - self.linkrev, - self.parentrevs, - ) - - def strip(self, minlink, transaction): - if not len(self): - return - - rev, _ignored = self.getstrippoint(minlink) - - if rev == len(self): - return - - for rev in self.revs(rev): - self._db.execute( - 'DELETE FROM fileindex WHERE pathid=? AND node=?', - (self._pathid, self.node(rev)), - ) - - # TODO how should we garbage collect data in delta table? - - self._refreshindex() - - # End of ifilemutation interface. - - # Start of ifilestorage interface. - - def files(self): - return [] - - def sidedata(self, nodeorrev, _df=None): - # Not supported for now - return {} - - def storageinfo( - self, - exclusivefiles=False, - sharedfiles=False, - revisionscount=False, - trackedsize=False, - storedsize=False, - ): - d = {} - - if exclusivefiles: - d[b'exclusivefiles'] = [] - - if sharedfiles: - # TODO list sqlite file(s) here. - d[b'sharedfiles'] = [] - - if revisionscount: - d[b'revisionscount'] = len(self) - - if trackedsize: - d[b'trackedsize'] = sum( - len(self.revision(node)) for node in self._nodetorev - ) - - if storedsize: - # TODO implement this? - d[b'storedsize'] = None - - return d - - def verifyintegrity(self, state) -> Iterable[repository.iverifyproblem]: - state[b'skipread'] = set() - - for rev in self: - node = self.node(rev) - - try: - self.revision(node) - except Exception as e: - yield sqliteproblem( - error=_(b'unpacking %s: %s') % (short(node), e), node=node - ) - - state[b'skipread'].add(node) - - # End of ifilestorage interface. - - def _checkhash(self, fulltext, node, p1=None, p2=None): - if p1 is None and p2 is None: - p1, p2 = self.parents(node) - - if node == storageutil.hashrevisionsha1(fulltext, p1, p2): - return - - try: - del self._revisioncache[node] - except KeyError: - pass - - if storageutil.iscensoredtext(fulltext): - raise error.CensoredNodeError(self._path, node, fulltext) - - raise SQLiteStoreError(_(b'integrity check failed on %s') % self._path) - - def _addrawrevision( - self, - node, - revisiondata, - transaction, - linkrev, - p1, - p2, - storedelta=None, - flags=0, - ): - if self._pathid is None: - res = self._db.execute( - 'INSERT INTO filepath (path) VALUES (?)', (self._path,) - ) - self._pathid = res.lastrowid - - # For simplicity, always store a delta against p1. - # TODO we need a lot more logic here to make behavior reasonable. - - if storedelta: - deltabase, delta = storedelta - - if isinstance(deltabase, int): - deltabase = self.node(deltabase) - - else: - assert revisiondata is not None - deltabase = p1 - - if deltabase == sha1nodeconstants.nullid: - delta = revisiondata - else: - delta = mdiff.textdiff( - self.revision(self.rev(deltabase)), revisiondata - ) - - # File index stores a pointer to its delta and the parent delta. - # The parent delta is stored via a pointer to the fileindex PK. - if deltabase == sha1nodeconstants.nullid: - baseid = None - else: - baseid = self._revisions[deltabase].rid - - # Deltas are stored with a hash of their content. This allows - # us to de-duplicate. The table is configured to ignore conflicts - # and it is faster to just insert and silently noop than to look - # first. - deltahash = hashutil.sha1(delta).digest() - - if self._compengine == b'zstd': - deltablob = self._cctx.compress(delta) - compression = COMPRESSION_ZSTD - elif self._compengine == b'zlib': - deltablob = zlib.compress(delta) - compression = COMPRESSION_ZLIB - elif self._compengine == b'none': - deltablob = delta - compression = COMPRESSION_NONE - else: - raise error.ProgrammingError( - b'unhandled compression engine: %s' % self._compengine - ) - - # Don't store compressed data if it isn't practical. - if len(deltablob) >= len(delta): - deltablob = delta - compression = COMPRESSION_NONE - - deltaid = insertdelta(self._db, compression, deltahash, deltablob) - - rev = len(self) - - if p1 == sha1nodeconstants.nullid: - p1rev = nullrev - else: - p1rev = self._nodetorev[p1] - - if p2 == sha1nodeconstants.nullid: - p2rev = nullrev - else: - p2rev = self._nodetorev[p2] - - rid = self._db.execute( - 'INSERT INTO fileindex (' - ' pathid, revnum, node, p1rev, p2rev, linkrev, flags, ' - ' deltaid, deltabaseid) ' - ' VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)', - ( - self._pathid, - rev, - node, - p1rev, - p2rev, - linkrev, - flags, - deltaid, - baseid, - ), - ).lastrowid - - entry = revisionentry( - rid=rid, - rev=rev, - node=node, - p1rev=p1rev, - p2rev=p2rev, - p1node=p1, - p2node=p2, - linkrev=linkrev, - flags=flags, - ) - - self._nodetorev[node] = rev - self._revtonode[rev] = node - self._revisions[node] = entry - - return rev - - -class sqliterepository(localrepo.localrepository): - def cancopy(self): - return False - - def transaction(self, *args, **kwargs): - current = self.currenttransaction() - - tr = super().transaction(*args, **kwargs) - - if current: - return tr - - self._dbconn.execute('BEGIN TRANSACTION') - - def committransaction(_): - self._dbconn.commit() - - tr.addfinalize(b'sqlitestore', committransaction) - - return tr - - @property - def _dbconn(self): - # SQLite connections can only be used on the thread that created - # them. In most cases, this "just works." However, hgweb uses - # multiple threads. - tid = threading.current_thread().ident - - if self._db: - if self._db[0] == tid: - return self._db[1] - - db = makedb(self.svfs.join(b'db.sqlite')) - self._db = (tid, db) - - return db - - -def makedb(path): - """Construct a database handle for a database at path.""" - - db = sqlite3.connect(encoding.strfromlocal(path)) - db.text_factory = bytes - - res = db.execute('PRAGMA user_version').fetchone()[0] - - # New database. - if res == 0: - for statement in CREATE_SCHEMA: - db.execute(statement) - - db.commit() - - elif res == CURRENT_SCHEMA_VERSION: - pass - - else: - raise error.Abort(_(b'sqlite database has unrecognized version')) - - db.execute('PRAGMA journal_mode=WAL') - - return db - - -def featuresetup(ui, supported): - supported.add(REQUIREMENT) - - if zstd: - supported.add(REQUIREMENT_ZSTD) - - supported.add(REQUIREMENT_ZLIB) - supported.add(REQUIREMENT_NONE) - supported.add(REQUIREMENT_SHALLOW_FILES) - supported.add(requirements.NARROW_REQUIREMENT) - - -def newreporequirements(orig, ui, createopts): - if createopts[b'backend'] != b'sqlite': - return orig(ui, createopts) - - # This restriction can be lifted once we have more confidence. - if b'sharedrepo' in createopts: - raise error.Abort( - _(b'shared repositories not supported with SQLite store') - ) - - # This filtering is out of an abundance of caution: we want to ensure - # we honor creation options and we do that by annotating exactly the - # creation options we recognize. - known = { - b'narrowfiles', - b'backend', - b'shallowfilestore', - } - - unsupported = set(createopts) - known - if unsupported: - raise error.Abort( - _(b'SQLite store does not support repo creation option: %s') - % b', '.join(sorted(unsupported)) - ) - - # Since we're a hybrid store that still relies on revlogs, we fall back - # to using the revlogv1 backend's storage requirements then adding our - # own requirement. - createopts[b'backend'] = b'revlogv1' - requirements = orig(ui, createopts) - requirements.add(REQUIREMENT) - - compression = ui.config(b'storage', b'sqlite.compression') - - if compression == b'zstd' and not zstd: - raise error.Abort( - _( - b'storage.sqlite.compression set to "zstd" but ' - b'zstandard compression not available to this ' - b'Mercurial install' - ) - ) - - if compression == b'zstd': - requirements.add(REQUIREMENT_ZSTD) - elif compression == b'zlib': - requirements.add(REQUIREMENT_ZLIB) - elif compression == b'none': - requirements.add(REQUIREMENT_NONE) - else: - raise error.Abort( - _( - b'unknown compression engine defined in ' - b'storage.sqlite.compression: %s' - ) - % compression - ) - - if createopts.get(b'shallowfilestore'): - requirements.add(REQUIREMENT_SHALLOW_FILES) - - return requirements - - -class sqlitefilestorage(repository.ilocalrepositoryfilestorage): - """Repository file storage backed by SQLite.""" - - def file(self, path): - if path[0] == b'/': - path = path[1:] - - if REQUIREMENT_ZSTD in self.requirements: - compression = b'zstd' - elif REQUIREMENT_ZLIB in self.requirements: - compression = b'zlib' - elif REQUIREMENT_NONE in self.requirements: - compression = b'none' - else: - raise error.Abort( - _( - b'unable to determine what compression engine ' - b'to use for SQLite storage' - ) - ) - - return sqlitefilestore(self._dbconn, path, compression) - - -def makefilestorage(orig, requirements, features, **kwargs): - """Produce a type conforming to ``ilocalrepositoryfilestorage``.""" - if REQUIREMENT in requirements: - if REQUIREMENT_SHALLOW_FILES in requirements: - features.add(repository.REPO_FEATURE_SHALLOW_FILE_STORAGE) - - return sqlitefilestorage - else: - return orig(requirements=requirements, features=features, **kwargs) - - -def makemain(orig, ui, requirements, **kwargs): - if REQUIREMENT in requirements: - if REQUIREMENT_ZSTD in requirements and not zstd: - raise error.Abort( - _( - b'repository uses zstandard compression, which ' - b'is not available to this Mercurial install' - ) - ) - - return sqliterepository - - return orig(requirements=requirements, **kwargs) - - -def verifierinit(orig, self, *args, **kwargs): - orig(self, *args, **kwargs) - - # We don't care that files in the store don't align with what is - # advertised. So suppress these warnings. - self.warnorphanstorefiles = False - - -def extsetup(ui): - localrepo.featuresetupfuncs.add(featuresetup) - extensions.wrapfunction( - localrepo, 'newreporequirements', newreporequirements - ) - extensions.wrapfunction(localrepo, 'makefilestorage', makefilestorage) - extensions.wrapfunction(localrepo, 'makemain', makemain) - extensions.wrapfunction(verify.verifier, '__init__', verifierinit) - - -def reposetup(ui, repo): - if isinstance(repo, sqliterepository): - repo._db = None - - # TODO check for bundlerepository?
--- a/mercurial/bundle2.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/bundle2.py Fri Feb 28 23:28:10 2025 +0100 @@ -186,6 +186,7 @@ if typing.TYPE_CHECKING: from typing import ( Dict, + Iterator, List, Optional, Tuple, @@ -739,7 +740,7 @@ return part # methods used to generate the bundle2 stream - def getchunks(self): + def getchunks(self) -> Iterator[bytes]: if self.ui.debugflag: msg = [b'bundle2-output-bundle: "%s",' % self._magicstring] if self._params: @@ -1466,6 +1467,11 @@ # we read the data, tell it self._initialized = True + def __iter__(self): + for chunk in self._payloadstream: + self._pos += len(chunk) + yield chunk + def _payloadchunks(self): """Generator of decoded chunks in the payload.""" return decodepayloadchunks(self.ui, self._fp) @@ -1501,6 +1507,10 @@ self.consumed = True return data + def tell(self) -> int: + """the amount of byte read so far in the part""" + return self._payloadstream.tell() + class seekableunbundlepart(unbundlepart): """A bundle2 part in a bundle that is seekable.
--- a/mercurial/bundlecaches.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/bundlecaches.py Fri Feb 28 23:28:10 2025 +0100 @@ -35,7 +35,6 @@ urlreq = util.urlreq -BUNDLE_CACHE_DIR = b'bundle-cache' CB_MANIFEST_FILE = b'clonebundles.manifest' CLONEBUNDLESCHEME = b"peer-bundle-cache://" @@ -318,10 +317,10 @@ % compression ) - # The specification for packed1 can optionally declare the data formats + # The specification for stream bundles can optionally declare the data formats # required to apply it. If we see this metadata, compare against what the # repo supports and error if the bundle isn't compatible. - if version == b'packed1' and b'requirements' in params: + if b'requirements' in params: requirements = set(cast(bytes, params[b'requirements']).split(b',')) missingreqs = requirements - requirementsmod.STREAM_FIXED_REQUIREMENTS if missingreqs:
--- a/mercurial/bundlerepo.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/bundlerepo.py Fri Feb 28 23:28:10 2025 +0100 @@ -617,6 +617,36 @@ raise NotImplementedError +class getremotechanges_state_tracker: + def __init__(self, peer, incoming, common, rheads): + # bundle file to be deleted + self.bundle = None + # bundle repo to be closed + self.bundlerepo = None + # remote peer connection to be closed + self.peer = peer + # if peer is remote, `localrepo` will be equal to + # `bundlerepo` when bundle is created. + self.localrepo = peer.local() + + # `incoming` operation parameters: + # (these get mutated by _create_bundle) + self.incoming = incoming + self.common = common + self.rheads = rheads + + def cleanup(self): + try: + if self.bundlerepo: + self.bundlerepo.close() + finally: + try: + if self.bundle: + os.unlink(self.bundle) + finally: + self.peer.close() + + def getremotechanges( ui, repo, peer, onlyheads=None, bundlename=None, force=False ): @@ -652,101 +682,127 @@ commonset = set(common) rheads = [x for x in rheads if x not in commonset] - bundle = None - bundlerepo = None - localrepo = peer.local() - if bundlename or not localrepo: - # create a bundle (uncompressed if peer repo is not local) + state = getremotechanges_state_tracker(peer, incoming, common, rheads) + + try: + csets = _getremotechanges_slowpath( + state, ui, repo, bundlename=bundlename, onlyheads=onlyheads + ) + return (state.localrepo, csets, state.cleanup) + except: # re-raises + state.cleanup() + raise + + +def _create_bundle(state, ui, repo, bundlename, onlyheads): + # create a bundle (uncompressed if peer repo is not local) - # developer config: devel.legacy.exchange - legexc = ui.configlist(b'devel', b'legacy.exchange') - forcebundle1 = b'bundle2' not in legexc and b'bundle1' in legexc - canbundle2 = ( - not forcebundle1 - and peer.capable(b'getbundle') - and peer.capable(b'bundle2') - ) - if canbundle2: - with peer.commandexecutor() as e: - b2 = e.callcommand( + # developer config: devel.legacy.exchange + legexc = ui.configlist(b'devel', b'legacy.exchange') + forcebundle1 = b'bundle2' not in legexc and b'bundle1' in legexc + canbundle2 = ( + not forcebundle1 + and state.peer.capable(b'getbundle') + and state.peer.capable(b'bundle2') + ) + if canbundle2: + with state.peer.commandexecutor() as e: + b2 = e.callcommand( + b'getbundle', + { + b'source': b'incoming', + b'common': state.common, + b'heads': state.rheads, + b'bundlecaps': exchange.caps20to10(repo, role=b'client'), + b'cg': True, + }, + ).result() + + fname = state.bundle = changegroup.writechunks( + ui, b2._forwardchunks(), bundlename + ) + else: + if state.peer.capable(b'getbundle'): + with state.peer.commandexecutor() as e: + cg = e.callcommand( b'getbundle', { b'source': b'incoming', - b'common': common, - b'heads': rheads, - b'bundlecaps': exchange.caps20to10( - repo, role=b'client' - ), - b'cg': True, + b'common': state.common, + b'heads': state.rheads, + }, + ).result() + elif onlyheads is None and not state.peer.capable(b'changegroupsubset'): + # compat with older servers when pulling all remote heads + + with state.peer.commandexecutor() as e: + cg = e.callcommand( + b'changegroup', + { + b'nodes': state.incoming, + b'source': b'incoming', }, ).result() - fname = bundle = changegroup.writechunks( - ui, b2._forwardchunks(), bundlename - ) + state.rheads = None else: - if peer.capable(b'getbundle'): - with peer.commandexecutor() as e: - cg = e.callcommand( - b'getbundle', - { - b'source': b'incoming', - b'common': common, - b'heads': rheads, - }, - ).result() - elif onlyheads is None and not peer.capable(b'changegroupsubset'): - # compat with older servers when pulling all remote heads + with state.peer.commandexecutor() as e: + cg = e.callcommand( + b'changegroupsubset', + { + b'bases': state.incoming, + b'heads': state.rheads, + b'source': b'incoming', + }, + ).result() - with peer.commandexecutor() as e: - cg = e.callcommand( - b'changegroup', - { - b'nodes': incoming, - b'source': b'incoming', - }, - ).result() + if state.localrepo: + bundletype = b"HG10BZ" + else: + bundletype = b"HG10UN" + fname = state.bundle = bundle2.writebundle( + ui, cg, bundlename, bundletype + ) + # keep written bundle? + if bundlename: + state.bundle = None + + return fname + - rheads = None - else: - with peer.commandexecutor() as e: - cg = e.callcommand( - b'changegroupsubset', - { - b'bases': incoming, - b'heads': rheads, - b'source': b'incoming', - }, - ).result() - - if localrepo: - bundletype = b"HG10BZ" - else: - bundletype = b"HG10UN" - fname = bundle = bundle2.writebundle(ui, cg, bundlename, bundletype) - # keep written bundle? - if bundlename: - bundle = None - if not localrepo: +def _getremotechanges_slowpath( + state, ui, repo, bundlename=None, onlyheads=None +): + if bundlename or not state.localrepo: + fname = _create_bundle( + state, + ui, + repo, + bundlename=bundlename, + onlyheads=onlyheads, + ) + if not state.localrepo: # use the created uncompressed bundlerepo - localrepo = bundlerepo = makebundlerepository( + state.localrepo = state.bundlerepo = makebundlerepository( repo.baseui, repo.root, fname ) # this repo contains local and peer now, so filter out local again - common = repo.heads() - if localrepo: + state.common = repo.heads() + + if state.localrepo: # Part of common may be remotely filtered # So use an unfiltered version # The discovery process probably need cleanup to avoid that - localrepo = localrepo.unfiltered() + state.localrepo = state.localrepo.unfiltered() - csets = localrepo.changelog.findmissing(common, rheads) + csets = state.localrepo.changelog.findmissing(state.common, state.rheads) - if bundlerepo: + if state.bundlerepo: + bundlerepo = state.bundlerepo reponodes = [ctx.node() for ctx in bundlerepo[bundlerepo.firstnewrev :]] - with peer.commandexecutor() as e: + with state.peer.commandexecutor() as e: remotephases = e.callcommand( b'listkeys', { @@ -755,16 +811,9 @@ ).result() pullop = exchange.pulloperation( - bundlerepo, peer, path=None, heads=reponodes + bundlerepo, state.peer, path=None, heads=reponodes ) pullop.trmanager = bundletransactionmanager() exchange._pullapplyphases(pullop, remotephases) - def cleanup(): - if bundlerepo: - bundlerepo.close() - if bundle: - os.unlink(bundle) - peer.close() - - return (localrepo, csets, cleanup) + return csets
--- a/mercurial/cext/revlog.c Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/cext/revlog.c Fri Feb 28 23:28:10 2025 +0100 @@ -90,6 +90,7 @@ int ntlookups; /* # lookups */ int ntmisses; /* # lookups that miss the cache */ int inlined; + int uses_generaldelta; /* whether this index uses generaldelta */ long entry_size; /* size of index headers. Differs in v1 v.s. v2 format */ long rust_ext_compat; /* compatibility with being used in rust @@ -1724,14 +1725,14 @@ static PyObject *index_deltachain(indexObject *self, PyObject *args) { - int rev, generaldelta; + int rev; PyObject *stoparg; int stoprev, iterrev, baserev = -1; int stopped; PyObject *chain = NULL, *result = NULL; const Py_ssize_t length = index_length(self); - if (!PyArg_ParseTuple(args, "iOi", &rev, &stoparg, &generaldelta)) { + if (!PyArg_ParseTuple(args, "iO", &rev, &stoparg)) { return NULL; } @@ -1774,7 +1775,7 @@ goto bail; } - if (generaldelta) { + if (self->uses_generaldelta) { iterrev = baserev; } else { iterrev--; @@ -3206,10 +3207,11 @@ static int index_init(indexObject *self, PyObject *args, PyObject *kwargs) { - PyObject *data_obj, *inlined_obj; + PyObject *data_obj, *inlined_obj, *generaldelta_obj; Py_ssize_t size; - static char *kwlist[] = {"data", "inlined", "format", NULL}; + static char *kwlist[] = {"data", "inlined", "uses_generaldelta", + "format", NULL}; /* Initialize before argument-checking to avoid index_dealloc() crash. */ @@ -3225,12 +3227,13 @@ self->offsets = NULL; self->nodelen = 20; self->nullentry = NULL; + self->uses_generaldelta = 0; self->rust_ext_compat = 0; self->format_version = format_v1; - if (!PyArg_ParseTupleAndKeywords(args, kwargs, "OO|l", kwlist, - &data_obj, &inlined_obj, - &(self->format_version))) + if (!PyArg_ParseTupleAndKeywords( + args, kwargs, "OOO|l", kwlist, &data_obj, &inlined_obj, + &generaldelta_obj, &(self->format_version))) return -1; if (!PyObject_CheckBuffer(data_obj)) { PyErr_SetString(PyExc_TypeError, @@ -3263,6 +3266,8 @@ size = self->buf.len; self->inlined = inlined_obj && PyObject_IsTrue(inlined_obj); + self->uses_generaldelta = + generaldelta_obj && PyObject_IsTrue(generaldelta_obj); self->data = data_obj; self->ntlookups = self->ntmisses = 0;
--- a/mercurial/chgserver.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/chgserver.py Fri Feb 28 23:28:10 2025 +0100 @@ -268,6 +268,7 @@ # command line args options = dispatch._earlyparseopts(newui, args) + dispatch._parse_config_files(newui, args, options[b'config_file']) dispatch._parseconfig(newui, options[b'config']) # stolen from tortoisehg.util.copydynamicconfig()
--- a/mercurial/commands.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/commands.py Fri Feb 28 23:28:10 2025 +0100 @@ -27,6 +27,7 @@ bundlecaches, changegroup, cmdutil, + context as contextmod, copies, debugcommands as debugcommandsmod, destutil, @@ -120,6 +121,13 @@ _(b'set/override config option (use \'section.name=value\')'), _(b'CONFIG'), ), + ( + b'', + b'config-file', + [], + _(b'load config file to set/override config options'), + _(b'HGRC'), + ), (b'', b'debug', None, _(b'enable debugging output')), (b'', b'debugger', None, _(b'start debugger')), ( @@ -2533,6 +2541,14 @@ (b'', b'from', b'', _(b'revision to diff from'), _(b'REV1')), (b'', b'to', b'', _(b'revision to diff to'), _(b'REV2')), (b'c', b'change', b'', _(b'change made by revision'), _(b'REV')), + ( + b'', + b'ignore-changes-from-ancestors', + False, + _( + b'only compare the change made by the selected revision (EXPERIMENTAL)' + ), + ), ] + diffopts + diffopts2 @@ -2614,6 +2630,7 @@ to_rev = opts.get(b'to') stat = opts.get(b'stat') reverse = opts.get(b'reverse') + patch_only = opts.get(b'ignore_changes_from_ancestors') cmdutil.check_incompatible_arguments(opts, b'from', [b'rev', b'change']) cmdutil.check_incompatible_arguments(opts, b'to', [b'rev', b'change']) @@ -2631,6 +2648,30 @@ repo = scmutil.unhidehashlikerevs(repo, revs, b'nowarn') ctx1, ctx2 = logcmdutil.revpair(repo, revs) + if patch_only and ctx1.p1() != ctx2.p1(): + old_base = ctx1.p1() + new_base = ctx2.p1() + new_ctx = contextmod.overlayworkingctx(repo) + new_ctx.setbase(ctx1) + configoverrides = { + (b'ui', b'forcemerge'): b'internal:merge3-lie-about-conflicts' + } + with ui.configoverride(configoverrides, b'obslog-diff'), ui.silent(): + mergemod._update( + repo, + new_base, + labels=[ + b'from', + b'parent-of-to', + b'parent-of-from', + ], + force=True, + branchmerge=True, + wc=new_ctx, + ancestor=old_base, + ) + ctx1 = new_ctx.tomemctx(text=ctx1.description()) + if reverse: ctxleft = ctx2 ctxright = ctx1
--- a/mercurial/configitems.toml Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/configitems.toml Fri Feb 28 23:28:10 2025 +0100 @@ -2093,6 +2093,11 @@ default = 3 [[items]] +section = "server" +name = "peer-bundle-cache-root" +default = "bundle-cache" + +[[items]] section = "share" name = "pool" @@ -2941,6 +2946,59 @@ section = "worker" name = "numcpus" + +# experimental until we are happy with the implementation and some sanity +# checking has been done. +# +# At the time of the 7.0 freeze, the threaded code is faster is some case and +# slower in some other. So it is not ready to be installed by default. +# +# The correct number of writers is hard to adjust as it strongly depends of the +# OS, disk setup and number of CPU. so more tuning is needed. +[[items]] +section = "worker" +name = "parallel-stream-bundle-processing" +default = false +documentation=""" +Read, parse and write stream bundle in parallel speeding up the process. + +Also see `worker.parallel-stream-bundle-processing.num-writer` to control the +amount of concurrent writers. +""" +experimental = true + +[[items]] +section = "worker" +name = "parallel-stream-bundle-processing.num-writer" +default = 0 +experimental = true +documentation=""" +Control the number of file being written concurrently when applying a stream +bundle. + +When set to 0, the default, the value used depends on the value of the +`usage.resources.cpu` configuration: +- low: 1 writer +- medium: 2 writers (default) +- high: 4 writers +""" + +[[items]] +section = "worker" +name = "parallel-stream-bundle-processing.memory-target" +default = 0 +experimental = true +documentation="""Limit memory usage around this value. This is not a hard target a Mercurial will use a bit more, but this reduce the amount of buffered stream content that we read ahead. + +If set to a negative value, the amount of memory used will not be restricted. + +When set to 0, the default value depends on the `usage.resources.memory` value. +- low: 100 MB, +- medium: 1 GB, +- high: unrestricted memory usage. +""" + + # Templates and template applications [[template-applications]]
--- a/mercurial/context.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/context.py Fri Feb 28 23:28:10 2025 +0100 @@ -1222,14 +1222,16 @@ # it is safe to use an unfiltered repository here because we are # walking ancestors only. cl = self._repo.unfiltered().changelog - if base.rev() is None: + # use self.rev(), not base.rev(), because if self is a merge we should still + # consider linkrevs in the other branch as ancestors. + if self.rev() is None: # wctx is not inclusive, but works because _ancestrycontext # is used to test filelog revisions ac = cl.ancestors( - [p.rev() for p in base.parents()], inclusive=True + [p.rev() for p in self.parents()], inclusive=True ) else: - ac = cl.ancestors([base.rev()], inclusive=True) + ac = cl.ancestors([self.rev()], inclusive=True) base._ancestrycontext = ac return dagop.annotate( @@ -2542,15 +2544,18 @@ files = self.files() def getfile(repo, memctx, path): - if self._cache[path][b'exists']: + hit = self._cache.get(path) + if hit is None: + return self.filectx(path) + elif hit[b'exists']: return memfilectx( repo, memctx, path, - self._cache[path][b'data'], - b'l' in self._cache[path][b'flags'], - b'x' in self._cache[path][b'flags'], - self._cache[path][b'copied'], + hit[b'data'], + b'l' in hit[b'flags'], + b'x' in hit[b'flags'], + hit[b'copied'], ) else: # Returning None, but including the path in `files`, is
--- a/mercurial/copies.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/copies.py Fri Feb 28 23:28:10 2025 +0100 @@ -28,7 +28,7 @@ sidedata as sidedatamod, ) -rustmod = policy.importrust("copy_tracing") +rustmod = policy.importrust("copy_tracing", pyo3=True) def _filter(src, dst, t):
--- a/mercurial/dirstate.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/dirstate.py Fri Feb 28 23:28:10 2025 +0100 @@ -24,6 +24,10 @@ ) from .i18n import _ +from .interfaces.types import ( + MatcherT, + TransactionT, +) from hgdemandimport import tracing @@ -50,7 +54,7 @@ ) parsers = policy.importmod('parsers') -rustmod = policy.importrust('dirstate') +rustmod = policy.importrust('dirstate', pyo3=True) HAS_FAST_DIRSTATE_V2 = rustmod is not None @@ -483,7 +487,7 @@ return self._map.hastrackeddir(d) @rootcache(b'.hgignore') - def _ignore(self) -> matchmod.basematcher: + def _ignore(self) -> MatcherT: files = self._ignorefiles() if not files: return matchmod.never() @@ -668,7 +672,7 @@ return self._map.setparents(p1, p2, fold_p2=fold_p2) def setbranch( - self, branch: bytes, transaction: Optional[intdirstate.TransactionT] + self, branch: bytes, transaction: Optional[TransactionT] ) -> None: self.__class__._branch.set(self, encoding.fromlocal(branch)) if transaction is not None: @@ -1101,7 +1105,7 @@ on_abort, ) - def write(self, tr: Optional[intdirstate.TransactionT]) -> None: + def write(self, tr: Optional[TransactionT]) -> None: if not self._dirty: return # make sure we don't request a write of invalidated content @@ -1359,7 +1363,7 @@ def walk( self, - match: matchmod.basematcher, + match: MatcherT, subrepos: Any, unknown: bool, ignored: bool, @@ -1639,7 +1643,7 @@ def status( self, - match: matchmod.basematcher, + match: MatcherT, subrepos: bool, ignored: bool, clean: bool, @@ -1796,7 +1800,7 @@ ) return (lookup, status, mtime_boundary) - def matches(self, match: matchmod.basematcher) -> Iterable[bytes]: + def matches(self, match: MatcherT) -> Iterable[bytes]: """ return files in the dirstate (in whatever state) filtered by match """
--- a/mercurial/dirstatemap.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/dirstatemap.py Fri Feb 28 23:28:10 2025 +0100 @@ -10,6 +10,7 @@ from typing import ( Optional, TYPE_CHECKING, + Tuple, ) from .i18n import _ @@ -35,7 +36,7 @@ ) parsers = policy.importmod('parsers') -rustmod = policy.importrust('dirstate') +rustmod = policy.importrust('dirstate', pyo3=True) propertycache = util.propertycache @@ -109,9 +110,6 @@ # for consistent view between _pl() and _read() invocations self._pendingmode = None - def _set_identity(self) -> None: - self.identity = self._get_current_identity() - def _get_current_identity(self) -> Optional[typelib.CacheStat]: # TODO have a cleaner approach on httpstaticrepo side path = self._opener.join(self._filename) @@ -172,25 +170,45 @@ self._pendingmode = mode return fp - def _readdirstatefile(self, size: int = -1) -> bytes: + def _readdirstatefile( + self, + size: int = -1, + ) -> Tuple[Optional[typelib.CacheStat], bytes]: + """read the content of the file used as "entry point" for the dirstate + + Return a (identity, data) tuple. The identity can be used for cache + validation and concurrent changes detection and must be set as + `self.identity` if `data` is preserved. + """ + identity = self._get_current_identity() + # There is a race condition between fetching the identity and reading + # the file content. Another process might update the file after we get + # the identity information. However this is fine for our purpose as + # this will only create false-positive for "data changed" and no + # false-negative. + # + # in addition for the case that matter the most (updating the dirstate + # semantic content and not just some cache information), this will not + # happens as the lock should be held when changes to the dirstate + # content are made. testing.wait_on_cfg(self._ui, b'dirstate.pre-read-file') try: with self._opendirstatefile() as fp: - return fp.read(size) + data = fp.read(size) except FileNotFoundError: # File doesn't exist, so the current state is empty - return b'' + data = b'' + testing.wait_on_cfg(self._ui, b'dirstate.post-docket-read-file') + return identity, data @property def docket(self) -> docketmod.DirstateDocket: - testing.wait_on_cfg(self._ui, b'dirstate.pre-read-file') if not self._docket: if not self._use_dirstate_v2: raise error.ProgrammingError( b'dirstate only has a docket in v2 format' ) - self._set_identity() - data = self._readdirstatefile() + self.identity, data = self._readdirstatefile() if data == b'' or data.startswith(docketmod.V2_FORMAT_MARKER): self._docket = docketmod.DirstateDocket.parse( data, self._nodeconstants @@ -280,7 +298,7 @@ def _v1_parents(self, from_v2_exception=None): read_len = self._nodelen * 2 - st = self._readdirstatefile(read_len) + _identity, st = self._readdirstatefile(read_len) l = len(st) if l == read_len: self._parents = ( @@ -395,22 +413,18 @@ ### disk interaction def read(self): - testing.wait_on_cfg(self._ui, b'dirstate.pre-read-file') if self._use_dirstate_v2: try: self.docket except error.CorruptedDirstate: # fall back to dirstate-v1 if we fail to read v2 - self._set_identity() - st = self._readdirstatefile() + self.identity, st = self._readdirstatefile() else: if not self.docket.uuid: return - testing.wait_on_cfg(self._ui, b'dirstate.post-docket-read-file') st = self._read_v2_data() else: - self._set_identity() - st = self._readdirstatefile() + self.identity, st = self._readdirstatefile() if not st: return @@ -473,7 +487,14 @@ self._dirtyparents = False @propertycache - def identity(self): + def identity(self) -> Optional[typelib.CacheStat]: + """A cache identifier for the state of the file as data were read + + This must always be set with the object returned from + `self._readdirstatefile()`. assigning another value later will break + some security mechanism and can lead to misbehavior when concurrent + operation are run. + """ self._map return self.identity @@ -673,9 +694,7 @@ Fills the Dirstatemap when called. """ # ignore HG_PENDING because identity is used only for writing - self._set_identity() - testing.wait_on_cfg(self._ui, b'dirstate.pre-read-file') if self._use_dirstate_v2: try: self.docket @@ -685,9 +704,6 @@ else: parents = self.docket.parents identity = self._get_rust_identity() - testing.wait_on_cfg( - self._ui, b'dirstate.post-docket-read-file' - ) if not self.docket.uuid: data = b'' self._map = rustmod.DirstateMap.new_empty() @@ -713,7 +729,6 @@ return self._map def _get_rust_identity(self): - self._set_identity() identity = None if self.identity is not None and self.identity.stat is not None: stat_info = self.identity.stat @@ -733,11 +748,10 @@ return identity def _v1_map(self, from_v2_exception=None): - identity = self._get_rust_identity() try: - self._map, parents = rustmod.DirstateMap.new_v1( - self._readdirstatefile(), identity - ) + self.identity, data = self._readdirstatefile() + identity = self._get_rust_identity() + self._map, parents = rustmod.DirstateMap.new_v1(data, identity) except OSError as e: if from_v2_exception is not None: raise e from from_v2_exception
--- a/mercurial/dispatch.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/dispatch.py Fri Feb 28 23:28:10 2025 +0100 @@ -17,6 +17,9 @@ import sys import traceback +from typing import ( + Iterable, +) from .i18n import _ @@ -26,6 +29,7 @@ cmdutil, color, commands, + config as configmod, demandimport, encoding, error, @@ -387,13 +391,22 @@ debugtrace = {b'pdb': pdb.set_trace} debugmortem = {b'pdb': pdb.post_mortem} - # read --config before doing anything else - # (e.g. to change trust settings for reading .hg/hgrc) + # read --config-file and --config before doing anything else + # (e.g. to change trust settings for reading .hg/hgrc). + + # cmdargs may not have been initialized here (in the case of an + # error), so use pycompat.sysargv instead. + file_cfgs = _parse_config_files( + req.ui, pycompat.sysargv, req.earlyoptions[b'config_file'] + ) cfgs = _parseconfig(req.ui, req.earlyoptions[b'config']) if req.repo: # copy configs that were passed on the cmdline (--config) to # the repo ui + for sec, name, val, source in file_cfgs: + req.repo.ui.setconfig(sec, name, val, source=source) + for sec, name, val in cfgs: req.repo.ui.setconfig( sec, name, val, source=b'--config' @@ -822,9 +835,20 @@ cmd = None c = [] + def global_opt_to_fancy_opt(opt_name): + # fancyopts() does this transform on `options`, but globalopts uses a + # '-', so that it is displayed in the help and accepted as input that + # way. + return opt_name.replace(b'-', b'_') + # combine global options into local for o in commands.globalopts: - c.append((o[0], o[1], options[o[1]], o[3])) + name = global_opt_to_fancy_opt(o[1]) + + # The fancyopts name is needed for `options`, but the original name + # needs to be used in the second element here, or the parsing for the + # command verb fails, saying the command has no such option. + c.append((o[0], o[1], options[name], o[3])) try: args = fancyopts.fancyopts(args, c, cmdoptions, gnu=True) @@ -833,7 +857,7 @@ # separate global options back out for o in commands.globalopts: - n = o[1] + n = global_opt_to_fancy_opt(o[1]) options[n] = cmdoptions[n] del cmdoptions[n] @@ -864,6 +888,48 @@ return configs +def _parse_config_files( + ui, cmdargs: list[bytes], config_files: Iterable[bytes] +) -> list[tuple[bytes, bytes, bytes, bytes]]: + """parse the --config-file options from the command line + + A list of tuples containing (section, name, value, source) is returned, + in the order they were read. + """ + + configs: list[tuple[bytes, bytes, bytes, bytes]] = [] + + cfg = configmod.config() + + for file in config_files: + try: + cfg.read(file) + except error.ConfigError as e: + raise error.InputError( + _(b'invalid --config-file content at %s') % e.location, + hint=e.message, + ) + except FileNotFoundError: + hint = None + if b'--cwd' in cmdargs: + hint = _(b"this file is resolved before --cwd is processed") + + raise error.InputError( + _(b'missing file "%s" for --config-file') % file, hint=hint + ) + + for section in cfg.sections(): + for item in cfg.items(section): + name = item[0] + value = item[1] + src = cfg.source(section, name) + + ui.setconfig(section, name, value, src) + configs.append((section, name, value, src)) + + return configs + + def _earlyparseopts(ui, args): options = {} fancyopts.fancyopts( @@ -881,7 +947,13 @@ """Split args into a list of possible early options and remainder args""" shortoptions = b'R:' # TODO: perhaps 'debugger' should be included - longoptions = [b'cwd=', b'repository=', b'repo=', b'config='] + longoptions = [ + b'cwd=', + b'repository=', + b'repo=', + b'config=', + b'config-file=', + ] return fancyopts.earlygetopt( args, shortoptions, longoptions, gnu=True, keepsep=True ) @@ -1079,13 +1151,32 @@ encoding.fallbackencoding = fallback fullargs = args - cmd, func, args, options, cmdoptions = _parse(lui, args) + try: + cmd, func, args, options, cmdoptions = _parse(lui, args) + except error.CommandError as e: + cause = e.__context__ + if isinstance(cause, getopt.GetoptError): + if cause.opt and "config".startswith(cause.opt): + # pycompat._getoptbwrapper() decodes bytes with latin-1 + opt = cause.opt.encode('latin-1') + all_long = {o[1] for o in commands.globalopts} + possible = [o for o in all_long if o.startswith(opt)] + + if len(possible) != 1: + raise error.InputError( + _(b"option --config may not be abbreviated") + ) + raise # store the canonical command name in request object for later access req.canonical_command = cmd if options[b"config"] != req.earlyoptions[b"config"]: raise error.InputError(_(b"option --config may not be abbreviated")) + if options[b"config_file"] != req.earlyoptions[b"config_file"]: + raise error.InputError( + _(b"option --config-file may not be abbreviated") + ) if options[b"cwd"] != req.earlyoptions[b"cwd"]: raise error.InputError(_(b"option --cwd may not be abbreviated")) if options[b"repository"] != req.earlyoptions[b"repository"]:
--- a/mercurial/extensions.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/extensions.py Fri Feb 28 23:28:10 2025 +0100 @@ -620,11 +620,13 @@ class wrappedfunction: '''context manager for temporarily wrapping a function''' - def __init__(self, container, funcname, wrapper): + def __init__(self, container, funcname: str, wrapper): assert callable(wrapper) if not isinstance(funcname, str): - msg = b"wrappedfunction target name should be `str`, not `bytes`" - raise TypeError(msg) + # Keep this compat shim around for older/unmaintained extensions + msg = b"pass wrappedfunction target name as `str`, not `bytes`" + util.nouideprecwarn(msg, b"6.6", stacklevel=2) + funcname = pycompat.sysstr(funcname) self._container = container self._funcname = funcname self._wrapper = wrapper @@ -636,7 +638,7 @@ unwrapfunction(self._container, self._funcname, self._wrapper) -def wrapfunction(container, funcname, wrapper): +def wrapfunction(container, funcname: str, wrapper): """Wrap the function named funcname in container Replace the funcname member in the given container with the specified @@ -672,8 +674,10 @@ assert callable(wrapper) if not isinstance(funcname, str): - msg = b"wrapfunction target name should be `str`, not `bytes`" - raise TypeError(msg) + # Keep this compat shim around for older/unmaintained extensions + msg = b"pass wrapfunction target name as `str`, not `bytes`" + util.nouideprecwarn(msg, b"6.6", stacklevel=2) + funcname = pycompat.sysstr(funcname) origfn = getattr(container, funcname) assert callable(origfn)
--- a/mercurial/hg.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/hg.py Fri Feb 28 23:28:10 2025 +0100 @@ -13,7 +13,6 @@ import shutil import stat import typing -import weakref from .i18n import _ from .node import ( @@ -950,14 +949,16 @@ # important: # # We still need to release that lock at the end of the function - destpeer.local()._lockref = weakref.ref(destlock) - destpeer.local()._wlockref = weakref.ref(destwlock) - # dirstate also needs to be copied because `_wlockref` has a reference - # to it: this dirstate is saved to disk when the wlock is released - destpeer.local().dirstate = destrepo.dirstate + if destrepo.dirstate._dirty: + msg = "dirstate dirty after stream clone" + raise error.ProgrammingError(msg) + destwlock = destpeer.local().wlock(steal_from=destwlock) + destlock = destpeer.local().lock(steal_from=destlock) srcrepo.hook( - b'outgoing', source=b'clone', node=srcrepo.nodeconstants.nullhex + b'outgoing', + source=b'clone', + node=srcrepo.nodeconstants.nullhex, ) else: try: @@ -1121,8 +1122,10 @@ bookmarks.activate(destrepo, update) if destlock is not None: release(destlock) + destlock = None if destwlock is not None: - release(destlock) + release(destwlock) + destwlock = None # here is a tiny windows were someone could end up writing the # repository before the cache are sure to be warm. This is "fine" # as the only "bad" outcome would be some slowness. That potential
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mercurial/interfaces/_basetypes.py Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,40 @@ +# mercurial/interfaces/_basetypes.py - internal base type aliases for interfaces +# +# This software may be used and distributed according to the terms of the +# GNU General Public License version 2 or any later version. +# +# This module contains trivial type aliases that other interfaces might need, +# in a location designed to avoid import cycles. This is for internal usage +# by the modules in `mercurial.interfaces`, instead of importing the `types` +# module. +# +# For using type aliases outside `mercurial.interfaces`, look at the +# `mercurial.interfaces.types` module. + +from __future__ import annotations + +from typing import Any + +UserMsgT = bytes +"""Text (maybe) displayed to the user.""" + +HgPathT = bytes +"""A path usable with Mercurial's vfs.""" + +FsPathT = bytes +"""A path on disk (after vfs encoding).""" + +# TODO: create a Protocol class, +RepoT = Any + +# TODO: create a Protocol class, +UiT = Any + +# TODO: make a protocol class for this +VfsT = Any + +VfsKeyT = bytes +"""Vfs identifier, typically used in a VfsMap.""" + +CallbackCategoryT = bytes +"""Key identifying a callback category."""
--- a/mercurial/interfaces/dirstate.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/interfaces/dirstate.py Fri Feb 28 23:28:10 2025 +0100 @@ -20,13 +20,12 @@ if typing.TYPE_CHECKING: # Almost all mercurial modules are only imported in the type checking phase # to avoid circular imports - from .. import ( - match as matchmod, - transaction as txnmod, + from . import ( + matcher, + status as istatus, + transaction, ) - from . import status as istatus - # TODO: finish adding type hints AddParentChangeCallbackT = Callable[ ["idirstate", Tuple[Any, Any], Tuple[Any, Any]], Any @@ -53,8 +52,7 @@ StatusReturnT = Tuple[Any, istatus.Status, Any] """The return type of dirstate.status().""" - # TODO: probably doesn't belong here. - TransactionT = txnmod.transaction + TransactionT = transaction.ITransaction """The type for a transaction used with dirstate. This is meant to help callers avoid having to remember to delay the import @@ -95,7 +93,7 @@ # TODO: decorate with `@rootcache(b'.hgignore')` like dirstate class? @property - def _ignore(self) -> matchmod.basematcher: + def _ignore(self) -> matcher.IMatcher: """Matcher for ignored files.""" @property @@ -307,7 +305,7 @@ @abc.abstractmethod def walk( self, - match: matchmod.basematcher, + match: matcher.IMatcher, subrepos: Any, # TODO: figure out what this is unknown: bool, ignored: bool, @@ -327,7 +325,7 @@ @abc.abstractmethod def status( self, - match: matchmod.basematcher, + match: matcher.IMatcher, subrepos: bool, ignored: bool, clean: bool, @@ -352,7 +350,7 @@ # TODO: could return a list, except git.dirstate is a generator @abc.abstractmethod - def matches(self, match: matchmod.basematcher) -> Iterable[bytes]: + def matches(self, match: matcher.IMatcher) -> Iterable[bytes]: """ return files in the dirstate (in whatever state) filtered by match """
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mercurial/interfaces/matcher.py Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,145 @@ +# mercurial/interfaces/matcher - typing protocol for Matcher objects +# +# This software may be used and distributed according to the terms of the +# GNU General Public License version 2 or any later version. + +from __future__ import annotations + +import abc + +from typing import ( + Callable, + List, + Optional, + Protocol, + Set, + Union, +) + +from ._basetypes import ( + HgPathT, + UserMsgT, +) + + +class IMatcher(Protocol): + """A protocol class that defines the common interface for all file matching + classes.""" + + @abc.abstractmethod + def was_tampered_with_nonrec(self) -> bool: + ... + + @abc.abstractmethod + def was_tampered_with(self) -> bool: + ... + + @abc.abstractmethod + def __call__(self, fn: HgPathT) -> bool: + ... + + # Callbacks related to how the matcher is used by dirstate.walk. + # Subscribers to these events must monkeypatch the matcher object. + @abc.abstractmethod + def bad(self, f: HgPathT, msg: Optional[UserMsgT]) -> None: + ... + + # If traversedir is set, it will be called when a directory discovered + # by recursive traversal is visited. + traversedir: Optional[Callable[[HgPathT], None]] = None + + @property + @abc.abstractmethod + def _files(self) -> List[HgPathT]: + ... + + @abc.abstractmethod + def files(self) -> List[HgPathT]: + ... + + @property + @abc.abstractmethod + def _fileset(self) -> Set[HgPathT]: + ... + + @abc.abstractmethod + def exact(self, f: HgPathT) -> bool: + """Returns True if f is in .files().""" + + @abc.abstractmethod + def matchfn(self, f: HgPathT) -> bool: + ... + + @abc.abstractmethod + def visitdir(self, dir: HgPathT) -> Union[bool, bytes]: + """Decides whether a directory should be visited based on whether it + has potential matches in it or one of its subdirectories. This is + based on the match's primary, included, and excluded patterns. + + Returns the string 'all' if the given directory and all subdirectories + should be visited. Otherwise returns True or False indicating whether + the given directory should be visited. + """ + + @abc.abstractmethod + def visitchildrenset(self, dir: HgPathT) -> Union[Set[HgPathT], bytes]: + """Decides whether a directory should be visited based on whether it + has potential matches in it or one of its subdirectories, and + potentially lists which subdirectories of that directory should be + visited. This is based on the match's primary, included, and excluded + patterns. + + This function is very similar to 'visitdir', and the following mapping + can be applied: + + visitdir | visitchildrenlist + ----------+------------------- + False | set() + 'all' | 'all' + True | 'this' OR non-empty set of subdirs -or files- to visit + + Example: + Assume matchers ['path:foo/bar', 'rootfilesin:qux'], we would return + the following values (assuming the implementation of visitchildrenset + is capable of recognizing this; some implementations are not). + + '' -> {'foo', 'qux'} + 'baz' -> set() + 'foo' -> {'bar'} + # Ideally this would be 'all', but since the prefix nature of matchers + # is applied to the entire matcher, we have to downgrade this to + # 'this' due to the non-prefix 'rootfilesin'-kind matcher being mixed + # in. + 'foo/bar' -> 'this' + 'qux' -> 'this' + + Important: + Most matchers do not know if they're representing files or + directories. They see ['path:dir/f'] and don't know whether 'f' is a + file or a directory, so visitchildrenset('dir') for most matchers will + return {'f'}, but if the matcher knows it's a file (like exactmatcher + does), it may return 'this'. Do not rely on the return being a set + indicating that there are no files in this dir to investigate (or + equivalently that if there are files to investigate in 'dir' that it + will always return 'this'). + """ + + @abc.abstractmethod + def always(self) -> bool: + """Matcher will match everything and .files() will be empty -- + optimization might be possible.""" + + @abc.abstractmethod + def isexact(self) -> bool: + """Matcher will match exactly the list of files in .files() -- + optimization might be possible.""" + + @abc.abstractmethod + def prefix(self) -> bool: + """Matcher will match the paths in .files() recursively -- + optimization might be possible.""" + + @abc.abstractmethod + def anypats(self) -> bool: + """None of .always(), .isexact(), and .prefix() is true -- + optimizations will be difficult."""
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mercurial/interfaces/misc.py Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,149 @@ +# misc.py - Various Interface that did not deserve a dedicated module (yet) +# +# Copyright 2025 Octobus, contact@octobus.net +from __future__ import annotations + +import abc + +from typing import ( + Callable, + Dict, + Iterator, + List, + Optional, + Protocol, + Tuple, +) + + +class IHooks(Protocol): + """A collection of hook functions that can be used to extend a + function's behavior. Hooks are called in lexicographic order, + based on the names of their sources.""" + + @abc.abstractmethod + def add(self, source: bytes, hook: Callable): + ... + + @abc.abstractmethod + def __call__(self, *args) -> List: + ... + + +class IDirs(Protocol): + '''a multiset of directory names from a set of file paths''' + + @abc.abstractmethod + def addpath(self, path: bytes) -> None: + ... + + @abc.abstractmethod + def delpath(self, path: bytes) -> None: + ... + + @abc.abstractmethod + def __iter__(self) -> Iterator[bytes]: + ... + + @abc.abstractmethod + def __contains__(self, d: bytes) -> bool: + ... + + +AuthInfoT = Tuple[ + bytes, + Optional[ + Tuple[ + None, + Tuple[bytes, bytes], + bytes, + bytes, + ] + ], +] + + +class IUrl(Protocol): + r"""Reliable URL parser. + + This parses URLs and provides attributes for the following + components: + + <scheme>://<user>:<passwd>@<host>:<port>/<path>?<query>#<fragment> + + Missing components are set to None. The only exception is + fragment, which is set to '' if present but empty. + + If parsefragment is False, fragment is included in query. If + parsequery is False, query is included in path. If both are + False, both fragment and query are included in path. + + See http://www.ietf.org/rfc/rfc2396.txt for more information. + """ + + path: Optional[bytes] + scheme: Optional[bytes] + user: Optional[bytes] + passwd: Optional[bytes] + host: Optional[bytes] + port: Optional[bytes] + query: Optional[bytes] + fragment: Optional[bytes] + + @abc.abstractmethod + def copy(self) -> IUrl: + ... + + @abc.abstractmethod + def authinfo(self) -> AuthInfoT: + ... + + @abc.abstractmethod + def isabs(self) -> bool: + ... + + @abc.abstractmethod + def localpath(self) -> bytes: + ... + + @abc.abstractmethod + def islocal(self) -> bool: + ... + + +class IPath(Protocol): + """Represents an individual path and its configuration.""" + + name: bytes + main_path: Optional[IPath] + url: IUrl + raw_url: IUrl + branch: bytes + rawloc: bytes + loc: bytes + + @abc.abstractmethod + def copy(self, new_raw_location: Optional[bytes] = None) -> IPath: + ... + + @property + @abc.abstractmethod + def is_push_variant(self) -> bool: + """is this a path variant to be used for pushing""" + + @abc.abstractmethod + def get_push_variant(self) -> IPath: + """get a "copy" of the path, but suitable for pushing + + This means using the value of the `pushurl` option (if any) as the url. + + The original path is available in the `main_path` attribute. + """ + + @property + @abc.abstractmethod + def suboptions(self) -> Dict[bytes, bytes]: + """Return sub-options and their values for this path. + + This is intended to be used for presentation purposes. + """
--- a/mercurial/interfaces/repository.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/interfaces/repository.py Fri Feb 28 23:28:10 2025 +0100 @@ -17,6 +17,7 @@ Iterable, Iterator, Mapping, + Optional, Protocol, Set, ) @@ -29,28 +30,20 @@ ByteString, # TODO: change to Buffer for 3.14 ) - # Almost all mercurial modules are only imported in the type checking phase - # to avoid circular imports - from .. import ( - match as matchmod, - pathutil, - util, + from ._basetypes import ( + UiT as Ui, + VfsT as Vfs, ) - from ..utils import ( - urlutil, + + from . import ( + dirstate as intdirstate, + matcher, + misc, ) - from . import dirstate as intdirstate - # TODO: make a protocol class for this NodeConstants = Any - # TODO: create a Protocol class, since importing uimod here causes a cycle - # that confuses pytype. - Ui = Any - - # TODO: make a protocol class for this - Vfs = Any # Local repository feature string. @@ -151,7 +144,7 @@ ui: Ui """ui.ui instance""" - path: urlutil.path | None + path: Optional[misc.IPath] """a urlutil.path instance or None""" @abc.abstractmethod @@ -456,13 +449,13 @@ """ limitedarguments: bool = False - path: urlutil.path | None + path: misc.IPath | None ui: Ui def __init__( self, ui: Ui, - path: urlutil.path | None = None, + path: misc.IPath | None = None, remotehidden: bool = False, ) -> None: self.ui = ui @@ -1176,7 +1169,7 @@ """ @abc.abstractmethod - def dirs(self) -> pathutil.dirs: + def dirs(self) -> misc.IDirs: """Returns an object implementing the ``idirs`` interface.""" @abc.abstractmethod @@ -1184,7 +1177,7 @@ """Returns a bool indicating if a directory is in this manifest.""" @abc.abstractmethod - def walk(self, match: matchmod.basematcher) -> Iterator[bytes]: + def walk(self, match: matcher.IMatcher) -> Iterator[bytes]: """Generator of paths in manifest satisfying a matcher. If the matcher has explicit files listed and they don't exist in @@ -1195,7 +1188,7 @@ def diff( self, other: Any, # TODO: 'manifestdict' or (better) equivalent interface - match: matchmod.basematcher | None = None, + match: matcher.IMatcher | None = None, clean: bool = False, ) -> dict[ bytes, @@ -2118,16 +2111,36 @@ pass @abc.abstractmethod - def lock(self, wait=True): - """Lock the repository store and return a lock instance.""" + def lock(self, wait=True, steal_from=None): + """Lock the repository store and return a lock instance. + + If another lock object is specified through the "steal_from" argument, + the new lock will reuse the on-disk lock of that "stolen" lock instead + of creating its own. The "stolen" lock is no longer usable for any + purpose and won't execute its release callback. + + That steal_from argument is used during local clone when reloading a + repository. If we could remove the need for this during copy clone, we + could remove this function. + """ @abc.abstractmethod def currentlock(self): """Return the lock if it's held or None.""" @abc.abstractmethod - def wlock(self, wait=True): - """Lock the non-store parts of the repository.""" + def wlock(self, wait=True, steal_from=None): + """Lock the non-store parts of the repository. + + If another lock object is specified through the "steal_from" argument, + the new lock will reuse the on-disk lock of that "stolen" lock instead + of creating its own. The "stolen" lock is no longer usable for any + purpose and won't execute its release callback. + + That steal_from argument is used during local clone when reloading a + repository. If we could remove the need for this during copy clone, we + could remove this function. + """ @abc.abstractmethod def currentwlock(self): @@ -2207,7 +2220,7 @@ def checkpush(self, pushop): pass - prepushoutgoinghooks: util.hooks + prepushoutgoinghooks: misc.IHooks """util.hooks instance.""" @abc.abstractmethod
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mercurial/interfaces/transaction.py Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,253 @@ +# transaction.py - simple journaling scheme for mercurial +# +# This transaction scheme is intended to gracefully handle program +# errors and interruptions. More serious failures like system crashes +# can be recovered with an fsck-like tool. As the whole repository is +# effectively log-structured, this should amount to simply truncating +# anything that isn't referenced in the changelog. +# +# Copyright 2005, 2006 Olivia Mackall <olivia@selenic.com> +# +# This software may be used and distributed according to the terms of the +# GNU General Public License version 2 or any later version. + +from __future__ import annotations + +import abc + +from typing import ( + Callable, + Collection, + List, + Optional, + Protocol, + Tuple, + Union, +) + +from ._basetypes import ( + CallbackCategoryT, + HgPathT, + VfsKeyT, +) + +JournalEntryT = Tuple[HgPathT, int] + + +class ITransaction(Protocol): + @property + @abc.abstractmethod + def finalized(self) -> bool: + ... + + @abc.abstractmethod + def startgroup(self) -> None: + """delay registration of file entry + + This is used by strip to delay vision of strip offset. The transaction + sees either none or all of the strip actions to be done.""" + + @abc.abstractmethod + def endgroup(self) -> None: + """apply delayed registration of file entry. + + This is used by strip to delay vision of strip offset. The transaction + sees either none or all of the strip actions to be done.""" + + @abc.abstractmethod + def add(self, file: HgPathT, offset: int) -> None: + """record the state of an append-only file before update""" + + @abc.abstractmethod + def addbackup( + self, + file: HgPathT, + hardlink: bool = True, + location: VfsKeyT = b'', + for_offset: Union[bool, int] = False, + ) -> None: + """Adds a backup of the file to the transaction + + Calling addbackup() creates a hardlink backup of the specified file + that is used to recover the file in the event of the transaction + aborting. + + * `file`: the file path, relative to .hg/store + * `hardlink`: use a hardlink to quickly create the backup + + If `for_offset` is set, we expect a offset for this file to have been + previously recorded + """ + + @abc.abstractmethod + def registertmp(self, tmpfile: HgPathT, location: VfsKeyT = b'') -> None: + """register a temporary transaction file + + Such files will be deleted when the transaction exits (on both + failure and success). + """ + + @abc.abstractmethod + def addfilegenerator( + self, + genid: bytes, + filenames: Collection[HgPathT], + genfunc: Callable, + order: int = 0, + location: VfsKeyT = b'', + post_finalize: bool = False, + ) -> None: + """add a function to generates some files at transaction commit + + The `genfunc` argument is a function capable of generating proper + content of each entry in the `filename` tuple. + + At transaction close time, `genfunc` will be called with one file + object argument per entries in `filenames`. + + The transaction itself is responsible for the backup, creation and + final write of such file. + + The `genid` argument is used to ensure the same set of file is only + generated once. Call to `addfilegenerator` for a `genid` already + present will overwrite the old entry. + + The `order` argument may be used to control the order in which multiple + generator will be executed. + + The `location` arguments may be used to indicate the files are located + outside of the the standard directory for transaction. It should match + one of the key of the `transaction.vfsmap` dictionary. + + The `post_finalize` argument can be set to `True` for file generation + that must be run after the transaction has been finalized. + """ + + @abc.abstractmethod + def removefilegenerator(self, genid: bytes) -> None: + """reverse of addfilegenerator, remove a file generator function""" + + @abc.abstractmethod + def findoffset(self, file: HgPathT) -> Optional[int]: + ... + + @abc.abstractmethod + def readjournal(self) -> List[JournalEntryT]: + ... + + @abc.abstractmethod + def replace(self, file: HgPathT, offset: int) -> None: + """ + replace can only replace already committed entries + that are not pending in the queue + """ + + @abc.abstractmethod + def nest(self, name: bytes = b'<unnamed>') -> ITransaction: + ... + + @abc.abstractmethod + def release(self) -> None: + ... + + @abc.abstractmethod + def running(self) -> bool: + ... + + @abc.abstractmethod + def addpending( + self, + category: CallbackCategoryT, + callback: Callable[[ITransaction], None], + ) -> None: + """add a callback to be called when the transaction is pending + + The transaction will be given as callback's first argument. + + Category is a unique identifier to allow overwriting an old callback + with a newer callback. + """ + + @abc.abstractmethod + def writepending(self) -> bool: + """write pending file to temporary version + + This is used to allow hooks to view a transaction before commit""" + + @abc.abstractmethod + def hasfinalize(self, category: CallbackCategoryT) -> bool: + """check is a callback already exist for a category""" + + @abc.abstractmethod + def addfinalize( + self, + category: CallbackCategoryT, + callback: Callable[[ITransaction], None], + ) -> None: + """add a callback to be called when the transaction is closed + + The transaction will be given as callback's first argument. + + Category is a unique identifier to allow overwriting old callbacks with + newer callbacks. + """ + + @abc.abstractmethod + def addpostclose( + self, + category: CallbackCategoryT, + callback: Callable[[ITransaction], None], + ) -> None: + """add or replace a callback to be called after the transaction closed + + The transaction will be given as callback's first argument. + + Category is a unique identifier to allow overwriting an old callback + with a newer callback. + """ + + @abc.abstractmethod + def getpostclose( + self, category: CallbackCategoryT + ) -> Optional[Callable[[ITransaction], None]]: + """return a postclose callback added before, or None""" + + @abc.abstractmethod + def addabort( + self, + category: CallbackCategoryT, + callback: Callable[[ITransaction], None], + ) -> None: + """add a callback to be called when the transaction is aborted. + + The transaction will be given as the first argument to the callback. + + Category is a unique identifier to allow overwriting an old callback + with a newer callback. + """ + + @abc.abstractmethod + def addvalidator( + self, + category: CallbackCategoryT, + callback: Callable[[ITransaction], None], + ) -> None: + """adds a callback to be called when validating the transaction. + + The transaction will be given as the first argument to the callback. + + callback should raise exception if to abort transaction""" + + @abc.abstractmethod + def close(self) -> None: + '''commit the transaction''' + + @abc.abstractmethod + def abort(self) -> None: + """abort the transaction (generally called on error, or when the + transaction is not explicitly committed before going out of + scope)""" + + @abc.abstractmethod + def add_journal(self, vfs_id: VfsKeyT, path: HgPathT) -> None: + ...
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mercurial/interfaces/types.py Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,31 @@ +# mercurial/interfaces/types.py - type alias for interfaces +# +# This software may be used and distributed according to the terms of the +# GNU General Public License version 2 or any later version. +# +# This is the main entry point for Mercurial code writing type annotations. +# +# The general principle can be summarized when dealing with <FooBar> object: +# - to type your code: use FooBarT from `mercurial.interface.types` +# - to subclass <FooBar>: use IFooBar from `mercurial.interface.foo_bar` + +from __future__ import annotations + +from ._basetypes import ( # noqa: F401 (ignore imported but not used) + CallbackCategoryT, + FsPathT, + HgPathT, + RepoT, + UiT, + UserMsgT, + VfsKeyT, + VfsT, +) + +from . import ( + matcher, + transaction, +) + +MatcherT = matcher.IMatcher +TransactionT = transaction.ITransaction
--- a/mercurial/localrepo.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/localrepo.py Fri Feb 28 23:28:10 2025 +0100 @@ -303,7 +303,7 @@ self._closed = True -class localpeer(repository.peer): # (repository.ipeercommands) +class localpeer(repository.peer, repository.ipeercommands): '''peer for a local repo; reflects only the most recent API''' def __init__(self, repo, caps=None, path=None, remotehidden=False): @@ -459,7 +459,7 @@ # End of peer interface. -class locallegacypeer(localpeer): # (repository.ipeerlegacycommands) +class locallegacypeer(localpeer, repository.ipeerlegacycommands): """peer extension which implements legacy methods too; used for tests with restricted capabilities""" @@ -3071,6 +3071,7 @@ releasefn, acquirefn, desc, + steal_from=None, ): timeout = 0 warntimeout = 0 @@ -3083,18 +3084,31 @@ if not sync_file: sync_file = None - l = lockmod.trylock( - self.ui, - vfs, - lockname, - timeout, - warntimeout, - releasefn=releasefn, - acquirefn=acquirefn, - desc=desc, - signalsafe=signalsafe, - devel_wait_sync_file=sync_file, - ) + if steal_from is None: + l = lockmod.trylock( + self.ui, + vfs, + lockname, + timeout, + warntimeout, + releasefn=releasefn, + acquirefn=acquirefn, + desc=desc, + signalsafe=signalsafe, + devel_wait_sync_file=sync_file, + ) + else: + l = lockmod.steal_lock( + self.ui, + vfs, + lockname, + steal_from, + releasefn=releasefn, + acquirefn=acquirefn, + desc=desc, + signalsafe=signalsafe, + ) + return l def _afterlock(self, callback): @@ -3110,19 +3124,29 @@ else: # no lock have been found. callback(True) - def lock(self, wait=True): + def lock(self, wait=True, steal_from=None): """Lock the repository store (.hg/store) and return a weak reference to the lock. Use this before modifying the store (e.g. committing or stripping). If you are opening a transaction, get a lock as well.) If both 'lock' and 'wlock' must be acquired, ensure you always acquires - 'wlock' first to avoid a dead-lock hazard.""" + 'wlock' first to avoid a dead-lock hazard. + + + The steal_from argument is used during local clone when reloading a + repository. If we could remove the need for this during copy clone, we + could remove this function. + """ l = self._currentlock(self._lockref) if l is not None: + if steal_from is not None: + msg = "cannot steal lock if already locked" + raise error.ProgrammingError(msg) l.lock() return l - self.hook(b'prelock', throw=True) + if steal_from is None: + self.hook(b'prelock', throw=True) l = self._lock( vfs=self.svfs, lockname=b"lock", @@ -3130,24 +3154,34 @@ releasefn=None, acquirefn=self.invalidate, desc=_(b'repository %s') % self.origroot, + steal_from=steal_from, ) self._lockref = weakref.ref(l) return l - def wlock(self, wait=True): + def wlock(self, wait=True, steal_from=None): """Lock the non-store parts of the repository (everything under .hg except .hg/store) and return a weak reference to the lock. Use this before modifying files in .hg. If both 'lock' and 'wlock' must be acquired, ensure you always acquires - 'wlock' first to avoid a dead-lock hazard.""" - l = self._wlockref() if self._wlockref else None - if l is not None and l.held: + 'wlock' first to avoid a dead-lock hazard. + + The steal_from argument is used during local clone when reloading a + repository. If we could remove the need for this during copy clone, we + could remove this function. + """ + l = self._currentlock(self._wlockref) + if l is not None: + if steal_from is not None: + msg = "cannot steal wlock if already locked" + raise error.ProgrammingError(msg) l.lock() return l - self.hook(b'prewlock', throw=True) + if steal_from is None: + self.hook(b'prewlock', throw=True) # We do not need to check for non-waiting lock acquisition. Such # acquisition would not cause dead-lock as they would just fail. if wait and ( @@ -3173,12 +3207,13 @@ del unfi.__dict__['dirstate'] l = self._lock( - self.vfs, - b"wlock", - wait, - unlock, - self.invalidatedirstate, - _(b'working directory of %s') % self.origroot, + vfs=self.vfs, + lockname=b"wlock", + wait=wait, + releasefn=unlock, + acquirefn=self.invalidatedirstate, + desc=_(b'working directory of %s') % self.origroot, + steal_from=steal_from, ) self._wlockref = weakref.ref(l) return l @@ -3769,7 +3804,9 @@ requirements.add(requirementsmod.BOOKMARKS_IN_STORE_REQUIREMENT) # The feature is disabled unless a fast implementation is available. - persistent_nodemap_default = policy.importrust('revlog') is not None + persistent_nodemap_default = ( + policy.importrust('revlog', pyo3=True) is not None + ) if ui.configbool( b'format', b'use-persistent-nodemap', persistent_nodemap_default ):
--- a/mercurial/lock.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/lock.py Fri Feb 28 23:28:10 2025 +0100 @@ -111,6 +111,24 @@ raiseinterrupt(assertedsigs[0]) +def steal_lock(ui, vfs, lockname, stolen_lock, *args, **kwargs) -> lock: + """return a new lock that "steal" the locking made by a source lock + + This is used during local clone when reloading a repository. If we could + remove the need for this during copy clone, we could remove this function. + """ + new_lock = lock(vfs, lockname, 0, *args, dolock=False, **kwargs) + + assert stolen_lock.f == new_lock.f + assert stolen_lock.held > 0 + assert new_lock.held == 0 + new_lock.held += 1 + stolen_lock.held = None + if new_lock.acquirefn is not None: + new_lock.acquirefn() + return new_lock + + def trylock(ui, vfs, lockname, timeout, warntimeout, *args, **kwargs) -> lock: """return an acquired lock or raise an a LockHeld exception @@ -240,7 +258,11 @@ self.release(success=success) def __del__(self): - if self.held: + if self.held is None: + # lock has been stolen (during a local clone) and should never be + # touched again. + return + if self.held > 0: warnings.warn( "use lock.release instead of del lock", category=DeprecationWarning, @@ -274,7 +296,10 @@ ) def _trylock(self) -> None: - if self.held: + if self.held is None: + msg = "cannot acquire a lock after it was stolen" + raise error.ProgrammingError(msg) + if self.held > 0: self.held += 1 return if lock._host is None: @@ -380,6 +405,9 @@ If the lock has been acquired multiple times, the actual release is delayed to the last release call.""" + if self.held is None: + msg = "cannot release a lock after it was stolen" + raise error.ProgrammingError(msg) if self.held > 1: self.held -= 1 elif self.held == 1:
--- a/mercurial/logcmdutil.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/logcmdutil.py Fri Feb 28 23:28:10 2025 +0100 @@ -22,6 +22,9 @@ ) from .i18n import _ +from .interfaces.types import ( + MatcherT, +) from .node import wdirrev from .thirdparty import attr @@ -1083,9 +1086,7 @@ def makewalker( repo: Any, wopts: walkopts, -) -> Tuple[ - smartset.abstractsmartset, Optional[Callable[[Any], matchmod.basematcher]] -]: +) -> Tuple[smartset.abstractsmartset, Optional[Callable[[Any], MatcherT]]]: """Build (revs, makefilematcher) to scan revision/file history - revs is the smartset to be traversed.
--- a/mercurial/manifest.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/manifest.py Fri Feb 28 23:28:10 2025 +0100 @@ -28,6 +28,9 @@ ) from .i18n import _ +from .interfaces.types import ( + MatcherT, +) from .node import ( bin, hex, @@ -570,7 +573,7 @@ def hasdir(self, dir: bytes) -> bool: return dir in self._dirs - def _filesfastpath(self, match: matchmod.basematcher) -> bool: + def _filesfastpath(self, match: MatcherT) -> bool: """Checks whether we can correctly and quickly iterate over matcher files instead of over manifest files.""" files = match.files() @@ -579,7 +582,7 @@ or (match.prefix() and all(fn in self for fn in files)) ) - def walk(self, match: matchmod.basematcher) -> Iterator[bytes]: + def walk(self, match: MatcherT) -> Iterator[bytes]: """Generates matching file names. Equivalent to manifest.matches(match).iterkeys(), but without creating @@ -615,7 +618,7 @@ if not self.hasdir(fn): match.bad(fn, None) - def _matches(self, match: matchmod.basematcher) -> manifestdict: + def _matches(self, match: MatcherT) -> manifestdict: '''generate a new manifest filtered by the match argument''' if match.always(): return self.copy() @@ -635,7 +638,7 @@ def diff( self, m2: manifestdict, - match: Optional[matchmod.basematcher] = None, + match: Optional[MatcherT] = None, clean: bool = False, ) -> Dict[ bytes, @@ -847,7 +850,7 @@ _noop = lambda s: None -class treemanifest: # (repository.imanifestdict) +class treemanifest(repository.imanifestdict): _dir: bytes _dirs: Dict[bytes, treemanifest] _dirty: bool @@ -1202,7 +1205,7 @@ return copy def filesnotin( - self, m2: treemanifest, match: Optional[matchmod.basematcher] = None + self, m2: treemanifest, match: Optional[MatcherT] = None ) -> Set[bytes]: '''Set of files in this manifest that are not in the other''' if match and not match.always(): @@ -1250,7 +1253,7 @@ dirslash = dir + b'/' return dirslash in self._dirs or dirslash in self._lazydirs - def walk(self, match: matchmod.basematcher) -> Iterator[bytes]: + def walk(self, match: MatcherT) -> Iterator[bytes]: """Generates matching file names. It also reports nonexistent files by marking them bad with match.bad(). @@ -1275,7 +1278,7 @@ if not self.hasdir(fn): match.bad(fn, None) - def _walk(self, match: matchmod.basematcher) -> Iterator[bytes]: + def _walk(self, match: MatcherT) -> Iterator[bytes]: '''Recursively generates matching file names for walk().''' visit = match.visitchildrenset(self._dir[:-1]) if not visit: @@ -1293,13 +1296,13 @@ if not visit or p[:-1] in visit: yield from self._dirs[p]._walk(match) - def _matches(self, match: matchmod.basematcher) -> treemanifest: + def _matches(self, match: MatcherT) -> treemanifest: """recursively generate a new manifest filtered by the match argument.""" if match.always(): return self.copy() return self._matches_inner(match) - def _matches_inner(self, match: matchmod.basematcher) -> treemanifest: + def _matches_inner(self, match: MatcherT) -> treemanifest: if match.always(): return self.copy() @@ -1342,13 +1345,13 @@ def fastdelta( self, base: ByteString, changes: Iterable[Tuple[bytes, bool]] - ) -> ByteString: + ) -> tuple[ByteString, ByteString]: raise FastdeltaUnavailable() def diff( self, m2: treemanifest, - match: Optional[matchmod.basematcher] = None, + match: Optional[MatcherT] = None, clean: bool = False, ) -> Dict[ bytes, @@ -1482,11 +1485,11 @@ Callable[[treemanifest], None], bytes, bytes, - matchmod.basematcher, + MatcherT, ], None, ], - match: matchmod.basematcher, + match: MatcherT, ) -> None: self._load() # for consistency; should never have any effect here m1._load() @@ -1516,7 +1519,7 @@ writesubtree(subm, subp1, subp2, match) def walksubtrees( - self, matcher: Optional[matchmod.basematcher] = None + self, matcher: Optional[MatcherT] = None ) -> Iterator[treemanifest]: """Returns an iterator of the subtrees of this manifest, including this manifest itself. @@ -1669,7 +1672,7 @@ """Exception raised when fastdelta isn't usable on a manifest.""" -class manifestrevlog: # (repository.imanifeststorage) +class manifestrevlog(repository.imanifeststorage): """A revlog that stores manifest texts. This is responsible for caching the full-text manifest contents. """ @@ -2152,7 +2155,7 @@ return self._rootstore._revlog.update_caches(transaction=transaction) -class memmanifestctx: # (repository.imanifestrevisionwritable) +class memmanifestctx(repository.imanifestrevisionwritable): _manifestdict: manifestdict def __init__(self, manifestlog):
--- a/mercurial/match.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/match.py Fri Feb 28 23:28:10 2025 +0100 @@ -24,6 +24,9 @@ ) from .i18n import _ +from .interfaces.types import ( + MatcherT, +) from . import ( encoding, error, @@ -34,7 +37,11 @@ ) from .utils import stringutil -rustmod = policy.importrust('dirstate') +from .interfaces import ( + matcher as int_matcher, +) + +rustmod = policy.importrust('dirstate', pyo3=True) allpatternkinds = ( b're', @@ -403,7 +410,7 @@ return kindpats -class basematcher: +class basematcher(int_matcher.IMatcher): def __init__(self, badfn=None): self._was_tampered_with = False if badfn is not None: @@ -1081,7 +1088,7 @@ sub/x.txt: No such file """ - def __init__(self, path: bytes, matcher: basematcher) -> None: + def __init__(self, path: bytes, matcher: MatcherT) -> None: super().__init__() self._path = path self._matcher = matcher
--- a/mercurial/merge.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/merge.py Fri Feb 28 23:28:10 2025 +0100 @@ -11,7 +11,7 @@ import os import struct import typing -from typing import Dict, Optional, Tuple +from typing import Dict, Iterable, Iterator, Optional, Tuple from .i18n import _ from .node import nullrev @@ -41,7 +41,19 @@ worker, ) -rust_update_mod = policy.importrust("update") +if typing.TYPE_CHECKING: + # TODO: figure out what exactly is in this tuple + MergeResultData = tuple + MergeResultAction = tuple[bytes, Optional[MergeResultData], bytes] + """The filename, data about the merge, and message about the merge.""" + + FileMappingValue = tuple[ + mergestatemod.MergeAction, Optional[MergeResultData], bytes + ] + """The merge action, data about the merge, and message about the merge, for + the keyed file.""" + +rust_update_mod = policy.importrust("update", pyo3=True) _pack = struct.pack _unpack = struct.unpack @@ -132,7 +144,9 @@ return None -def _checkunknownfiles(repo, wctx, mctx, force, mresult, mergeforce): +def _checkunknownfiles( + repo, wctx, mctx, force, mresult: mergeresult, mergeforce +): """ Considers any actions that care about the presence of conflicting unknown files. For some actions, the result is to abort; for others, it is to @@ -277,7 +291,7 @@ ) -def _forgetremoved(wctx, mctx, branchmerge, mresult): +def _forgetremoved(wctx, mctx, branchmerge, mresult: mergeresult) -> None: """ Forget removed files @@ -310,7 +324,7 @@ ) -def _checkcollision(repo, wmf, mresult): +def _checkcollision(repo, wmf, mresult: mergeresult | None) -> None: """ Check for case-folding collisions. """ @@ -378,7 +392,7 @@ lastfull = f -def _filesindirs(repo, manifest, dirs): +def _filesindirs(repo, manifest, dirs) -> Iterator[tuple[bytes, bytes]]: """ Generator that yields pairs of all the files in the manifest that are found inside the directories listed in dirs, and which directory they are found @@ -391,7 +405,7 @@ break -def checkpathconflicts(repo, wctx, mctx, mresult): +def checkpathconflicts(repo, wctx, mctx, mresult: mergeresult) -> None: """ Check if any actions introduce path conflicts in the repository, updating actions to record or handle the path conflict accordingly. @@ -492,7 +506,13 @@ ctxname = bytes(mctx).rstrip(b'+') for f, p in _filesindirs(repo, mf, remoteconflicts): if f not in deletedfiles: - m, args, msg = mresult.getfile(p) + mapping_value = mresult.getfile(p) + + # Help pytype- in theory, this could be None since no default + # value is passed to getfile() above. + assert mapping_value is not None + + m, args, msg = mapping_value pnew = util.safename(p, ctxname, wctx, set(mresult.files())) if m in ( mergestatemod.ACTION_DELETED_CHANGED, @@ -526,7 +546,9 @@ ) -def _filternarrowactions(narrowmatch, branchmerge, mresult): +def _filternarrowactions( + narrowmatch, branchmerge, mresult: mergeresult +) -> None: """ Filters out actions that can ignored because the repo is narrowed. @@ -567,7 +589,12 @@ It has information about what actions need to be performed on dirstate mapping of divergent renames and other such cases.""" - def __init__(self): + _filemapping: dict[bytes, FileMappingValue] + _actionmapping: dict[ + mergestatemod.MergeAction, dict[bytes, tuple[MergeResultData, bytes]] + ] + + def __init__(self) -> None: """ filemapping: dict of filename as keys and action related info as values diverge: mapping of source name -> list of dest name for @@ -589,7 +616,13 @@ self._diverge = diverge self._renamedelete = renamedelete - def addfile(self, filename, action, data, message): + def addfile( + self, + filename: bytes, + action: mergestatemod.MergeAction, + data: MergeResultData | None, + message, + ) -> None: """adds a new file to the mergeresult object filename: file which we are adding @@ -606,7 +639,12 @@ self._filemapping[filename] = (action, data, message) self._actionmapping[action][filename] = (data, message) - def mapaction(self, actionfrom, actionto, transform): + def mapaction( + self, + actionfrom: mergestatemod.MergeAction, + actionto: mergestatemod.MergeAction, + transform, + ): """changes all occurrences of action `actionfrom` into `actionto`, transforming its args with the function `transform`. """ @@ -618,7 +656,9 @@ self._filemapping[f] = (actionto, data, msg) dest[f] = (data, msg) - def getfile(self, filename, default_return=None): + def getfile( + self, filename: bytes, default_return: FileMappingValue | None = None + ) -> FileMappingValue | None: """returns (action, args, msg) about this file returns default_return if the file is not present""" @@ -626,7 +666,9 @@ return self._filemapping[filename] return default_return - def files(self, actions=None): + def files( + self, actions: Iterable[mergestatemod.MergeAction] | None = None + ) -> Iterator[bytes]: """returns files on which provided action needs to perfromed If actions is None, all files are returned @@ -640,14 +682,16 @@ for a in actions: yield from self._actionmapping[a] - def removefile(self, filename): + def removefile(self, filename: bytes) -> None: """removes a file from the mergeresult object as the file might not merging anymore""" action, data, message = self._filemapping[filename] del self._filemapping[filename] del self._actionmapping[action][filename] - def getactions(self, actions, sort=False): + def getactions( + self, actions: Iterable[mergestatemod.MergeAction], sort: bool = False + ) -> Iterator[MergeResultAction]: """get list of files which are marked with these actions if sort is true, files for each action is sorted and then added @@ -662,7 +706,9 @@ for f, (args, msg) in self._actionmapping[a].items(): yield f, args, msg - def len(self, actions=None): + def len( + self, actions: Iterable[mergestatemod.MergeAction] | None = None + ) -> int: """returns number of files which needs actions if actions is passed, total of number of files in that action @@ -673,13 +719,15 @@ return sum(len(self._actionmapping[a]) for a in actions) - def filemap(self, sort=False): + def filemap( + self, sort: bool = False + ) -> Iterator[tuple[bytes, MergeResultData]]: if sort: yield from sorted(self._filemapping.items()) else: yield from self._filemapping.items() - def addcommitinfo(self, filename, key, value): + def addcommitinfo(self, filename: bytes, key, value) -> None: """adds key-value information about filename which will be required while committing this merge""" self._commitinfo[filename][key] = value @@ -697,7 +745,9 @@ return self._commitinfo @property - def actionsdict(self): + def actionsdict( + self, + ) -> dict[mergestatemod.MergeAction, list[MergeResultAction]]: """returns a dictionary of actions to be perfomed with action as key and a list of files and related arguments as values""" res = collections.defaultdict(list) @@ -706,13 +756,13 @@ res[a].append((f, args, msg)) return res - def setactions(self, actions): + def setactions(self, actions) -> None: self._filemapping = actions self._actionmapping = collections.defaultdict(dict) for f, (act, data, msg) in self._filemapping.items(): self._actionmapping[act][f] = data, msg - def hasconflicts(self): + def hasconflicts(self) -> bool: """tells whether this merge resulted in some actions which can result in conflicts or not""" for a in self._actionmapping.keys(): @@ -743,7 +793,7 @@ acceptremote, followcopies, forcefulldiff=False, -): +) -> mergeresult: """ Merge wctx and p2 with ancestor pa and generate merge action list @@ -1117,7 +1167,7 @@ return mresult -def _resolvetrivial(repo, wctx, mctx, ancestor, mresult): +def _resolvetrivial(repo, wctx, mctx, ancestor, mresult: mergeresult) -> None: """Resolves false conflicts where the nodeid changed but the content remained the same.""" # We force a copy of actions.items() because we're going to mutate @@ -1146,7 +1196,7 @@ followcopies, matcher=None, mergeforce=False, -): +) -> mergeresult: """ Calculate the actions needed to merge mctx into wctx using ancestors @@ -1469,7 +1519,7 @@ yield True, filedata -def _prefetchfiles(repo, ctx, mresult): +def _prefetchfiles(repo, ctx, mresult: mergeresult) -> None: """Invoke ``scmutil.prefetchfiles()`` for the files relevant to the dict of merge actions. ``ctx`` is the context being merged in.""" @@ -1516,7 +1566,7 @@ def applyupdates( repo, - mresult, + mresult: mergeresult, wctx, mctx, overwrite, @@ -1794,7 +1844,7 @@ b'fsmonitor', b'warn_update_file_count' ) # avoid cycle dirstate -> sparse -> merge -> dirstate - dirstate_rustmod = policy.importrust("dirstate") + dirstate_rustmod = policy.importrust("dirstate", pyo3=True) if dirstate_rustmod is not None: # When using rust status, fsmonitor becomes necessary at higher sizes
--- a/mercurial/narrowspec.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/narrowspec.py Fri Feb 28 23:28:10 2025 +0100 @@ -68,12 +68,27 @@ if _numlines(pat) > 1: raise error.Abort(_(b'newlines are not allowed in narrowspec paths')) + # patterns are stripped on load (see sparse.parseconfig), + # so a pattern ending in whitespace doesn't work correctly + if pat.strip() != pat: + raise error.Abort( + _( + b'leading or trailing whitespace is not allowed ' + b'in narrowspec paths' + ) + ) + components = pat.split(b'/') if b'.' in components or b'..' in components: raise error.Abort( _(b'"." and ".." are not allowed in narrowspec paths') ) + if pat != b'' and b'' in components: + raise error.Abort( + _(b'empty path components are not allowed in narrowspec paths') + ) + def normalizepattern(pattern, defaultkind=b'path'): """Returns the normalized version of a text-format pattern.
--- a/mercurial/obsutil.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/obsutil.py Fri Feb 28 23:28:10 2025 +0100 @@ -417,7 +417,10 @@ This is a first and basic implementation, with many shortcoming. """ - diffopts = diffutil.diffallopts(leftctx.repo().ui, {b'git': True}) + diffopts = diffutil.diffallopts( + leftctx.repo().ui, + {b'git': True, b'unified': 1}, + ) # Leftctx or right ctx might be filtered, so we need to use the contexts # with an unfiltered repository to safely compute the diff
--- a/mercurial/patch.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/patch.py Fri Feb 28 23:28:10 2025 +0100 @@ -14,6 +14,7 @@ import os import re import shutil +import typing import zlib from .i18n import _ @@ -44,6 +45,14 @@ stringutil, ) +if typing.TYPE_CHECKING: + import email + + from typing import ( + Any, + Iterator, + ) + stringio = util.stringio gitre = re.compile(br'diff --git a/(.*) b/(.*)') @@ -272,12 +281,18 @@ diffs_seen = 0 ok_types = (b'text/plain', b'text/x-diff', b'text/x-patch') message = b'' + + part: email.message.Message for part in msg.walk(): content_type = pycompat.bytestr(part.get_content_type()) ui.debug(b'Content-Type: %s\n' % content_type) if content_type not in ok_types: continue - payload = part.get_payload(decode=True) + + # When decode=True, the only possible return types are bytes or None + # for a multipart message. But it can't be multipart here, because the + # Content-Type was just checked. + payload = typing.cast(bytes, part.get_payload(decode=True)) m = diffre.search(payload) if m: hgpatch = False @@ -2792,7 +2807,9 @@ nextisnewline = True -def difflabel(func, *args, **kw): +# TODO: first tuple element is likely bytes, but was being detected as bytes|int +# so it needs investigation/more typing here. +def difflabel(func, *args, **kw) -> Iterator[tuple[Any, bytes]]: '''yields 2-tuples of (output, label) based on the output of func()''' if kw.get('opts') and kw['opts'].worddiff: dodiffhunk = diffsinglehunkinline
--- a/mercurial/pathutil.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/pathutil.py Fri Feb 28 23:28:10 2025 +0100 @@ -22,7 +22,9 @@ util, ) -rustdirs = policy.importrust('dirstate', 'Dirs') +from .interfaces import misc as int_misc + +rustdirs = policy.importrust('dirstate', 'Dirs', pyo3=True) parsers = policy.importmod('parsers') @@ -335,7 +337,7 @@ pos = path.find(pycompat.ossep, pos + 1) -class dirs: +class dirs(int_misc.IDirs): '''a multiset of directory names from a set of file paths''' def __init__(self, map, only_tracked=False):
--- a/mercurial/phases.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/phases.py Fri Feb 28 23:28:10 2025 +0100 @@ -173,10 +173,12 @@ phasenames[archived] = b'archived' phasenames[internal] = b'internal' # map phase name to phase number -phasenumber = {name: phase for phase, name in phasenames.items()} +phasenumber: dict[bytes, int] = { + name: phase for phase, name in phasenames.items() +} # like phasenumber, but also include maps for the numeric and binary # phase number to the phase number -phasenumber2 = phasenumber.copy() +phasenumber2: dict[bytes | int, int] = phasenumber.copy() phasenumber2.update({phase: phase for phase in phasenames}) phasenumber2.update({b'%i' % phase: phase for phase in phasenames}) # record phase property @@ -216,7 +218,7 @@ """ repo = repo.unfiltered() dirty = False - roots = {i: set() for i in allphases} + roots: Phaseroots = {i: set() for i in allphases} to_rev = repo.changelog.index.get_rev unknown_msg = b'removing unknown node %s from %i-phase boundary\n' try:
--- a/mercurial/pure/parsers.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/pure/parsers.py Fri Feb 28 23:28:10 2025 +0100 @@ -736,7 +736,7 @@ class IndexObject(BaseIndexObject): - def __init__(self, data: ByteString): + def __init__(self, data: ByteString, uses_generaldelta=False): assert len(data) % self.entry_size == 0, ( len(data), self.entry_size, @@ -813,7 +813,7 @@ class InlinedIndexObject(BaseIndexObject): - def __init__(self, data, inline=0): + def __init__(self, data, inline=0, uses_generaldelta=False): self._data = data self._lgt = self._inline_scan(None) self._inline_scan(self._lgt) @@ -856,7 +856,10 @@ def parse_index2( - data: ByteString, inline, format=revlog_constants.REVLOGV1 + data: ByteString, + inline, + uses_generaldelta, + format=revlog_constants.REVLOGV1, ) -> tuple[IndexObject | InlinedIndexObject, tuple[int, ByteString] | None]: if format == revlog_constants.CHANGELOGV2: return parse_index_cl_v2(data) @@ -865,9 +868,9 @@ cls = IndexObject2 else: cls = IndexObject - return cls(data), None + return cls(data, uses_generaldelta), None cls = InlinedIndexObject - return cls(data, inline), (0, data) + return cls(data, inline, uses_generaldelta), (0, data) def parse_index_cl_v2(data): @@ -986,9 +989,9 @@ return self.index_format.pack(*data) -def parse_index_devel_nodemap(data, inline): - """like parse_index2, but alway return a PersistentNodeMapIndexObject""" - return PersistentNodeMapIndexObject(data), None +def parse_index_devel_nodemap(data, inline, uses_generaldelta): + """like parse_index2, but always return a PersistentNodeMapIndexObject""" + return PersistentNodeMapIndexObject(data, uses_generaldelta), None def parse_dirstate(dmap, copymap, st):
--- a/mercurial/pycompat.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/pycompat.py Fri Feb 28 23:28:10 2025 +0100 @@ -11,22 +11,22 @@ from __future__ import annotations import builtins -import codecs -import concurrent.futures as futures +import codecs # noqa: F401 (ignore imported but not used) +import concurrent.futures as futures # noqa: F401 (ignore imported but not used) import getopt -import http.client as httplib -import http.cookiejar as cookielib +import http.client as httplib # noqa: F401 (ignore imported but not used) +import http.cookiejar as cookielib # noqa: F401 (ignore imported but not used) import inspect -import io +import io # noqa: F401 (ignore imported but not used) import json import os -import queue +import queue # noqa: F401 (ignore imported but not used) import shlex -import socketserver +import socketserver # noqa: F401 (ignore imported but not used) import struct import sys import tempfile -import xmlrpc.client as xmlrpclib +import xmlrpc.client as xmlrpclib # noqa: F401 (ignore imported but not used) from typing import ( Any,
--- a/mercurial/revlog.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/revlog.py Fri Feb 28 23:28:10 2025 +0100 @@ -138,7 +138,7 @@ parsers = policy.importmod('parsers') rustancestor = policy.importrust('ancestor', pyo3=True) rustdagop = policy.importrust('dagop', pyo3=True) -rustrevlog = policy.importrust('revlog') +rustrevlog = policy.importrust('revlog', pyo3=True) # Aliased for performance. _zlibdecompress = zlib.decompress @@ -209,39 +209,40 @@ node = attr.ib(default=None, type=Optional[bytes]) -def parse_index_v1(data, inline): +def parse_index_v1(data, inline, uses_generaldelta): # call the C implementation to parse the index data - index, cache = parsers.parse_index2(data, inline) + index, cache = parsers.parse_index2(data, inline, uses_generaldelta) return index, cache -def parse_index_v2(data, inline): +def parse_index_v2(data, inline, uses_generaldelta): # call the C implementation to parse the index data - index, cache = parsers.parse_index2(data, inline, format=REVLOGV2) + index, cache = parsers.parse_index2( + data, inline, uses_generaldelta, format=REVLOGV2 + ) return index, cache -def parse_index_cl_v2(data, inline): +def parse_index_cl_v2(data, inline, uses_generaldelta): # call the C implementation to parse the index data - index, cache = parsers.parse_index2(data, inline, format=CHANGELOGV2) + index, cache = parsers.parse_index2( + data, inline, uses_generaldelta, format=CHANGELOGV2 + ) return index, cache if hasattr(parsers, 'parse_index_devel_nodemap'): - def parse_index_v1_nodemap(data, inline): - index, cache = parsers.parse_index_devel_nodemap(data, inline) + def parse_index_v1_nodemap(data, inline, uses_generaldelta): + index, cache = parsers.parse_index_devel_nodemap( + data, inline, uses_generaldelta + ) return index, cache else: parse_index_v1_nodemap = None -def parse_index_v1_rust(data, inline, default_header): - cache = (0, data) if inline else None - return rustrevlog.Index(data, default_header), cache - - # corresponds to uncompressed length of indexformatng (2 gigs, 4-byte # signed integer) _maxentrysize = 0x7FFFFFFF @@ -524,11 +525,10 @@ revs in ascending order and ``stopped`` is a bool indicating whether ``stoprev`` was hit. """ - generaldelta = self.delta_config.general_delta # Try C implementation. try: return self.index.deltachain( - rev, stoprev, generaldelta + rev, stoprev ) # pytype: disable=attribute-error except AttributeError: pass @@ -537,6 +537,7 @@ # Alias to prevent attribute lookup in tight loop. index = self.index + generaldelta = self.delta_config.general_delta iterrev = rev e = index[iterrev] @@ -1819,7 +1820,9 @@ self.uses_rust = True else: try: - d = self._parse_index(index_data, self._inline) + d = self._parse_index( + index_data, self._inline, self.delta_config.general_delta + ) index, chunkcache = d self._register_nodemap_info(index) except (ValueError, IndexError):
--- a/mercurial/revlogutils/revlogv0.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/revlogutils/revlogv0.py Fri Feb 28 23:28:10 2025 +0100 @@ -130,7 +130,7 @@ return [r for r, val in enumerate(ishead) if val] -def parse_index_v0(data, inline): +def parse_index_v0(data, inline, uses_generaldelta): s = INDEX_ENTRY_V0.size index = [] nodemap = nodemaputil.NodeMap({node.nullid: node.nullrev})
--- a/mercurial/scmutil.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/scmutil.py Fri Feb 28 23:28:10 2025 +0100 @@ -83,7 +83,7 @@ ) parsers = policy.importmod('parsers') -rustrevlog = policy.importrust('revlog') +rustrevlog = policy.importrust('revlog', pyo3=True) termsize = scmplatform.termsize
--- a/mercurial/setdiscovery.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/setdiscovery.py Fri Feb 28 23:28:10 2025 +0100 @@ -276,7 +276,7 @@ pure_partialdiscovery = partialdiscovery partialdiscovery = policy.importrust( - 'discovery', member='PartialDiscovery', default=partialdiscovery + 'discovery', member='PartialDiscovery', default=partialdiscovery, pyo3=True )
--- a/mercurial/sparse.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/sparse.py Fri Feb 28 23:28:10 2025 +0100 @@ -548,7 +548,7 @@ elif (old and not new) or (not old and not new and file in dirstate): dropped.append(file) if file not in pending: - mresult.addfile(file, mergestatemod.ACTION_REMOVE, [], b'') + mresult.addfile(file, mergestatemod.ACTION_REMOVE, None, b'') # Verify there are no pending changes in newly included files abort = False
--- a/mercurial/sshpeer.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/sshpeer.py Fri Feb 28 23:28:10 2025 +0100 @@ -11,6 +11,8 @@ import typing import uuid +from typing import Callable, Optional + from .i18n import _ from . import ( error, @@ -53,6 +55,31 @@ display(_(b"remote: "), l, b'\n') +def _write_all( + write_once: Callable[[bytes], Optional[int]], + data: bytes, +) -> Optional[int]: + """write data with a non blocking function + + In case not all data were written, keep writing until everything is + written. + """ + to_write = len(data) + written = write_once(data) + if written is None: + written = 0 + if written < to_write: + data = memoryview(data) + while written < to_write: + wrote = write_once(data[written:]) + # XXX if number of written bytes is "None", the destination is + # full. Some `select` call would be better than the current active + # polling. + if wrote is not None: + written += wrote + return written + + class doublepipe: """Operate a side-channel pipe in addition of a main one @@ -97,9 +124,14 @@ act = fds return (self._main.fileno() in act, self._side.fileno() in act) - def write(self, data): + def _write_once(self, data: bytes) -> Optional[int]: + """Write as much data as possible in a non blocking way""" return self._call(b'write', data) + def write(self, data: bytes) -> Optional[int]: + """write all data in a blocking way""" + return _write_all(self._write_once, data) + def read(self, size): r = self._call(b'read', size) if size != 0 and not r: @@ -130,6 +162,8 @@ # data can be '' or 0 if (data is not None and not data) or self._main.closed: _forwardoutput(self._ui, self._side) + if methname == b'write': + return 0 return b'' while True: mainready, sideready = self._wait() @@ -314,7 +348,7 @@ ui.debug(b'sending hello command\n') ui.debug(b'sending between command\n') - stdin.write(b''.join(handshake)) + _write_all(stdin.write, b''.join(handshake)) stdin.flush() except OSError: badresponse()
--- a/mercurial/statichttprepo.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/statichttprepo.py Fri Feb 28 23:28:10 2025 +0100 @@ -253,7 +253,7 @@ def peer(self, path=None, remotehidden=False): return statichttppeer(self, path=path, remotehidden=remotehidden) - def wlock(self, wait=True): + def wlock(self, wait=True, steal_from=None): raise error.LockUnavailable( 0, pycompat.sysstr(_(b'lock not available')), @@ -261,7 +261,7 @@ _(b'cannot lock static-http repository'), ) - def lock(self, wait=True): + def lock(self, wait=True, steal_from=None): raise error.LockUnavailable( 0, pycompat.sysstr(_(b'lock not available')),
--- a/mercurial/store.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/store.py Fri Feb 28 23:28:10 2025 +0100 @@ -16,6 +16,7 @@ from typing import ( Generator, + Iterator, List, Optional, ) @@ -116,7 +117,7 @@ ) -def _reserved(): +def _reserved() -> Iterator[int]: """characters that are problematic for filesystems * ascii escapes (0..31)
--- a/mercurial/streamclone.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/streamclone.py Fri Feb 28 23:28:10 2025 +0100 @@ -7,12 +7,22 @@ from __future__ import annotations +import collections import contextlib import errno import os import struct +import threading -from typing import Optional +from typing import ( + Callable, + Iterable, + Iterator, + Optional, + Set, + Tuple, + Type, +) from .i18n import _ from .interfaces import repository @@ -35,7 +45,29 @@ ) -def new_stream_clone_requirements(default_requirements, streamed_requirements): +# Number arbitrarily picked, feel free to change them (but the LOW one) +# +# update the configuration documentation if you touch this. +DEFAULT_NUM_WRITER = { + scmutil.RESOURCE_LOW: 1, + scmutil.RESOURCE_MEDIUM: 4, + scmutil.RESOURCE_HIGH: 8, +} + + +# Number arbitrarily picked, feel free to adjust them. Do update the +# documentation if you do so +DEFAULT_MEMORY_TARGET = { + scmutil.RESOURCE_LOW: 50 * (2**20), # 100 MB + scmutil.RESOURCE_MEDIUM: 500 * 2**20, # 500 MB + scmutil.RESOURCE_HIGH: 2 * 2**30, # 2 GB +} + + +def new_stream_clone_requirements( + default_requirements: Iterable[bytes], + streamed_requirements: Iterable[bytes], +) -> Set[bytes]: """determine the final set of requirement for a new stream clone this method combine the "default" requirements that a new repository would @@ -48,7 +80,7 @@ return requirements -def streamed_requirements(repo): +def streamed_requirements(repo) -> Set[bytes]: """the set of requirement the new clone will have to support This is used for advertising the stream options and to generate the actual @@ -59,7 +91,7 @@ return requiredformats -def canperformstreamclone(pullop, bundle2=False): +def canperformstreamclone(pullop, bundle2: bool = False): """Whether it is possible to perform a streaming clone as part of pull. ``bundle2`` will cause the function to consider stream clone through @@ -153,7 +185,7 @@ return True, requirements -def maybeperformlegacystreamclone(pullop): +def maybeperformlegacystreamclone(pullop) -> None: """Possibly perform a legacy stream clone operation. Legacy stream clones are performed as part of pull but before all other @@ -228,7 +260,7 @@ repo.invalidate() -def allowservergeneration(repo): +def allowservergeneration(repo) -> bool: """Whether streaming clones are allowed from the server.""" if repository.REPO_FEATURE_STREAM_CLONE not in repo.features: return False @@ -246,11 +278,30 @@ # This is it's own function so extensions can override it. -def _walkstreamfiles(repo, matcher=None, phase=False, obsolescence=False): +def _walkstreamfiles( + repo, matcher=None, phase: bool = False, obsolescence: bool = False +): return repo.store.walk(matcher, phase=phase, obsolescence=obsolescence) -def generatev1(repo): +def _report_transferred( + repo, start_time: float, file_count: int, byte_count: int +): + """common utility to report time it took to apply the stream bundle""" + elapsed = util.timer() - start_time + if elapsed <= 0: + elapsed = 0.001 + m = _(b'stream-cloned %d files / %s in %.1f seconds (%s/sec)\n') + m %= ( + file_count, + util.bytecount(byte_count), + elapsed, + util.bytecount(byte_count / elapsed), + ) + repo.ui.status(m) + + +def generatev1(repo) -> tuple[int, int, Iterator[bytes]]: """Emit content for version 1 of a streaming clone. This returns a 3-tuple of (file count, byte size, data iterator). @@ -291,7 +342,7 @@ svfs = repo.svfs debugflag = repo.ui.debugflag - def emitrevlogdata(): + def emitrevlogdata() -> Iterator[bytes]: for name, size in entries: if debugflag: repo.ui.debug(b'sending %s (%d bytes)\n' % (name, size)) @@ -308,7 +359,7 @@ return len(entries), total_bytes, emitrevlogdata() -def generatev1wireproto(repo): +def generatev1wireproto(repo) -> Iterator[bytes]: """Emit content for version 1 of streaming clone suitable for the wire. This is the data output from ``generatev1()`` with 2 header lines. The @@ -335,7 +386,9 @@ yield from it -def generatebundlev1(repo, compression=b'UN'): +def generatebundlev1( + repo, compression: bytes = b'UN' +) -> tuple[Set[bytes], Iterator[bytes]]: """Emit content for version 1 of a stream clone bundle. The first 4 bytes of the output ("HGS1") denote this as stream clone @@ -358,12 +411,12 @@ Returns a tuple of (requirements, data generator). """ if compression != b'UN': - raise ValueError(b'we do not support the compression argument yet') + raise ValueError('we do not support the compression argument yet') requirements = streamed_requirements(repo) requires = b','.join(sorted(requirements)) - def gen(): + def gen() -> Iterator[bytes]: yield b'HGS1' yield compression @@ -393,7 +446,7 @@ return requirements, gen() -def consumev1(repo, fp, filecount, bytecount): +def consumev1(repo, fp, filecount: int, bytecount: int) -> None: """Apply the contents from version 1 of a streaming clone file handle. This takes the output from "stream_out" and applies it to the specified @@ -427,6 +480,7 @@ # nesting occurs also in ordinary case (e.g. enabling # clonebundles). + total_file_count = 0 with repo.transaction(b'clone'): with repo.svfs.backgroundclosing(repo.ui, expectedcount=filecount): for i in range(filecount): @@ -455,6 +509,7 @@ # for backwards compat, name was partially encoded path = store.decodedir(name) with repo.svfs(path, b'w', backgroundclose=True) as ofp: + total_file_count += 1 for chunk in util.filechunkiter(fp, limit=size): progress.increment(step=len(chunk)) ofp.write(chunk) @@ -463,21 +518,11 @@ # streamclone-ed file at next access repo.invalidate(clearfilecache=True) - elapsed = util.timer() - start - if elapsed <= 0: - elapsed = 0.001 progress.complete() - repo.ui.status( - _(b'transferred %s in %.1f seconds (%s/sec)\n') - % ( - util.bytecount(bytecount), - elapsed, - util.bytecount(bytecount / elapsed), - ) - ) + _report_transferred(repo, start, total_file_count, bytecount) -def readbundle1header(fp): +def readbundle1header(fp) -> tuple[int, int, Set[bytes]]: compression = fp.read(2) if compression != b'UN': raise error.Abort( @@ -505,7 +550,7 @@ return filecount, bytecount, requirements -def applybundlev1(repo, fp): +def applybundlev1(repo, fp) -> None: """Apply the content from a stream clone bundle version 1. We assume the 4 byte header has been read and validated and the file handle @@ -535,10 +580,10 @@ readers to perform bundle type-specific functionality. """ - def __init__(self, fh): + def __init__(self, fh) -> None: self._fh = fh - def apply(self, repo): + def apply(self, repo) -> None: return applybundlev1(repo, self._fh) @@ -552,7 +597,7 @@ # This is it's own function so extensions can override it. -def _walkstreamfullstorefiles(repo): +def _walkstreamfullstorefiles(repo) -> list[bytes]: """list snapshot file from the store""" fnames = [] if not repo.publishing(): @@ -589,7 +634,7 @@ # for flushing to disk in __call__(). MAX_OPEN = 2 if pycompat.iswindows else 100 - def __init__(self): + def __init__(self) -> None: self._counter = 0 self._volatile_fps = None self._copies = None @@ -619,7 +664,7 @@ assert self._copies is None assert self._dst_dir is None - def _init_tmp_copies(self): + def _init_tmp_copies(self) -> None: """prepare a temporary directory to save volatile files This will be used as backup if we have too many files open""" @@ -629,7 +674,7 @@ self._copies = {} self._dst_dir = pycompat.mkdtemp(prefix=b'hg-clone-') - def _flush_some_on_disk(self): + def _flush_some_on_disk(self) -> None: """move some of the open files to tempory files on disk""" if self._copies is None: self._init_tmp_copies() @@ -648,7 +693,7 @@ del self._volatile_fps[src] fp.close() - def _keep_one(self, src): + def _keep_one(self, src: bytes) -> int: """preserve an open file handle for a given path""" # store the file quickly to ensure we close it if any error happens _, fp = self._volatile_fps[src] = (None, open(src, 'rb')) @@ -657,14 +702,14 @@ self._volatile_fps[src] = (size, fp) return size - def __call__(self, src): + def __call__(self, src: bytes) -> None: """preserve the volatile file at src""" assert 0 < self._counter if len(self._volatile_fps) >= (self.MAX_OPEN - 1): self._flush_some_on_disk() self._keep_one(src) - def try_keep(self, src) -> Optional[int]: + def try_keep(self, src: bytes) -> Optional[int]: """record a volatile file and returns it size return None if the file does not exists. @@ -721,7 +766,7 @@ # fine, while this is really not fine. if repo.vfs in vfsmap.values(): raise error.ProgrammingError( - b'repo.vfs must not be added to vfsmap for security reasons' + 'repo.vfs must not be added to vfsmap for security reasons' ) # translate the vfs one @@ -778,7 +823,7 @@ raise error.Abort(msg % (bytecount, name, size)) -def _emit3(repo, entries): +def _emit3(repo, entries) -> Iterator[bytes | None]: """actually emit the stream bundle (v3)""" vfsmap = _makemap(repo) # we keep repo.vfs out of the map on purpose, ther are too many dangers @@ -788,7 +833,7 @@ # fine, while this is really not fine. if repo.vfs in vfsmap.values(): raise error.ProgrammingError( - b'repo.vfs must not be added to vfsmap for security reasons' + 'repo.vfs must not be added to vfsmap for security reasons' ) # translate the vfs once @@ -880,14 +925,14 @@ BaseEntry base class, bbut the store one would be identical) """ - def __init__(self, entry_path): + def __init__(self, entry_path) -> None: super().__init__( entry_path, # we will directly deal with that in `setup_cache_file` is_volatile=True, ) - def preserve_volatiles(self, vfs, volatiles): + def preserve_volatiles(self, vfs, volatiles) -> None: self._file_size = volatiles.try_keep(vfs.join(self._entry_path)) if self._file_size is None: self._files = [] @@ -901,7 +946,7 @@ ) ] - def files(self): + def files(self) -> list[store.StoreFile]: if self._files is None: self._files = [ CacheFile( @@ -915,10 +960,10 @@ class CacheFile(store.StoreFile): # inform the "copy/hardlink" version that this file might be missing # without consequences. - optional = True + optional: bool = True -def _entries_walk(repo, includes, excludes, includeobsmarkers): +def _entries_walk(repo, includes, excludes, includeobsmarkers: bool): """emit a seris of files information useful to clone a repo return (vfs-key, entry) iterator @@ -950,7 +995,7 @@ yield (_srccache, CacheEntry(entry_path=name)) -def generatev2(repo, includes, excludes, includeobsmarkers): +def generatev2(repo, includes, excludes, includeobsmarkers: bool): """Emit content for version 2 of a streaming clone. the data stream consists the following entries: @@ -981,7 +1026,9 @@ return file_count, total_file_size, chunks -def generatev3(repo, includes, excludes, includeobsmarkers): +def generatev3( + repo, includes, excludes, includeobsmarkers: bool +) -> Iterator[bytes | None]: """Emit content for version 3 of a streaming clone. the data stream consists the following: @@ -1039,7 +1086,14 @@ yield -def consumev2(repo, fp, filecount, filesize): +class V2Report: + """a small class to track the data we saw within the stream""" + + def __init__(self): + self.byte_count = 0 + + +def consumev2(repo, fp, filecount: int, filesize: int) -> None: """Apply the contents from a version 2 streaming clone. Data is read from an object that only needs to provide a ``read(size)`` @@ -1050,12 +1104,13 @@ _(b'%d files to transfer, %s of data\n') % (filecount, util.bytecount(filesize)) ) - + progress = repo.ui.makeprogress( + _(b'clone'), + total=filesize, + unit=_(b'bytes'), + ) start = util.timer() - progress = repo.ui.makeprogress( - _(b'clone'), total=filesize, unit=_(b'bytes') - ) - progress.update(0) + report = V2Report() vfsmap = _makemap(repo) # we keep repo.vfs out of the on purpose, ther are too many danger @@ -1065,50 +1120,453 @@ # is fine, while this is really not fine. if repo.vfs in vfsmap.values(): raise error.ProgrammingError( - b'repo.vfs must not be added to vfsmap for security reasons' + 'repo.vfs must not be added to vfsmap for security reasons' ) + cpu_profile = scmutil.get_resource_profile(repo.ui, b'cpu') + mem_profile = scmutil.get_resource_profile(repo.ui, b'memory') + threaded = repo.ui.configbool( + b"worker", b"parallel-stream-bundle-processing" + ) + num_writer = repo.ui.configint( + b"worker", + b"parallel-stream-bundle-processing.num-writer", + ) + if num_writer <= 0: + num_writer = DEFAULT_NUM_WRITER[cpu_profile] + memory_target = repo.ui.configbytes( + b"worker", + b"parallel-stream-bundle-processing.memory-target", + ) + if memory_target < 0: + memory_target = None + elif memory_target == 0: + memory_target = DEFAULT_MEMORY_TARGET[mem_profile] with repo.transaction(b'clone'): ctxs = (vfs.backgroundclosing(repo.ui) for vfs in vfsmap.values()) with nested(*ctxs): - for i in range(filecount): - src = util.readexactly(fp, 1) - vfs = vfsmap[src] - namelen = util.uvarintdecodestream(fp) - datalen = util.uvarintdecodestream(fp) - - name = util.readexactly(fp, namelen) + workers = [] + info_queue = None + data_queue = None + mark_used = None + try: + if not threaded: + fc = _FileChunker + raw_data = fp + else: + fc = _ThreadSafeFileChunker + data_queue = _DataQueue(memory_target=memory_target) + if memory_target is not None: + mark_used = data_queue.mark_used + raw_data = util.chunkbuffer(data_queue) - if repo.ui.debugflag: - repo.ui.debug( - b'adding [%s] %s (%s)\n' - % (src, name, util.bytecount(datalen)) + w = threading.Thread( + target=data_queue.fill_from, + args=(fp,), ) + workers.append(w) + w.start() + files = _v2_parse_files( + repo, + raw_data, + vfsmap, + filecount, + progress, + report, + file_chunker=fc, + mark_used=mark_used, + ) + if not threaded: + _write_files(files) + else: + info_queue = _FileInfoQueue(files) - with vfs(name, b'w') as ofp: - for chunk in util.filechunkiter(fp, limit=datalen): - progress.increment(step=len(chunk)) - ofp.write(chunk) + for __ in range(num_writer): + w = threading.Thread( + target=_write_files, + args=(info_queue,), + ) + workers.append(w) + w.start() + info_queue.fill() + except: # re-raises + if data_queue is not None: + data_queue.abort() + raise + finally: + # shut down all the workers + if info_queue is not None: + # this is strictly speaking one too many worker for + # this queu, but closing too many is not a problem. + info_queue.close(len(workers)) + for w in workers: + w.join() # force @filecache properties to be reloaded from # streamclone-ed file at next access repo.invalidate(clearfilecache=True) - elapsed = util.timer() - start - if elapsed <= 0: - elapsed = 0.001 - repo.ui.status( - _(b'transferred %s in %.1f seconds (%s/sec)\n') - % ( - util.bytecount(progress.pos), - elapsed, - util.bytecount(progress.pos / elapsed), - ) - ) progress.complete() + # acknowledge the end of the bundle2 part, this help aligning + # sequential and parallel behavior. + remains = fp.read(1) + assert not remains + _report_transferred(repo, start, filecount, report.byte_count) + + +# iterator of chunk of bytes that constitute a file content. +FileChunksT = Iterator[bytes] +# Contains the information necessary to write stream file on disk +FileInfoT = Tuple[ + bytes, # real fs path + Optional[int], # permission to give to chmod + FileChunksT, # content +] + + +class _Queue: + """a reimplementation of queue.Queue which doesn't use thread.Condition""" + + def __init__(self): + self._queue = collections.deque() + + # the "_lock" protect manipulation of the "_queue" deque + # the "_wait" is used to have the "get" thread waits for the + # "put" thread when the queue is empty. + # + # This is similar to the "threading.Condition", but without the absurd + # slowness of the stdlib implementation. + # + # the "_wait" is always released while holding the "_lock". + self._lock = threading.Lock() + self._wait = threading.Lock() + + def put(self, item): + with self._lock: + self._queue.append(item) + # if anyone is waiting on item, unblock it. + if self._wait.locked(): + self._wait.release() + + def get(self): + with self._lock: + while len(self._queue) == 0: + # "arm" the waiting lock + self._wait.acquire(blocking=False) + # release the lock to let other touch the queue + # (especially the put call we wait on) + self._lock.release() + # wait for for a `put` call to release the lock + self._wait.acquire() + # grab the lock to look at a possible available value + self._lock.acquire() + # disarm the lock if necessary. + # + # If the queue only constains one item, keep the _wait lock + # armed, as there is no need to wake another waiter anyway. + if self._wait.locked() and len(self._queue) > 1: + self._wait.release() + return self._queue.popleft() + + +class _DataQueue: + """A queue passing data from the bundle stream to other thread + + It has a "memory_target" optional parameter to avoid buffering too much + information. The implementation is not exact and the memory target might be + exceed for a time in some situation. + """ + + def __init__(self, memory_target=None): + self._q = _Queue() + self._abort = False + self._memory_target = memory_target + if self._memory_target is not None and self._memory_target <= 0: + raise error.ProgrammingError("memory target should be > 0") + + # the "_lock" protect manipulation of the _current_used" variable + # the "_wait" is used to have the "reading" thread waits for the + # "using" thread when the buffer is full. + # + # This is similar to the "threading.Condition", but without the absurd + # slowness of the stdlib implementation. + # + # the "_wait" is always released while holding the "_lock". + self._lock = threading.Lock() + self._wait = threading.Lock() + # only the stream reader touch this, it is find to touch without the lock + self._current_read = 0 + # do not touch this without the lock + self._current_used = 0 + + def _has_free_space(self): + """True if more data can be read without further exceeding memory target + + Must be called under the lock. + """ + if self._memory_target is None: + # Ideally we should not even get into the locking business in that + # case, but we keep the implementation simple for now. + return True + return (self._current_read - self._current_used) < self._memory_target + + def mark_used(self, offset): + """Notify we have used the buffer up to "offset" + + This is meant to be used from another thread than the one filler the queue. + """ + if self._memory_target is not None: + with self._lock: + if offset > self._current_used: + self._current_used = offset + # If the reader is waiting for room, unblock it. + if self._wait.locked() and self._has_free_space(): + self._wait.release() + + def fill_from(self, data): + """fill the data queue from a bundle2 part object + + This is meant to be called by the data reading thread + """ + q = self._q + try: + for item in data: + self._current_read += len(item) + q.put(item) + if self._abort: + break + if self._memory_target is not None: + with self._lock: + while not self._has_free_space(): + # make sure the _wait lock is locked + # this is done under lock, so there case be no race with the release logic + self._wait.acquire(blocking=False) + self._lock.release() + # acquiring the lock will block until some other thread release it. + self._wait.acquire() + # lets dive into the locked section again + self._lock.acquire() + # make sure we release the lock we just grabed if + # needed. + if self._wait.locked(): + self._wait.release() + finally: + q.put(None) + + def __iter__(self): + """Iterate over the bundle chunkgs + + This is meant to be called by the data parsing thread.""" + q = self._q + while (i := q.get()) is not None: + yield i + if self._abort: + break + + def abort(self): + """stop the data-reading thread and interrupt the comsuming iteration + + This is meant to be called on errors. + """ + self._abort = True + self._q.put(None) + if self._memory_target is not None: + with self._lock: + # make sure we unstuck the reader thread. + if self._wait.locked(): + self._wait.release() -def consumev3(repo, fp): +class _FileInfoQueue: + """A thread-safe queue to passer parsed file information to the writers""" + + def __init__(self, info: Iterable[FileInfoT]): + self._info = info + self._q = _Queue() + + def fill(self): + """iterate over the parsed information to file the queue + + This is meant to be call from the thread parsing the stream information. + """ + q = self._q + for i in self._info: + q.put(i) + + def close(self, number_worker): + """signal all the workers that we no longer have any file info coming + + Called from the thread parsing the stream information (and/or the main + thread if different). + """ + for __ in range(number_worker): + self._q.put(None) + + def __iter__(self): + """iterate over the available file info + + This is meant to be called from the writer threads. + """ + q = self._q + while (i := q.get()) is not None: + yield i + + +class _FileChunker: + """yield the chunk that constitute a file + + This class exists as the counterpart of the threaded version and + would not be very useful on its own. + """ + + def __init__( + self, + fp: bundle2mod.unbundlepart, + data_len: int, + progress: scmutil.progress, + report: V2Report, + mark_used: Optional[Callable[[int], None]] = None, + ): + self.report = report + self.progress = progress + self._chunks = util.filechunkiter(fp, limit=data_len) + + def fill(self) -> None: + """Do nothing in non-threading context""" + + def __iter__(self) -> FileChunksT: + for chunk in self._chunks: + self.report.byte_count += len(chunk) + self.progress.increment(step=len(chunk)) + yield chunk + + +class _ThreadSafeFileChunker(_FileChunker): + """yield the chunk that constitute a file + + Make sure you call the "fill" function in the main thread to read the + right data at the right time. + """ + + def __init__( + self, + fp: bundle2mod.unbundlepart, + data_len: int, + progress: scmutil.progress, + report: V2Report, + mark_used: Optional[Callable[[int], None]] = None, + ): + super().__init__(fp, data_len, progress, report) + self._fp = fp + self._queue = _Queue() + self._mark_used = mark_used + + def fill(self) -> None: + """fill the file chunker queue with data read from the stream + + This is meant to be called from the thread parsing information (and + consuming the stream data). + """ + try: + for chunk in super().__iter__(): + offset = self._fp.tell() + self._queue.put((chunk, offset)) + finally: + self._queue.put(None) + + def __iter__(self) -> FileChunksT: + """Iterate over all the file chunk + + This is meant to be called from the writer threads. + """ + while (info := self._queue.get()) is not None: + chunk, offset = info + if self._mark_used is not None: + self._mark_used(offset) + yield chunk + + +def _trivial_file( + chunk: bytes, + mark_used: Optional[Callable[[int], None]], + offset: int, +) -> FileChunksT: + """used for single chunk file,""" + if mark_used is not None: + mark_used(offset) + yield chunk + + +def _v2_parse_files( + repo, + fp: bundle2mod.unbundlepart, + vfs_map, + file_count: int, + progress: scmutil.progress, + report: V2Report, + file_chunker: Type[_FileChunker] = _FileChunker, + mark_used: Optional[Callable[[int], None]] = None, +) -> Iterator[FileInfoT]: + """do the "stream-parsing" part of stream v2 + + The parsed information are yield result for consumption by the "writer" + """ + known_dirs = set() # set of directory that we know to exists + progress.update(0) + for i in range(file_count): + src = util.readexactly(fp, 1) + namelen = util.uvarintdecodestream(fp) + datalen = util.uvarintdecodestream(fp) + + name = util.readexactly(fp, namelen) + + if repo.ui.debugflag: + repo.ui.debug( + b'adding [%s] %s (%s)\n' % (src, name, util.bytecount(datalen)) + ) + vfs = vfs_map[src] + path, mode = vfs.prepare_streamed_file(name, known_dirs) + if datalen <= util.DEFAULT_FILE_CHUNK: + c = fp.read(datalen) + offset = fp.tell() + report.byte_count += len(c) + progress.increment(step=len(c)) + chunks = _trivial_file(c, mark_used, offset) + yield (path, mode, iter(chunks)) + else: + chunks = file_chunker( + fp, + datalen, + progress, + report, + mark_used=mark_used, + ) + yield (path, mode, iter(chunks)) + # make sure we read all the chunk before moving to the next file + chunks.fill() + + +def _write_files(info: Iterable[FileInfoT]): + """write files from parsed data""" + io_flags = os.O_WRONLY | os.O_CREAT + if pycompat.iswindows: + io_flags |= os.O_BINARY + for path, mode, data in info: + if mode is None: + fd = os.open(path, io_flags) + else: + fd = os.open(path, io_flags, mode=mode) + try: + for chunk in data: + written = os.write(fd, chunk) + # write missing pieces if the write was interrupted + while written < len(chunk): + written = os.write(fd, chunk[written:]) + finally: + os.close(fd) + + +def consumev3(repo, fp) -> None: """Apply the contents from a version 3 streaming clone. Data is read from an object that only needs to provide a ``read(size)`` @@ -1136,9 +1594,9 @@ # is fine, while this is really not fine. if repo.vfs in vfsmap.values(): raise error.ProgrammingError( - b'repo.vfs must not be added to vfsmap for security reasons' + 'repo.vfs must not be added to vfsmap for security reasons' ) - + total_file_count = 0 with repo.transaction(b'clone'): ctxs = (vfs.backgroundclosing(repo.ui) for vfs in vfsmap.values()) with nested(*ctxs): @@ -1147,6 +1605,7 @@ if filecount == 0: if repo.ui.debugflag: repo.ui.debug(b'entry with no files [%d]\n' % (i)) + total_file_count += filecount for i in range(filecount): src = util.readexactly(fp, 1) vfs = vfsmap[src] @@ -1170,18 +1629,13 @@ # streamclone-ed file at next access repo.invalidate(clearfilecache=True) - elapsed = util.timer() - start - if elapsed <= 0: - elapsed = 0.001 - msg = _(b'transferred %s in %.1f seconds (%s/sec)\n') - byte_count = util.bytecount(bytes_transferred) - bytes_sec = util.bytecount(bytes_transferred / elapsed) - msg %= (byte_count, elapsed, bytes_sec) - repo.ui.status(msg) progress.complete() + _report_transferred(repo, start, total_file_count, bytes_transferred) -def applybundlev2(repo, fp, filecount, filesize, requirements): +def applybundlev2( + repo, fp, filecount: int, filesize: int, requirements: Iterable[bytes] +) -> None: from . import localrepo missingreqs = [r for r in requirements if r not in repo.supported] @@ -1191,7 +1645,8 @@ % b', '.join(sorted(missingreqs)) ) - consumev2(repo, fp, filecount, filesize) + with util.nogc(): + consumev2(repo, fp, filecount, filesize) repo.requirements = new_stream_clone_requirements( repo.requirements, @@ -1204,7 +1659,7 @@ nodemap.post_stream_cleanup(repo) -def applybundlev3(repo, fp, requirements): +def applybundlev3(repo, fp, requirements: Iterable[bytes]) -> None: from . import localrepo missingreqs = [r for r in requirements if r not in repo.supported] @@ -1226,7 +1681,7 @@ nodemap.post_stream_cleanup(repo) -def _copy_files(src_vfs_map, dst_vfs_map, entries, progress): +def _copy_files(src_vfs_map, dst_vfs_map, entries, progress) -> bool: hardlink = [True] def copy_used(): @@ -1260,7 +1715,7 @@ return hardlink[0] -def local_copy(src_repo, dest_repo): +def local_copy(src_repo, dest_repo) -> None: """copy all content from one local repository to another This is useful for local clone"""
--- a/mercurial/subrepo.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/subrepo.py Fri Feb 28 23:28:10 2025 +0100 @@ -18,6 +18,9 @@ import xml.dom.minidom from .i18n import _ +from .interfaces.types import ( + MatcherT, +) from .node import ( bin, hex, @@ -367,7 +370,7 @@ """handle the files command for this subrepo""" return 1 - def archive(self, opener, prefix, match: matchmod.basematcher, decode=True): + def archive(self, opener, prefix, match: MatcherT, decode=True): files = [f for f in self.files() if match(f)] total = len(files) relpath = subrelpath(self) @@ -656,7 +659,7 @@ ) @annotatesubrepoerror - def archive(self, opener, prefix, match: matchmod.basematcher, decode=True): + def archive(self, opener, prefix, match: MatcherT, decode=True): self._get(self._state + (b'hg',)) files = [f for f in self.files() if match(f)] rev = self._state[1] @@ -1913,7 +1916,7 @@ else: self.wvfs.unlink(f) - def archive(self, opener, prefix, match: matchmod.basematcher, decode=True): + def archive(self, opener, prefix, match: MatcherT, decode=True): total = 0 source, revision = self._state if not revision:
--- a/mercurial/subrepoutil.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/subrepoutil.py Fri Feb 28 23:28:10 2025 +0100 @@ -57,6 +57,9 @@ ) from .interfaces import status as istatus + from .interfaces.types import ( + MatcherT, + ) # keeps pyflakes happy assert [ @@ -335,7 +338,7 @@ ui: uimod.ui, wctx: context.workingcommitctx, status: istatus.Status, - match: matchmod.basematcher, + match: MatcherT, force: bool = False, ) -> Tuple[List[bytes], Set[bytes], Substate]: """Calculate .hgsubstate changes that should be applied before committing
--- a/mercurial/testing/revlog.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/testing/revlog.py Fri Feb 28 23:28:10 2025 +0100 @@ -44,9 +44,22 @@ rust_revlog = None +try: + from ..pyo3_rustext import ( # pytype: disable=import-error + revlog as pyo3_revlog, + ) + + pyo3_revlog.__name__ # force actual import +except ImportError: + pyo3_revlog = None + + @unittest.skipIf( cparsers is None, - 'The C version of the "parsers" module is not available. It is needed for this test.', + ( + 'The C version of the "parsers" module is not available. ' + 'It is needed for this test.' + ), ) class RevlogBasedTestBase(unittest.TestCase): def parseindex(self, data=None): @@ -65,13 +78,21 @@ revlog_delta_config = revlog.DeltaConfig() revlog_feature_config = revlog.FeatureConfig() + @classmethod + def irl_class(cls): + return rust_revlog.InnerRevlog + + @classmethod + def nodetree(cls, idx): + return rust_revlog.NodeTree(idx) + def make_inner_revlog( self, data=None, vfs_is_readonly=True, kind=KIND_CHANGELOG ): if data is None: data = data_non_inlined - return rust_revlog.InnerRevlog( + return self.irl_class()( vfs_base=b"Just a path", fncache=None, # might be enough for now vfs_is_readonly=vfs_is_readonly, @@ -91,3 +112,17 @@ def parserustindex(self, data=None): return revlog.RustIndexProxy(self.make_inner_revlog(data=data)) + + +@unittest.skipIf( + pyo3_revlog is None, + 'The Rust PyO3 revlog module is not available. It is needed for this test.', +) +class PyO3RevlogBasedTestBase(RustRevlogBasedTestBase): + @classmethod + def irl_class(cls): + return pyo3_revlog.InnerRevlog + + @classmethod + def nodetree(cls, idx): + return pyo3_revlog.NodeTree(idx)
--- a/mercurial/transaction.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/transaction.py Fri Feb 28 23:28:10 2025 +0100 @@ -16,7 +16,21 @@ import errno import os +from typing import ( + Callable, + Collection, + List, + Optional, + Union, +) + from .i18n import _ +from .interfaces.types import ( + CallbackCategoryT, + HgPathT, + TransactionT, + VfsKeyT, +) from . import ( encoding, error, @@ -24,6 +38,7 @@ util, ) from .utils import stringutil +from .interfaces import transaction as itxn version = 2 @@ -224,7 +239,7 @@ pass -class transaction(util.transactional): +class transaction(util.transactional, itxn.ITransaction): def __init__( self, report, @@ -336,11 +351,11 @@ self._abort() @property - def finalized(self): + def finalized(self) -> bool: return self._finalizecallback is None @active - def startgroup(self): + def startgroup(self) -> None: """delay registration of file entry This is used by strip to delay vision of strip offset. The transaction @@ -348,7 +363,7 @@ self._queue.append([]) @active - def endgroup(self): + def endgroup(self) -> None: """apply delayed registration of file entry. This is used by strip to delay vision of strip offset. The transaction @@ -358,7 +373,7 @@ self._addentry(f, o) @active - def add(self, file, offset): + def add(self, file: HgPathT, offset: int) -> None: """record the state of an append-only file before update""" if ( file in self._newfiles @@ -391,7 +406,13 @@ self._file.flush() @active - def addbackup(self, file, hardlink=True, location=b'', for_offset=False): + def addbackup( + self, + file: HgPathT, + hardlink: bool = True, + location: VfsKeyT = b'', + for_offset: Union[bool, int] = False, + ) -> None: """Adds a backup of the file to the transaction Calling addbackup() creates a hardlink backup of the specified file @@ -445,7 +466,7 @@ self._backupsfile.flush() @active - def registertmp(self, tmpfile, location=b''): + def registertmp(self, tmpfile: HgPathT, location: VfsKeyT = b'') -> None: """register a temporary transaction file Such files will be deleted when the transaction exits (on both @@ -457,13 +478,13 @@ @active def addfilegenerator( self, - genid, - filenames, - genfunc, - order=0, - location=b'', - post_finalize=False, - ): + genid: bytes, + filenames: Collection[HgPathT], + genfunc: Callable, + order: int = 0, + location: VfsKeyT = b'', + post_finalize: bool = False, + ) -> None: """add a function to generates some files at transaction commit The `genfunc` argument is a function capable of generating proper @@ -495,7 +516,7 @@ self._filegenerators[genid] = entry @active - def removefilegenerator(self, genid): + def removefilegenerator(self, genid: bytes) -> None: """reverse of addfilegenerator, remove a file generator function""" if genid in self._filegenerators: del self._filegenerators[genid] @@ -545,13 +566,13 @@ return any @active - def findoffset(self, file): + def findoffset(self, file: HgPathT) -> Optional[int]: if file in self._newfiles: return 0 return self._offsetmap.get(file) @active - def readjournal(self): + def readjournal(self) -> List[itxn.JournalEntryT]: self._file.seek(0) entries = [] for l in self._file.readlines(): @@ -560,7 +581,7 @@ return entries @active - def replace(self, file, offset): + def replace(self, file: HgPathT, offset: int) -> None: """ replace can only replace already committed entries that are not pending in the queue @@ -582,13 +603,13 @@ self._file.flush() @active - def nest(self, name=b'<unnamed>'): + def nest(self, name: bytes = b'<unnamed>') -> TransactionT: self._count += 1 self._usages += 1 self._names.append(name) return self - def release(self): + def release(self) -> None: if self._count > 0: self._usages -= 1 if self._names: @@ -597,10 +618,14 @@ if self._count > 0 and self._usages == 0: self._abort() - def running(self): + def running(self) -> bool: return self._count > 0 - def addpending(self, category, callback): + def addpending( + self, + category: CallbackCategoryT, + callback: Callable[[TransactionT], None], + ) -> None: """add a callback to be called when the transaction is pending The transaction will be given as callback's first argument. @@ -611,7 +636,7 @@ self._pendingcallback[category] = callback @active - def writepending(self): + def writepending(self) -> bool: """write pending file to temporary version This is used to allow hooks to view a transaction before commit""" @@ -624,12 +649,16 @@ return self._anypending @active - def hasfinalize(self, category): + def hasfinalize(self, category: CallbackCategoryT) -> bool: """check is a callback already exist for a category""" return category in self._finalizecallback @active - def addfinalize(self, category, callback): + def addfinalize( + self, + category: CallbackCategoryT, + callback: Callable[[TransactionT], None], + ) -> None: """add a callback to be called when the transaction is closed The transaction will be given as callback's first argument. @@ -640,7 +669,11 @@ self._finalizecallback[category] = callback @active - def addpostclose(self, category, callback): + def addpostclose( + self, + category: CallbackCategoryT, + callback: Callable[[TransactionT], None], + ) -> None: """add or replace a callback to be called after the transaction closed The transaction will be given as callback's first argument. @@ -651,12 +684,19 @@ self._postclosecallback[category] = callback @active - def getpostclose(self, category): + def getpostclose( + self, + category: CallbackCategoryT, + ) -> Optional[Callable[[TransactionT], None]]: """return a postclose callback added before, or None""" return self._postclosecallback.get(category, None) @active - def addabort(self, category, callback): + def addabort( + self, + category: CallbackCategoryT, + callback: Callable[[TransactionT], None], + ) -> None: """add a callback to be called when the transaction is aborted. The transaction will be given as the first argument to the callback. @@ -667,7 +707,11 @@ self._abortcallback[category] = callback @active - def addvalidator(self, category, callback): + def addvalidator( + self, + category: CallbackCategoryT, + callback: Callable[[TransactionT], None], + ) -> None: """adds a callback to be called when validating the transaction. The transaction will be given as the first argument to the callback. @@ -676,7 +720,7 @@ self._validatecallback[category] = callback @active - def close(self): + def close(self) -> None: '''commit the transaction''' if self._count == 1: for category in sorted(self._validatecallback): @@ -758,14 +802,14 @@ self._postclosecallback = None @active - def abort(self): + def abort(self) -> None: """abort the transaction (generally called on error, or when the transaction is not explicitly committed before going out of scope)""" self._abort() @active - def add_journal(self, vfs_id, path): + def add_journal(self, vfs_id: VfsKeyT, path: HgPathT) -> None: self._journal_files.append((vfs_id, path)) def _writeundo(self):
--- a/mercurial/upgrade_utils/actions.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/upgrade_utils/actions.py Fri Feb 28 23:28:10 2025 +0100 @@ -929,7 +929,7 @@ def has_upgrade_action(self, name): """Check whether the upgrade operation will perform this action""" - return name in self._upgrade_actions_names + return name in self.upgrade_actions_names def print_post_op_messages(self): """print post upgrade operation warning messages"""
--- a/mercurial/util.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/util.py Fri Feb 28 23:28:10 2025 +0100 @@ -26,7 +26,9 @@ import locale import mmap import os -import pickle # provides util.pickle symbol + +# provides util.pickle symbol +import pickle # noqa: F401 (ignore imported but not used) import re as remod import shutil import stat @@ -68,6 +70,7 @@ urllibcompat, ) from .interfaces import ( + misc as int_misc, modules as intmod, ) from .utils import ( @@ -86,6 +89,12 @@ Tuple, ] +if typing.TYPE_CHECKING: + from typing_extensions import ( + Self, + ) + + _Tcow = TypeVar('_Tcow', bound="cow") base85: intmod.Base85 = policy.importmod('base85') osutil = policy.importmod('osutil') @@ -1197,7 +1206,8 @@ try: from . import __version__ # pytype: disable=import-error - return __version__.version + # setuptools-scm uses py3 str + return __version__.version.encode() except ImportError: return b'unknown' @@ -1326,7 +1336,9 @@ Call preparewrite before doing any writes. """ - def preparewrite(self): + _copied: int # doesn't exist until first preparewrite() + + def preparewrite(self: _Tcow) -> _Tcow: """call this before writes, return self or a copied new object""" if getattr(self, '_copied', 0): self._copied -= 1 @@ -1334,7 +1346,7 @@ return self.__class__(self) # pytype: disable=wrong-arg-count return self - def copy(self): + def copy(self) -> Self: """always do a cheap copy""" self._copied = getattr(self, '_copied', 0) + 1 return self @@ -1413,26 +1425,24 @@ """ -class transactional: # pytype: disable=ignored-metaclass +class transactional(abc.ABC): """Base class for making a transactional type into a context manager.""" - __metaclass__ = abc.ABCMeta - @abc.abstractmethod - def close(self): + def close(self) -> None: """Successfully closes the transaction.""" @abc.abstractmethod - def release(self): + def release(self) -> None: """Marks the end of the transaction. If the transaction has not been closed, it will be aborted. """ - def __enter__(self): + def __enter__(self) -> Self: return self - def __exit__(self, exc_type, exc_val, exc_tb): + def __exit__(self, exc_type, exc_val, exc_tb) -> None: try: if exc_type is None: self.close() @@ -2786,6 +2796,25 @@ self.iter = splitbig(in_iter) self._queue = collections.deque() self._chunkoffset = 0 + self._absolute_offset = 0 + + def __iter__(self): + while self._queue: + chunk = self._queue.popleft() + if self._chunkoffset: + d = chunk[self._chunkoffset :] + else: + d = chunk + self._absolute_offset += len(d) + yield d + self._chunkoffset = 0 + for d in self.iter: + self._absolute_offset += len(d) + yield d + + def tell(self) -> int: + """tell how much data we have read so far""" + return self._absolute_offset def read(self, l=None): """Read L bytes of data from the iterator of chunks of data. @@ -2793,7 +2822,9 @@ If size parameter is omitted, read everything""" if l is None: - return b''.join(self.iter) + d = b''.join(self.iter) + self._absolute_offset += len(d) + return d left = l buf = [] @@ -2845,10 +2876,15 @@ self._chunkoffset += left left -= chunkremaining - return b''.join(buf) - - -def filechunkiter(f, size=131072, limit=None): + d = b''.join(buf) + self._absolute_offset += len(d) + return d + + +DEFAULT_FILE_CHUNK = 128 * (2**10) + + +def filechunkiter(f, size=DEFAULT_FILE_CHUNK, limit=None): """Create a generator that produces the data in the file size (default 131072) bytes at a time, up to optional limit (default is to read all data). Chunks may be less than size bytes if the @@ -3175,7 +3211,7 @@ raise error.ParseError(_(b"couldn't parse size: %s") % s) -class hooks: +class hooks(int_misc.IHooks): """A collection of hook functions that can be used to extend a function's behavior. Hooks are called in lexicographic order, based on the names of their sources.""" @@ -3183,10 +3219,10 @@ def __init__(self): self._hooks = [] - def add(self, source, hook): + def add(self, source: bytes, hook: Callable) -> None: self._hooks.append((source, hook)) - def __call__(self, *args): + def __call__(self, *args) -> List: self._hooks.sort(key=lambda x: x[0]) results = [] for source, hook in self._hooks:
--- a/mercurial/utils/cborutil.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/utils/cborutil.py Fri Feb 28 23:28:10 2025 +0100 @@ -8,7 +8,13 @@ from __future__ import annotations import struct +import typing +if typing.TYPE_CHECKING: + from typing import ( + Iterable, + Iterator, + ) # Very short very of RFC 7049... # @@ -64,7 +70,7 @@ BREAK_INT = 255 -def encodelength(majortype, length): +def encodelength(majortype: int, length: int) -> bytes: """Obtain a value encoding the major type and its length.""" if length < 24: return ENCODED_LENGTH_1.pack(majortype << 5 | length) @@ -78,12 +84,12 @@ return ENCODED_LENGTH_5.pack(majortype << 5 | 27, length) -def streamencodebytestring(v): +def streamencodebytestring(v: bytes) -> Iterator[bytes]: yield encodelength(MAJOR_TYPE_BYTESTRING, len(v)) yield v -def streamencodebytestringfromiter(it): +def streamencodebytestringfromiter(it: Iterable[bytes]) -> Iterator[bytes]: """Convert an iterator of chunks to an indefinite bytestring. Given an input that is iterable and each element in the iterator is @@ -98,7 +104,9 @@ yield BREAK -def streamencodeindefinitebytestring(source, chunksize=65536): +def streamencodeindefinitebytestring( + source, chunksize: int = 65536 +) -> Iterator[bytes]: """Given a large source buffer, emit as an indefinite length bytestring. This is a generator of chunks constituting the encoded CBOR data. @@ -121,9 +129,9 @@ yield BREAK -def streamencodeint(v): +def streamencodeint(v: int) -> Iterator[bytes]: if v >= 18446744073709551616 or v < -18446744073709551616: - raise ValueError(b'big integers not supported') + raise ValueError('big integers not supported') if v >= 0: yield encodelength(MAJOR_TYPE_UINT, v) @@ -131,7 +139,7 @@ yield encodelength(MAJOR_TYPE_NEGINT, abs(v) - 1) -def streamencodearray(l): +def streamencodearray(l) -> Iterator[bytes]: """Encode a known size iterable to an array.""" yield encodelength(MAJOR_TYPE_ARRAY, len(l)) @@ -140,7 +148,7 @@ yield from streamencode(i) -def streamencodearrayfromiter(it): +def streamencodearrayfromiter(it) -> Iterator[bytes]: """Encode an iterator of items to an indefinite length array.""" yield BEGIN_INDEFINITE_ARRAY @@ -155,7 +163,7 @@ return type(v).__name__, v -def streamencodeset(s): +def streamencodeset(s) -> Iterator[bytes]: # https://www.iana.org/assignments/cbor-tags/cbor-tags.xhtml defines # semantic tag 258 for finite sets. yield encodelength(MAJOR_TYPE_SEMANTIC, SEMANTIC_TAG_FINITE_SET) @@ -163,7 +171,7 @@ yield from streamencodearray(sorted(s, key=_mixedtypesortkey)) -def streamencodemap(d): +def streamencodemap(d: dict) -> Iterator[bytes]: """Encode dictionary to a generator. Does not supporting indefinite length dictionaries. @@ -175,7 +183,7 @@ yield from streamencode(value) -def streamencodemapfromiter(it): +def streamencodemapfromiter(it: Iterable) -> Iterator[bytes]: """Given an iterable of (key, value), encode to an indefinite length map.""" yield BEGIN_INDEFINITE_MAP @@ -186,12 +194,12 @@ yield BREAK -def streamencodebool(b): +def streamencodebool(b: bool) -> Iterator[bytes]: # major type 7, simple value 20 and 21. yield b'\xf5' if b else b'\xf4' -def streamencodenone(v): +def streamencodenone(v: None) -> Iterator[bytes]: # major type 7, simple value 22. yield b'\xf6' @@ -208,7 +216,7 @@ } -def streamencode(v): +def streamencode(v) -> Iterator[bytes]: """Encode a value in a streaming manner. Given an input object, encode it to CBOR recursively. @@ -238,7 +246,7 @@ """Represents an error decoding CBOR.""" -def _elementtointeger(b, i): +def _elementtointeger(b, i: int) -> int: return b[i] @@ -255,7 +263,7 @@ SPECIAL_INDEFINITE_BREAK = 5 -def decodeitem(b, offset=0): +def decodeitem(b, offset: int = 0): """Decode a new CBOR value from a buffer at offset. This function attempts to decode up to one complete CBOR value @@ -301,6 +309,7 @@ complete, value, readcount = decodeuint(subtype, b, offset) if complete: + assert value is not None # help pytype return True, -value - 1, readcount + 1, SPECIAL_NONE else: return False, None, readcount, SPECIAL_NONE @@ -335,7 +344,7 @@ return True, None, 1, SPECIAL_START_INDEFINITE_BYTESTRING elif majortype == MAJOR_TYPE_STRING: - raise CBORDecodeError(b'string major type not supported') + raise CBORDecodeError('string major type not supported') elif majortype == MAJOR_TYPE_ARRAY: # Beginning of arrays are treated as uints in order to decode their @@ -384,13 +393,13 @@ if special != SPECIAL_START_ARRAY: raise CBORDecodeError( - b'expected array after finite set semantic tag' + 'expected array after finite set semantic tag' ) return True, size, readcount + readcount2 + 1, SPECIAL_START_SET else: - raise CBORDecodeError(b'semantic tag %d not allowed' % tagvalue) + raise CBORDecodeError('semantic tag %d not allowed' % tagvalue) elif majortype == MAJOR_TYPE_SPECIAL: # Only specific values for the information field are allowed. @@ -404,12 +413,14 @@ return True, None, 1, SPECIAL_INDEFINITE_BREAK # If value is 24, subtype is in next byte. else: - raise CBORDecodeError(b'special type %d not allowed' % subtype) + raise CBORDecodeError('special type %d not allowed' % subtype) else: assert False -def decodeuint(subtype, b, offset=0, allowindefinite=False): +def decodeuint( + subtype: int, b: bytes, offset: int = 0, allowindefinite: bool = False +): """Decode an unsigned integer. ``subtype`` is the lower 5 bits from the initial byte CBOR item @@ -437,10 +448,10 @@ if allowindefinite: return True, None, 0 else: - raise CBORDecodeError(b'indefinite length uint not allowed here') + raise CBORDecodeError('indefinite length uint not allowed here') elif subtype >= 28: raise CBORDecodeError( - b'unsupported subtype on integer type: %d' % subtype + 'unsupported subtype on integer type: %d' % subtype ) if subtype == 24: @@ -452,7 +463,7 @@ elif subtype == 27: s = STRUCT_BIG_ULONGLONG else: - raise CBORDecodeError(b'bounds condition checking violation') + raise CBORDecodeError('bounds condition checking violation') if len(b) - offset >= s.size: return True, s.unpack_from(b, offset)[0], s.size @@ -468,7 +479,10 @@ or last in an indefinite length bytestring. """ - def __new__(cls, v, first=False, last=False): + isfirst: bool + islast: bool + + def __new__(cls, v, first: bool = False, last: bool = False): self = bytes.__new__(cls, v) self.isfirst = first self.islast = last @@ -541,7 +555,7 @@ _STATE_WANT_BYTESTRING_CHUNK_FIRST = 5 _STATE_WANT_BYTESTRING_CHUNK_SUBSEQUENT = 6 - def __init__(self): + def __init__(self) -> None: # TODO add support for limiting size of bytestrings # TODO add support for limiting number of keys / values in collections # TODO add support for limiting size of buffered partial values @@ -561,11 +575,11 @@ self._decodedvalues = [] @property - def inprogress(self): + def inprogress(self) -> bool: """Whether the decoder has partially decoded a value.""" return self._state != self._STATE_NONE - def decode(self, b, offset=0): + def decode(self, b, offset: int = 0) -> tuple[bool, int, int]: """Attempt to decode bytes from an input buffer. ``b`` is a collection of bytes and ``offset`` is the byte @@ -651,7 +665,7 @@ else: raise CBORDecodeError( - b'unhandled special state: %d' % special + 'unhandled special state: %d' % special ) # This value becomes an element of the current array. @@ -713,14 +727,14 @@ elif special == SPECIAL_START_INDEFINITE_BYTESTRING: raise CBORDecodeError( - b'indefinite length bytestrings ' - b'not allowed as array values' + 'indefinite length bytestrings ' + 'not allowed as array values' ) else: raise CBORDecodeError( - b'unhandled special item when ' - b'expecting array value: %d' % special + 'unhandled special item when ' + 'expecting array value: %d' % special ) # This value becomes the key of the current map instance. @@ -731,8 +745,8 @@ elif special == SPECIAL_START_INDEFINITE_BYTESTRING: raise CBORDecodeError( - b'indefinite length bytestrings ' - b'not allowed as map keys' + 'indefinite length bytestrings ' + 'not allowed as map keys' ) elif special in ( @@ -741,14 +755,14 @@ SPECIAL_START_SET, ): raise CBORDecodeError( - b'collections not supported as map keys' + 'collections not supported as map keys' ) # We do not allow special values to be used as map keys. else: raise CBORDecodeError( - b'unhandled special item when ' - b'expecting map key: %d' % special + 'unhandled special item when ' + 'expecting map key: %d' % special ) # This value becomes the value of the current map key. @@ -814,14 +828,14 @@ elif special == SPECIAL_START_INDEFINITE_BYTESTRING: raise CBORDecodeError( - b'indefinite length bytestrings not ' - b'allowed as map values' + 'indefinite length bytestrings not ' + 'allowed as map values' ) else: raise CBORDecodeError( - b'unhandled special item when ' - b'expecting map value: %d' % special + 'unhandled special item when ' + 'expecting map value: %d' % special ) self._currentmapkey = None @@ -835,8 +849,8 @@ elif special == SPECIAL_START_INDEFINITE_BYTESTRING: raise CBORDecodeError( - b'indefinite length bytestrings not ' - b'allowed as set values' + 'indefinite length bytestrings not ' + 'allowed as set values' ) elif special in ( @@ -845,14 +859,14 @@ SPECIAL_START_SET, ): raise CBORDecodeError( - b'collections not allowed as set values' + 'collections not allowed as set values' ) # We don't allow non-trivial types to exist as set values. else: raise CBORDecodeError( - b'unhandled special item when ' - b'expecting set value: %d' % special + 'unhandled special item when ' + 'expecting set value: %d' % special ) # This value represents the first chunk in an indefinite length @@ -883,8 +897,8 @@ else: raise CBORDecodeError( - b'unexpected special value when ' - b'expecting bytestring chunk: %d' % special + 'unexpected special value when ' + 'expecting bytestring chunk: %d' % special ) # This value represents the non-initial chunk in an indefinite @@ -905,13 +919,13 @@ else: raise CBORDecodeError( - b'unexpected special value when ' - b'expecting bytestring chunk: %d' % special + 'unexpected special value when ' + 'expecting bytestring chunk: %d' % special ) else: raise CBORDecodeError( - b'unhandled decoder state: %d' % self._state + 'unhandled decoder state: %d' % self._state ) # We could have just added the final value in a collection. End @@ -980,12 +994,16 @@ be buffered. """ - def __init__(self): + _decoder: sansiodecoder + _chunks: list + _wanted: int + + def __init__(self) -> None: self._decoder = sansiodecoder() self._chunks = [] self._wanted = 0 - def decode(self, b): + def decode(self, b) -> tuple[bool, int, int]: """Attempt to decode bytes to CBOR values. Returns a tuple with the following fields: @@ -1057,9 +1075,9 @@ havevalues, readcount, wantbytes = decoder.decode(b) if readcount != len(b): - raise CBORDecodeError(b'input data not fully consumed') + raise CBORDecodeError('input data not fully consumed') if decoder.inprogress: - raise CBORDecodeError(b'input data not complete') + raise CBORDecodeError('input data not complete') return decoder.getavailable()
--- a/mercurial/utils/stringutil.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/utils/stringutil.py Fri Feb 28 23:28:10 2025 +0100 @@ -17,6 +17,7 @@ import typing from typing import ( + Iterator, Optional, overload, ) @@ -72,7 +73,9 @@ return b''.join(pprintgen(o, bprefix=bprefix, indent=indent, level=level)) -def pprintgen(o, bprefix: bool = False, indent: int = 0, level: int = 0): +def pprintgen( + o, bprefix: bool = False, indent: int = 0, level: int = 0 +) -> Iterator[bytes]: """Pretty print an object to a generator of atoms. ``bprefix`` is a flag influencing whether bytestrings are preferred with
--- a/mercurial/utils/urlutil.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/utils/urlutil.py Fri Feb 28 23:28:10 2025 +0100 @@ -14,6 +14,7 @@ from typing import ( Callable, Dict, + Optional, Tuple, Union, ) @@ -30,6 +31,10 @@ stringutil, ) +from ..interfaces import ( + misc as int_misc, +) + from ..revlogutils import ( constants as revlog_constants, ) @@ -60,7 +65,7 @@ ) -class url: +class url(int_misc.IUrl): r"""Reliable URL parser. This parses URLs and provides attributes for the following @@ -244,7 +249,7 @@ if v is not None: setattr(self, a, urlreq.unquote(v)) - def copy(self): + def copy(self) -> url: u = url(b'temporary useless value') u.path = self.path u.scheme = self.scheme @@ -361,7 +366,7 @@ __str__ = encoding.strmethod(__bytes__) - def authinfo(self): + def authinfo(self) -> int_misc.AuthInfoT: user, passwd = self.user, self.passwd try: self.user, self.passwd = None, None @@ -376,7 +381,7 @@ # a password. return (s, (None, (s, self.host), self.user, self.passwd or b'')) - def isabs(self): + def isabs(self) -> bool: if self.scheme and self.scheme != b'file': return True # remote URL if hasdriveletter(self.path): @@ -401,7 +406,7 @@ return path return self._origpath - def islocal(self): + def islocal(self) -> bool: '''whether localpath will return something that posixfile can open''' return ( not self.scheme @@ -821,7 +826,7 @@ return new_paths -class path: +class path(int_misc.IPath): """Represents an individual path and its configuration.""" def __init__( @@ -888,7 +893,7 @@ self.rawloc = rawloc self.loc = b'%s' % u - def copy(self, new_raw_location=None): + def copy(self, new_raw_location: Optional[bytes] = None) -> path: """make a copy of this path object When `new_raw_location` is set, the new path will point to it. @@ -905,11 +910,11 @@ return new @property - def is_push_variant(self): + def is_push_variant(self) -> bool: """is this a path variant to be used for pushing""" return self.main_path is not None - def get_push_variant(self): + def get_push_variant(self) -> path: """get a "copy" of the path, but suitable for pushing This means using the value of the `pushurl` option (if any) as the url. @@ -961,7 +966,7 @@ return False @property - def suboptions(self): + def suboptions(self) -> Dict[bytes, bytes]: """Return sub-options and their values for this path. This is intended to be used for presentation purposes.
--- a/mercurial/vfs.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/vfs.py Fri Feb 28 23:28:10 2025 +0100 @@ -25,6 +25,7 @@ List, MutableMapping, Optional, + Set, Tuple, Type, TypeVar, @@ -86,6 +87,11 @@ # by the Rust code rust_compatible = True + # createmode is always available on subclasses + createmode: int + + _chmod: bool + # TODO: type return, which is util.posixfile wrapped by a proxy @abc.abstractmethod def __call__(self, path: bytes, mode: bytes = b'rb', **kwargs) -> Any: @@ -343,7 +349,11 @@ """ if forcibly: - def onexc(function, path: bytes, excinfo): + def onexc(function: Callable, path: str, excinfo: Exception): + # Note: str is passed here even if bytes are passed to rmtree + # on platforms where `shutil._use_fd_functions == True`. It is + # bytes otherwise. Fortunately, the methods used here accept + # both. if function is not os.remove: raise # read-only files cannot be unlinked under Windows @@ -354,7 +364,10 @@ os.remove(path) else: - onexc = None + + def onexc(*args): + pass + try: # pytype: disable=wrong-keyword-args return shutil.rmtree( @@ -449,6 +462,32 @@ def register_file(self, path: bytes) -> None: """generic hook point to lets fncache steer its stew""" + def prepare_streamed_file( + self, path: bytes, known_directories: Set[bytes] + ) -> Tuple[bytes, Optional[int]]: + """make sure we are ready to write a file from a stream clone + + The "known_directories" variable is here to avoid trying to create the + same directories over and over during a stream clone. It will be + updated by this function. + + return (path, mode):: + + <path> is the real file system path content should be written to, + <mode> is the file mode that need to be set if any. + """ + self._auditpath(path, b'wb') + self.register_file(path) + real_path = self.join(path) + dirname, basename = util.split(real_path) + if dirname not in known_directories: + util.makedirs(dirname, self.createmode, True) + known_directories.add(dirname) + mode = None + if self.createmode is not None: + mode = self.createmode & 0o666 + return real_path, mode + class vfs(abstractvfs): """Operate files relative to a base directory @@ -549,6 +588,7 @@ checkambig: bool = False, auditpath: bool = True, makeparentdirs: bool = True, + buffering: int = -1, ) -> Any: # TODO: should be BinaryIO if util.atomictempfile can be coersed """Open ``path`` file, which is relative to vfs root. @@ -619,7 +659,7 @@ self._trustnlink = nlink > 1 or util.checknlink(f) if nlink > 1 or not self._trustnlink: util.rename(util.mktempcopy(f), f) - fp = util.posixfile(f, mode) + fp = util.posixfile(f, mode, buffering=buffering) if nlink == 0: self._fixfilemode(f)
--- a/mercurial/wireprotoframing.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/wireprotoframing.py Fri Feb 28 23:28:10 2025 +0100 @@ -11,10 +11,16 @@ from __future__ import annotations +import abc import collections import struct import typing +from typing import ( + Protocol, + Type, +) + from .i18n import _ from .thirdparty import attr @@ -35,6 +41,13 @@ stringutil, ) +if typing.TYPE_CHECKING: + from typing import ( + Iterator, + ) + + HandleSendFramesReturnT = tuple[bytes, dict[bytes, Iterator[bytearray]]] + FRAME_HEADER_SIZE = 8 DEFAULT_MAX_FRAME_SIZE = 32768 @@ -42,7 +55,7 @@ STREAM_FLAG_END_STREAM = 0x02 STREAM_FLAG_ENCODING_APPLIED = 0x04 -STREAM_FLAGS = { +STREAM_FLAGS: dict[bytes, int] = { b'stream-begin': STREAM_FLAG_BEGIN_STREAM, b'stream-end': STREAM_FLAG_END_STREAM, b'encoded': STREAM_FLAG_ENCODING_APPLIED, @@ -57,7 +70,7 @@ FRAME_TYPE_SENDER_PROTOCOL_SETTINGS = 0x08 FRAME_TYPE_STREAM_SETTINGS = 0x09 -FRAME_TYPES = { +FRAME_TYPES: dict[bytes, int] = { b'command-request': FRAME_TYPE_COMMAND_REQUEST, b'command-data': FRAME_TYPE_COMMAND_DATA, b'command-response': FRAME_TYPE_COMMAND_RESPONSE, @@ -73,7 +86,7 @@ FLAG_COMMAND_REQUEST_MORE_FRAMES = 0x04 FLAG_COMMAND_REQUEST_EXPECT_DATA = 0x08 -FLAGS_COMMAND_REQUEST = { +FLAGS_COMMAND_REQUEST: dict[bytes, int] = { b'new': FLAG_COMMAND_REQUEST_NEW, b'continuation': FLAG_COMMAND_REQUEST_CONTINUATION, b'more': FLAG_COMMAND_REQUEST_MORE_FRAMES, @@ -83,7 +96,7 @@ FLAG_COMMAND_DATA_CONTINUATION = 0x01 FLAG_COMMAND_DATA_EOS = 0x02 -FLAGS_COMMAND_DATA = { +FLAGS_COMMAND_DATA: dict[bytes, int] = { b'continuation': FLAG_COMMAND_DATA_CONTINUATION, b'eos': FLAG_COMMAND_DATA_EOS, } @@ -91,7 +104,7 @@ FLAG_COMMAND_RESPONSE_CONTINUATION = 0x01 FLAG_COMMAND_RESPONSE_EOS = 0x02 -FLAGS_COMMAND_RESPONSE = { +FLAGS_COMMAND_RESPONSE: dict[bytes, int] = { b'continuation': FLAG_COMMAND_RESPONSE_CONTINUATION, b'eos': FLAG_COMMAND_RESPONSE_EOS, } @@ -99,7 +112,7 @@ FLAG_SENDER_PROTOCOL_SETTINGS_CONTINUATION = 0x01 FLAG_SENDER_PROTOCOL_SETTINGS_EOS = 0x02 -FLAGS_SENDER_PROTOCOL_SETTINGS = { +FLAGS_SENDER_PROTOCOL_SETTINGS: dict[bytes, int] = { b'continuation': FLAG_SENDER_PROTOCOL_SETTINGS_CONTINUATION, b'eos': FLAG_SENDER_PROTOCOL_SETTINGS_EOS, } @@ -107,13 +120,13 @@ FLAG_STREAM_ENCODING_SETTINGS_CONTINUATION = 0x01 FLAG_STREAM_ENCODING_SETTINGS_EOS = 0x02 -FLAGS_STREAM_ENCODING_SETTINGS = { +FLAGS_STREAM_ENCODING_SETTINGS: dict[bytes, int] = { b'continuation': FLAG_STREAM_ENCODING_SETTINGS_CONTINUATION, b'eos': FLAG_STREAM_ENCODING_SETTINGS_EOS, } # Maps frame types to their available flags. -FRAME_TYPE_FLAGS = { +FRAME_TYPE_FLAGS: dict[int, dict[bytes, int]] = { FRAME_TYPE_COMMAND_REQUEST: FLAGS_COMMAND_REQUEST, FRAME_TYPE_COMMAND_DATA: FLAGS_COMMAND_DATA, FRAME_TYPE_COMMAND_RESPONSE: FLAGS_COMMAND_RESPONSE, @@ -127,7 +140,7 @@ ARGUMENT_RECORD_HEADER = struct.Struct('<HH') -def humanflags(mapping, value): +def humanflags(mapping: dict[bytes, int], value: int) -> bytes: """Convert a numeric flags value to a human value, using a mapping table.""" namemap = {v: k for k, v in mapping.items()} flags = [] @@ -144,24 +157,24 @@ class frameheader: """Represents the data in a frame header.""" - length = attr.ib() - requestid = attr.ib() - streamid = attr.ib() - streamflags = attr.ib() - typeid = attr.ib() - flags = attr.ib() + length = attr.ib(type=int) + requestid = attr.ib(type=int) + streamid = attr.ib(type=int) + streamflags = attr.ib(type=int) + typeid = attr.ib(type=int) + flags = attr.ib(type=int) @attr.s(slots=True, repr=False) class frame: """Represents a parsed frame.""" - requestid = attr.ib() - streamid = attr.ib() - streamflags = attr.ib() - typeid = attr.ib() - flags = attr.ib() - payload = attr.ib() + requestid = attr.ib(type=int) + streamid = attr.ib(type=int) + streamflags = attr.ib(type=int) + typeid = attr.ib(type=int) + flags = attr.ib(type=int) + payload = attr.ib(type=bytes) @encoding.strmethod def __repr__(self): @@ -185,7 +198,14 @@ ) -def makeframe(requestid, streamid, streamflags, typeid, flags, payload): +def makeframe( + requestid: int, + streamid: int, + streamflags: int, + typeid: int, + flags: int, + payload: bytes, +) -> bytearray: """Assemble a frame into a byte array.""" # TODO assert size of payload. frame = bytearray(FRAME_HEADER_SIZE + len(payload)) @@ -206,7 +226,7 @@ return frame -def makeframefromhumanstring(s): +def makeframefromhumanstring(s: bytes) -> bytearray: """Create a frame from a human readable string Strings have the form: @@ -272,7 +292,7 @@ ) -def parseheader(data): +def parseheader(data: bytes) -> frameheader: """Parse a unified framing protocol frame header from a buffer. The header is expected to be in the buffer at offset 0 and the @@ -297,7 +317,7 @@ ) -def readframe(fh): +def readframe(fh) -> frame | None: """Read a unified framing protocol frame from a file object. Returns a 3-tuple of (type, flags, payload) for the decoded frame or @@ -332,14 +352,14 @@ def createcommandframes( - stream, - requestid, + stream: stream, + requestid: int, cmd, args, datafh=None, - maxframesize=DEFAULT_MAX_FRAME_SIZE, + maxframesize: int = DEFAULT_MAX_FRAME_SIZE, redirect=None, -): +) -> Iterator[bytearray]: """Create frames necessary to transmit a request to run a command. This is a generator of bytearrays. Each item represents a frame @@ -408,7 +428,9 @@ break -def createcommandresponseokframe(stream, requestid): +def createcommandresponseokframe( + stream: outputstream, requestid: int +) -> bytearray | None: overall = b''.join(cborutil.streamencode({b'status': b'ok'})) if stream.streamsettingssent: @@ -430,8 +452,10 @@ def createcommandresponseeosframes( - stream, requestid, maxframesize=DEFAULT_MAX_FRAME_SIZE -): + stream: outputstream, + requestid: int, + maxframesize: int = DEFAULT_MAX_FRAME_SIZE, +) -> Iterator[bytearray]: """Create an empty payload frame representing command end-of-stream.""" payload = stream.flush() @@ -459,7 +483,9 @@ break -def createalternatelocationresponseframe(stream, requestid, location): +def createalternatelocationresponseframe( + stream: outputstream, requestid: int, location +) -> bytearray: data = { b'status': b'redirect', b'location': { @@ -498,7 +524,9 @@ ) -def createcommanderrorresponse(stream, requestid, message, args=None): +def createcommanderrorresponse( + stream: stream, requestid: int, message: bytes, args=None +) -> Iterator[bytearray]: # TODO should this be using a list of {'msg': ..., 'args': {}} so atom # formatting works consistently? m = { @@ -521,7 +549,9 @@ ) -def createerrorframe(stream, requestid, msg, errtype): +def createerrorframe( + stream: stream, requestid: int, msg: bytes, errtype: bytes +) -> Iterator[bytearray]: # TODO properly handle frame size limits. assert len(msg) <= DEFAULT_MAX_FRAME_SIZE @@ -543,8 +573,11 @@ def createtextoutputframe( - stream, requestid, atoms, maxframesize=DEFAULT_MAX_FRAME_SIZE -): + stream: stream, + requestid: int, + atoms, + maxframesize: int = DEFAULT_MAX_FRAME_SIZE, +) -> Iterator[bytearray]: """Create a text output frame to render text to people. ``atoms`` is a 3-tuple of (formatting string, args, labels). @@ -610,7 +643,9 @@ level. """ - def __init__(self, stream, requestid, maxframesize=DEFAULT_MAX_FRAME_SIZE): + def __init__( + self, stream, requestid: int, maxframesize: int = DEFAULT_MAX_FRAME_SIZE + ) -> None: self._stream = stream self._requestid = requestid self._maxsize = maxframesize @@ -705,58 +740,82 @@ # mechanism. -class identityencoder: +class Encoder(Protocol): + """A protocol class for the various encoder implementations.""" + + @abc.abstractmethod + def encode(self, data) -> bytes: + raise NotImplementedError + + @abc.abstractmethod + def flush(self) -> bytes: + raise NotImplementedError + + @abc.abstractmethod + def finish(self) -> bytes: + raise NotImplementedError + + +class Decoder(Protocol): + """A protocol class for the various encoder implementations.""" + + @abc.abstractmethod + def decode(self, data) -> bytes: + raise NotImplementedError + + +class identityencoder(Encoder): """Encoder for the "identity" stream encoding profile.""" - def __init__(self, ui): + def __init__(self, ui) -> None: pass - def encode(self, data): + def encode(self, data) -> bytes: return data - def flush(self): + def flush(self) -> bytes: return b'' - def finish(self): + def finish(self) -> bytes: return b'' -class identitydecoder: +class identitydecoder(Decoder): """Decoder for the "identity" stream encoding profile.""" - def __init__(self, ui, extraobjs): + def __init__(self, ui, extraobjs) -> None: if extraobjs: raise error.Abort( _(b'identity decoder received unexpected additional values') ) - def decode(self, data): + def decode(self, data) -> bytes: return data -class zlibencoder: - def __init__(self, ui): +class zlibencoder(Encoder): + def __init__(self, ui) -> None: import zlib self._zlib = zlib self._compressor = zlib.compressobj() - def encode(self, data): + def encode(self, data) -> bytes: return self._compressor.compress(data) - def flush(self): + def flush(self) -> bytes: # Z_SYNC_FLUSH doesn't reset compression context, which is # what we want. return self._compressor.flush(self._zlib.Z_SYNC_FLUSH) - def finish(self): + def finish(self) -> bytes: res = self._compressor.flush(self._zlib.Z_FINISH) self._compressor = None return res -class zlibdecoder: - def __init__(self, ui, extraobjs): +class zlibdecoder(Decoder): + def __init__(self, ui, extraobjs) -> None: import zlib if extraobjs: @@ -766,51 +825,51 @@ self._decompressor = zlib.decompressobj() - def decode(self, data): + def decode(self, data) -> bytes: return self._decompressor.decompress(data) -class zstdbaseencoder: - def __init__(self, level): +class zstdbaseencoder(Encoder): + def __init__(self, level: int) -> None: from . import zstd # pytype: disable=import-error self._zstd = zstd cctx = zstd.ZstdCompressor(level=level) self._compressor = cctx.compressobj() - def encode(self, data): + def encode(self, data) -> bytes: return self._compressor.compress(data) - def flush(self): + def flush(self) -> bytes: # COMPRESSOBJ_FLUSH_BLOCK flushes all data previously fed into the # compressor and allows a decompressor to access all encoded data # up to this point. return self._compressor.flush(self._zstd.COMPRESSOBJ_FLUSH_BLOCK) - def finish(self): + def finish(self) -> bytes: res = self._compressor.flush(self._zstd.COMPRESSOBJ_FLUSH_FINISH) self._compressor = None return res class zstd8mbencoder(zstdbaseencoder): - def __init__(self, ui): + def __init__(self, ui) -> None: super().__init__(3) -class zstdbasedecoder: - def __init__(self, maxwindowsize): +class zstdbasedecoder(Decoder): + def __init__(self, maxwindowsize: int) -> None: from . import zstd # pytype: disable=import-error dctx = zstd.ZstdDecompressor(max_window_size=maxwindowsize) self._decompressor = dctx.decompressobj() - def decode(self, data): + def decode(self, data) -> bytes: return self._decompressor.decompress(data) class zstd8mbdecoder(zstdbasedecoder): - def __init__(self, ui, extraobjs): + def __init__(self, ui, extraobjs) -> None: if extraobjs: raise error.Abort( _(b'zstd8mb decoder received unexpected additional values') @@ -819,13 +878,19 @@ super().__init__(maxwindowsize=8 * 1048576) +# TypeVar('EncoderT', bound=Encoder) was flagged as "not in scope" when used +# on the STREAM_ENCODERS dict below. +if typing.TYPE_CHECKING: + EncoderT = Type[identityencoder | zlibencoder | zstd8mbencoder] + DecoderT = Type[identitydecoder | zlibdecoder | zstd8mbdecoder] + # We lazily populate this to avoid excessive module imports when importing # this module. -STREAM_ENCODERS = {} -STREAM_ENCODERS_ORDER = [] - - -def populatestreamencoders(): +STREAM_ENCODERS: dict[bytes, tuple[EncoderT, DecoderT]] = {} +STREAM_ENCODERS_ORDER: list[bytes] = [] + + +def populatestreamencoders() -> None: if STREAM_ENCODERS: return @@ -851,11 +916,16 @@ class stream: """Represents a logical unidirectional series of frames.""" - def __init__(self, streamid, active=False): + streamid: int + _active: bool + + def __init__(self, streamid: int, active: bool = False) -> None: self.streamid = streamid self._active = active - def makeframe(self, requestid, typeid, flags, payload): + def makeframe( + self, requestid: int, typeid: int, flags: int, payload: bytes + ) -> bytearray: """Create a frame to be sent out over this stream. Only returns the frame instance. Does not actually send it. @@ -873,11 +943,13 @@ class inputstream(stream): """Represents a stream used for receiving data.""" - def __init__(self, streamid, active=False): + _decoder: Decoder | None + + def __init__(self, streamid: int, active: bool = False) -> None: super().__init__(streamid, active=active) self._decoder = None - def setdecoder(self, ui, name, extraobjs): + def setdecoder(self, ui, name: bytes, extraobjs) -> None: """Set the decoder for this stream. Receives the stream profile name and any additional CBOR objects @@ -888,7 +960,7 @@ self._decoder = STREAM_ENCODERS[name][1](ui, extraobjs) - def decode(self, data): + def decode(self, data) -> bytes: # Default is identity decoder. We don't bother instantiating one # because it is trivial. if not self._decoder: @@ -896,23 +968,29 @@ return self._decoder.decode(data) - def flush(self): + def flush(self) -> bytes: if not self._decoder: return b'' + # TODO: this looks like a bug- no decoder class defines flush(), so + # either no decoders are used, or no inputstream is flushed. return self._decoder.flush() class outputstream(stream): """Represents a stream used for sending data.""" - def __init__(self, streamid, active=False): + streamsettingssent: bool + _encoder: Encoder | None + _encodername: bytes | None + + def __init__(self, streamid: int, active: bool = False) -> None: super().__init__(streamid, active=active) self.streamsettingssent = False self._encoder = None self._encodername = None - def setencoder(self, ui, name): + def setencoder(self, ui, name: bytes) -> None: """Set the encoder for this stream. Receives the stream profile name. @@ -923,25 +1001,33 @@ self._encoder = STREAM_ENCODERS[name][0](ui) self._encodername = name - def encode(self, data): + def encode(self, data) -> bytes: if not self._encoder: return data return self._encoder.encode(data) - def flush(self): + def flush(self) -> bytes: if not self._encoder: return b'' return self._encoder.flush() - def finish(self): + # TODO: was this supposed to return the result of finish()? + def finish(self): # -> bytes: if not self._encoder: return b'' self._encoder.finish() - def makeframe(self, requestid, typeid, flags, payload, encoded=False): + def makeframe( + self, + requestid: int, + typeid: int, + flags: int, + payload: bytes, + encoded: bool = False, + ) -> bytearray: """Create a frame to be sent out over this stream. Only returns the frame instance. Does not actually send it. @@ -970,7 +1056,7 @@ requestid, self.streamid, streamflags, typeid, flags, payload ) - def makestreamsettingsframe(self, requestid): + def makestreamsettingsframe(self, requestid: int) -> bytearray | None: """Create a stream settings frame for this stream. Returns frame data or None if no stream settings frame is needed or has @@ -988,7 +1074,7 @@ ) -def ensureserverstream(stream): +def ensureserverstream(stream: stream) -> None: if stream.streamid % 2: raise error.ProgrammingError( b'server should only write to even ' @@ -996,7 +1082,7 @@ ) -DEFAULT_PROTOCOL_SETTINGS = { +DEFAULT_PROTOCOL_SETTINGS: dict[bytes, list[bytes]] = { b'contentencodings': [b'identity'], } @@ -1066,7 +1152,9 @@ between who responds to what. """ - def __init__(self, ui, deferoutput=False): + _bufferedframegens: list[Iterator[bytearray]] + + def __init__(self, ui, deferoutput: bool = False) -> None: """Construct a new server reactor. ``deferoutput`` can be used to indicate that no output frames should be @@ -1098,7 +1186,7 @@ populatestreamencoders() - def onframerecv(self, frame): + def onframerecv(self, frame: frame): """Process a frame that has been received off the wire. Returns a dict with an ``action`` key that details what action, @@ -1147,7 +1235,9 @@ return meth(frame) - def oncommandresponsereadyobjects(self, stream, requestid, objs): + def oncommandresponsereadyobjects( + self, stream, requestid: int, objs + ) -> HandleSendFramesReturnT: """Signal that objects are ready to be sent to the client. ``objs`` is an iterable of objects (typically a generator) that will @@ -1286,7 +1376,7 @@ return self._handlesendframes(sendframes()) - def oninputeof(self): + def oninputeof(self) -> tuple[bytes, dict[bytes, Iterator[bytearray]]]: """Signals that end of input has been received. No more frames will be received. All pending activity should be @@ -1306,7 +1396,9 @@ b'framegen': makegen(), } - def _handlesendframes(self, framegen): + def _handlesendframes( + self, framegen: Iterator[bytearray] + ) -> HandleSendFramesReturnT: if self._deferoutput: self._bufferedframegens.append(framegen) return b'noop', {} @@ -1315,7 +1407,9 @@ b'framegen': framegen, } - def onservererror(self, stream, requestid, msg): + def onservererror( + self, stream: stream, requestid: int, msg: bytes + ) -> HandleSendFramesReturnT: ensureserverstream(stream) def sendframes(): @@ -1327,7 +1421,9 @@ return self._handlesendframes(sendframes()) - def oncommanderror(self, stream, requestid, message, args=None): + def oncommanderror( + self, stream: stream, requestid: int, message: bytes, args=None + ) -> HandleSendFramesReturnT: """Called when a command encountered an error before sending output.""" ensureserverstream(stream) @@ -1340,7 +1436,7 @@ return self._handlesendframes(sendframes()) - def makeoutputstream(self): + def makeoutputstream(self) -> outputstream: """Create a stream to be used for sending data to the client. If this is called before protocol settings frames are received, we @@ -1362,12 +1458,12 @@ return s - def _makeerrorresult(self, msg): + def _makeerrorresult(self, msg: bytes) -> tuple[bytes, dict[bytes, bytes]]: return b'error', { b'message': msg, } - def _makeruncommandresult(self, requestid): + def _makeruncommandresult(self, requestid: int): entry = self._receivingcommands[requestid] if not entry[b'requestdone']: @@ -1410,12 +1506,12 @@ }, ) - def _makewantframeresult(self): + def _makewantframeresult(self) -> tuple[bytes, dict[bytes, bytes]]: return b'wantframe', { b'state': self._state, } - def _validatecommandrequestframe(self, frame): + def _validatecommandrequestframe(self, frame: frame): new = frame.flags & FLAG_COMMAND_REQUEST_NEW continuation = frame.flags & FLAG_COMMAND_REQUEST_CONTINUATION @@ -1437,7 +1533,7 @@ ) ) - def _onframeinitial(self, frame): + def _onframeinitial(self, frame: frame): # Called when we receive a frame when in the "initial" state. if frame.typeid == FRAME_TYPE_SENDER_PROTOCOL_SETTINGS: self._state = b'protocol-settings-receiving' @@ -1458,7 +1554,7 @@ % frame.typeid ) - def _onframeprotocolsettings(self, frame): + def _onframeprotocolsettings(self, frame: frame): assert self._state == b'protocol-settings-receiving' assert self._protocolsettingsdecoder is not None @@ -1534,7 +1630,7 @@ return self._makewantframeresult() - def _onframeidle(self, frame): + def _onframeidle(self, frame: frame): # The only frame type that should be received in this state is a # command request. if frame.typeid != FRAME_TYPE_COMMAND_REQUEST: @@ -1587,7 +1683,7 @@ self._state = b'command-receiving' return self._makewantframeresult() - def _onframecommandreceiving(self, frame): + def _onframecommandreceiving(self, frame: frame): if frame.typeid == FRAME_TYPE_COMMAND_REQUEST: # Process new command requests as such. if frame.flags & FLAG_COMMAND_REQUEST_NEW: @@ -1665,7 +1761,7 @@ _(b'received unexpected frame type: %d') % frame.typeid ) - def _handlecommanddataframe(self, frame, entry): + def _handlecommanddataframe(self, frame: frame, entry): assert frame.typeid == FRAME_TYPE_COMMAND_DATA # TODO support streaming data instead of buffering it. @@ -1680,14 +1776,18 @@ self._state = b'errored' return self._makeerrorresult(_(b'command data frame without flags')) - def _onframeerrored(self, frame): + def _onframeerrored(self, frame: frame): return self._makeerrorresult(_(b'server already errored')) class commandrequest: """Represents a request to run a command.""" - def __init__(self, requestid, name, args, datafh=None, redirect=None): + state: bytes + + def __init__( + self, requestid: int, name, args, datafh=None, redirect=None + ) -> None: self.requestid = requestid self.name = name self.args = args @@ -1743,13 +1843,16 @@ respectively. """ + _hasmultiplesend: bool + _buffersends: bool + def __init__( self, ui, - hasmultiplesend=False, - buffersends=True, + hasmultiplesend: bool = False, + buffersends: bool = True, clientcontentencoders=None, - ): + ) -> None: """Create a new instance. ``hasmultiplesend`` indicates whether multiple sends are supported @@ -1823,7 +1926,7 @@ }, ) - def flushcommands(self): + def flushcommands(self) -> tuple[bytes, dict[bytes, Iterator[bytearray]]]: """Request that all queued commands be sent. If any commands are buffered, this will instruct the caller to send @@ -1856,7 +1959,9 @@ b'framegen': makeframes(), } - def _makecommandframes(self, request): + def _makecommandframes( + self, request: commandrequest + ) -> Iterator[bytearray]: """Emit frames to issue a command request. As a side-effect, update request accounting to reflect its changed @@ -1896,7 +2001,7 @@ request.state = b'sent' - def onframerecv(self, frame): + def onframerecv(self, frame: frame): """Process a frame that has been received off the wire. Returns a 2-tuple of (action, meta) describing further action the @@ -1968,7 +2073,7 @@ return meth(request, frame) - def _onstreamsettingsframe(self, frame): + def _onstreamsettingsframe(self, frame: frame): assert frame.typeid == FRAME_TYPE_STREAM_SETTINGS more = frame.flags & FLAG_STREAM_ENCODING_SETTINGS_CONTINUATION @@ -2056,7 +2161,7 @@ return b'noop', {} - def _oncommandresponseframe(self, request, frame): + def _oncommandresponseframe(self, request: commandrequest, frame: frame): if frame.flags & FLAG_COMMAND_RESPONSE_EOS: request.state = b'received' del self._activerequests[request.requestid] @@ -2071,7 +2176,7 @@ }, ) - def _onerrorresponseframe(self, request, frame): + def _onerrorresponseframe(self, request: commandrequest, frame: frame): request.state = b'errored' del self._activerequests[request.requestid]
--- a/mercurial/wireprotov1server.py Fri Feb 28 23:25:42 2025 +0100 +++ b/mercurial/wireprotov1server.py Fri Feb 28 23:28:10 2025 +0100 @@ -279,17 +279,20 @@ clonebundlepath=path, ) - bundle_dir = repo.vfs.join(bundlecaches.BUNDLE_CACHE_DIR) - clonebundlepath = repo.vfs.join(bundle_dir, path) + bundle_root = repo.ui.config(b'server', b'peer-bundle-cache-root') + bundle_root_dir = repo.vfs.join(bundle_root) + clonebundlepath = repo.vfs.join(bundle_root, path) if not repo.vfs.exists(clonebundlepath): raise error.Abort(b'clonebundle %s does not exist' % path) - clonebundles_dir = os.path.realpath(bundle_dir) + clonebundles_dir = os.path.realpath(bundle_root_dir) + # audit invariance: absolute path of the bundle is below the bundle root if not os.path.realpath(clonebundlepath).startswith(clonebundles_dir): raise error.Abort(b'clonebundle %s is using an illegal path' % path) def generator(vfs, bundle_path): - with vfs(bundle_path) as f: + # path audited above already + with vfs(bundle_path, auditpath=False) as f: length = os.fstat(f.fileno())[6] yield util.uvarintencode(length) yield from util.filechunkiter(f)
--- a/pyproject.toml Fri Feb 28 23:25:42 2025 +0100 +++ b/pyproject.toml Fri Feb 28 23:28:10 2025 +0100 @@ -1,5 +1,9 @@ [build-system] -requires = ["setuptools", "wheel"] +requires = [ + "wheel", + "setuptools>=64", + "setuptools-scm>=8.1.0", + ] build-backend = "setuptools.build_meta" @@ -65,6 +69,71 @@ [tool.cibuildwheel] +build = ["cp38-*", "cp39-*", "cp310-*", "cp311-*", "cp312-*", "cp313-*"] + # Don't stockpile wheels in the pip cache directory when they get built, since # there's no mechanism to age old ones out. build-frontend = { name = "pip", args = ["--no-cache-dir"] } + +# Build translations; requires msgfmt.exe on PATH. +environment = { MERCURIAL_SETUP_FORCE_TRANSLATIONS="1" } + +# Prevent building pypy wheels, which is broken. +skip = "pp*" + +# Tests are run separately, but some values like "*-win_arm64" avoid a warning +# on amd64 Windows about not being able to test without an arm64 runner. That's +# likely to be an issue elsewhere too, like testing amd64 on an arm64 mac. +test-skip = "*" + + +[tool.cibuildwheel.macos] +# See https://cibuildwheel.pypa.io/en/stable/faq/#what-to-provide for reasons +# to also build "x86_64". Further discussion here: +# https://github.com/pypa/cibuildwheel/issues/1333 +# https://github.com/python-cffi/cffi/issues/133 +# +# NOTE: this is overridden in heptapod-ci.yml because the current CI system +# doesn't support arm64 builds. +archs = ["universal2"] + + +[[tool.cibuildwheel.overrides]] +select = "*-macosx_*" + +# The minimum value is adjusted automatically when building for later Pythons +# +# Python Version Minimum macOS +# -------------------------------------- +# Intel CPython 3.6-3.11 10.9 +# Intel CPython 3.12+ 10.13 +# AS CPython or PyPy 11 +inherit.environment = "append" +environment = { MACOSX_DEPLOYMENT_TARGET="10.9" } + + +[tool.cibuildwheel.windows] +archs = ["x86", "AMD64", "ARM64"] + + +[tool.setuptools_scm] +version_file = "mercurial/__version__.py" + +# this use `<last-tag>.post1.dev<distance> +# +# To restore the format introduced for 6.9 nightly build we would need to be +# able to customise the `post1` section to avoid flip-flopping update between +# unrelated branches. It would need to be changed to: +# - post0: for the "stable" branch +# - post1: for the "default" branch +# - post2: for any other branch +version_scheme = "no-guess-dev" + +# The "node-and-timestamp" option seems better but we cannot use it as it make +# `pip install` freaks out with the following warning resulting in an error in +# the end:: +# +# WARNING: Built wheel for mercurial is invalid: Wheel has unexpected file name: expected 'x.y.z.post1.devXXX+hdeadbeef.d20250218231130', got 'x.y.z.post1.devXXX+hdeadbeef.d20250218231133' +# Failed to build mercurial +# ERROR: Failed to build installable wheels for some pyproject.toml based projects (mercurial) +local_scheme = "node-and-date"
--- a/relnotes/6.9 Fri Feb 28 23:25:42 2025 +0100 +++ b/relnotes/6.9 Fri Feb 28 23:28:10 2025 +0100 @@ -1,3 +1,18 @@ += Mercurial 6.9.2 = + + * narrow: stricter validation of narrowspec patterns + * narrow: stricter validation of narrowspec patterns in rhg + * rhg: fix a bug where only the first pattern in narrowspec was validated + * extensions: allow wrapping a function with a bytes name again + * upgrade: fix a reference to a missing attribute + * bundles: filter out unsupported requirements for non-packed1 format + * dirstate-race: add more output to highlight a "to-be-revealed" bug + * dirstate-race: simplify some output match to highligh an error + * dirstate-race: fix a missing synchronisation in the python code + * dirstatemap: stop setting identity after reading the data + * sshpeer: fix deadlock on short writes + * sshpeer: fix another occurrence of short write handling + = Mercurial 6.9.1 = * ci: disable caching of the wheels that get built to save space
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/relnotes/7.0 Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,75 @@ += Mercurial 7.0rc0 = + +/!\ These are the release notes for a tentative version of Mercurial, anything +/!\ and everything is subject to change. + +== Packaging Changes == + +The 7.0 release is the first to be compliant with `PEP 517`. + +This required an overhaul of the Mercurial packaging: packagers should pay extra +attention to this release and report any issues they might encounter with the +new system. + +In practice, this means that Mercurial's `setup.py` can no longer be called +directly. +Instead, one should build the Mercurial package using PyPA's `build` package +(https://github.com/pypa/build). + +In the general case, this will take care of the build dependencies, but +packagers might want to explicitly manage them. Currently the build depends on: + +- `wheel` +- `setuptools>=64` +- `setuptools_scm>=8.1.0` +- `docutils` + +The `Makefile` no longer offers a `build` target. +We now use `BuildTools 2022` when building Windows packages. + +== Other Backwards Compatibility Changes == + +- sslutil: bump the default minimum TLS version of the client to 1.2 (BC) (085cc409847d) +- setup: require TLS 1.2 support from the Python interpreter (BC) (a820a7a1fce0) + +== New Features == + +- It is now possible to store inline clone bundle outside of .hg (48572371d478) +- Added a generic `storage.all-slow-path` option to control the default + behavior regarding degraded support for some repository format. (bbbb12632607) +- Added a `--to` flag to `hg graft` that allows graft in memory (68dc6cecca32) +- Added a `fix.extra-bin-paths` configuration for the `fix` extension (1330278b9029) + +== New Experimental Features == + +- add a --ignore-changes-from-ancestors option (688665425496) +- stream-clone: use dedicated threads to write the data on disk (7f848cfc4286, 58baa86c7a02, aee193b1c784) +- the experimental `git` extension now supports more commands + +== Bug Fixes == + +- subrepo: fix calling outgoing with multiple paths (85c095c1f8bc) +- stream clone: fix a race condition around volatile files (46574e588017, 3f0cf7bb3086) +- rhg: set the expected dirstate permissions (0o666 minus umask) (a48c688d3e80) +- rhg: fix matcher issue (136e74c2bf8f) +- rhg files correctly implements `--rev` (it instead provided `--revision`) + +== Rust == + +- the Rust code is now exposed to Python though PyO3 instead of `rust-cpython` (6673cec8605c)¹ +- rhg: support `status --change`, including `--copies` (bde718849153) +- Rust implementation for the internal part of revlogs +- Rust implementation for `hg annotate` (6183949219b2) +- Rust implementation for `hg update` from a completely empty working copy + +[1] Both `rust-cpython` and `PyO3` bridges are present in this release in case users need to switch back (by changing every `importrust` call) in case something went really wrong in the translation. The `rust-cpython` code will be removed entirely in Mercurial 7.1. + +== Miscellaneous == + +- help: modernize the help text for `hostsecurity.minimumprotocol` (b65085c6d6ff) +- run-tests: add a 4th `HGPORT` value (7f8d0c2c3692) +- rust-ignore: make `debugignorerhg` command show a full regex, with exact files (e2e49069eeb6) +- tests: fix `filtertraceback.py` to handle contiguous `File` lines (8431296a93e8) +- typing: moved `interface` logic from `zope` interfaces to `typing.Protocol` (a1c0f19e7cb4) +- format: add pattern filtering to debugformat (8dede0df9de9) +- run-tests: add a `--tail-report` argument to analyze run parallelism (a814534aaedd)
--- a/rust/Cargo.lock Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/Cargo.lock Fri Feb 28 23:28:10 2025 +0100 @@ -1,6 +1,6 @@ # This file is automatically @generated by Cargo. # It is not intended for manual editing. -version = 3 +version = 4 [[package]] name = "adler2" @@ -112,6 +112,21 @@ checksum = "ace50bade8e6234aa140d9a2f552bbee1db4d353f69b8217bc503490fc1a9f26" [[package]] +name = "bit-set" +version = "0.8.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "08807e080ed7f9d5433fa9b275196cfc35414f66a0c79d864dc51a0d825231a3" +dependencies = [ + "bit-vec", +] + +[[package]] +name = "bit-vec" +version = "0.8.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "5e764a1d40d510daf35e07be9eb06e75770908c27d411ee6c92109c9840eaaf7" + +[[package]] name = "bitflags" version = "1.3.2" source = "registry+https://github.com/rust-lang/crates.io-index" @@ -345,9 +360,9 @@ [[package]] name = "crossbeam-channel" -version = "0.5.13" +version = "0.5.14" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "33480d6946193aa8033910124896ca395333cae7e2d1113d1fef6c3272217df2" +checksum = "06ba6d68e24814cb8de6bb986db8222d3a027d15872cabc0d18817bc3c0e4471" dependencies = [ "crossbeam-utils", ] @@ -648,10 +663,12 @@ name = "hg-core" version = "0.1.0" dependencies = [ + "bit-set", "bitflags 1.3.2", "bitvec", "byteorder", "bytes-cast", + "cc", "chrono", "clap", "crossbeam-channel", @@ -686,10 +703,12 @@ "self_cell", "serde", "sha-1 0.10.1", + "static_assertions_next", "tempfile", "thread_local", "toml", "twox-hash", + "unicode-width 0.2.0", "uuid", "zstd", ] @@ -714,16 +733,14 @@ name = "hg-pyo3" version = "0.1.0" dependencies = [ - "cpython", + "crossbeam-channel", "derive_more", "env_logger 0.9.3", "hg-core", - "hg-cpython", - "lazy_static", "log", + "logging_timer", "pyo3", "pyo3-sharedref", - "python3-sys", "stable_deref_trait", "vcsgraph", ] @@ -1129,6 +1146,7 @@ dependencies = [ "pyo3", "stable_deref_trait", + "static_assertions_next", ] [[package]] @@ -1493,6 +1511,12 @@ checksum = "a2eb9349b6444b326872e140eb1cf5e7c522154d69e7a0ffb0fb81c06b37543f" [[package]] +name = "static_assertions_next" +version = "1.1.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d7beae5182595e9a8b683fa98c4317f956c9a2dec3b9716990d20023cc60c766" + +[[package]] name = "strsim" version = "0.11.1" source = "registry+https://github.com/rust-lang/crates.io-index"
--- a/rust/Cargo.toml Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/Cargo.toml Fri Feb 28 23:28:10 2025 +0100 @@ -2,3 +2,6 @@ members = ["hg-core", "hg-cpython", "hg-pyo3", "rhg", "pyo3-sharedref"] exclude = ["chg", "hgcli"] resolver = "2" + +[workspace.lints.clippy] +or_fun_call = "deny"
--- a/rust/hg-core/Cargo.toml Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/Cargo.toml Fri Feb 28 23:28:10 2025 +0100 @@ -5,6 +5,9 @@ description = "Mercurial pure Rust core library, with no assumption on Python bindings (FFI)" edition = "2021" +[lints] +workspace = true + [lib] name = "hg" @@ -50,6 +53,9 @@ uuid = { version = "1.10", features = ["v4"] } regex-automata = "0.4.9" regex-syntax = "0.8.5" +unicode-width = "0.2.0" +bit-set = "0.8.0" +static_assertions_next = "1.1.2" # We don't use the `miniz-oxide` backend to not change rhg benchmarks and until # we have a clearer view of which backend is the fastest. @@ -58,6 +64,9 @@ features = ["zlib"] default-features = false +[build-dependencies] +cc = "1.0" + [dev-dependencies] clap = { version = "4", features = ["derive"] } pretty_assertions = "1.1.0"
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-core/build.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,11 @@ +fn main() { + // The relative paths work locally but won't if published to crates.io. + println!("cargo::rerun-if-changed=../../mercurial/bdiff.c"); + println!("cargo::rerun-if-changed=../../mercurial/bdiff.h"); + println!("cargo::rerun-if-changed=../../mercurial/compat.h"); + println!("cargo::rerun-if-changed=../../mercurial/bitmanipulation.h"); + cc::Build::new() + .warnings(true) + .file("../../mercurial/bdiff.c") + .compile("bdiff"); +}
--- a/rust/hg-core/src/ancestors.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/ancestors.rs Fri Feb 28 23:28:10 2025 +0100 @@ -9,9 +9,58 @@ use super::{Graph, GraphError, Revision, NULL_REVISION}; use crate::dagops; +use bit_set::BitSet; use std::cmp::max; use std::collections::{BinaryHeap, HashSet}; +/// A set of revisions backed by a bitset, optimized for descending insertion. +struct DescendingRevisionSet { + /// The underlying bitset storage. + set: BitSet, + /// For a revision `R` we store `ceiling - R` instead of `R` so that + /// memory usage is proportional to how far we've descended. + ceiling: i32, + /// Track length separately because [`BitSet::len`] recounts every time. + len: usize, +} + +impl DescendingRevisionSet { + /// Creates a new empty set that can store revisions up to `ceiling`. + fn new(ceiling: Revision) -> Self { + Self { + set: BitSet::new(), + ceiling: ceiling.0, + len: 0, + } + } + + /// Returns the number of revisions in the set. + fn len(&self) -> usize { + self.len + } + + /// Returns true if the set contains `value`. + fn contains(&self, value: Revision) -> bool { + match self.encode(value) { + Ok(n) => self.set.contains(n), + Err(_) => false, + } + } + + /// Adds `value` to the set. Returns true if it was not already in the set. + /// Returns `Err` if it cannot store it because it is above the ceiling. + fn insert(&mut self, value: Revision) -> Result<bool, GraphError> { + let inserted = self.set.insert(self.encode(value)?); + self.len += inserted as usize; + Ok(inserted) + } + + fn encode(&self, value: Revision) -> Result<usize, GraphError> { + usize::try_from(self.ceiling - value.0) + .map_err(|_| GraphError::ParentOutOfOrder(value)) + } +} + /// Iterator over the ancestors of a given list of revisions /// This is a generic type, defined and implemented for any Graph, so that /// it's easy to @@ -22,7 +71,7 @@ pub struct AncestorsIterator<G: Graph> { graph: G, visit: BinaryHeap<Revision>, - seen: HashSet<Revision>, + seen: DescendingRevisionSet, stoprev: Revision, } @@ -43,12 +92,18 @@ stoprev: Revision, inclusive: bool, ) -> Result<Self, GraphError> { - let filtered_initrevs = initrevs.into_iter().filter(|&r| r >= stoprev); + let filtered_initrevs = initrevs + .into_iter() + .filter(|&r| r >= stoprev) + .collect::<BinaryHeap<_>>(); + let max = *filtered_initrevs.peek().unwrap_or(&NULL_REVISION); + let mut seen = DescendingRevisionSet::new(max); if inclusive { - let visit: BinaryHeap<Revision> = filtered_initrevs.collect(); - let seen = visit.iter().cloned().collect(); + for &rev in &filtered_initrevs { + seen.insert(rev).expect("revs cannot be above their max"); + } return Ok(AncestorsIterator { - visit, + visit: filtered_initrevs, seen, stoprev, graph, @@ -56,24 +111,30 @@ } let mut this = AncestorsIterator { visit: BinaryHeap::new(), - seen: HashSet::new(), + seen, stoprev, graph, }; - this.seen.insert(NULL_REVISION); + this.seen + .insert(NULL_REVISION) + .expect("null is the smallest revision"); for rev in filtered_initrevs { for parent in this.graph.parents(rev)?.iter().cloned() { - this.conditionally_push_rev(parent); + this.conditionally_push_rev(parent)?; } } Ok(this) } #[inline] - fn conditionally_push_rev(&mut self, rev: Revision) { - if self.stoprev <= rev && self.seen.insert(rev) { + fn conditionally_push_rev( + &mut self, + rev: Revision, + ) -> Result<(), GraphError> { + if self.stoprev <= rev && self.seen.insert(rev)? { self.visit.push(rev); } + Ok(()) } /// Consumes partially the iterator to tell if the given target @@ -82,7 +143,7 @@ /// This is meant for iterators actually dedicated to that kind of /// purpose pub fn contains(&mut self, target: Revision) -> Result<bool, GraphError> { - if self.seen.contains(&target) && target != NULL_REVISION { + if self.seen.contains(target) && target != NULL_REVISION { return Ok(true); } for item in self { @@ -110,13 +171,14 @@ if self.visit.len() > 0 { return false; } - if self.seen.len() > 1 { + let seen_len = self.seen.len(); + if seen_len > 1 { return false; } // at this point, the seen set is at most a singleton. // If not `self.inclusive`, it's still possible that it has only // the null revision - self.seen.is_empty() || self.seen.contains(&NULL_REVISION) + seen_len == 0 || self.seen.contains(NULL_REVISION) } } @@ -145,13 +207,23 @@ Ok(ps) => ps, Err(e) => return Some(Err(e)), }; - if p1 < self.stoprev || !self.seen.insert(p1) { + let pop = if p1 < self.stoprev { + true + } else { + match self.seen.insert(p1) { + Ok(inserted) => !inserted, + Err(e) => return Some(Err(e)), + } + }; + if pop { self.visit.pop(); } else { *(self.visit.peek_mut().unwrap()) = p1; }; - self.conditionally_push_rev(p2); + if let Err(e) = self.conditionally_push_rev(p2) { + return Some(Err(e)); + } Some(Ok(current)) } } @@ -406,6 +478,28 @@ } #[test] + fn test_descending_revision_set() { + let mut set = DescendingRevisionSet::new(Revision(1_000_000)); + + assert_eq!(set.len(), 0); + assert!(!set.contains(Revision(999_950))); + + assert_eq!(set.insert(Revision(999_950)), Ok(true)); + assert_eq!(set.insert(Revision(999_950)), Ok(false)); + assert_eq!(set.insert(Revision(1_000_000)), Ok(true)); + assert_eq!( + set.insert(Revision(1_000_001)), + Err(GraphError::ParentOutOfOrder(Revision(1_000_001))) + ); + + assert_eq!(set.len(), 2); + assert!(set.contains(Revision(999_950))); + assert!(!set.contains(Revision(999_951))); + assert!(set.contains(Revision(1_000_000))); + assert!(!set.contains(Revision(1_000_001))); + } + + #[test] /// Same tests as test-ancestor.py, without membership /// (see also test-ancestor.py.out) fn test_list_ancestor() {
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-core/src/bdiff.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,325 @@ +//! Safe bindings to bdiff.c. + +use crate::errors::HgError; +use std::marker::PhantomData; + +/// A file split into lines, ready for diffing. +pub struct Lines<'a> { + /// The array of lines, allocated by bdiff.c. + /// Must never be mutated by Rust code apart from freeing it in `Drop`. + array: *mut ffi::bdiff_line, + /// Length of the array. + len: u32, + /// Lifetime of the source buffer, since array items store pointers. + _lifetime: PhantomData<&'a [u8]>, +} + +/// Splits `source` into lines that can be diffed. +pub fn split_lines(source: &[u8]) -> Result<Lines, HgError> { + let mut array = std::ptr::null_mut(); + // Safety: The pointer and length are valid since they both come from + // `source`, and the out pointer is non-null. + let result = unsafe { + ffi::bdiff_splitlines( + source.as_ptr() as *const std::ffi::c_char, + source.len() as isize, + &mut array, + ) + }; + match u32::try_from(result) { + Ok(len) => { + assert!(!array.is_null()); + Ok(Lines { + array, + len, + _lifetime: PhantomData, + }) + } + Err(_) => { + Err(HgError::abort_simple("bdiff_splitlines failed to allocate")) + } + } +} + +impl<'a> Lines<'a> { + /// Returns the number of lines. + pub fn len(&self) -> usize { + self.len as usize + } + + /// Returns an iterator over the lines. + pub fn iter(&self) -> LinesIter<'_, 'a> { + LinesIter { + lines: self, + index: 0, + } + } +} + +impl Drop for Lines<'_> { + fn drop(&mut self) { + // Safety: This is the only place that frees the array (no + // double-free), and it's in a `Drop` impl (no use-after-free). + unsafe { + libc::free(self.array as *mut std::ffi::c_void); + } + } +} + +// Safety: It is safe to send `Lines` to a different thread because +// `self.array` is never copied so only one thread will free it. +unsafe impl Send for Lines<'_> {} + +// It is *not* safe to share `&Lines` between threads because `ffi::bdiff_diff` +// mutates lines by storing bookkeeping information in `n` and `e`. +static_assertions_next::assert_impl!(Lines<'_>: !Sync); + +#[derive(Clone)] +pub struct LinesIter<'a, 'b> { + lines: &'a Lines<'b>, + index: usize, +} + +impl<'b> Iterator for LinesIter<'_, 'b> { + type Item = &'b [u8]; + + fn next(&mut self) -> Option<Self::Item> { + if self.index == self.lines.len() { + return None; + } + // Safety: We just checked that the index has not reached the length. + let line = unsafe { *self.lines.array.add(self.index) }; + self.index += 1; + // Safety: We assume bdiff.c sets `l` and `len` correctly. + Some(unsafe { + std::slice::from_raw_parts(line.l as *const u8, line.len as usize) + }) + } + + fn size_hint(&self) -> (usize, Option<usize>) { + let len = self.lines.len() - self.index; + (len, Some(len)) + } +} + +impl ExactSizeIterator for LinesIter<'_, '_> {} + +/// A diff hunk comparing lines [a1,a2) in file A with lines [b1,b2) in file B. +#[derive(Copy, Clone, Debug, PartialEq, Eq)] +pub struct Hunk { + /// Start line index in file A (inclusive). + pub a1: u32, + /// End line index in file A (exclusive). + pub a2: u32, + /// Start line index in file B (inclusive). + pub b1: u32, + /// End line index in file B (exclusive). + pub b2: u32, +} + +/// A list of matching hunks. +pub struct HunkList { + /// The head of the linked list, allocated by bdiff.c. + head: *mut ffi::bdiff_hunk, + /// Length of the list. + len: u32, +} + +/// Returns a list of hunks that match in `a` and `b`. +pub fn diff(a: &Lines, b: &Lines) -> Result<HunkList, HgError> { + let mut out = ffi::bdiff_hunk { + a1: 0, + a2: 0, + b1: 0, + b2: 0, + next: std::ptr::null_mut(), + }; + // Safety: We assume bdiff.c sets `array` and `len` correctly; and the + // out pointer is non-null. + let result = unsafe { + ffi::bdiff_diff(a.array, a.len as i32, b.array, b.len as i32, &mut out) + }; + match u32::try_from(result) { + Ok(len) => Ok(HunkList { + // Start with out.next because the first hunk is not meaningful and + // is not included in len. This matches mercurial/cffi/bdiff.py. + head: out.next, + len, + }), + Err(_) => Err(HgError::abort_simple("bdiff_diff failed to allocate")), + } +} + +impl HunkList { + /// Returns the number of hunks. + pub fn len(&self) -> usize { + self.len as usize + } + + /// Returns an iterator over the hunks. + pub fn iter(&self) -> HunkListIter { + HunkListIter { + // Safety: If `self.head` is null, this is safe. If non-null, then: + // - We assume bdiff.c made it properly aligned. + // - It's dereferenceable (any bit pattern is ok for `bdiff_hunk`). + // - It won't be mutated because `HunkListIter` is tied to `&self`. + next: unsafe { self.head.as_ref() }, + remaining: self.len(), + } + } +} + +impl Drop for HunkList { + fn drop(&mut self) { + // Safety: This is the only place that frees `self.head` (no + // double-free), and it's in a `Drop` impl (no use-after-free). + unsafe { + ffi::bdiff_freehunks(self.head); + } + } +} + +pub struct HunkListIter<'a> { + next: Option<&'a ffi::bdiff_hunk>, + remaining: usize, +} + +impl Iterator for HunkListIter<'_> { + type Item = Hunk; + + fn next(&mut self) -> Option<Self::Item> { + match self.next { + Some(hunk) => { + // Safety: Same reasoning as in `HunkList::iter`. + self.next = unsafe { hunk.next.as_ref() }; + self.remaining -= 1; + debug_assert_eq!(hunk.a2 - hunk.a1, hunk.b2 - hunk.b1); + Some(Hunk { + a1: hunk.a1 as u32, + a2: hunk.a2 as u32, + b1: hunk.b1 as u32, + b2: hunk.b2 as u32, + }) + } + None => { + assert_eq!(self.remaining, 0); + None + } + } + } + + fn size_hint(&self) -> (usize, Option<usize>) { + (self.remaining, Some(self.remaining)) + } +} + +impl ExactSizeIterator for HunkListIter<'_> {} + +mod ffi { + #![allow(non_camel_case_types)] + + use std::ffi::{c_char, c_int}; + + #[repr(C)] + #[derive(Debug, Copy, Clone)] + pub struct bdiff_line { + pub hash: c_int, + pub n: c_int, + pub e: c_int, + pub len: isize, + pub l: *const c_char, + } + + #[repr(C)] + #[derive(Debug, Copy, Clone)] + pub struct bdiff_hunk { + pub a1: c_int, + pub a2: c_int, + pub b1: c_int, + pub b2: c_int, + pub next: *mut bdiff_hunk, + } + + #[link(name = "bdiff", kind = "static")] + extern "C" { + /// Splits `a` into lines. On success, stores a pointer to an array of + /// lines in `*lr` and returns its length. On failure, returns + /// -1. The caller is responsible for freeing the array. + /// + /// # Safety + /// + /// - `a` must point to an array of `len` chars. + /// - `lr` must be non-null (but `*lr` can be null). + pub fn bdiff_splitlines( + a: *const c_char, + len: isize, + lr: *mut *mut bdiff_line, + ) -> c_int; + + /// Diffs `a` and `b`. On success, stores the head of a linked list of + /// hunks in `base->next` and returns its length. On failure, returns + /// -1. The caller is responsible for `bdiff_freehunks(base->next)`. + /// + /// # Safety + /// + /// - `a` must point to an array of `an` lines. + /// - `b` must point to an array of `bn` lines. + /// - `base` must be non-null. + pub fn bdiff_diff( + a: *mut bdiff_line, + an: c_int, + b: *mut bdiff_line, + bn: c_int, + base: *mut bdiff_hunk, + ) -> c_int; + + /// Frees the linked list of hunks `l`. + /// + /// # Safety + /// + /// - `l` must be non-null, not already freed, and not used after this. + pub fn bdiff_freehunks(l: *mut bdiff_hunk); + } +} + +#[cfg(test)] +mod tests { + fn split(a: &[u8]) -> Vec<&[u8]> { + super::split_lines(a).unwrap().iter().collect() + } + + fn diff(a: &[u8], b: &[u8]) -> Vec<(u32, u32, u32, u32)> { + let la = super::split_lines(a).unwrap(); + let lb = super::split_lines(b).unwrap(); + let hunks = super::diff(&la, &lb).unwrap(); + hunks.iter().map(|h| (h.a1, h.a2, h.b1, h.b2)).collect() + } + + #[test] + fn test_split_lines() { + assert_eq!(split(b""), [] as [&[u8]; 0]); + assert_eq!(split(b"\n"), [b"\n"]); + assert_eq!(split(b"\r\n"), [b"\r\n"]); + assert_eq!(split(b"X\nY"), [b"X\n" as &[u8], b"Y"]); + assert_eq!(split(b"X\nY\n"), [b"X\n" as &[u8], b"Y\n"]); + assert_eq!(split(b"X\r\nY\r\n"), [b"X\r\n" as &[u8], b"Y\r\n"]); + } + + #[test] + fn test_diff_single_line() { + assert_eq!(diff(b"", b""), &[(0, 0, 0, 0)]); + assert_eq!(diff(b"x", b"x"), &[(0, 1, 0, 1), (1, 1, 1, 1)]); + assert_eq!(diff(b"x", b"y"), &[(1, 1, 1, 1)]); + } + + #[test] + fn test_diff_multiple_lines() { + assert_eq!( + diff( + b" line1 \n line2 \n line3 \n line4 \n REMOVED \n", + b" ADDED \n line1 \n lined2_CHANGED \n line3 \n line4 \n" + ), + &[(0, 1, 1, 2), (2, 4, 3, 5), (5, 5, 5, 5)] + ); + } +}
--- a/rust/hg-core/src/config/layer.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/config/layer.rs Fri Feb 28 23:28:10 2025 +0100 @@ -57,7 +57,7 @@ cli_config_args: impl IntoIterator<Item = impl AsRef<[u8]>>, ) -> Result<Option<Self>, ConfigError> { fn parse_one(arg: &[u8]) -> Option<(Vec<u8>, Vec<u8>, Vec<u8>)> { - use crate::utils::SliceExt; + use crate::utils::strings::SliceExt; let (section_and_item, value) = arg.split_2(b'=')?; let (section, item) = section_and_item.trim().split_2(b'.')?; @@ -169,7 +169,8 @@ let line = Some(index + 1); if let Some(m) = INCLUDE_RE.captures(bytes) { let filename_bytes = &m[1]; - let filename_bytes = crate::utils::expand_vars(filename_bytes); + let filename_bytes = + crate::utils::strings::expand_vars(filename_bytes); // `Path::parent` only fails for the root directory, // which `src` can’t be since we’ve managed to open it as a // file.
--- a/rust/hg-core/src/config/values.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/config/values.rs Fri Feb 28 23:28:10 2025 +0100 @@ -8,7 +8,7 @@ //! details about where the value came from (but omits details of what’s //! invalid inside the value). -use crate::utils::SliceExt; +use crate::utils::strings::SliceExt; pub(super) fn parse_bool(v: &[u8]) -> Option<bool> { match v.to_ascii_lowercase().as_slice() {
--- a/rust/hg-core/src/dirstate.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/dirstate.rs Fri Feb 28 23:28:10 2025 +0100 @@ -46,12 +46,14 @@ dyn Iterator< Item = Result<(&'a HgPath, DirstateEntry), DirstateV2ParseError>, > + Send + + Sync + 'a, >; pub type CopyMapIter<'a> = Box< dyn Iterator<Item = Result<(&'a HgPath, &'a HgPath), DirstateV2ParseError>> + Send + + Sync + 'a, >;
--- a/rust/hg-core/src/dirstate/owning.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/dirstate/owning.rs Fri Feb 28 23:28:10 2025 +0100 @@ -11,7 +11,7 @@ /// Keep a `DirstateMap<'owner>` next to the `owner` buffer that it /// borrows. pub struct OwningDirstateMap { - owner: Box<dyn Deref<Target = [u8]> + Send>, + owner: Box<dyn Deref<Target = [u8]> + Send + Sync>, #[covariant] dependent: DirstateMap, } @@ -23,7 +23,7 @@ identity: Option<DirstateIdentity>, ) -> Self where - OnDisk: Deref<Target = [u8]> + Send + 'static, + OnDisk: Deref<Target = [u8]> + Send + Sync + 'static, { let on_disk = Box::new(on_disk); @@ -39,7 +39,7 @@ identity: Option<DirstateIdentity>, ) -> Result<(Self, DirstateParents), DirstateError> where - OnDisk: Deref<Target = [u8]> + Send + 'static, + OnDisk: Deref<Target = [u8]> + Send + Sync + 'static, { let on_disk = Box::new(on_disk); let mut parents = DirstateParents::NULL; @@ -63,7 +63,7 @@ identity: Option<DirstateIdentity>, ) -> Result<Self, DirstateError> where - OnDisk: Deref<Target = [u8]> + Send + 'static, + OnDisk: Deref<Target = [u8]> + Send + Sync + 'static, { let on_disk = Box::new(on_disk);
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-core/src/encoding.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,291 @@ +//! Character transcoding support. + +use core::str; +use std::borrow::Cow; + +use crate::{errors::HgError, utils::strings::Escaped}; +use unicode_width::UnicodeWidthStr as _; + +/// String encoder and decoder. +#[derive(Copy, Clone, Debug)] +pub struct Encoder { + /// The user's local encoding. + local_encoding: Encoding, + /// What to do when decoding fails. (Encoding always uses + /// `Mode::Replace`). + decoding_mode: Mode, + /// Width to use for characters that can be interpreted either as narrow + /// or wide depending on the context. + pub ambiguous_width: Width, +} + +/// Character encoding. +#[derive(Copy, Clone, Debug)] +pub enum Encoding { + Utf8, + Ascii, +} + +/// Character decoding mode. +#[derive(Copy, Clone, Debug)] +pub enum Mode { + /// Produce an error message for invalid characters. + Strict, + /// Replace invalid characters with a special character. + Replace, +} + +/// The width of a Unicode character. +#[derive(Copy, Clone, Debug)] +pub enum Width { + /// Narrow, taking up 1 terminal column. + Narrow, + /// Wide, taking up 2 terminal columns. + Wide, +} + +impl Default for Encoder { + fn default() -> Self { + Self { + local_encoding: Encoding::Utf8, + decoding_mode: Mode::Strict, + ambiguous_width: Width::Narrow, + } + } +} + +impl Encoder { + /// Creates an encoder from environment variables. + pub fn from_env() -> Result<Self, HgError> { + let default = Encoder::default(); + let local_encoding = match std::env::var_os("HGENCODING") { + None => default.local_encoding, + Some(s) + if s.eq_ignore_ascii_case("utf-8") + || s.eq_ignore_ascii_case("utf8") => + { + Encoding::Utf8 + } + Some(s) if s.eq_ignore_ascii_case("ascii") => Encoding::Ascii, + Some(s) => { + return Err(HgError::unsupported(format!( + "HGENCODING value '{}' is not supported", + s.to_string_lossy() + ))) + } + }; + let decoding_mode = match std::env::var_os("HGENCODINGMODE") { + None => default.decoding_mode, + Some(s) if s == "strict" => Mode::Strict, + Some(s) if s == "replace" => Mode::Replace, + Some(s) => { + return Err(HgError::abort_simple(format!( + "HGENCODINGMODE value '{}' is not supported", + s.to_string_lossy() + ))) + } + }; + let ambiguous_width = match std::env::var_os("HGENCODINGAMBIGUOUS") { + None => default.ambiguous_width, + Some(s) if s == "narrow" => Width::Narrow, + Some(s) if s == "wide" => Width::Wide, + Some(s) => { + return Err(HgError::abort_simple(format!( + "HGENCODINGAMBIGUOUS value '{}' is not supported", + s.to_string_lossy() + ))) + } + }; + Ok(Self { + local_encoding, + decoding_mode, + ambiguous_width, + }) + } + + /// Decodes an internal UTF-8 string from bytes. + pub fn decode_internal<'a>( + &self, + bytes: &'a [u8], + ) -> Result<&'a str, HgError> { + decode_utf8(bytes).map_err(HgError::corrupted) + } + + /// Converts a string from internal UTF-8 to the local character encoding. + pub fn to_local<'a>(&self, str: &'a str) -> Cow<'a, [u8]> { + match self.local_encoding { + Encoding::Utf8 => Cow::Borrowed(str.as_bytes()), + Encoding::Ascii => { + if str.is_ascii() { + Cow::Borrowed(str.as_bytes()) + } else { + Cow::Owned(codepoints_to_ascii_lossy(str).into_bytes()) + } + } + } + } + + /// Converts a string from the local character encoding to UTF-8. + pub fn from_local<'a>( + &self, + bytes: &'a [u8], + ) -> Result<Cow<'a, str>, HgError> { + match (self.local_encoding, self.decoding_mode) { + (Encoding::Utf8, Mode::Strict) => Ok(Cow::Borrowed( + decode_utf8(bytes).map_err(HgError::abort_simple)?, + )), + (Encoding::Utf8, Mode::Replace) => { + Ok(String::from_utf8_lossy(bytes)) + } + (Encoding::Ascii, Mode::Strict) => Ok(Cow::Borrowed( + decode_ascii(bytes).map_err(HgError::abort_simple)?, + )), + (Encoding::Ascii, Mode::Replace) => { + Ok(Cow::Owned(bytes_to_ascii_lossy(bytes))) + } + } + } + + /// Returns the column width of a string for display. + pub fn column_width(&self, str: &str) -> usize { + match self.ambiguous_width { + Width::Narrow => str.width(), + Width::Wide => str.width_cjk(), + } + } + + /// Returns the column width if `bytes` can be decoded as UTF-8, otherwise + /// just returns the length in bytes. + pub fn column_width_bytes(&self, bytes: &[u8]) -> usize { + match str::from_utf8(bytes) { + Ok(str) => self.column_width(str), + Err(_) => bytes.len(), + } + } +} + +/// Decodes bytes as UTF-8 or returns a detailed error message. +fn decode_utf8(bytes: &[u8]) -> Result<&str, String> { + str::from_utf8(bytes).map_err(|err| { + format!( + "invalid UTF-8 at offset {}: \"{}\"", + err.valid_up_to(), + str::from_utf8(&bytes.escaped_bytes()).unwrap() + ) + }) +} + +/// Decodes bytes as ASCII or returns a detailed error message. +fn decode_ascii(bytes: &[u8]) -> Result<&str, String> { + // TODO: Use `as_ascii` https://github.com/rust-lang/rust/issues/110998 + if bytes.is_ascii() { + // Safety: Just checked that it's ASCII. + let str = unsafe { str::from_utf8_unchecked(bytes) }; + Ok(str) + } else { + Err(format!( + "invalid ASCII: \"{}\"", + str::from_utf8(&bytes.escaped_bytes()).unwrap() + )) + } +} + +/// Replaces all non-ASCII codepoints with '?'. +fn codepoints_to_ascii_lossy(str: &str) -> String { + let mut ascii = String::new(); + for char in str.chars() { + ascii.push(if char.is_ascii() { char } else { '?' }); + } + ascii +} + +/// Replaces all non-ASCII bytes with '?'. +fn bytes_to_ascii_lossy(bytes: &[u8]) -> String { + let mut ascii = String::new(); + for &b in bytes { + ascii.push(if b.is_ascii() { b as char } else { '?' }); + } + ascii +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_decode_internal() { + let encoder = Encoder::default(); + assert_eq!(encoder.decode_internal(b"").unwrap(), ""); + assert_eq!(encoder.decode_internal(b"\xc3\xa9").unwrap(), "é"); + match encoder.decode_internal(b"A\xc3") { + Ok(_) => panic!("expected an error"), + Err(HgError::CorruptedRepository(message)) => { + assert_eq!(message, "invalid UTF-8 at offset 1: \"A\\xc3\"") + } + Err(_) => panic!("expected a CorruptedRepository error"), + } + } + + #[test] + fn test_to_local() { + let encoder = Encoder::default(); + assert_eq!(encoder.to_local("").as_ref(), b""); + assert_eq!(encoder.to_local("é").as_ref(), b"\xc3\xa9"); + } + + #[test] + fn test_from_local() { + let encoder = Encoder::default(); + assert_eq!(encoder.from_local(b"").unwrap(), ""); + assert_eq!(encoder.from_local(b"\xc3\xa9").unwrap(), "é"); + match encoder.from_local(b"A\xc3") { + Ok(_) => panic!("expected an error"), + Err(HgError::Abort { message, .. }) => { + assert_eq!(message, "invalid UTF-8 at offset 1: \"A\\xc3\"") + } + Err(_) => panic!("expected a CorruptedRepository error"), + } + } + + #[test] + fn test_from_local_replace() { + let encoder = Encoder { + decoding_mode: Mode::Replace, + ..Default::default() + }; + assert_eq!(encoder.from_local(b"A\xc3").unwrap(), "A\u{fffd}"); + } + + #[test] + fn test_column_width() { + let encoder = Encoder::default(); + assert_eq!(encoder.column_width(""), 0); + assert_eq!(encoder.column_width("a"), 1); + assert_eq!(encoder.column_width("ab"), 2); + assert_eq!(encoder.column_width("été"), 3); + assert_eq!(encoder.column_width("\u{1f496}"), 2); + } + + #[test] + fn test_column_width_ambiguous() { + let narrow_encoder = Encoder { + ambiguous_width: Width::Narrow, + ..Default::default() + }; + assert_eq!(narrow_encoder.column_width("\u{2606}"), 1); + + let wide_encoder = Encoder { + ambiguous_width: Width::Wide, + ..Default::default() + }; + assert_eq!(wide_encoder.column_width("\u{2606}"), 2); + } + + #[test] + fn test_column_width_bytes() { + let encoder = Encoder::default(); + assert_eq!(encoder.column_width_bytes(b""), 0); + assert_eq!(encoder.column_width_bytes("été".as_bytes()), 3); + assert_eq!(encoder.column_width_bytes(b"A\xc3"), 2); + } +}
--- a/rust/hg-core/src/filepatterns.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/filepatterns.rs Fri Feb 28 23:28:10 2025 +0100 @@ -12,7 +12,7 @@ utils::{ files::{canonical_path, get_bytes_from_path, get_path_from_bytes}, hg_path::{path_to_hg_path_buf, HgPathBuf, HgPathError}, - SliceExt, + strings::SliceExt, }, FastHashMap, };
--- a/rust/hg-core/src/lib.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/lib.rs Fri Feb 28 23:28:10 2025 +0100 @@ -5,7 +5,9 @@ // GNU General Public License version 2 or any later version. mod ancestors; +mod bdiff; pub mod dagops; +pub mod encoding; pub mod errors; pub mod narrow; pub mod sparse;
--- a/rust/hg-core/src/matchers.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/matchers.rs Fri Feb 28 23:28:10 2025 +0100 @@ -24,7 +24,7 @@ utils::{ files::{dir_ancestors, find_dirs}, hg_path::{HgPath, HgPathBuf, HgPathError}, - Escaped, + strings::Escaped, }, FastHashMap, };
--- a/rust/hg-core/src/narrow.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/narrow.rs Fri Feb 28 23:28:10 2025 +0100 @@ -97,19 +97,38 @@ Ok((m, warnings)) } +fn is_whitespace(b: &u8) -> bool { + // should match what .strip() in Python does + b.is_ascii_whitespace() || *b == 0x0b +} + +fn starts_or_ends_with_whitespace(s: &[u8]) -> bool { + let w = |b: Option<&u8>| b.map(is_whitespace).unwrap_or(false); + w(s.first()) || w(s.last()) +} + +fn validate_pattern(pattern: &[u8]) -> Result<(), SparseConfigError> { + if starts_or_ends_with_whitespace(pattern) { + return Err(SparseConfigError::WhitespaceAtEdgeOfPattern( + pattern.to_owned(), + )); + } + for prefix in VALID_PREFIXES.iter() { + if pattern.starts_with(prefix.as_bytes()) { + return Ok(()); + } + } + Err(SparseConfigError::InvalidNarrowPrefix(pattern.to_owned())) +} + fn validate_patterns(patterns: &[u8]) -> Result<(), SparseConfigError> { for pattern in patterns.split(|c| *c == b'\n') { if pattern.is_empty() { + // TODO: probably not intentionally allowed (only because `split` + // produces "fake" empty line at the end) continue; } - for prefix in VALID_PREFIXES.iter() { - if pattern.starts_with(prefix.as_bytes()) { - return Ok(()); - } - } - return Err(SparseConfigError::InvalidNarrowPrefix( - pattern.to_owned(), - )); + validate_pattern(pattern)? } Ok(()) }
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-core/src/operations/annotate.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,547 @@ +use std::borrow::Cow; + +use crate::{ + bdiff::{self, Lines}, + errors::HgError, + repo::Repo, + revlog::{ + changelog::Changelog, + filelog::{Filelog, FilelogRevisionData}, + manifest::Manifestlog, + }, + utils::{ + self, + hg_path::{HgPath, HgPathBuf}, + strings::{clean_whitespace, CleanWhitespace}, + }, + AncestorsIterator, FastHashMap, Graph, GraphError, Node, Revision, + NULL_REVISION, +}; +use itertools::Itertools as _; +use rayon::prelude::*; +use self_cell::self_cell; + +/// Options for [`annotate`]. +#[derive(Copy, Clone)] +pub struct AnnotateOptions { + pub treat_binary_as_text: bool, + pub follow_copies: bool, + pub whitespace: CleanWhitespace, +} + +/// The final result of annotating a file. +pub enum AnnotateOutput { + /// An annotated text file. + Text(ChangesetAnnotatedFile), + /// The file cannot be annotated because it is binary. + Binary, + /// The file was not found in the repository. + NotFound, +} + +/// A file with user-facing changeset annotations for each line. +pub struct ChangesetAnnotatedFile { + // The lines of the file, including original line endings. + pub lines: Vec<Vec<u8>>, + // List of annotations corresponding to `lines`. + pub annotations: Vec<ChangesetAnnotation>, +} + +/// A user-facing changeset annotation for one line. +pub struct ChangesetAnnotation { + /// The file path as it was at `revision`. This can be different from the + /// file's current path if it was copied or renamed in the past. + pub path: HgPathBuf, + /// The changelog revision that introduced the line. + pub revision: Revision, + /// The one-based line number in the original file. + pub line_number: u32, +} + +self_cell!( + /// A wrapper around [`Lines`] that owns the buffer the lines point into. + /// The buffer contains the file text processed by [`clean_whitespace`]. + struct OwnedLines { + owner: Vec<u8>, + #[covariant] + dependent: Lines, + } +); + +impl OwnedLines { + /// Cleans `data` based on `whitespace` and then splits into lines. + fn split( + data: Vec<u8>, + whitespace: CleanWhitespace, + ) -> Result<Self, HgError> { + let data = match clean_whitespace(&data, whitespace) { + Cow::Borrowed(_) => data, + Cow::Owned(data) => data, + }; + Self::try_new(data, |data| bdiff::split_lines(data)) + } + + fn get(&self) -> &Lines { + self.borrow_dependent() + } +} + +/// A file with filelog annotations for each line. +struct AnnotatedFile { + lines: OwnedLines, + annotations: Vec<Annotation>, +} + +/// A filelog annotation for one line. +#[derive(Copy, Clone)] +struct Annotation { + /// The file revision that introduced the line. + id: FileId, + /// The one-based line number in the original file. + line_number: u32, +} + +/// Helper for keeping track of multiple filelogs. +#[derive(Default)] +struct FilelogSet { + /// List of filelogs. The first one is for the root file being blamed. + /// Others are added only when following copies/renames. + items: Vec<FilelogSetItem>, + /// Mapping of paths to indexes in `items`. + path_to_index: FastHashMap<HgPathBuf, FilelogIndex>, +} + +struct FilelogSetItem { + path: HgPathBuf, + filelog: Filelog, +} + +/// Identifies a filelog in a FilelogSet. +type FilelogIndex = u32; + +/// Identifies a file revision in a FilelogSet. +#[derive(Debug, Copy, Clone, PartialEq, Eq, Hash)] +struct FileId { + index: FilelogIndex, + revision: Revision, +} + +impl FilelogSet { + /// Returns filelog item at the given index. + fn get(&self, index: FilelogIndex) -> &FilelogSetItem { + &self.items[index as usize] + } + + /// Opens a filelog by path and returns its index. + fn open( + &mut self, + repo: &Repo, + path: &HgPath, + ) -> Result<FilelogIndex, HgError> { + if let Some(&index) = self.path_to_index.get(path) { + return Ok(index); + } + let index = self.items.len() as FilelogIndex; + self.items.push(FilelogSetItem { + filelog: repo.filelog(path)?, + path: path.into(), + }); + self.path_to_index.insert(path.into(), index); + Ok(index) + } + + /// Opens a new filelog by path and returns the id for the given file node. + fn open_at_node( + &mut self, + repo: &Repo, + path: &HgPath, + node: Node, + ) -> Result<FileId, HgError> { + let index = self.open(repo, path)?; + let revision = + self.get(index).filelog.revlog.rev_from_node(node.into())?; + Ok(FileId { index, revision }) + } + + /// Reads the contents of a file by id. + fn read(&self, id: FileId) -> Result<FilelogRevisionData, HgError> { + self.get(id.index).filelog.entry(id.revision)?.data() + } + + /// Returns the parents of a file. If `follow_copies` is true, it treats + /// the copy source as a parent. In that case, also returns the file data + /// (since it has to read the file to extract the copy metadata). + fn parents( + &mut self, + repo: &Repo, + id: FileId, + follow_copies: bool, + ) -> Result<(Vec<FileId>, Option<Vec<u8>>), HgError> { + let filelog = &self.get(id.index).filelog; + let revisions = + filelog.parents(id.revision).map_err(from_graph_error)?; + let mut parents = Vec::with_capacity(2); + let mut file_data = None; + if revisions[0] != NULL_REVISION { + parents.push(FileId { + index: id.index, + revision: revisions[0], + }); + } else if follow_copies { + // A null p1 indicates there might be copy metadata. + // Check for it, and if present use it as the parent. + let data = filelog.entry(id.revision)?.data()?; + let meta = data.metadata()?.parse()?; + // If copy or copyrev occurs without the other, ignore it. + // This matches filerevisioncopied in storageutil.py. + if let (Some(copy), Some(copyrev)) = (meta.copy, meta.copyrev) { + parents.push(self.open_at_node(repo, copy, copyrev)?); + } + file_data = Some(data.into_file_data()?); + } + if revisions[1] != NULL_REVISION { + parents.push(FileId { + index: id.index, + revision: revisions[1], + }); + } + Ok((parents, file_data)) + } +} + +/// Per [`FileId`] information used in the [`annotate`] algorithm. +#[derive(Default)] +struct FileInfo { + /// Parents of this revision (via p1 and p2 or copy metadata). + parents: Option<Vec<FileId>>, + /// Current state for annotating the file. + file: AnnotatedFileState, + /// Remaining number of times `file` is needed before we can drop it. + needed: usize, + /// Current state for converting to a changelog revision. + revision: ChangelogRevisionState, + /// The value of `revision` from a descendant. If the linkrev needs + /// adjustment, we can start iterating the changelog here. + descendant: Option<Revision>, +} + +/// State enum for reading a file and annotating it. +#[derive(Default)] +enum AnnotatedFileState { + #[default] + None, + Read(OwnedLines), + Annotated(AnnotatedFile), +} + +/// State enum for converting a filelog revision to a changelog revision, but +/// only if needed (because it will appear in the final output). +#[derive(Default)] +enum ChangelogRevisionState { + #[default] + NotNeeded, + Needed, + Done(Revision), +} + +/// A collection of [`FileInfo`], forming a graph via [`FileInfo::parents`]. +#[derive(Default)] +struct FileGraph(FastHashMap<FileId, FileInfo>); + +impl FileGraph { + fn get_or_insert_default(&mut self, id: FileId) -> &mut FileInfo { + self.0.entry(id).or_default() + } +} + +impl std::ops::Index<FileId> for FileGraph { + type Output = FileInfo; + fn index(&self, id: FileId) -> &Self::Output { + self.0.get(&id).expect("the graph should be populated") + } +} + +impl std::ops::IndexMut<FileId> for FileGraph { + fn index_mut(&mut self, id: FileId) -> &mut Self::Output { + self.0.get_mut(&id).expect("the graph should be populated") + } +} + +/// Annotates each line of a file with changeset information. +pub fn annotate( + repo: &Repo, + path: &HgPath, + changelog_revision: Revision, + options: AnnotateOptions, +) -> Result<AnnotateOutput, HgError> { + // Step 1: Load the base file and check if it's binary. + let changelog = repo.changelog()?; + let manifestlog = repo.manifestlog()?; + let mut fls = FilelogSet::default(); + let base_id = { + let changelog_data = changelog.entry(changelog_revision)?.data()?; + let manifest = manifestlog + .data_for_node(changelog_data.manifest_node()?.into())?; + let Some(entry) = manifest.find_by_path(path)? else { + return Ok(AnnotateOutput::NotFound); + }; + fls.open_at_node(repo, path, entry.node_id()?)? + }; + let base_file_data = fls.read(base_id)?.into_file_data()?; + if !options.treat_binary_as_text + && utils::files::is_binary(&base_file_data) + { + return Ok(AnnotateOutput::Binary); + } + + // Step 2: DFS to build the graph. + let mut graph = FileGraph::default(); + let mut visit = vec![base_id]; + while let Some(id) = visit.pop() { + let info = graph.get_or_insert_default(id); + if info.parents.is_some() { + continue; + } + let (parents, file_data) = + fls.parents(repo, id, options.follow_copies)?; + info.parents = Some(parents.clone()); + if let Some(data) = file_data { + info.file = AnnotatedFileState::Read(OwnedLines::split( + data, + options.whitespace, + )?); + } + for id in parents { + let info = graph.get_or_insert_default(id); + info.needed += 1; + if info.parents.is_none() { + visit.push(id); + } + } + } + + // Step 3: Read files and split lines. Do the base file with and without + // whitespace cleaning. Do the rest of the files in parallel with rayon. + let base_file_original_lines = match options.whitespace { + CleanWhitespace::None => None, + _ => Some(OwnedLines::split( + base_file_data.clone(), + CleanWhitespace::None, + )?), + }; + graph[base_id].file = AnnotatedFileState::Read(OwnedLines::split( + base_file_data, + options.whitespace, + )?); + graph.0.par_iter_mut().try_for_each( + |(&id, info)| -> Result<(), HgError> { + if let AnnotatedFileState::None = info.file { + info.file = AnnotatedFileState::Read(OwnedLines::split( + fls.read(id)?.into_file_data()?, + options.whitespace, + )?); + } + Ok(()) + }, + )?; + + // Step 4: DFS to do the actual annotate algorithm. + // While we're at it, save the topological order. + let mut topological_order = vec![]; + visit.push(base_id); + while let Some(&id) = visit.last() { + let info = &mut graph[id]; + if let AnnotatedFileState::Annotated(_) = info.file { + visit.pop(); + continue; + } + let visit_len = visit.len(); + let parents = info.parents.clone().expect("parents set in step 2"); + for &id in &parents { + match graph[id].file { + AnnotatedFileState::Annotated(_) => {} + _ => visit.push(id), + } + } + if visit.len() != visit_len { + continue; + } + visit.pop(); + topological_order.push(id); + let lines = match std::mem::take(&mut graph[id].file) { + AnnotatedFileState::Read(lines) => lines, + _ => unreachable!(), + }; + let mut parent_files = Vec::with_capacity(2); + for &id in &parents { + match graph[id].file { + AnnotatedFileState::Annotated(ref file) => { + parent_files.push(file) + } + _ => unreachable!(), + } + } + graph[id].file = AnnotatedFileState::Annotated(annotate_pair( + id, + lines, + parent_files, + )?); + for &id in &parents { + let info = &mut graph[id]; + info.needed -= 1; + if info.needed == 0 { + info.file = AnnotatedFileState::None; + } + } + } + + // Step 5: Map filelog revisions to changelog revisions. + let base_info = &mut graph[base_id]; + base_info.descendant = Some(changelog_revision); + let AnnotatedFileState::Annotated(AnnotatedFile { lines, annotations }) = + std::mem::take(&mut base_info.file) + else { + panic!("the base file should have been annotated in step 4") + }; + // Don't use the lines from the graph if they had whitespace cleaned. + let lines = base_file_original_lines.unwrap_or(lines); + // Only convert revisions that actually appear in the final output. + for &Annotation { id, .. } in &annotations { + graph[id].revision = ChangelogRevisionState::Needed; + } + // Use the same object for all ancestor checks, since it internally + // builds a hash set of seen revisions. + let mut ancestors = ancestor_iter(&changelog, changelog_revision, None); + // Iterate in reverse topological order so that we visits nodes after their + // children, that way we can propagate `descendant` correctly. + for &id in topological_order.iter().rev() { + let info = &mut graph[id]; + let descendant = + info.descendant.expect("descendant set by prior iteration"); + let propagate = match info.revision { + ChangelogRevisionState::NotNeeded => descendant, + ChangelogRevisionState::Needed => { + let revision = adjust_link_revision( + &changelog, + &manifestlog, + &fls, + &mut ancestors, + descendant, + id, + )?; + info.revision = ChangelogRevisionState::Done(revision); + revision + } + ChangelogRevisionState::Done(_) => unreachable!(), + }; + for id in info.parents.clone().expect("parents set in step 2") { + let descendant = &mut graph[id].descendant; + // If the parent had other descendants, choose the smallest one + // because we want to skip over as much as possible. + *descendant = Some(descendant.unwrap_or(propagate).min(propagate)); + } + } + + // Step 6: Convert to `ChangesetAnnotatedFile`. + let mut changeset_annotations = Vec::with_capacity(annotations.len()); + for Annotation { id, line_number } in annotations { + changeset_annotations.push(ChangesetAnnotation { + path: fls.get(id.index).path.clone(), + revision: match graph[id].revision { + ChangelogRevisionState::Done(revision) => revision, + _ => unreachable!(), + }, + line_number, + }); + } + Ok(AnnotateOutput::Text(ChangesetAnnotatedFile { + lines: lines.get().iter().map(ToOwned::to_owned).collect(), + annotations: changeset_annotations, + })) +} + +/// Annotates a file by diffing against its parents, attributing changed lines +/// to `id`, and copying ids from the parent results for unchanged lines. +/// If there are two parents and a line is unchanged in both diffs, p2 wins. +fn annotate_pair( + id: FileId, + lines: OwnedLines, + parents: Vec<&AnnotatedFile>, +) -> Result<AnnotatedFile, HgError> { + let len = lines.get().len(); + let mut annotations = Vec::with_capacity(len); + for line_number in 1..(len + 1) as u32 { + annotations.push(Annotation { id, line_number }); + } + for parent in parents { + for bdiff::Hunk { a1, a2, b1, b2 } in + bdiff::diff(parent.lines.get(), lines.get())?.iter() + { + for (a, b) in (a1..a2).zip(b1..b2) { + annotations[b as usize] = parent.annotations[a as usize]; + } + } + } + Ok(AnnotatedFile { lines, annotations }) +} + +/// Creates an iterator over the ancestors of `base_revision` (inclusive), +/// stopping at `stop_revision` if provided. Panics if `base_revision` is null. +fn ancestor_iter( + changelog: &Changelog, + base_revision: Revision, + stop_revision: Option<Revision>, +) -> AncestorsIterator<&Changelog> { + AncestorsIterator::new( + changelog, + [base_revision], + stop_revision.unwrap_or(NULL_REVISION), + true, + ) + .expect("base_revision should not be null") +} + +/// If the linkrev of `id` is in `ancestors`, returns it. Otherwise, finds and +/// returns the first ancestor of `descendant` that introduced `id`. +fn adjust_link_revision( + changelog: &Changelog, + manifestlog: &Manifestlog, + fls: &FilelogSet, + ancestors: &mut AncestorsIterator<&Changelog>, + descendant: Revision, + id: FileId, +) -> Result<Revision, HgError> { + let FilelogSetItem { filelog, path } = fls.get(id.index); + let linkrev = filelog + .revlog + .link_revision(id.revision, &changelog.revlog)?; + if ancestors.contains(linkrev).map_err(from_graph_error)? { + return Ok(linkrev); + } + let file_node = *filelog.revlog.node_from_rev(id.revision); + for ancestor in ancestor_iter(changelog, descendant, Some(linkrev)) { + let ancestor = ancestor.map_err(from_graph_error)?; + let data = changelog.entry(ancestor)?.data()?; + if data.files().contains(&path.as_ref()) { + let manifest_rev = manifestlog + .revlog + .rev_from_node(data.manifest_node()?.into())?; + if let Some(entry) = manifestlog + .inexact_data_delta_parents(manifest_rev)? + .find_by_path(path)? + { + if entry.node_id()? == file_node { + return Ok(ancestor); + } + } + } + } + // In theory this should be unreachable. But in case it happens, return the + // linkrev. This matches _adjustlinkrev in context.py. + Ok(linkrev) +} + +/// Converts a [`GraphError`] to an [`HgError`]. +fn from_graph_error(err: GraphError) -> HgError { + HgError::corrupted(err.to_string()) +}
--- a/rust/hg-core/src/operations/cat.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/operations/cat.rs Fri Feb 28 23:28:10 2025 +0100 @@ -85,11 +85,8 @@ ) -> Result<CatOutput<'a>, RevlogError> { let rev = crate::revset::resolve_single(revset, repo)?; let manifest = repo.manifest_for_rev(rev.into())?; - let node = *repo - .changelog()? - .node_from_unchecked_rev(rev.into()) - .expect("should succeed when repo.manifest did"); let mut results: Vec<(&'a HgPath, Vec<u8>)> = vec![]; + let node = *repo.changelog()?.node_from_rev(rev); let mut found_any = false; files.sort_unstable();
--- a/rust/hg-core/src/operations/mod.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/operations/mod.rs Fri Feb 28 23:28:10 2025 +0100 @@ -2,10 +2,14 @@ //! An operation is what can be done whereas a command is what is exposed by //! the cli. A single command can use several operations to achieve its goal. +mod annotate; mod cat; mod debugdata; mod list_tracked_files; mod status_rev_rev; +pub use annotate::{ + annotate, AnnotateOptions, AnnotateOutput, ChangesetAnnotation, +}; pub use cat::{cat, CatOutput}; pub use debugdata::debug_data; pub use list_tracked_files::{
--- a/rust/hg-core/src/repo.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/repo.rs Fri Feb 28 23:28:10 2025 +0100 @@ -15,7 +15,7 @@ use crate::utils::debug::debug_wait_for_file_or_print; use crate::utils::files::get_path_from_bytes; use crate::utils::hg_path::HgPath; -use crate::utils::SliceExt; +use crate::utils::strings::SliceExt; use crate::vfs::{is_dir, is_file, Vfs, VfsImpl}; use crate::{exit_codes, requirements, NodePrefix, UncheckedRevision}; use std::cell::{Ref, RefCell, RefMut};
--- a/rust/hg-core/src/requirements.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/requirements.rs Fri Feb 28 23:28:10 2025 +0100 @@ -1,6 +1,6 @@ use crate::errors::{HgError, HgResultExt}; use crate::repo::Repo; -use crate::utils::join_display; +use crate::utils::strings::join_display; use crate::vfs::VfsImpl; use std::collections::HashSet;
--- a/rust/hg-core/src/revlog/changelog.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/revlog/changelog.rs Fri Feb 28 23:28:10 2025 +0100 @@ -53,7 +53,7 @@ } /// Same as [`Self::entry_for_unchecked_rev`] for a checked revision - fn entry(&self, rev: Revision) -> Result<ChangelogEntry, RevlogError> { + pub fn entry(&self, rev: Revision) -> Result<ChangelogEntry, RevlogError> { let revlog_entry = self.revlog.get_entry(rev)?; Ok(ChangelogEntry { revlog_entry }) } @@ -71,11 +71,15 @@ self.entry_for_unchecked_rev(rev)?.data() } + pub fn node_from_rev(&self, rev: Revision) -> &Node { + self.revlog.node_from_rev(rev) + } + pub fn node_from_unchecked_rev( &self, rev: UncheckedRevision, ) -> Option<&Node> { - self.revlog.node_from_rev(rev) + self.revlog.node_from_unchecked_rev(rev) } pub fn rev_from_node(
--- a/rust/hg-core/src/revlog/file_io.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/revlog/file_io.rs Fri Feb 28 23:28:10 2025 +0100 @@ -28,10 +28,15 @@ vfs: Box<dyn Vfs>, /// Filename of the open file, relative to the vfs root pub filename: PathBuf, - /// The current read-only handle on the file, if any - pub reading_handle: RefCell<Option<FileHandle>>, - /// The current read-write handle on the file, if any - pub writing_handle: RefCell<Option<FileHandle>>, + /// The current read-only handle on the file, if any. + /// Specific to the current thread, since we don't want seeks to overlap + pub reading_handle: thread_local::ThreadLocal<RefCell<Option<FileHandle>>>, + /// The current read-write handle on the file, if any. + /// Specific to the current thread, since we don't want seeks to overlap, + /// and we can re-use the write handle for reading in certain contexts. + /// Logically, two concurrent writes are impossible because they are only + /// accessible through `&mut self` methods, which take a lock. + pub writing_handle: thread_local::ThreadLocal<RefCell<Option<FileHandle>>>, } impl RandomAccessFile { @@ -41,8 +46,8 @@ Self { vfs, filename, - reading_handle: RefCell::new(None), - writing_handle: RefCell::new(None), + reading_handle: thread_local::ThreadLocal::new(), + writing_handle: thread_local::ThreadLocal::new(), } } @@ -62,7 +67,7 @@ /// `pub` only for hg-cpython #[doc(hidden)] pub fn get_read_handle(&self) -> Result<FileHandle, HgError> { - if let Some(handle) = &*self.writing_handle.borrow() { + if let Some(handle) = &*self.writing_handle.get_or_default().borrow() { // Use a file handle being actively used for writes, if available. // There is some danger to doing this because reads will seek the // file. @@ -70,7 +75,7 @@ // before all writes, so we should be safe. return Ok(handle.clone()); } - if let Some(handle) = &*self.reading_handle.borrow() { + if let Some(handle) = &*self.reading_handle.get_or_default().borrow() { return Ok(handle.clone()); } // early returns done to work around borrowck being overzealous @@ -81,20 +86,21 @@ false, false, )?; - *self.reading_handle.borrow_mut() = Some(new_handle.clone()); + *self.reading_handle.get_or_default().borrow_mut() = + Some(new_handle.clone()); Ok(new_handle) } /// `pub` only for hg-cpython #[doc(hidden)] pub fn exit_reading_context(&self) { - self.reading_handle.take(); + self.reading_handle.get().map(|h| h.take()); } // Returns whether this file currently open pub fn is_open(&self) -> bool { - self.reading_handle.borrow().is_some() - || self.writing_handle.borrow().is_some() + self.reading_handle.get_or_default().borrow().is_some() + || self.writing_handle.get_or_default().borrow().is_some() } }
--- a/rust/hg-core/src/revlog/filelog.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/revlog/filelog.rs Fri Feb 28 23:28:10 2025 +0100 @@ -8,7 +8,7 @@ use crate::revlog::{Revlog, RevlogError}; use crate::utils::files::get_path_from_bytes; use crate::utils::hg_path::HgPath; -use crate::utils::SliceExt; +use crate::utils::strings::SliceExt; use crate::Graph; use crate::GraphError; use crate::Node; @@ -20,7 +20,7 @@ /// A specialized `Revlog` to work with file data logs. pub struct Filelog { /// The generic `revlog` format. - revlog: Revlog, + pub(crate) revlog: Revlog, } impl Graph for Filelog { @@ -57,7 +57,7 @@ file_node: impl Into<NodePrefix>, ) -> Result<FilelogRevisionData, RevlogError> { let file_rev = self.revlog.rev_from_node(file_node.into())?; - self.data_for_unchecked_rev(file_rev.into()) + Ok(self.entry(file_rev)?.data()?) } /// The given revision is that of the file as found in a filelog, not of a @@ -109,7 +109,7 @@ get_path_from_bytes(&encoded_bytes).into() } -pub struct FilelogEntry<'a>(RevlogEntry<'a>); +pub struct FilelogEntry<'a>(pub(crate) RevlogEntry<'a>); impl FilelogEntry<'_> { /// `self.data()` can be expensive, with decompression and delta
--- a/rust/hg-core/src/revlog/index.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/revlog/index.rs Fri Feb 28 23:28:10 2025 +0100 @@ -725,13 +725,11 @@ &self, rev: Revision, stop_rev: Option<Revision>, - using_general_delta: Option<bool>, ) -> Result<(Vec<Revision>, bool), HgError> { let mut current_rev = rev; let mut entry = self.get_entry(rev).unwrap(); let mut chain = vec![]; - let using_general_delta = - using_general_delta.unwrap_or_else(|| self.uses_generaldelta()); + let using_general_delta = self.uses_generaldelta(); while current_rev.0 != entry.base_revision_or_base_of_delta_chain().0 && stop_rev.map(|r| r != current_rev).unwrap_or(true) { @@ -882,9 +880,9 @@ if parent_base.0 == p1.0 { break; } - p1 = self.check_revision(parent_base).ok_or( - RevlogError::InvalidRevision(parent_base.to_string()), - )?; + p1 = self.check_revision(parent_base).ok_or_else(|| { + RevlogError::InvalidRevision(parent_base.to_string()) + })?; } while let Some(p2_entry) = self.get_entry(p2) { if p2_entry.compressed_len() != 0 || p2.0 == 0 { @@ -895,16 +893,16 @@ if parent_base.0 == p2.0 { break; } - p2 = self.check_revision(parent_base).ok_or( - RevlogError::InvalidRevision(parent_base.to_string()), - )?; + p2 = self.check_revision(parent_base).ok_or_else(|| { + RevlogError::InvalidRevision(parent_base.to_string()) + })?; } if base == p1.0 || base == p2.0 { return Ok(false); } - rev = self - .check_revision(base.into()) - .ok_or(RevlogError::InvalidRevision(base.to_string()))?; + rev = self.check_revision(base.into()).ok_or_else(|| { + RevlogError::InvalidRevision(base.to_string()) + })?; } Ok(rev == NULL_REVISION) }
--- a/rust/hg-core/src/revlog/inner_revlog.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/revlog/inner_revlog.rs Fri Feb 28 23:28:10 2025 +0100 @@ -2,11 +2,10 @@ //! IO work and expensive operations. use std::{ borrow::Cow, - cell::RefCell, io::{ErrorKind, Seek, SeekFrom, Write}, ops::Deref, path::PathBuf, - sync::{Arc, Mutex}, + sync::{Arc, Mutex, RwLock}, }; use schnellru::{ByMemoryUsage, LruMap}; @@ -103,7 +102,7 @@ data_config.uncompressed_cache_factor.map( // Arbitrary initial value // TODO check if using a hasher specific to integers is useful - |_factor| RefCell::new(LruMap::with_memory_budget(65536)), + |_factor| RwLock::new(LruMap::with_memory_budget(65536)), ); let inline = index.is_inline(); @@ -156,7 +155,7 @@ // We don't clear the allocation here because it's probably faster. // We could change our minds later if this ends up being a problem // with regards to memory consumption. - cache.borrow_mut().clear(); + cache.write().expect("lock is poisoned").clear(); } } @@ -293,11 +292,7 @@ rev: Revision, stop_rev: Option<Revision>, ) -> Result<(Vec<Revision>, bool), HgError> { - self.index.delta_chain( - rev, - stop_rev, - self.delta_config.general_delta.into(), - ) + self.index.delta_chain(rev, stop_rev) } /// Generate a possibly-compressed representation of data. @@ -395,8 +390,12 @@ /// Return the uncompressed raw data for `rev` pub fn chunk_for_rev(&self, rev: Revision) -> Result<Arc<[u8]>, HgError> { - if let Some(cache) = self.uncompressed_chunk_cache.as_ref() { - if let Some(chunk) = cache.borrow_mut().get(&rev) { + if let Some(Ok(mut cache)) = self + .uncompressed_chunk_cache + .as_ref() + .map(|c| c.try_write()) + { + if let Some(chunk) = cache.get(&rev) { return Ok(chunk.clone()); } } @@ -410,8 +409,12 @@ ) })?; let uncompressed: Arc<[u8]> = Arc::from(uncompressed.into_owned()); - if let Some(cache) = self.uncompressed_chunk_cache.as_ref() { - cache.borrow_mut().insert(rev, uncompressed.clone()); + if let Some(Ok(mut cache)) = self + .uncompressed_chunk_cache + .as_ref() + .map(|c| c.try_write()) + { + cache.insert(rev, uncompressed.clone()); } Ok(uncompressed) } @@ -479,25 +482,27 @@ } else { None }; - if let Some(cache) = &self.uncompressed_chunk_cache { - let cache = &mut cache.borrow_mut(); - if let Some(size) = raw_size { - // Dynamically update the uncompressed_chunk_cache size to the - // largest revision we've seen in this revlog. - // Do it *before* restoration in case the current revision - // is the largest. - let factor = self - .data_config - .uncompressed_cache_factor - .expect("cache should not exist without factor"); - let candidate_size = (size as f64 * factor) as usize; - let limiter_mut = cache.limiter_mut(); - if candidate_size > limiter_mut.max_memory_usage() { - std::mem::swap( - limiter_mut, - &mut ByMemoryUsage::new(candidate_size), - ); - } + if let (Some(size), Some(Ok(mut cache))) = ( + raw_size, + self.uncompressed_chunk_cache + .as_ref() + .map(|c| c.try_write()), + ) { + // Dynamically update the uncompressed_chunk_cache size to the + // largest revision we've seen in this revlog. + // Do it *before* restoration in case the current revision + // is the largest. + let factor = self + .data_config + .uncompressed_cache_factor + .expect("cache should not exist without factor"); + let candidate_size = (size as f64 * factor) as usize; + let limiter_mut = cache.limiter_mut(); + if candidate_size > limiter_mut.max_memory_usage() { + std::mem::swap( + limiter_mut, + &mut ByMemoryUsage::new(candidate_size), + ); } } entry.rawdata(cached_rev, get_buffer)?; @@ -526,12 +531,15 @@ match self.uncompressed_chunk_cache.as_ref() { Some(cache) => { - let mut cache = cache.borrow_mut(); - for rev in revs.iter() { - match cache.get(rev) { - Some(hit) => chunks.push((*rev, hit.to_owned())), - None => fetched_revs.push(*rev), + if let Ok(mut cache) = cache.try_write() { + for rev in revs.iter() { + match cache.get(rev) { + Some(hit) => chunks.push((*rev, hit.to_owned())), + None => fetched_revs.push(*rev), + } } + } else { + fetched_revs = revs } } None => fetched_revs = revs, @@ -592,8 +600,11 @@ Ok(()) })?; - if let Some(cache) = self.uncompressed_chunk_cache.as_ref() { - let mut cache = cache.borrow_mut(); + if let Some(Ok(mut cache)) = self + .uncompressed_chunk_cache + .as_ref() + .map(|c| c.try_write()) + { for (rev, chunk) in chunks.iter().skip(already_cached) { cache.insert(*rev, chunk.clone()); } @@ -814,8 +825,8 @@ #[doc(hidden)] pub fn exit_writing_context(&mut self) { self.writing_handles.take(); - self.segment_file.writing_handle.take(); - self.segment_file.reading_handle.take(); + self.segment_file.writing_handle.get().map(|h| h.take()); + self.segment_file.reading_handle.get().map(|h| h.take()); } /// `pub` only for use in hg-cpython @@ -878,7 +889,11 @@ index_handle: index_handle.clone(), data_handle: data_handle.clone(), }); - *self.segment_file.reading_handle.borrow_mut() = if self.is_inline() { + *self + .segment_file + .reading_handle + .get_or_default() + .borrow_mut() = if self.is_inline() { Some(index_handle) } else { data_handle @@ -958,7 +973,7 @@ if let Some(handles) = &mut self.writing_handles { handles.index_handle.flush()?; self.writing_handles.take(); - self.segment_file.writing_handle.take(); + self.segment_file.writing_handle.get().map(|h| h.take()); } let mut new_data_file_handle = self.vfs.create(&self.data_file, true)?; @@ -1024,7 +1039,11 @@ index_handle: self.index_write_handle()?, data_handle: new_data_handle.clone(), }); - *self.segment_file.writing_handle.borrow_mut() = new_data_handle; + *self + .segment_file + .writing_handle + .get_or_default() + .borrow_mut() = new_data_handle; } Ok(self.index_file.to_owned()) @@ -1275,10 +1294,8 @@ } } -/// The use of a [`Refcell`] assumes that a given revlog will only -/// be accessed (read or write) by a single thread. type UncompressedChunkCache = - RefCell<LruMap<Revision, Arc<[u8]>, ByMemoryUsage>>; + RwLock<LruMap<Revision, Arc<[u8]>, ByMemoryUsage>>; /// The node, revision and data for the last revision we've seen. Speeds up /// a lot of sequential operations of the revlog.
--- a/rust/hg-core/src/revlog/manifest.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/revlog/manifest.rs Fri Feb 28 23:28:10 2025 +0100 @@ -1,14 +1,12 @@ use std::num::NonZeroU8; use crate::errors::HgError; -use crate::revlog::{Node, NodePrefix}; -use crate::revlog::{Revlog, RevlogError}; +use crate::revlog::options::RevlogOpenOptions; +use crate::revlog::{Node, NodePrefix, Revlog, RevlogError}; use crate::utils::hg_path::HgPath; -use crate::utils::SliceExt; +use crate::utils::strings::SliceExt; use crate::vfs::VfsImpl; -use crate::{Graph, GraphError, Revision, UncheckedRevision}; - -use super::options::RevlogOpenOptions; +use crate::{Graph, GraphError, Revision, UncheckedRevision, NULL_REVISION}; /// A specialized `Revlog` to work with `manifest` data format. pub struct Manifestlog { @@ -66,6 +64,27 @@ let bytes = self.revlog.get_data(rev)?.into_owned(); Ok(Manifest { bytes }) } + + /// Returns a manifest containing entries for `rev` that are not in its + /// parents. It is inexact because it might return a superset of this. + /// Equivalent to `manifestctx.read_delta_parents(exact=False)` in Python. + pub fn inexact_data_delta_parents( + &self, + rev: Revision, + ) -> Result<Manifest, RevlogError> { + let delta_parent = self.revlog.delta_parent(rev); + let parents = self.parents(rev).map_err(|err| { + RevlogError::corrupted(format!("rev {rev}: {err}")) + })?; + if delta_parent == NULL_REVISION || !parents.contains(&delta_parent) { + return self.data(rev); + } + let mut bytes = vec![]; + for chunk in self.revlog.get_data_incr(rev)?.as_patch_list()?.chunks { + bytes.extend_from_slice(chunk.data); + } + Ok(Manifest { bytes }) + } } /// `Manifestlog` entry which knows how to interpret the `manifest` data bytes.
--- a/rust/hg-core/src/revlog/mod.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/revlog/mod.rs Fri Feb 28 23:28:10 2025 +0100 @@ -136,7 +136,10 @@ #[derive(Clone, Debug, PartialEq)] pub enum GraphError { + /// Parent revision does not exist, i.e. below 0 or above max revision. ParentOutOfRange(Revision), + /// Parent revision number is greater than one of its descendants. + ParentOutOfOrder(Revision), } impl std::fmt::Display for GraphError { @@ -145,6 +148,9 @@ GraphError::ParentOutOfRange(revision) => { write!(f, "parent out of range ({})", revision) } + GraphError::ParentOutOfOrder(revision) => { + write!(f, "parent out of order ({})", revision) + } } } } @@ -343,12 +349,19 @@ /// Returns the node ID for the given revision number, if it exists in this /// revlog - pub fn node_from_rev(&self, rev: UncheckedRevision) -> Option<&Node> { - if rev == NULL_REVISION.into() { - return Some(&NULL_NODE); + pub fn node_from_rev(&self, rev: Revision) -> &Node { + match self.index().get_entry(rev) { + None => &NULL_NODE, + Some(entry) => entry.hash(), } - let rev = self.index().check_revision(rev)?; - Some(self.index().get_entry(rev)?.hash()) + } + + /// Like [`Self::node_from_rev`] but checks `rev` first. + pub fn node_from_unchecked_rev( + &self, + rev: UncheckedRevision, + ) -> Option<&Node> { + Some(self.node_from_rev(self.index().check_revision(rev)?)) } /// Return the revision number for the given node ID, if it exists in this @@ -361,7 +374,9 @@ nodemap .find_bin(self.index(), node) .map_err(|err| (err, format!("{:x}", node)))? - .ok_or(RevlogError::InvalidRevision(format!("{:x}", node))) + .ok_or_else(|| { + RevlogError::InvalidRevision(format!("{:x}", node)) + }) } else { self.index().rev_from_node_no_persistent_nodemap(node) } @@ -386,6 +401,36 @@ self.inner.get_entry_for_unchecked_rev(rev) } + /// Returns the delta parent of the given revision. + pub fn delta_parent(&self, rev: Revision) -> Revision { + if rev == NULL_REVISION { + NULL_REVISION + } else { + self.inner.delta_parent(rev) + } + } + + /// Returns the link revision (a.k.a. "linkrev") of the given revision. + /// Returns an error if the linkrev does not exist in `linked_revlog`. + pub fn link_revision( + &self, + rev: Revision, + linked_revlog: &Self, + ) -> Result<Revision, RevlogError> { + let Some(entry) = self.index().get_entry(rev) else { + return Ok(NULL_REVISION); + }; + linked_revlog + .index() + .check_revision(entry.link_revision()) + .ok_or_else(|| { + RevlogError::corrupted(format!( + "linkrev for rev {} is invalid", + rev + )) + }) + } + /// Return the full data associated to a revision. /// /// All entries required to build the final data out of deltas will be @@ -409,6 +454,26 @@ self.get_entry(rev)?.data() } + /// Gets the raw uncompressed data stored for a revision, which is either + /// the full text or a delta. Panics if `rev` is null. + pub fn get_data_incr( + &self, + rev: Revision, + ) -> Result<RawdataBuf, RevlogError> { + let index = self.index(); + let entry = index.get_entry(rev).expect("rev should not be null"); + let delta_base = entry.base_revision_or_base_of_delta_chain(); + let base = if UncheckedRevision::from(rev) == delta_base { + None + } else if index.uses_generaldelta() { + Some(delta_base) + } else { + Some(UncheckedRevision(rev.0 - 1)) + }; + let data = self.inner.chunk_for_rev(rev)?; + Ok(RawdataBuf { base, data }) + } + /// Check the hash of some given data against the recorded hash. pub fn check_hash( &self, @@ -441,6 +506,21 @@ } } +pub struct RawdataBuf { + // If `Some`, data is a delta. + base: Option<UncheckedRevision>, + data: std::sync::Arc<[u8]>, +} + +impl RawdataBuf { + fn as_patch_list(&self) -> Result<patch::PatchList, RevlogError> { + match self.base { + None => Ok(patch::PatchList::full_snapshot(&self.data)), + Some(_) => patch::PatchList::new(&self.data), + } + } +} + type IndexData = Box<dyn Deref<Target = [u8]> + Send + Sync>; /// TODO We should check for version 5.14+ at runtime, but we either should
--- a/rust/hg-core/src/revlog/nodemap.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/revlog/nodemap.rs Fri Feb 28 23:28:10 2025 +0100 @@ -25,6 +25,8 @@ use std::ops::Deref; use std::ops::Index; +type NodeTreeBuffer = Box<dyn Deref<Target = [u8]> + Send + Sync>; + #[derive(Debug, PartialEq)] pub enum NodeMapError { /// A `NodePrefix` matches several [`Revision`]s. @@ -224,7 +226,7 @@ /// The mutable root [`Block`] is kept apart so that we don't have to rebump /// it on each insertion. pub struct NodeTree { - readonly: Box<dyn Deref<Target = [Block]> + Send>, + readonly: Box<dyn Deref<Target = [Block]> + Send + Sync>, growable: Vec<Block>, root: Block, masked_inner_blocks: usize, @@ -299,7 +301,7 @@ /// Initiate a NodeTree from an immutable slice-like of `Block` /// /// We keep `readonly` and clone its root block if it isn't empty. - fn new(readonly: Box<dyn Deref<Target = [Block]> + Send>) -> Self { + fn new(readonly: Box<dyn Deref<Target = [Block]> + Send + Sync>) -> Self { let root = readonly.last().cloned().unwrap_or_else(Block::new); NodeTree { readonly, @@ -321,17 +323,14 @@ /// First use-case for this would be to support Mercurial shell hooks. /// /// panics if `buffer` is smaller than `amount` - pub fn load_bytes( - bytes: Box<dyn Deref<Target = [u8]> + Send>, - amount: usize, - ) -> Self { + pub fn load_bytes(bytes: NodeTreeBuffer, amount: usize) -> Self { NodeTree::new(Box::new(NodeTreeBytes::new(bytes, amount))) } /// Retrieve added [`Block`]s and the original immutable data pub fn into_readonly_and_added( self, - ) -> (Box<dyn Deref<Target = [Block]> + Send>, Vec<Block>) { + ) -> (Box<dyn Deref<Target = [Block]> + Send + Sync>, Vec<Block>) { let mut vec = self.growable; let readonly = self.readonly; if readonly.last() != Some(&self.root) { @@ -344,7 +343,7 @@ /// storage pub fn into_readonly_and_added_bytes( self, - ) -> (Box<dyn Deref<Target = [Block]> + Send>, Vec<u8>) { + ) -> (Box<dyn Deref<Target = [Block]> + Send + Sync>, Vec<u8>) { let (readonly, vec) = self.into_readonly_and_added(); // Prevent running `v`'s destructor so we are in complete control // of the allocation. @@ -562,15 +561,12 @@ } pub struct NodeTreeBytes { - buffer: Box<dyn Deref<Target = [u8]> + Send>, + buffer: NodeTreeBuffer, len_in_blocks: usize, } impl NodeTreeBytes { - fn new( - buffer: Box<dyn Deref<Target = [u8]> + Send>, - amount: usize, - ) -> Self { + fn new(buffer: NodeTreeBuffer, amount: usize) -> Self { assert!(buffer.len() >= amount); let len_in_blocks = amount / size_of::<Block>(); NodeTreeBytes {
--- a/rust/hg-core/src/revlog/options.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/revlog/options.rs Fri Feb 28 23:28:10 2025 +0100 @@ -183,12 +183,6 @@ } } - if let Some(mmap_index_threshold) = config - .get_byte_size(b"storage", b"revlog.mmap.index:size-threshold")? - { - data_config.mmap_index_threshold = Some(mmap_index_threshold); - } - let with_sparse_read = config.get_bool(b"experimental", b"sparse-read")?; if let Some(sr_density_threshold) = config
--- a/rust/hg-core/src/revlog/patch.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/revlog/patch.rs Fri Feb 28 23:28:10 2025 +0100 @@ -12,13 +12,13 @@ /// - a replacement when `!data.is_empty() && start < end` /// - not doing anything when `data.is_empty() && start == end` #[derive(Debug, Clone)] -struct Chunk<'a> { +pub(crate) struct Chunk<'a> { /// The start position of the chunk of data to replace - start: u32, + pub(crate) start: u32, /// The end position of the chunk of data to replace (open end interval) - end: u32, + pub(crate) end: u32, /// The data replacing the chunk - data: &'a [u8], + pub(crate) data: &'a [u8], } impl Chunk<'_> { @@ -60,7 +60,7 @@ /// - ordered from the left-most replacement to the right-most replacement /// - non-overlapping, meaning that two chucks can not change the same /// chunk of the patched data - chunks: Vec<Chunk<'a>>, + pub(crate) chunks: Vec<Chunk<'a>>, } impl<'a> PatchList<'a> { @@ -85,6 +85,17 @@ Ok(PatchList { chunks }) } + /// Creates a patch for a full snapshot, going from nothing to `data`. + pub fn full_snapshot(data: &'a [u8]) -> Self { + Self { + chunks: vec![Chunk { + start: 0, + end: 0, + data, + }], + } + } + /// Apply the patch to some data. pub fn apply<T>( &self,
--- a/rust/hg-core/src/sparse.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/sparse.rs Fri Feb 28 23:28:10 2025 +0100 @@ -17,7 +17,7 @@ operations::cat, repo::Repo, requirements::SPARSE_REQUIREMENT, - utils::{hg_path::HgPath, SliceExt}, + utils::{hg_path::HgPath, strings::SliceExt}, Revision, NULL_REVISION, }; @@ -89,6 +89,10 @@ /// An invalid pattern prefix was given to the narrow spec. Includes the /// entire pattern for context. InvalidNarrowPrefix(Vec<u8>), + /// Narrow/sparse patterns can not begin or end in whitespace + /// because the Python parser strips the whitespace when parsing + /// the config file. + WhitespaceAtEdgeOfPattern(Vec<u8>), #[from] HgError(HgError), #[from] @@ -138,6 +142,20 @@ VALID_PREFIXES.join(", ") )), }, + SparseConfigError::WhitespaceAtEdgeOfPattern(vec) => { + HgError::Abort { + message: String::from_utf8_lossy(&format_bytes!( + b"narrow pattern with whitespace at the edge: {}", + vec + )) + .to_string(), + detailed_exit_code: STATE_ERROR, + hint: Some( + "narrow patterns can't begin or end in whitespace" + .to_string(), + ), + } + } SparseConfigError::HgError(hg_error) => hg_error, SparseConfigError::PatternError(pattern_error) => HgError::Abort { message: pattern_error.to_string(),
--- a/rust/hg-core/src/utils.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/utils.rs Fri Feb 28 23:28:10 2025 +0100 @@ -8,222 +8,17 @@ //! Contains useful functions, traits, structs, etc. for use in core. use crate::errors::{HgError, IoErrorContext}; -use crate::utils::hg_path::HgPath; use im_rc::ordmap::DiffItem; use im_rc::ordmap::OrdMap; use itertools::EitherOrBoth; use itertools::Itertools; -use std::cell::Cell; use std::cmp::Ordering; -use std::fmt; -use std::{io::Write, ops::Deref}; pub mod debug; pub mod files; pub mod hg_path; pub mod path_auditor; - -/// Useful until rust/issues/56345 is stable -/// -/// # Examples -/// -/// ``` -/// use crate::hg::utils::find_slice_in_slice; -/// -/// let haystack = b"This is the haystack".to_vec(); -/// assert_eq!(find_slice_in_slice(&haystack, b"the"), Some(8)); -/// assert_eq!(find_slice_in_slice(&haystack, b"not here"), None); -/// ``` -pub fn find_slice_in_slice<T>(slice: &[T], needle: &[T]) -> Option<usize> -where - for<'a> &'a [T]: PartialEq, -{ - slice - .windows(needle.len()) - .position(|window| window == needle) -} - -/// Replaces the `from` slice with the `to` slice inside the `buf` slice. -/// -/// # Examples -/// -/// ``` -/// use crate::hg::utils::replace_slice; -/// let mut line = b"I hate writing tests!".to_vec(); -/// replace_slice(&mut line, b"hate", b"love"); -/// assert_eq!( -/// line, -/// b"I love writing tests!".to_vec() -/// ); -/// ``` -pub fn replace_slice<T>(buf: &mut [T], from: &[T], to: &[T]) -where - T: Clone + PartialEq, -{ - if buf.len() < from.len() || from.len() != to.len() { - return; - } - for i in 0..=buf.len() - from.len() { - if buf[i..].starts_with(from) { - buf[i..(i + from.len())].clone_from_slice(to); - } - } -} - -pub trait SliceExt { - fn trim_end(&self) -> &Self; - fn trim_start(&self) -> &Self; - fn trim_end_matches(&self, f: impl FnMut(u8) -> bool) -> &Self; - fn trim_start_matches(&self, f: impl FnMut(u8) -> bool) -> &Self; - fn trim(&self) -> &Self; - fn drop_prefix(&self, needle: &Self) -> Option<&Self>; - fn split_2(&self, separator: u8) -> Option<(&[u8], &[u8])>; - fn split_2_by_slice(&self, separator: &[u8]) -> Option<(&[u8], &[u8])>; -} - -impl SliceExt for [u8] { - fn trim_end(&self) -> &[u8] { - self.trim_end_matches(|byte| byte.is_ascii_whitespace()) - } - - fn trim_start(&self) -> &[u8] { - self.trim_start_matches(|byte| byte.is_ascii_whitespace()) - } - - fn trim_end_matches(&self, mut f: impl FnMut(u8) -> bool) -> &Self { - if let Some(last) = self.iter().rposition(|&byte| !f(byte)) { - &self[..=last] - } else { - &[] - } - } - - fn trim_start_matches(&self, mut f: impl FnMut(u8) -> bool) -> &Self { - if let Some(first) = self.iter().position(|&byte| !f(byte)) { - &self[first..] - } else { - &[] - } - } - - /// ``` - /// use hg::utils::SliceExt; - /// assert_eq!( - /// b" to trim ".trim(), - /// b"to trim" - /// ); - /// assert_eq!( - /// b"to trim ".trim(), - /// b"to trim" - /// ); - /// assert_eq!( - /// b" to trim".trim(), - /// b"to trim" - /// ); - /// ``` - fn trim(&self) -> &[u8] { - self.trim_start().trim_end() - } - - fn drop_prefix(&self, needle: &Self) -> Option<&Self> { - if self.starts_with(needle) { - Some(&self[needle.len()..]) - } else { - None - } - } - - fn split_2(&self, separator: u8) -> Option<(&[u8], &[u8])> { - let pos = memchr::memchr(separator, self)?; - Some((&self[..pos], &self[pos + 1..])) - } - - fn split_2_by_slice(&self, separator: &[u8]) -> Option<(&[u8], &[u8])> { - find_slice_in_slice(self, separator) - .map(|pos| (&self[..pos], &self[pos + separator.len()..])) - } -} - -pub trait Escaped { - /// Return bytes escaped for display to the user - fn escaped_bytes(&self) -> Vec<u8>; -} - -impl Escaped for u8 { - fn escaped_bytes(&self) -> Vec<u8> { - let mut acc = vec![]; - match self { - c @ b'\'' | c @ b'\\' => { - acc.push(b'\\'); - acc.push(*c); - } - b'\t' => { - acc.extend(br"\\t"); - } - b'\n' => { - acc.extend(br"\\n"); - } - b'\r' => { - acc.extend(br"\\r"); - } - c if (*c < b' ' || *c >= 127) => { - write!(acc, "\\x{:x}", self).unwrap(); - } - c => { - acc.push(*c); - } - } - acc - } -} - -impl<'a, T: Escaped> Escaped for &'a [T] { - fn escaped_bytes(&self) -> Vec<u8> { - self.iter().flat_map(Escaped::escaped_bytes).collect() - } -} - -impl<T: Escaped> Escaped for Vec<T> { - fn escaped_bytes(&self) -> Vec<u8> { - self.deref().escaped_bytes() - } -} - -impl<'a> Escaped for &'a HgPath { - fn escaped_bytes(&self) -> Vec<u8> { - self.as_bytes().escaped_bytes() - } -} - -#[cfg(unix)] -pub fn shell_quote(value: &[u8]) -> Vec<u8> { - if value.iter().all(|&byte| { - matches!( - byte, - b'a'..=b'z' - | b'A'..=b'Z' - | b'0'..=b'9' - | b'.' - | b'_' - | b'/' - | b'+' - | b'-' - ) - }) { - value.to_owned() - } else { - let mut quoted = Vec::with_capacity(value.len() + 2); - quoted.push(b'\''); - for &byte in value { - if byte == b'\'' { - quoted.push(b'\\'); - } - quoted.push(byte); - } - quoted.push(b'\''); - quoted - } -} +pub mod strings; pub fn current_dir() -> Result<std::path::PathBuf, HgError> { std::env::current_dir().map_err(|error| HgError::IoError { @@ -239,59 +34,6 @@ }) } -/// Expand `$FOO` and `${FOO}` environment variables in the given byte string -pub fn expand_vars(s: &[u8]) -> std::borrow::Cow<[u8]> { - lazy_static::lazy_static! { - /// https://github.com/python/cpython/blob/3.9/Lib/posixpath.py#L301 - /// The `x` makes whitespace ignored. - /// `-u` disables the Unicode flag, which makes `\w` like Python with the ASCII flag. - static ref VAR_RE: regex::bytes::Regex = - regex::bytes::Regex::new(r"(?x-u) - \$ - (?: - (\w+) - | - \{ - ([^}]*) - \} - ) - ").unwrap(); - } - VAR_RE.replace_all(s, |captures: ®ex::bytes::Captures| { - let var_name = files::get_os_str_from_bytes( - captures - .get(1) - .or_else(|| captures.get(2)) - .expect("either side of `|` must participate in match") - .as_bytes(), - ); - std::env::var_os(var_name) - .map(files::get_bytes_from_os_str) - .unwrap_or_else(|| { - // Referencing an environment variable that does not exist. - // Leave the $FOO reference as-is. - captures[0].to_owned() - }) - }) -} - -#[test] -fn test_expand_vars() { - // Modifying process-global state in a test isn’t great, - // but hopefully this won’t collide with anything. - std::env::set_var("TEST_EXPAND_VAR", "1"); - assert_eq!( - expand_vars(b"before/$TEST_EXPAND_VAR/after"), - &b"before/1/after"[..] - ); - assert_eq!( - expand_vars(b"before${TEST_EXPAND_VAR}${TEST_EXPAND_VAR}${TEST_EXPAND_VAR}after"), - &b"before111after"[..] - ); - let s = b"before $SOME_LONG_NAME_THAT_WE_ASSUME_IS_NOT_AN_ACTUAL_ENV_VAR after"; - assert_eq!(expand_vars(s), &s[..]); -} - pub(crate) enum MergeResult<V> { Left, Right, @@ -440,46 +182,6 @@ } } -/// Join items of the iterable with the given separator, similar to Python’s -/// `separator.join(iter)`. -/// -/// Formatting the return value consumes the iterator. -/// Formatting it again will produce an empty string. -pub fn join_display( - iter: impl IntoIterator<Item = impl fmt::Display>, - separator: impl fmt::Display, -) -> impl fmt::Display { - JoinDisplay { - iter: Cell::new(Some(iter.into_iter())), - separator, - } -} - -struct JoinDisplay<I, S> { - iter: Cell<Option<I>>, - separator: S, -} - -impl<I, T, S> fmt::Display for JoinDisplay<I, S> -where - I: Iterator<Item = T>, - T: fmt::Display, - S: fmt::Display, -{ - fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { - if let Some(mut iter) = self.iter.take() { - if let Some(first) = iter.next() { - first.fmt(f)?; - } - for value in iter { - self.separator.fmt(f)?; - value.fmt(f)?; - } - } - Ok(()) - } -} - /// Like `Iterator::filter_map`, but over a fallible iterator of `Result`s. /// /// The callback is only called for incoming `Ok` values. Errors are passed
--- a/rust/hg-core/src/utils/files.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/utils/files.rs Fri Feb 28 23:28:10 2025 +0100 @@ -12,7 +12,7 @@ use crate::utils::{ hg_path::{path_to_hg_path_buf, HgPath, HgPathBuf, HgPathError}, path_auditor::PathAuditor, - replace_slice, + strings::replace_slice, }; use lazy_static::lazy_static; use same_file::is_same_file; @@ -333,6 +333,12 @@ .modified() } +/// Returns true if file content is considered to be binary (not text). +pub fn is_binary(content: &[u8]) -> bool { + // Matches binary() in utils/stringutil.py. + !content.is_empty() && memchr::memchr(b'\0', content).is_some() +} + #[cfg(test)] mod tests { use super::*;
--- a/rust/hg-core/src/utils/hg_path.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/utils/hg_path.rs Fri Feb 28 23:28:10 2025 +0100 @@ -6,7 +6,7 @@ // GNU General Public License version 2 or any later version. use crate::errors::HgError; -use crate::utils::SliceExt; +use crate::utils::strings::SliceExt; use std::borrow::Borrow; use std::borrow::Cow; use std::ffi::{OsStr, OsString};
--- a/rust/hg-core/src/utils/path_auditor.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-core/src/utils/path_auditor.rs Fri Feb 28 23:28:10 2025 +0100 @@ -8,8 +8,8 @@ use crate::utils::{ files::lower_clean, - find_slice_in_slice, hg_path::{hg_path_to_path_buf, HgPath, HgPathBuf, HgPathError}, + strings::find_slice_in_slice, }; use std::collections::HashSet; use std::path::{Path, PathBuf};
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-core/src/utils/strings.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,381 @@ +//! Contains string-related utilities. + +use crate::utils::hg_path::HgPath; +use lazy_static::lazy_static; +use regex::bytes::Regex; +use std::{borrow::Cow, cell::Cell, fmt, io::Write as _, ops::Deref as _}; + +/// Useful until rust/issues/56345 is stable +/// +/// # Examples +/// +/// ``` +/// use hg::utils::strings::find_slice_in_slice; +/// +/// let haystack = b"This is the haystack".to_vec(); +/// assert_eq!(find_slice_in_slice(&haystack, b"the"), Some(8)); +/// assert_eq!(find_slice_in_slice(&haystack, b"not here"), None); +/// ``` +pub fn find_slice_in_slice<T>(slice: &[T], needle: &[T]) -> Option<usize> +where + for<'a> &'a [T]: PartialEq, +{ + slice + .windows(needle.len()) + .position(|window| window == needle) +} + +/// Replaces the `from` slice with the `to` slice inside the `buf` slice. +/// +/// # Examples +/// +/// ``` +/// use hg::utils::strings::replace_slice; +/// let mut line = b"I hate writing tests!".to_vec(); +/// replace_slice(&mut line, b"hate", b"love"); +/// assert_eq!( +/// line, +/// b"I love writing tests!".to_vec() +/// ); +/// ``` +pub fn replace_slice<T>(buf: &mut [T], from: &[T], to: &[T]) +where + T: Clone + PartialEq, +{ + if buf.len() < from.len() || from.len() != to.len() { + return; + } + for i in 0..=buf.len() - from.len() { + if buf[i..].starts_with(from) { + buf[i..(i + from.len())].clone_from_slice(to); + } + } +} + +pub trait SliceExt { + fn trim_end(&self) -> &Self; + fn trim_start(&self) -> &Self; + fn trim_end_matches(&self, f: impl FnMut(u8) -> bool) -> &Self; + fn trim_start_matches(&self, f: impl FnMut(u8) -> bool) -> &Self; + fn trim(&self) -> &Self; + fn drop_prefix(&self, needle: &Self) -> Option<&Self>; + fn split_2(&self, separator: u8) -> Option<(&[u8], &[u8])>; + fn split_2_by_slice(&self, separator: &[u8]) -> Option<(&[u8], &[u8])>; +} + +impl SliceExt for [u8] { + fn trim_end(&self) -> &[u8] { + self.trim_end_matches(|byte| byte.is_ascii_whitespace()) + } + + fn trim_start(&self) -> &[u8] { + self.trim_start_matches(|byte| byte.is_ascii_whitespace()) + } + + fn trim_end_matches(&self, mut f: impl FnMut(u8) -> bool) -> &Self { + if let Some(last) = self.iter().rposition(|&byte| !f(byte)) { + &self[..=last] + } else { + &[] + } + } + + fn trim_start_matches(&self, mut f: impl FnMut(u8) -> bool) -> &Self { + if let Some(first) = self.iter().position(|&byte| !f(byte)) { + &self[first..] + } else { + &[] + } + } + + /// ``` + /// use hg::utils::strings::SliceExt; + /// assert_eq!( + /// b" to trim ".trim(), + /// b"to trim" + /// ); + /// assert_eq!( + /// b"to trim ".trim(), + /// b"to trim" + /// ); + /// assert_eq!( + /// b" to trim".trim(), + /// b"to trim" + /// ); + /// ``` + fn trim(&self) -> &[u8] { + self.trim_start().trim_end() + } + + fn drop_prefix(&self, needle: &Self) -> Option<&Self> { + if self.starts_with(needle) { + Some(&self[needle.len()..]) + } else { + None + } + } + + fn split_2(&self, separator: u8) -> Option<(&[u8], &[u8])> { + let pos = memchr::memchr(separator, self)?; + Some((&self[..pos], &self[pos + 1..])) + } + + fn split_2_by_slice(&self, separator: &[u8]) -> Option<(&[u8], &[u8])> { + find_slice_in_slice(self, separator) + .map(|pos| (&self[..pos], &self[pos + separator.len()..])) + } +} + +pub trait Escaped { + /// Return bytes escaped for display to the user + fn escaped_bytes(&self) -> Vec<u8>; +} + +impl Escaped for u8 { + fn escaped_bytes(&self) -> Vec<u8> { + let mut acc = vec![]; + match self { + c @ b'\'' | c @ b'\\' => { + acc.push(b'\\'); + acc.push(*c); + } + b'\t' => { + acc.extend(br"\\t"); + } + b'\n' => { + acc.extend(br"\\n"); + } + b'\r' => { + acc.extend(br"\\r"); + } + c if (*c < b' ' || *c >= 127) => { + write!(acc, "\\x{:x}", self).unwrap(); + } + c => { + acc.push(*c); + } + } + acc + } +} + +impl<'a, T: Escaped> Escaped for &'a [T] { + fn escaped_bytes(&self) -> Vec<u8> { + self.iter().flat_map(Escaped::escaped_bytes).collect() + } +} + +impl<T: Escaped> Escaped for Vec<T> { + fn escaped_bytes(&self) -> Vec<u8> { + self.deref().escaped_bytes() + } +} + +impl<'a> Escaped for &'a HgPath { + fn escaped_bytes(&self) -> Vec<u8> { + self.as_bytes().escaped_bytes() + } +} + +#[cfg(unix)] +pub fn shell_quote(value: &[u8]) -> Vec<u8> { + if value.iter().all(|&byte| { + matches!( + byte, + b'a'..=b'z' + | b'A'..=b'Z' + | b'0'..=b'9' + | b'.' + | b'_' + | b'/' + | b'+' + | b'-' + ) + }) { + value.to_owned() + } else { + let mut quoted = Vec::with_capacity(value.len() + 2); + quoted.push(b'\''); + for &byte in value { + if byte == b'\'' { + quoted.push(b'\\'); + } + quoted.push(byte); + } + quoted.push(b'\''); + quoted + } +} + +/// Expand `$FOO` and `${FOO}` environment variables in the given byte string +pub fn expand_vars(s: &[u8]) -> std::borrow::Cow<[u8]> { + lazy_static::lazy_static! { + /// https://github.com/python/cpython/blob/3.9/Lib/posixpath.py#L301 + /// The `x` makes whitespace ignored. + /// `-u` disables the Unicode flag, which makes `\w` like Python with the ASCII flag. + static ref VAR_RE: regex::bytes::Regex = + regex::bytes::Regex::new(r"(?x-u) + \$ + (?: + (\w+) + | + \{ + ([^}]*) + \} + ) + ").unwrap(); + } + VAR_RE.replace_all(s, |captures: ®ex::bytes::Captures| { + let var_name = crate::utils::files::get_os_str_from_bytes( + captures + .get(1) + .or_else(|| captures.get(2)) + .expect("either side of `|` must participate in match") + .as_bytes(), + ); + std::env::var_os(var_name) + .map(crate::utils::files::get_bytes_from_os_str) + .unwrap_or_else(|| { + // Referencing an environment variable that does not exist. + // Leave the $FOO reference as-is. + captures[0].to_owned() + }) + }) +} + +/// Join items of the iterable with the given separator, similar to Python’s +/// `separator.join(iter)`. +/// +/// Formatting the return value consumes the iterator. +/// Formatting it again will produce an empty string. +pub fn join_display( + iter: impl IntoIterator<Item = impl fmt::Display>, + separator: impl fmt::Display, +) -> impl fmt::Display { + JoinDisplay { + iter: Cell::new(Some(iter.into_iter())), + separator, + } +} + +struct JoinDisplay<I, S> { + iter: Cell<Option<I>>, + separator: S, +} + +impl<I, T, S> fmt::Display for JoinDisplay<I, S> +where + I: Iterator<Item = T>, + T: fmt::Display, + S: fmt::Display, +{ + fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { + if let Some(mut iter) = self.iter.take() { + if let Some(first) = iter.next() { + first.fmt(f)?; + } + for value in iter { + self.separator.fmt(f)?; + value.fmt(f)?; + } + } + Ok(()) + } +} + +/// Returns a short representation of a user name or email address. +pub fn short_user(user: &[u8]) -> &[u8] { + let mut str = user; + if let Some(i) = memchr::memchr(b'@', str) { + str = &str[..i]; + } + if let Some(i) = memchr::memchr(b'<', str) { + str = &str[i + 1..]; + } + if let Some(i) = memchr::memchr(b' ', str) { + str = &str[..i]; + } + if let Some(i) = memchr::memchr(b'.', str) { + str = &str[..i]; + } + str +} + +/// Options for [`clean_whitespace`]. +#[derive(Copy, Clone)] +pub enum CleanWhitespace { + /// Do nothing. + None, + /// Remove whitespace at ends of lines. + AtEol, + /// Collapse consecutive whitespace characters into a single space. + Collapse, + /// Remove all whitespace characters. + All, +} + +/// Normalizes whitespace in text so that it won't apppear in diffs. +/// Returns `Cow::Borrowed(text)` if the result is unchanged. +pub fn clean_whitespace(text: &[u8], how: CleanWhitespace) -> Cow<[u8]> { + lazy_static! { + // To match wsclean in mdiff.py, this includes "\f". + static ref AT_EOL: Regex = + Regex::new(r"(?m)[ \t\r\f]+$").expect("valid regex"); + // To match fixws in cext/bdiff.c, this does *not* include "\f". + static ref MULTIPLE: Regex = + Regex::new(r"[ \t\r]+").expect("valid regex"); + } + let replacement: &[u8] = match how { + CleanWhitespace::None => return Cow::Borrowed(text), + CleanWhitespace::AtEol => return AT_EOL.replace_all(text, b""), + CleanWhitespace::Collapse => b" ", + CleanWhitespace::All => b"", + }; + let text = MULTIPLE.replace_all(text, replacement); + replace_all_cow(&AT_EOL, text, b"") +} + +/// Helper to call [`Regex::replace_all`] with `Cow` as input and output. +fn replace_all_cow<'a>( + regex: &Regex, + haystack: Cow<'a, [u8]>, + replacement: &[u8], +) -> Cow<'a, [u8]> { + match haystack { + Cow::Borrowed(haystack) => regex.replace_all(haystack, replacement), + Cow::Owned(haystack) => { + Cow::Owned(regex.replace_all(&haystack, replacement).into_owned()) + } + } +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_expand_vars() { + // Modifying process-global state in a test isn’t great, + // but hopefully this won’t collide with anything. + std::env::set_var("TEST_EXPAND_VAR", "1"); + assert_eq!( + expand_vars(b"before/$TEST_EXPAND_VAR/after"), + &b"before/1/after"[..] + ); + assert_eq!( + expand_vars(b"before${TEST_EXPAND_VAR}${TEST_EXPAND_VAR}${TEST_EXPAND_VAR}after"), + &b"before111after"[..] + ); + let s = b"before $SOME_LONG_NAME_THAT_WE_ASSUME_IS_NOT_AN_ACTUAL_ENV_VAR after"; + assert_eq!(expand_vars(s), &s[..]); + } + + #[test] + fn test_short_user() { + assert_eq!(short_user(b""), b""); + assert_eq!(short_user(b"Name"), b"Name"); + assert_eq!(short_user(b"First Last"), b"First"); + assert_eq!(short_user(b"First Last <user@example.com>"), b"user"); + assert_eq!(short_user(b"First Last <user.name@example.com>"), b"user"); + } +}
--- a/rust/hg-cpython/Cargo.toml Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-cpython/Cargo.toml Fri Feb 28 23:28:10 2025 +0100 @@ -4,9 +4,12 @@ authors = ["Georges Racinet <gracinet@anybox.fr>"] edition = "2021" +[lints] +workspace = true + [lib] name='rusthg' -crate-type = ["cdylib", "rlib"] +crate-type = ["cdylib"] [dependencies] cpython = { version = "0.7.2", features = ["extension-module"] }
--- a/rust/hg-cpython/src/cindex.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-cpython/src/cindex.rs Fri Feb 28 23:28:10 2025 +0100 @@ -171,6 +171,9 @@ Err(GraphError::ParentOutOfRange(rev)) => { Err(vcsgraph::graph::GraphReadError::KeyedInvalidKey(rev.0)) } + Err(GraphError::ParentOutOfOrder(_)) => { + Err(vcsgraph::graph::GraphReadError::InconsistentGraphData) + } } } }
--- a/rust/hg-cpython/src/exceptions.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-cpython/src/exceptions.rs Fri Feb 28 23:28:10 2025 +0100 @@ -28,6 +28,9 @@ hg::GraphError::ParentOutOfRange(r) => { GraphError::new(py, ("ParentOutOfRange", PyRevision(r.0))) } + hg::GraphError::ParentOutOfOrder(r) => { + GraphError::new(py, ("ParentOutOfOrder", PyRevision(r.0))) + } } }
--- a/rust/hg-cpython/src/pybytes_deref.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-cpython/src/pybytes_deref.rs Fri Feb 28 23:28:10 2025 +0100 @@ -18,13 +18,16 @@ /// Borrows the buffer inside `self.keep_alive`, /// but the borrow-checker cannot express self-referential structs. - data: *const [u8], + data: &'static [u8], } impl PyBytesDeref { pub fn new(py: Python, bytes: PyBytes) -> Self { + let as_raw: *const [u8] = bytes.data(py); Self { - data: bytes.data(py), + // Safety: the raw pointer is valid as long as the PyBytes is still + // alive, and the objecs owns it. + data: unsafe { &*as_raw }, keep_alive: bytes, } } @@ -38,9 +41,7 @@ type Target = [u8]; fn deref(&self) -> &[u8] { - // Safety: the raw pointer is valid as long as the PyBytes is still - // alive, and the returned slice borrows `self`. - unsafe { &*self.data } + self.data } }
--- a/rust/hg-cpython/src/revlog.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-cpython/src/revlog.rs Fri Feb 28 23:28:10 2025 +0100 @@ -95,6 +95,9 @@ Err(hg::GraphError::ParentOutOfRange(rev)) => { Err(vcsgraph::graph::GraphReadError::KeyedInvalidKey(rev.0)) } + Err(hg::GraphError::ParentOutOfOrder(_)) => { + Err(vcsgraph::graph::GraphReadError::InconsistentGraphData) + } } } } @@ -920,14 +923,11 @@ } def _deltachain(&self, *args, **kw) -> PyResult<PyObject> { - let inner = self.inner(py).borrow(); - let general_delta = inner.index.uses_generaldelta(); let args = PyTuple::new( py, &[ args.get_item(py, 0), kw.and_then(|d| d.get_item(py, "stoprev")).to_py_object(py), - general_delta.to_py_object(py).into_object(), ] ); self._index_deltachain(py, &args, kw) @@ -1391,11 +1391,8 @@ nodemap_error(py, NodeMapError::RevisionNotInIndex(stop_rev)) })?) } else {None}; - let using_general_delta = args.get_item(py, 2) - .extract::<Option<u32>>(py)? - .map(|i| i != 0); let (chain, stopped) = index.delta_chain( - rev, stop_rev, using_general_delta + rev, stop_rev ).map_err(|e| { PyErr::new::<cpython::exc::ValueError, _>(py, e.to_string()) })?;
--- a/rust/hg-pyo3/Cargo.toml Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-pyo3/Cargo.toml Fri Feb 28 23:28:10 2025 +0100 @@ -3,6 +3,9 @@ version = "0.1.0" edition = "2021" +[lints] +workspace = true + [lib] name='rusthgpyo3' crate-type = ["cdylib"] @@ -14,13 +17,11 @@ [dependencies] pyo3 = { version = "0.23.1" } pyo3-sharedref = { path = "../pyo3-sharedref" } -cpython = { version = "0.7.2", features = ["extension-module"] } -hg-cpython = { path = "../hg-cpython" } -python3-sys = { version = "0.7.2" } hg-core = { path = "../hg-core"} stable_deref_trait = "1.2.0" log = "0.4.17" +logging_timer = "1.1.0" derive_more = "0.99.17" env_logger = "0.9.3" -lazy_static = "*" vcsgraph = "0.2.0" +crossbeam-channel = "0.5.14"
--- a/rust/hg-pyo3/src/ancestors.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-pyo3/src/ancestors.rs Fri Feb 28 23:28:10 2025 +0100 @@ -8,12 +8,11 @@ //! Bindings for the `hg::ancestors` module provided by the //! `hg-core` crate. From Python, this will be seen as `pyo3_rustext.ancestor` //! and can be used as replacement for the the pure `ancestor` Python module. -use cpython::UnsafePyLeaked; use pyo3::prelude::*; use pyo3::types::PyTuple; +use pyo3_sharedref::SharedByPyObject; use std::collections::HashSet; -use std::sync::RwLock; use hg::MissingAncestors as CoreMissing; use vcsgraph::lazy_ancestors::{ @@ -21,18 +20,16 @@ LazyAncestors as VCGLazyAncestors, }; -use crate::convert_cpython::{ - proxy_index_py_leak, py_leaked_borrow, py_leaked_borrow_mut, - py_leaked_or_map_err, +use crate::exceptions::GraphError; +use crate::revision::{rev_pyiter_collect_with_py_index, PyRevision}; +use crate::revlog::PySharedIndex; +use crate::utils::{ + new_submodule, py_rust_index_to_graph, py_shared_or_map_err, }; -use crate::exceptions::{map_lock_error, GraphError}; -use crate::revision::{rev_pyiter_collect_with_py_index, PyRevision}; -use crate::util::new_submodule; -use rusthg::revlog::PySharedIndex; #[pyclass] struct AncestorsIterator { - inner: RwLock<UnsafePyLeaked<VCGAncestorsIterator<PySharedIndex>>>, + inner: SharedByPyObject<VCGAncestorsIterator<PySharedIndex>>, } #[pymethods] @@ -44,11 +41,12 @@ stoprev: PyRevision, inclusive: bool, ) -> PyResult<Self> { + let py = index_proxy.py(); let initvec: Vec<_> = rev_pyiter_collect_with_py_index(initrevs, index_proxy)?; - let (py, leaked_idx) = proxy_index_py_leak(index_proxy)?; + let shared_idx = py_rust_index_to_graph(index_proxy)?; let res_ait = unsafe { - leaked_idx.map(py, |idx| { + shared_idx.map(py, |idx| { VCGAncestorsIterator::new( idx, initvec.into_iter().map(|r| r.0), @@ -57,9 +55,8 @@ ) }) }; - let ait = - py_leaked_or_map_err(py, res_ait, GraphError::from_vcsgraph)?; - let inner = ait.into(); + let inner = + py_shared_or_map_err(py, res_ait, GraphError::from_vcsgraph)?; Ok(Self { inner }) } @@ -67,10 +64,10 @@ slf } - fn __next__(slf: PyRefMut<'_, Self>) -> PyResult<Option<PyRevision>> { - let mut leaked = slf.inner.write().map_err(map_lock_error)?; - // Safety: we don't leak the inner 'static ref out of UnsafePyLeaked - let mut inner = unsafe { py_leaked_borrow_mut(&slf, &mut leaked)? }; + fn __next__(mut slf: PyRefMut<'_, Self>) -> PyResult<Option<PyRevision>> { + let py = slf.py(); + // Safety: we don't leak the inner 'static ref out of SharedByPyObject + let mut inner = unsafe { slf.inner.try_borrow_mut(py) }?; match inner.next() { Some(Err(e)) => Err(GraphError::from_vcsgraph(e)), None => Ok(None), @@ -81,7 +78,7 @@ #[pyclass(sequence)] struct LazyAncestors { - inner: RwLock<UnsafePyLeaked<VCGLazyAncestors<PySharedIndex>>>, + inner: SharedByPyObject<VCGLazyAncestors<PySharedIndex>>, proxy_index: PyObject, initrevs: PyObject, stoprev: PyRevision, @@ -92,6 +89,7 @@ impl LazyAncestors { #[new] fn new( + py: Python<'_>, index_proxy: &Bound<'_, PyAny>, initrevs: &Bound<'_, PyAny>, stoprev: PyRevision, @@ -100,11 +98,11 @@ let cloned_proxy = index_proxy.clone().unbind(); let initvec: Vec<_> = rev_pyiter_collect_with_py_index(initrevs, index_proxy)?; - let (py, leaked_idx) = proxy_index_py_leak(index_proxy)?; + let shared_idx = py_rust_index_to_graph(index_proxy)?; // Safety: we don't leak the "faked" reference out of - // `UnsafePyLeaked` + // `SharedByPyObject` let res_lazy = unsafe { - leaked_idx.map(py, |idx| { + shared_idx.map(py, |idx| { VCGLazyAncestors::new( idx, initvec.into_iter().map(|r| r.0), @@ -113,10 +111,10 @@ ) }) }; - let lazy = - py_leaked_or_map_err(py, res_lazy, GraphError::from_vcsgraph)?; + let inner = + py_shared_or_map_err(py, res_lazy, GraphError::from_vcsgraph)?; Ok(Self { - inner: lazy.into(), + inner, proxy_index: cloned_proxy, initrevs: initrevs.clone().unbind(), stoprev, @@ -124,23 +122,21 @@ }) } - fn __bool__(slf: PyRef<'_, Self>) -> PyResult<bool> { - let leaked = slf.inner.read().map_err(map_lock_error)?; - // Safety: we don't leak the "faked" reference out of `UnsafePyLeaked` - let inner = unsafe { py_leaked_borrow(&slf, &leaked) }?; + fn __bool__(slf: PyRef<'_, Self>, py: Python<'_>) -> PyResult<bool> { + // Safety: we don't leak the "faked" reference out of + // `SharedByPyObject` + let inner = unsafe { slf.inner.try_borrow(py) }?; Ok(!inner.is_empty()) } fn __contains__( - slf: PyRefMut<'_, Self>, + mut slf: PyRefMut<'_, Self>, obj: &Bound<'_, PyAny>, ) -> PyResult<bool> { PyRevision::extract_bound(obj).map_or(Ok(false), |rev| { - let mut leaked = slf.inner.write().map_err(map_lock_error)?; // Safety: we don't leak the "faked" reference out of - // `UnsafePyLeaked` - let mut inner = - unsafe { py_leaked_borrow_mut(&slf, &mut leaked) }?; + // `SharedByPyObject` + let mut inner = unsafe { slf.inner.try_borrow_mut(obj.py()) }?; inner.contains(rev.0).map_err(GraphError::from_vcsgraph) }) } @@ -158,7 +154,7 @@ #[pyclass] struct MissingAncestors { - inner: RwLock<UnsafePyLeaked<CoreMissing<PySharedIndex>>>, + inner: SharedByPyObject<CoreMissing<PySharedIndex>>, proxy_index: PyObject, } @@ -172,52 +168,54 @@ let cloned_proxy = index_proxy.clone().unbind(); let bases_vec: Vec<_> = rev_pyiter_collect_with_py_index(bases, index_proxy)?; - let (py, leaked_idx) = proxy_index_py_leak(index_proxy)?; + let shared_idx = py_rust_index_to_graph(index_proxy)?; // Safety: we don't leak the "faked" reference out of - // `UnsafePyLeaked` + // `SharedByPyObject` let inner = unsafe { - leaked_idx.map(py, |idx| CoreMissing::new(idx, bases_vec)) + shared_idx + .map(index_proxy.py(), |idx| CoreMissing::new(idx, bases_vec)) }; Ok(Self { - inner: inner.into(), + inner, proxy_index: cloned_proxy, }) } fn hasbases(slf: PyRef<'_, Self>) -> PyResult<bool> { - let leaked = slf.inner.read().map_err(map_lock_error)?; - // Safety: we don't leak the "faked" reference out of `UnsafePyLeaked` - let inner = unsafe { py_leaked_borrow(&slf, &leaked) }?; + // Safety: we don't leak the "faked" reference out of + // `SharedByPyObject` + let inner = unsafe { slf.inner.try_borrow(slf.py()) }?; Ok(inner.has_bases()) } fn addbases( - slf: PyRefMut<'_, Self>, + mut slf: PyRefMut<'_, Self>, bases: &Bound<'_, PyAny>, ) -> PyResult<()> { - let index_proxy = slf.proxy_index.bind(slf.py()); + let py = slf.py(); + let index_proxy = slf.proxy_index.bind(py); let bases_vec: Vec<_> = rev_pyiter_collect_with_py_index(bases, index_proxy)?; - let mut leaked = slf.inner.write().map_err(map_lock_error)?; - // Safety: we don't leak the "faked" reference out of `UnsafePyLeaked` - let mut inner = unsafe { py_leaked_borrow_mut(&slf, &mut leaked) }?; + // Safety: we don't leak the "faked" reference out of + // `SharedByPyObject` + let mut inner = unsafe { slf.inner.try_borrow_mut(py) }?; inner.add_bases(bases_vec); Ok(()) } fn bases(slf: PyRef<'_, Self>) -> PyResult<HashSet<PyRevision>> { - let leaked = slf.inner.read().map_err(map_lock_error)?; - // Safety: we don't leak the "faked" reference out of `UnsafePyLeaked` - let inner = unsafe { py_leaked_borrow(&slf, &leaked) }?; + // Safety: we don't leak the "faked" reference out of + // `SharedByPyObject` + let inner = unsafe { slf.inner.try_borrow(slf.py()) }?; Ok(inner.get_bases().iter().map(|r| PyRevision(r.0)).collect()) } fn basesheads(slf: PyRef<'_, Self>) -> PyResult<HashSet<PyRevision>> { - let leaked = slf.inner.read().map_err(map_lock_error)?; - // Safety: we don't leak the "faked" reference out of `UnsafePyLeaked` - let inner = unsafe { py_leaked_borrow(&slf, &leaked) }?; + // Safety: we don't leak the "faked" reference out of + // `SharedByPyObject` + let inner = unsafe { slf.inner.try_borrow(slf.py()) }?; Ok(inner .bases_heads() .map_err(GraphError::from_hg)? @@ -227,7 +225,7 @@ } fn removeancestorsfrom( - slf: PyRef<'_, Self>, + mut slf: PyRefMut<'_, Self>, revs: &Bound<'_, PyAny>, ) -> PyResult<()> { // Original comment from hg-cpython: @@ -243,35 +241,37 @@ // PyO3 additional comment: the trait approach would probably be // simpler because we can implement it without a Py wrappper, just // on &Bound<'py, PySet> - let index_proxy = slf.proxy_index.bind(slf.py()); + let py = slf.py(); + let index_proxy = slf.proxy_index.bind(py); let mut revs_set: HashSet<_> = rev_pyiter_collect_with_py_index(revs, index_proxy)?; - let mut leaked = slf.inner.write().map_err(map_lock_error)?; - // Safety: we don't leak the "faked" reference out of `UnsafePyLeaked` - let mut inner = unsafe { py_leaked_borrow_mut(&slf, &mut leaked) }?; + // Safety: we don't leak the "faked" reference out of + // `SharedByPyObject` + let mut inner = unsafe { slf.inner.try_borrow_mut(py) }?; inner .remove_ancestors_from(&mut revs_set) .map_err(GraphError::from_hg)?; // convert as Python tuple and discard from original `revs` let remaining_tuple = - PyTuple::new(slf.py(), revs_set.iter().map(|r| PyRevision(r.0)))?; + PyTuple::new(py, revs_set.iter().map(|r| PyRevision(r.0)))?; revs.call_method("intersection_update", (remaining_tuple,), None)?; Ok(()) } fn missingancestors( - slf: PyRefMut<'_, Self>, + mut slf: PyRefMut<'_, Self>, bases: &Bound<'_, PyAny>, ) -> PyResult<Vec<PyRevision>> { - let index_proxy = slf.proxy_index.bind(slf.py()); + let py = slf.py(); + let index_proxy = slf.proxy_index.bind(py); let revs_vec: Vec<_> = rev_pyiter_collect_with_py_index(bases, index_proxy)?; - let mut leaked = slf.inner.write().map_err(map_lock_error)?; - // Safety: we don't leak the "faked" reference out of `UnsafePyLeaked` - let mut inner = unsafe { py_leaked_borrow_mut(&slf, &mut leaked) }?; + // Safety: we don't leak the "faked" reference out of + // `SharedByPyObject` + let mut inner = unsafe { slf.inner.try_borrow_mut(py) }?; let missing_vec = inner .missing_ancestors(revs_vec)
--- a/rust/hg-pyo3/src/convert_cpython.rs Fri Feb 28 23:25:42 2025 +0100 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,283 +0,0 @@ -//! This module takes care of all conversions involving `rusthg` (hg-cpython) -//! objects in the PyO3 call context. -//! -//! For source code clarity, we only import (`use`) [`cpython`] traits and not -//! any of its data objects. We are instead using full qualifiers, such as -//! `cpython::PyObject`, and believe that the added heaviness is an acceptatble -//! price to pay to avoid confusion. -//! -//! Also it, is customary in [`cpython`] to label the GIL lifetime as `'p`, -//! whereas it is `'py` in PyO3 context. We keep both these conventions in -//! the arguments side of function signatures when they are not simply elided. -use pyo3::exceptions::PyTypeError; -use pyo3::prelude::*; -use pyo3::{pyclass::boolean_struct::False, PyClass}; - -use cpython::ObjectProtocol; -use cpython::PythonObject; -use lazy_static::lazy_static; - -use hg::revlog::index::Index as CoreIndex; -use rusthg::revlog::{InnerRevlog, PySharedIndex}; - -/// Marker trait for PyO3 objects with a lifetime representing the acquired GIL -/// -/// # Safety -/// -/// This trait must not be implemented for objects with lifetimes that -/// do not imply in PyO3 that the GIL is acquired during the whole lifetime. -pub unsafe trait WithGIL<'py> {} - -// Safety: the lifetime on these PyO3 objects all represent the acquired GIL -unsafe impl<'py> WithGIL<'py> for Python<'py> {} -unsafe impl<'py, T> WithGIL<'py> for Bound<'py, T> {} -unsafe impl<'py, T: PyClass> WithGIL<'py> for PyRef<'py, T> {} -unsafe impl<'py, T: PyClass<Frozen = False>> WithGIL<'py> - for PyRefMut<'py, T> -{ -} - -/// Force cpython's GIL handle with the appropriate lifetime -/// -/// In `pyo3`, the fact that we have the GIL is expressed by the lifetime of -/// the incoming [`Bound`] smart pointer. We therefore simply instantiate -/// the `cpython` handle and coerce its lifetime by the function signature. -/// -/// Reacquiring the GIL is also a possible alternative, as the CPython -/// documentation explicitely states that "recursive calls are allowed" -/// (we interpret that as saying that acquiring the GIL within a thread that -/// already has it works) *as long as it is properly released* -/// reference: -/// <https://docs.python.org/3.8/c-api/init.html#c.PyGILState_Ensure> -pub(crate) fn cpython_handle<'py, T: WithGIL<'py>>( - _with_gil: &T, -) -> cpython::Python<'py> { - // safety: this is safe because the returned object has the same lifetime - // as the incoming object. - unsafe { cpython::Python::assume_gil_acquired() } -} - -/// Force PyO3 GIL handle from cpython's. -/// -/// Very similar to [`cpython_handle`] -pub fn pyo3_handle(_py: cpython::Python<'_>) -> Python<'_> { - // safety: this is safe because the returned object has the same lifetime - // as the incoming object. - unsafe { Python::assume_gil_acquired() } -} - -/// Convert a PyO3 [`PyObject`] into a [`cpython::PyObject`] -/// -/// During this process, the reference count is increased, then decreased. -/// This means that the GIL (symbolized by the lifetime on the `obj` -/// argument) is needed. -/// -/// We could make something perhaps more handy by simply stealing the -/// pointer, forgetting the incoming and then implement `From` with "newtype". -/// It would be worth the effort for a generic cpython-to-pyo3 crate, perhaps -/// not for the current endeavour. -pub(crate) fn to_cpython_py_object<'py>( - obj: &Bound<'py, PyAny>, -) -> (cpython::Python<'py>, cpython::PyObject) { - let py = cpython_handle(obj); - // public alias of the private cpython::fii::PyObject (!) - let raw = obj.as_ptr() as *mut python3_sys::PyObject; - // both pyo3 and rust-cpython will decrement the refcount on drop. - // If we use from_owned_ptr, that's a segfault. - (py, unsafe { cpython::PyObject::from_borrowed_ptr(py, raw) }) -} - -/// Convert a [`cpython::PyObject`] into a PyO3 [`PyObject`] -/// -/// During this process, the reference count is increased, then decreased. -/// This means that the GIL (symbolized by the PyO3 [`Python`] handle is -/// needed. -/// -/// We could make something perhaps more handy by simply stealing the -/// pointer, forgetting the incoming and then implement `From` with "newtype". -/// It would be worth the effort for a generic cpython-to-pyo3 crate, perhaps -/// not for the current endeavour. -pub(crate) fn from_cpython_py_object( - py: Python<'_>, - obj: cpython::PyObject, -) -> PyObject { - let raw = obj.as_ptr() as *mut pyo3::ffi::PyObject; - unsafe { Py::from_borrowed_ptr(py, raw) } -} - -/// Convert [`cpython::PyErr`] into [`pyo3::PyErr`] -/// -/// The exception class remains the same as the original exception, -/// hence if it is also defined in another dylib based on `cpython` crate, -/// it will need to be converted to be downcasted in this crate. -pub(crate) fn from_cpython_pyerr( - py: cpython::Python<'_>, - mut e: cpython::PyErr, -) -> PyErr { - let pyo3_py = pyo3_handle(py); - let cpython_exc_obj = e.instance(py); - let pyo3_exc_obj = from_cpython_py_object(pyo3_py, cpython_exc_obj); - PyErr::from_value(pyo3_exc_obj.into_bound(pyo3_py)) -} - -/// Retrieve the PyType for objects from the `mercurial.rustext` crate. -fn retrieve_cpython_py_type( - submodule_name: &str, - type_name: &str, -) -> cpython::PyResult<cpython::PyType> { - let guard = cpython::Python::acquire_gil(); - let py = guard.python(); - let module = py.import(&format!("mercurial.rustext.{submodule_name}"))?; - module.get(py, type_name)?.extract::<cpython::PyType>(py) -} - -lazy_static! { - static ref INNER_REVLOG_PY_TYPE: cpython::PyType = { - retrieve_cpython_py_type("revlog", "InnerRevlog") - .expect("Could not import InnerRevlog in Python") - }; -} - -/// Downcast [`InnerRevlog`], with the appropriate Python type checking. -/// -/// The PyType object representing the `InnerRevlog` Python class is not the -/// the same in this dylib as it is in the `mercurial.rustext` module. -/// This is because the code created with the [`cpython::py_class!`] -/// macro is itself duplicated in both dylibs. In the case of this crate, this -/// happens by linking to the [`rusthg`] crate and provides the `InnerRevlog` -/// that is visible from this crate. The `InnerRevlog::get_type` associated -/// function turns out to return a `static mut` (look for `TYPE_OBJECT` in -/// `py_class_impl3.rs`), which obviously is different in both dylibs. -/// -/// The consequence of that is that downcasting an `InnerRevlog` originally -/// from the `mecurial.rustext` module to our `InnerRevlog` cannot be done with -/// the usual `extract::<InnerRevlog>(py)`, as it would perform the type -/// checking with the `PyType` that is embedded in `mercurial.pyo3_rustext`. -/// We must check the `PyType` that is within `mercurial.rustext` instead. -/// This is what this function does. -fn extract_inner_revlog( - py: cpython::Python, - inner_revlog: cpython::PyObject, -) -> PyResult<InnerRevlog> { - if !(*INNER_REVLOG_PY_TYPE).is_instance(py, &inner_revlog) { - return Err(PyTypeError::new_err("Not an InnerRevlog instance")); - } - // Safety: this is safe because we checked the PyType already, with the - // value embedded in `mercurial.rustext`. - Ok(unsafe { InnerRevlog::unchecked_downcast_from(inner_revlog) }) -} - -/// This is similar to [`rusthg.py_rust_index_to_graph`], with difference in -/// how we retrieve the [`InnerRevlog`]. -pub fn py_rust_index_to_graph( - py: cpython::Python, - index_proxy: cpython::PyObject, -) -> PyResult<cpython::UnsafePyLeaked<PySharedIndex>> { - let inner_revlog = extract_inner_revlog( - py, - index_proxy - .getattr(py, "inner") - .map_err(|e| from_cpython_pyerr(py, e))?, - )?; - - let leaked = inner_revlog.pub_inner(py).leak_immutable(); - // Safety: we don't leak the "faked" reference out of the `UnsafePyLeaked` - Ok(unsafe { leaked.map(py, |idx| PySharedIndex { inner: &idx.index }) }) -} - -pub(crate) fn proxy_index_py_leak<'py>( - index_proxy: &Bound<'py, PyAny>, -) -> PyResult<(cpython::Python<'py>, cpython::UnsafePyLeaked<PySharedIndex>)> { - let (py, idx_proxy) = to_cpython_py_object(index_proxy); - let py_leaked = py_rust_index_to_graph(py, idx_proxy)?; - Ok((py, py_leaked)) -} - -/// Full extraction of the proxy index object as received in PyO3 to a -/// [`CoreIndex`] reference. -/// -/// # Safety -/// -/// The invariants to maintain are those of the underlying -/// [`UnsafePyLeaked::try_borrow`]: the caller must not leak the inner -/// reference. -pub(crate) unsafe fn proxy_index_extract<'py>( - index_proxy: &Bound<'py, PyAny>, -) -> PyResult<&'py CoreIndex> { - let (py, py_leaked) = proxy_index_py_leak(index_proxy)?; - let py_shared = &*unsafe { - py_leaked - .try_borrow(py) - .map_err(|e| from_cpython_pyerr(py, e))? - }; - Ok(py_shared.inner) -} - -/// Generic borrow of [`cpython::UnsafePyLeaked`], with proper mapping. -/// -/// # Safety -/// -/// The invariants to maintain are those of the underlying -/// [`UnsafePyLeaked::try_borrow`]: the caller must not leak the inner -/// static reference. It is possible, depending on `T` that such a leak cannot -/// occur in practice. We may later on define a marker trait for this, -/// which will allow us to make declare this function to be safe. -pub(crate) unsafe fn py_leaked_borrow<'a, 'py: 'a, T>( - py: &impl WithGIL<'py>, - leaked: &'a cpython::UnsafePyLeaked<T>, -) -> PyResult<cpython::PyLeakedRef<'a, T>> { - let py = cpython_handle(py); - leaked.try_borrow(py).map_err(|e| from_cpython_pyerr(py, e)) -} - -/// Mutable variant of [`py_leaked_borrow`] -/// -/// # Safety -/// -/// See [`py_leaked_borrow`] -pub(crate) unsafe fn py_leaked_borrow_mut<'a, 'py: 'a, T>( - py: &impl WithGIL<'py>, - leaked: &'a mut cpython::UnsafePyLeaked<T>, -) -> PyResult<cpython::PyLeakedRefMut<'a, T>> { - let py = cpython_handle(py); - leaked - .try_borrow_mut(py) - .map_err(|e| from_cpython_pyerr(py, e)) -} - -/// Error propagation for an [`UnsafePyLeaked`] wrapping a [`Result`] -/// -/// TODO (will consider when implementing UnsafePyLeaked in PyO3): -/// It would be nice for UnsafePyLeaked to provide this directly as a variant -/// of the `map` method with a signature such as: -/// -/// ``` -/// unsafe fn map_or_err(&self, -/// py: Python, -/// f: impl FnOnce(T) -> Result(U, E), -/// convert_err: impl FnOnce(E) -> PyErr) -/// ``` -/// -/// This would spare users of the `cpython` crate the additional `unsafe` deref -/// to inspect the error and return it outside `UnsafePyLeaked`, and the -/// subsequent unwrapping that this function performs. -pub(crate) fn py_leaked_or_map_err<T, E: std::fmt::Debug + Copy>( - py: cpython::Python, - leaked: cpython::UnsafePyLeaked<Result<T, E>>, - convert_err: impl FnOnce(E) -> PyErr, -) -> PyResult<cpython::UnsafePyLeaked<T>> { - // Safety: we don't leak the "faked" reference out of `UnsafePyLeaked` - if let Err(e) = *unsafe { - leaked - .try_borrow(py) - .map_err(|e| from_cpython_pyerr(py, e))? - } { - return Err(convert_err(e)); - } - // Safety: we don't leak the "faked" reference out of `UnsafePyLeaked` - Ok(unsafe { - leaked.map(py, |res| { - res.expect("Error case should have already be treated") - }) - }) -}
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-pyo3/src/copy_tracing.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,181 @@ +// copy_tracing.rs +// +// Copyright 2025 Mercurial developers +// +// This software may be used and distributed according to the terms of the +// GNU General Public License version 2 or any later version. + +//! Bindings for the `hg::copy_tracing` module provided by the +//! `hg-core` package. +//! +//! From Python, this will be seen as `mercurial.pyo3_rustext.copy_tracing` + +use hg::copy_tracing::ChangedFiles; +use hg::copy_tracing::CombineChangesetCopies; +use hg::Revision; +use pyo3::types::PyBytes; +use pyo3::types::PyDict; +use pyo3::types::PyList; +use pyo3::types::PyTuple; + +use crate::revision::PyRevision; +use crate::utils::new_submodule; +use crate::utils::PyBytesDeref; + +use pyo3::prelude::*; + +/// Combines copies information contained into revision `revs` to build a copy +/// map. +/// +/// See mercurial/copies.py for details +#[pyfunction] +#[pyo3(name = "combine_changeset_copies")] +pub fn combine_changeset_copies_wrapper( + revs: Bound<'_, PyList>, + children_count: Bound<'_, PyDict>, + target_rev: PyRevision, + rev_info: Bound<'_, PyAny>, + multi_thread: bool, +) -> PyResult<PyObject> { + let py = revs.py(); + let target_rev = Revision(target_rev.0); + let children_count = children_count + .iter() + .map(|(k, v)| { + Ok((Revision(k.extract::<PyRevision>()?.0), v.extract()?)) + }) + .collect::<PyResult<_>>()?; + + /// (Revision number, parent 1, parent 2, copy data for this revision) + type RevInfo<Bytes> = (Revision, Revision, Revision, Option<Bytes>); + + let revs_info = + revs.iter().map(|rev_py| -> PyResult<RevInfo<Py<PyBytes>>> { + let rev = Revision(rev_py.extract::<PyRevision>()?.0); + let ret = rev_info.call1((rev_py,))?; + let tuple: &Bound<'_, PyTuple> = ret.downcast()?; + let p1 = Revision(tuple.get_item(0)?.extract::<PyRevision>()?.0); + let p2 = Revision(tuple.get_item(1)?.extract::<PyRevision>()?.0); + let opt_bytes = tuple.get_item(2)?.extract()?; + Ok((rev, p1, p2, opt_bytes)) + }); + + let path_copies; + if !multi_thread { + let mut combine_changeset_copies = + CombineChangesetCopies::new(children_count); + + for rev_info in revs_info { + let (rev, p1, p2, opt_bytes) = rev_info?; + let files = match &opt_bytes { + Some(bytes) => ChangedFiles::new(bytes.as_bytes(py)), + // Python None was extracted to Option::None, + // meaning there was no copy data. + None => ChangedFiles::new_empty(), + }; + + combine_changeset_copies.add_revision(rev, p1, p2, files) + } + path_copies = combine_changeset_copies.finish(target_rev) + } else { + // Use a bounded channel to provide back-pressure: + // if the child thread is slower to process revisions than this thread + // is to gather data for them, an unbounded channel would keep + // growing and eat memory. + // + // TODO: tweak the bound? + let (rev_info_sender, rev_info_receiver) = + crossbeam_channel::bounded::<RevInfo<PyBytesDeref>>(1000); + + // This channel (going the other way around) however is unbounded. + // If they were both bounded, there might potentially be deadlocks + // where both channels are full and both threads are waiting on each + // other. + let (pybytes_sender, pybytes_receiver) = + crossbeam_channel::unbounded(); + + // Start a thread that does CPU-heavy processing in parallel with the + // loop below. + // + // If the parent thread panics, `rev_info_sender` will be dropped and + // “disconnected”. `rev_info_receiver` will be notified of this and + // exit its own loop. + let thread = std::thread::spawn(move || { + let mut combine_changeset_copies = + CombineChangesetCopies::new(children_count); + for (rev, p1, p2, opt_bytes) in rev_info_receiver { + let files = match &opt_bytes { + Some(raw) => ChangedFiles::new(raw.as_ref()), + // Python None was extracted to Option::None, + // meaning there was no copy data. + None => ChangedFiles::new_empty(), + }; + combine_changeset_copies.add_revision(rev, p1, p2, files); + + // Send `PyBytes` back to the parent thread so the parent + // thread can drop it. Otherwise the GIL would be implicitly + // acquired here through `impl Drop for PyBytes`. + if let Some(bytes) = opt_bytes { + if pybytes_sender.send(bytes.unwrap()).is_err() { + // The channel is disconnected, meaning the parent + // thread panicked or returned + // early through + // `?` to propagate a Python exception. + break; + } + } + } + + combine_changeset_copies.finish(target_rev) + }); + + for rev_info in revs_info { + let (rev, p1, p2, opt_bytes) = rev_info?; + let opt_bytes = opt_bytes.map(|b| PyBytesDeref::new(py, b)); + + // We’d prefer to avoid the child thread calling into Python code, + // but this avoids a potential deadlock on the GIL if it does: + py.allow_threads(|| { + rev_info_sender.send((rev, p1, p2, opt_bytes)).expect( + "combine_changeset_copies: channel is disconnected", + ); + }); + + // Drop anything in the channel, without blocking + pybytes_receiver.try_iter().for_each(drop); + } + // We’d prefer to avoid the child thread calling into Python code, + // but this avoids a potential deadlock on the GIL if it does: + path_copies = py.allow_threads(|| { + // Disconnect the channel to signal the child thread to stop: + // the `for … in rev_info_receiver` loop will end. + drop(rev_info_sender); + + // Wait for the child thread to stop, and propagate any panic. + thread.join().unwrap_or_else(|panic_payload| { + std::panic::resume_unwind(panic_payload) + }) + }); + + // Drop anything left in the channel + drop(pybytes_receiver) + }; + + let out = PyDict::new(py); + for (dest, source) in path_copies.into_iter() { + out.set_item( + PyBytes::new(py, &dest.into_vec()), + PyBytes::new(py, &source.into_vec()), + )?; + } + Ok(out.into_any().unbind()) +} + +pub fn init_module<'py>( + py: Python<'py>, + package: &str, +) -> PyResult<Bound<'py, PyModule>> { + let m = new_submodule(py, package, "copy_tracing")?; + m.add_function(wrap_pyfunction!(combine_changeset_copies_wrapper, &m)?)?; + Ok(m) +}
--- a/rust/hg-pyo3/src/dagops.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-pyo3/src/dagops.rs Fri Feb 28 23:28:10 2025 +0100 @@ -8,17 +8,16 @@ //! Bindings for the `hg::dagops` module provided by the //! `hg-core` package. //! -//! From Python, this will be seen as `mercurial.pyo3-rustext.dagop` +//! From Python, this will be seen as `mercurial.pyo3_rustext.dagop` use pyo3::prelude::*; use std::collections::HashSet; use hg::{dagops, Revision}; -use crate::convert_cpython::proxy_index_extract; use crate::exceptions::GraphError; use crate::revision::{rev_pyiter_collect, PyRevision}; -use crate::util::new_submodule; +use crate::utils::{new_submodule, proxy_index_extract}; /// Using the the `index_proxy`, return heads out of any Python iterable of /// Revisions @@ -29,7 +28,7 @@ index_proxy: &Bound<'_, PyAny>, revs: &Bound<'_, PyAny>, ) -> PyResult<HashSet<PyRevision>> { - // Safety: we don't leak the "faked" reference out of `UnsafePyLeaked` + // Safety: we don't leak the "faked" reference out of `SharedByPyObject` let index = unsafe { proxy_index_extract(index_proxy)? }; let mut as_set: HashSet<Revision> = rev_pyiter_collect(revs, index)?; dagops::retain_heads(index, &mut as_set).map_err(GraphError::from_hg)?;
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-pyo3/src/dirstate.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,47 @@ +// dirstate.rs +// +// Copyright 2019 Raphaël Gomès <rgomes@octobus.net> +// 2025 Georges Racinet <georges.racinet@cloudcrane.io> +// +// This software may be used and distributed according to the terms of the +// GNU General Public License version 2 or any later version. + +//! Bindings for the `hg::dirstate` module provided by the +//! `hg-core` package. +//! +//! From Python, this will be seen as `mercurial.pyo3_rustext.dirstate` +use crate::{exceptions, utils::new_submodule}; +use pyo3::prelude::*; +mod item; +use item::DirstateItem; +mod dirstate_map; +use dirstate_map::{ + DirstateIdentity, DirstateMap, DirstateMapItemsIterator, + DirstateMapKeysIterator, +}; +mod copy_map; +use copy_map::{CopyMap, CopyMapItemsIterator, CopyMapKeysIterator}; +mod dirs_multiset; +use dirs_multiset::{Dirs, DirsMultisetKeysIterator}; +mod status; + +pub fn init_module<'py>( + py: Python<'py>, + package: &str, +) -> PyResult<Bound<'py, PyModule>> { + let m = new_submodule(py, package, "dirstate")?; + m.add("__doc__", "Dirstate - Rust implementation exposed via PyO3")?; + m.add("FallbackError", py.get_type::<exceptions::FallbackError>())?; + m.add_class::<DirstateIdentity>()?; + m.add_class::<DirstateItem>()?; + m.add_class::<DirstateMap>()?; + m.add_class::<DirstateMapKeysIterator>()?; + m.add_class::<DirstateMapItemsIterator>()?; + m.add_class::<CopyMap>()?; + m.add_class::<CopyMapKeysIterator>()?; + m.add_class::<CopyMapItemsIterator>()?; + m.add_class::<Dirs>()?; + m.add_class::<DirsMultisetKeysIterator>()?; + m.add_function(wrap_pyfunction!(self::status::status, &m)?)?; + Ok(m) +}
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-pyo3/src/dirstate/copy_map.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,225 @@ +// copy_map.rs +// +// Copyright 2019 Raphaël Gomès <rgomes@octobus.net> +// 2025 Georges Racinet <georges.racinet@cloudcrane.io> +// +// This software may be used and distributed according to the terms of the +// GNU General Public License version 2 or any later version. +//! Bindings for `hg::dirstate::dirstate_map::CopyMap` provided by the +//! `hg-core` package. + +use pyo3::exceptions::PyKeyError; +use pyo3::prelude::*; +use pyo3::types::{PyBytes, PyDict, PyTuple}; +use pyo3_sharedref::py_shared_iterator; + +use std::sync::{RwLockReadGuard, RwLockWriteGuard}; + +use hg::{ + dirstate::{ + on_disk::DirstateV2ParseError, owning::OwningDirstateMap, CopyMapIter, + }, + utils::hg_path::HgPath, +}; + +use super::dirstate_map::DirstateMap; +use crate::{ + exceptions::dirstate_v2_error, + path::{PyHgPathBuf, PyHgPathRef}, +}; + +#[pyclass(mapping)] +pub struct CopyMap { + dirstate_map: Py<DirstateMap>, +} + +#[pymethods] +impl CopyMap { + #[new] + pub fn new(dsm: &Bound<'_, DirstateMap>) -> PyResult<Self> { + Ok(Self { + dirstate_map: dsm.clone().unbind(), + }) + } + + fn __getitem__( + &self, + py: Python, + key: &Bound<'_, PyBytes>, + ) -> PyResult<Py<PyBytes>> { + let key = key.as_bytes(); + self.with_dirstate_map_read(py, |inner_dsm| { + inner_dsm + .copy_map_get(HgPath::new(key)) + .map_err(dirstate_v2_error)? + .ok_or_else(|| { + PyKeyError::new_err( + String::from_utf8_lossy(key).to_string(), + ) + }) + .and_then(|copy| { + Ok(PyHgPathRef(copy).into_pyobject(py)?.unbind()) + }) + }) + } + + fn __len__(&self, py: Python) -> PyResult<usize> { + self.with_dirstate_map_read(py, |inner_dsm| { + Ok(inner_dsm.copy_map_len()) + }) + } + + fn __contains__( + &self, + py: Python, + key: &Bound<'_, PyBytes>, + ) -> PyResult<bool> { + let key = key.as_bytes(); + self.with_dirstate_map_read(py, |inner_dsm| { + inner_dsm + .copy_map_contains_key(HgPath::new(key)) + .map_err(dirstate_v2_error) + }) + } + + #[pyo3(signature = (key, default=None))] + fn get( + &self, + py: Python, + key: &Bound<'_, PyBytes>, + default: Option<PyObject>, + ) -> PyResult<Option<PyObject>> { + let key = key.as_bytes(); + self.with_dirstate_map_read(py, |inner_dsm| { + match inner_dsm + .copy_map_get(HgPath::new(key)) + .map_err(dirstate_v2_error)? + { + Some(copy) => Ok(Some( + PyHgPathRef(copy).into_pyobject(py)?.unbind().into(), + )), + None => Ok(default), + } + }) + } + + #[pyo3(signature = (key, default=None))] + fn pop( + &self, + py: Python, + key: &Bound<'_, PyBytes>, + default: Option<PyObject>, + ) -> PyResult<Option<PyObject>> { + let path = HgPath::new(key.as_bytes()); + self.with_dirstate_map_write(py, |mut inner_dsm| { + match inner_dsm.copy_map_remove(path).map_err(dirstate_v2_error)? { + Some(copy) => Ok(Some( + PyHgPathBuf(copy).into_pyobject(py)?.unbind().into(), + )), + None => Ok(default), + } + }) + } + + fn __iter__(&self, py: Python) -> PyResult<CopyMapKeysIterator> { + self.keys(py) + } + + fn keys(&self, py: Python) -> PyResult<CopyMapKeysIterator> { + CopyMapKeysIterator::new(self.dirstate_map.bind(py)) + } + + fn items(&self, py: Python) -> PyResult<CopyMapItemsIterator> { + CopyMapItemsIterator::new(self.dirstate_map.bind(py)) + } + + fn __setitem__( + &self, + py: Python, + key: &Bound<'_, PyBytes>, + value: &Bound<'_, PyBytes>, + ) -> PyResult<()> { + let key = HgPath::new(key.as_bytes()); + let value = HgPath::new(value.as_bytes()); + self.with_dirstate_map_write(py, |mut inner_dsm| { + inner_dsm + .copy_map_insert(key, value) + .map_err(dirstate_v2_error) + })?; + Ok(()) + } + + fn copy(&self, py: Python) -> PyResult<Py<PyDict>> { + let dict = PyDict::new(py); + // The `IntoPyDict` trait just does the same, but is not applicable + // here because it is meant to work on infallible iterators + self.with_dirstate_map_read(py, |inner_dsm| { + for item in inner_dsm.copy_map_iter() { + let (key, value) = item.map_err(dirstate_v2_error)?; + dict.set_item(PyHgPathRef(key), PyHgPathRef(value))?; + } + Ok(()) + })?; + Ok(dict.unbind()) + } +} + +py_shared_iterator!( + CopyMapKeysIterator, + PyBytes, + DirstateMap, + inner, + CopyMapIter<'static>, + |dsm| dsm.copy_map_iter(), + CopyMap::keys_next_result +); + +py_shared_iterator!( + CopyMapItemsIterator, + PyTuple, + DirstateMap, + inner, + CopyMapIter<'static>, + |dsm| dsm.copy_map_iter(), + CopyMap::items_next_result +); + +impl CopyMap { + fn keys_next_result( + py: Python, + res: Result<(&HgPath, &HgPath), DirstateV2ParseError>, + ) -> PyResult<Option<Py<PyBytes>>> { + let key = res.map_err(dirstate_v2_error)?.0; + Ok(Some(PyHgPathRef(key).into_pyobject(py)?.unbind())) + } + + fn items_next_result( + py: Python, + res: Result<(&HgPath, &HgPath), DirstateV2ParseError>, + ) -> PyResult<Option<Py<PyTuple>>> { + let (key, value) = res.map_err(dirstate_v2_error)?; + Ok(Some( + (PyHgPathRef(key), PyHgPathRef(value)) + .into_pyobject(py)? + .unbind(), + )) + } + + fn with_dirstate_map_read<T>( + &self, + py: Python, + f: impl FnOnce(RwLockReadGuard<OwningDirstateMap>) -> PyResult<T>, + ) -> PyResult<T> { + let dsm = self.dirstate_map.bind(py); + DirstateMap::with_inner_read(dsm, |_dsm, inner| f(inner)) + } + + fn with_dirstate_map_write<T>( + &self, + py: Python, + f: impl FnOnce(RwLockWriteGuard<OwningDirstateMap>) -> PyResult<T>, + ) -> PyResult<T> { + let dsm = self.dirstate_map.bind(py); + DirstateMap::with_inner_write(dsm, |_dsm, inner| f(inner)) + } +}
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-pyo3/src/dirstate/dirs_multiset.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,131 @@ +// copy_map.rs +// +// Copyright 2019 Raphaël Gomès <rgomes@octobus.net> +// 2025 Georges Racinet <georges.racinet@cloudcrane.io> +// +// This software may be used and distributed according to the terms of the +// GNU General Public License version 2 or any later version. +//! Bindings for `hg::dirstate::dirs_multiset` file provided by the +//! `hg-core` package. +use pyo3::exceptions::PyTypeError; +use pyo3::prelude::*; +use pyo3::types::{PyBytes, PyDict}; +use pyo3_sharedref::{py_shared_iterator, PyShareable}; + +use std::sync::{RwLockReadGuard, RwLockWriteGuard}; + +use hg::{ + dirstate::dirs_multiset::{DirsMultiset, DirsMultisetIter}, + utils::hg_path::{HgPath, HgPathBuf}, +}; + +use crate::exceptions::{map_try_lock_error, to_string_value_error}; +use crate::path::PyHgPathRef; + +#[pyclass(mapping)] +pub struct Dirs { + pub(super) inner: PyShareable<DirsMultiset>, +} + +#[pymethods] +impl Dirs { + #[new] + fn new(map: &Bound<'_, PyAny>) -> PyResult<Self> { + if map.downcast::<PyDict>().is_ok() { + return Err(PyTypeError::new_err( + "pathutil.dirs() with a dict should only be used by the \ + Python dirstatemap and should not be used \ + when Rust is enabled", + )); + } + let map: Result<Vec<_>, PyErr> = map + .try_iter()? + .map(|o| Ok(HgPathBuf::from_bytes(o?.extract()?))) + .collect(); + Ok(Self { + inner: DirsMultiset::from_manifest(&map?) + .map_err(to_string_value_error)? + .into(), + }) + } + + fn addpath( + slf: &Bound<'_, Self>, + path: &Bound<'_, PyBytes>, + ) -> PyResult<()> { + let path = HgPath::new(path.as_bytes()); + Self::with_inner_write(slf, |mut inner| { + inner.add_path(path).map_err(to_string_value_error) + }) + } + + fn delpath( + slf: &Bound<'_, Self>, + path: &Bound<'_, PyBytes>, + ) -> PyResult<()> { + let path = HgPath::new(path.as_bytes()); + Self::with_inner_write(slf, |mut inner| { + inner.delete_path(path).map_err(to_string_value_error) + }) + } + + fn __iter__(slf: &Bound<'_, Self>) -> PyResult<DirsMultisetKeysIterator> { + DirsMultisetKeysIterator::new(slf) + } + + fn __contains__( + slf: &Bound<'_, Self>, + key: &Bound<'_, PyAny>, + ) -> PyResult<bool> { + let path = if let Ok(k) = key.extract::<&[u8]>() { + HgPath::new(k) + } else { + return Ok(false); + }; + + Self::with_inner_read(slf, |inner| Ok(inner.contains(path))) + } +} + +py_shared_iterator!( + DirsMultisetKeysIterator, + PyBytes, + Dirs, + inner, + DirsMultisetIter<'static>, + |ms| ms.iter(), + Dirs::keys_next_result +); + +impl Dirs { + fn keys_next_result( + py: Python, + res: &HgPathBuf, + ) -> PyResult<Option<Py<PyBytes>>> { + Ok(Some(PyHgPathRef(res).into_pyobject(py)?.unbind())) + } + + pub(super) fn with_inner_read<T>( + slf: &Bound<'_, Self>, + f: impl FnOnce(RwLockReadGuard<DirsMultiset>) -> PyResult<T>, + ) -> PyResult<T> { + let self_ref = slf.borrow(); + // Safety: the owner is the right one. We will anyway + // not actually `share` it. + let shareable_ref = unsafe { self_ref.inner.borrow_with_owner(slf) }; + let guard = shareable_ref.try_read().map_err(map_try_lock_error)?; + f(guard) + } + + pub(super) fn with_inner_write<T>( + slf: &Bound<'_, Self>, + f: impl FnOnce(RwLockWriteGuard<DirsMultiset>) -> PyResult<T>, + ) -> PyResult<T> { + let self_ref = slf.borrow(); + // Safety: the owner is the right one. We will anyway + // not actually `share` it. + let shareable_ref = unsafe { self_ref.inner.borrow_with_owner(slf) }; + let guard = shareable_ref.try_write().map_err(map_try_lock_error)?; + f(guard) + } +}
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-pyo3/src/dirstate/dirstate_map.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,557 @@ +// dirstate_map.rs +// +// Copyright 2019 Raphaël Gomès <rgomes@octobus.net> +// 2025 Georges Racinet <georges.racinet@cloudcrane.io> +// +// This software may be used and distributed according to the terms of the +// GNU General Public License version 2 or any later version. +//! Bindings for the `hg::dirstate::dirstate_map` file provided by the +//! `hg-core` package. + +use pyo3::exceptions::{PyKeyError, PyOSError}; +use pyo3::prelude::*; +use pyo3::types::{ + PyBytes, PyBytesMethods, PyDict, PyDictMethods, PyList, PyTuple, +}; +use pyo3_sharedref::{py_shared_iterator, PyShareable}; + +use std::sync::{RwLockReadGuard, RwLockWriteGuard}; + +use hg::{ + dirstate::{ + dirstate_map::{ + DirstateEntryReset, DirstateIdentity as CoreDirstateIdentity, + DirstateMapWriteMode, + }, + entry::{DirstateEntry, ParentFileData, TruncatedTimestamp}, + on_disk::DirstateV2ParseError, + owning::OwningDirstateMap, + StateMapIter, + }, + utils::{files::normalize_case, hg_path::HgPath}, + DirstateParents, +}; + +use super::{copy_map::CopyMap, item::DirstateItem}; +use crate::{ + exceptions::{ + dirstate_error, dirstate_v2_error, map_try_lock_error, + to_string_value_error, + }, + node::{node_from_py_bytes, PyNode}, + path::{PyHgPathBuf, PyHgPathDirstateV2Result, PyHgPathRef}, + utils::PyBytesDeref, +}; + +/// Type alias to satisfy Clippy in `DirstateMap::reset_state)` +/// +/// It is *not* the same as [`super::item::UncheckedTruncatedTimeStamp`] and +/// this is worth reviewing. +type UncheckedTruncatedTimeStamp = Option<(i64, u32, bool)>; + +#[pyclass(mapping)] +pub struct DirstateMap { + pub(super) inner: PyShareable<OwningDirstateMap>, +} + +#[pymethods] +impl DirstateMap { + #[staticmethod] + #[pyo3(signature = (on_disk, identity))] + /// Returns a `(dirstate_map, parents)` tuple + /// + /// The Python call site is using the positional argument style, hence + /// despite the fact that `identity` can be `None`, we specify the + /// matching signature. + fn new_v1( + py: Python, + on_disk: Py<PyBytes>, + identity: Option<&Bound<'_, DirstateIdentity>>, + ) -> PyResult<Py<PyTuple>> { + let on_disk = PyBytesDeref::new(py, on_disk); + let (map, parents) = OwningDirstateMap::new_v1( + on_disk, + identity.map(|i| i.borrow().inner), + ) + .map_err(dirstate_error)?; + let map = Self { inner: map.into() }; + let parents = (PyNode(parents.p1), PyNode(parents.p2)); + Ok((map, parents).into_pyobject(py)?.into()) + } + + #[staticmethod] + #[pyo3(signature = (on_disk, data_size, tree_metadata, uuid, identity))] + fn new_v2( + py: Python, + on_disk: Py<PyBytes>, + data_size: usize, + tree_metadata: &Bound<'_, PyBytes>, + uuid: &Bound<'_, PyBytes>, + identity: Option<&Bound<'_, DirstateIdentity>>, + ) -> PyResult<Self> { + Ok(Self { + inner: OwningDirstateMap::new_v2( + PyBytesDeref::new(py, on_disk), + data_size, + tree_metadata.as_bytes(), + uuid.as_bytes().to_owned(), + identity.map(|i| i.borrow().inner), + ) + .map_err(dirstate_error)? + .into(), + }) + } + + #[staticmethod] + fn new_empty() -> PyResult<Self> { + Ok(Self { + inner: OwningDirstateMap::new_empty(vec![], None).into(), + }) + } + + fn clear(slf: &Bound<'_, Self>) -> PyResult<()> { + Self::with_inner_write(slf, |_self_ref, mut inner| { + inner.clear(); + Ok(()) + }) + } + + #[pyo3(signature = (key, default=None))] + fn get( + slf: &Bound<'_, Self>, + key: &Bound<'_, PyBytes>, + default: Option<PyObject>, + ) -> PyResult<Option<PyObject>> { + let path = HgPath::new(key.as_bytes()); + + Self::with_inner_read(slf, |_self_ref, inner| { + match inner.get(path).map_err(dirstate_v2_error)? { + Some(entry) => Ok(Some( + DirstateItem::new_as_py(slf.py(), entry)?.into_any(), + )), + None => Ok(default), + } + }) + } + + fn set_tracked( + slf: &Bound<'_, Self>, + f: &Bound<'_, PyBytes>, + ) -> PyResult<bool> { + Self::with_inner_write(slf, |_self_ref, mut inner| { + inner + .set_tracked(HgPath::new(f.as_bytes())) + .map_err(dirstate_v2_error) + }) + } + + fn set_untracked( + slf: &Bound<'_, Self>, + f: &Bound<'_, PyBytes>, + ) -> PyResult<bool> { + Self::with_inner_write(slf, |_self_ref, mut inner| { + // here it would be more straightforward to use dirstate_v2_error, + // but that raises ValueError instead of OSError + inner + .set_untracked(HgPath::new(f.as_bytes())) + .map_err(|_| PyOSError::new_err("Dirstate error")) + }) + } + + fn set_clean( + slf: &Bound<'_, Self>, + f: &Bound<'_, PyBytes>, + mode: u32, + size: u32, + mtime: (i64, u32, bool), + ) -> PyResult<()> { + let (mtime_s, mtime_ns, second_ambiguous) = mtime; + let timestamp = TruncatedTimestamp::new_truncate( + mtime_s, + mtime_ns, + second_ambiguous, + ); + + Self::with_inner_write(slf, |_self_ref, mut inner| { + inner + .set_clean(HgPath::new(f.as_bytes()), mode, size, timestamp) + .map_err(dirstate_error) + }) + } + + fn set_possibly_dirty( + slf: &Bound<'_, Self>, + f: &Bound<'_, PyBytes>, + ) -> PyResult<()> { + Self::with_inner_write(slf, |_self_ref, mut inner| { + inner + .set_possibly_dirty(HgPath::new(f.as_bytes())) + .map_err(dirstate_error) + }) + } + + #[pyo3(signature = (f, + wc_tracked=false, + p1_tracked=false, + p2_info=false, + has_meaningful_mtime=true, + parentfiledata=None))] + fn reset_state( + slf: &Bound<'_, Self>, + f: &Bound<'_, PyBytes>, + wc_tracked: bool, + p1_tracked: bool, + p2_info: bool, + has_meaningful_mtime: bool, + parentfiledata: Option<(u32, u32, UncheckedTruncatedTimeStamp)>, + ) -> PyResult<()> { + let mut has_meaningful_mtime = has_meaningful_mtime; + let parent_file_data = match parentfiledata { + None => { + has_meaningful_mtime = false; + None + } + Some(data) => { + let (mode, size, mtime_info) = data; + let mtime = if let Some(mtime_info) = mtime_info { + let (mtime_s, mtime_ns, second_ambiguous) = mtime_info; + let timestamp = TruncatedTimestamp::new_truncate( + mtime_s, + mtime_ns, + second_ambiguous, + ); + Some(timestamp) + } else { + has_meaningful_mtime = false; + None + }; + Some(ParentFileData { + mode_size: Some((mode, size)), + mtime, + }) + } + }; + + let reset = DirstateEntryReset { + filename: HgPath::new(f.as_bytes()), + wc_tracked, + p1_tracked, + p2_info, + has_meaningful_mtime, + parent_file_data_opt: parent_file_data, + from_empty: false, + }; + + Self::with_inner_write(slf, |_self_ref, mut inner| { + inner.reset_state(reset).map_err(dirstate_error) + }) + } + + fn hastrackeddir( + slf: &Bound<'_, Self>, + d: &Bound<'_, PyBytes>, + ) -> PyResult<bool> { + Self::with_inner_write(slf, |_self_ref, mut inner| { + inner + .has_tracked_dir(HgPath::new(d.as_bytes())) + .map_err(to_string_value_error) + }) + } + + fn hasdir( + slf: &Bound<'_, Self>, + d: &Bound<'_, PyBytes>, + ) -> PyResult<bool> { + Self::with_inner_write(slf, |_self_ref, mut inner| { + inner + .has_dir(HgPath::new(d.as_bytes())) + .map_err(to_string_value_error) + }) + } + + /// Returns suitable data for writing on disk in v1 format + /// + /// Despite the name, this is not a mutation of the object. + fn write_v1( + slf: &Bound<'_, Self>, + py: Python, + p1: &Bound<'_, PyBytes>, + p2: &Bound<'_, PyBytes>, + ) -> PyResult<Py<PyBytes>> { + Self::with_inner_read(slf, |_self_ref, inner| { + let parents = DirstateParents { + p1: node_from_py_bytes(p1)?, + p2: node_from_py_bytes(p2)?, + }; + let packed = inner.pack_v1(parents).map_err(dirstate_error)?; + // TODO optim, see `write_v2()` + Ok(PyBytes::new(py, &packed).unbind()) + }) + } + + /// Returns suitable new data for writing on disk in v2 format + /// + /// Despite the name, this is not a mutation of the object. + /// + /// The new data together with whether that data should be appended to + /// the existing data file whose content is at `self.on_disk` (True), + /// instead of written to a new data file (False). + fn write_v2( + slf: &Bound<'_, Self>, + py: Python, + write_mode: usize, + ) -> PyResult<Py<PyTuple>> { + Self::with_inner_read(slf, |_self_ref, inner| { + let rust_write_mode = match write_mode { + 0 => DirstateMapWriteMode::Auto, + 1 => DirstateMapWriteMode::ForceNewDataFile, + 2 => DirstateMapWriteMode::ForceAppend, + _ => DirstateMapWriteMode::Auto, // XXX should we error out? + }; + let (packed, tree_metadata, append, _old_data_size) = + inner.pack_v2(rust_write_mode).map_err(dirstate_error)?; + // TODO optim. In theory we should be able to avoid these copies, + // since we have full ownership of `packed` and `tree_metadata`. + // But the copy is done by CPython itself, in + // `PyBytes_FromStringAndSize()`. Perhaps something better can + // be done with `PyBytes_FromObject` (buffer protocol). + let packed = PyBytes::new(py, &packed).unbind(); + let tree_metadata = + PyBytes::new(py, tree_metadata.as_bytes()).unbind(); + Ok((packed, tree_metadata, append).into_pyobject(py)?.into()) + }) + } + + fn filefoldmapasdict( + slf: &Bound<'_, Self>, + py: Python, + ) -> PyResult<Py<PyDict>> { + let dict = PyDict::new(py); + Self::with_inner_read(slf, |_self_ref, inner| { + for item in inner.iter() { + let (path, entry) = item.map_err(dirstate_v2_error)?; + if !entry.removed() { + let key = normalize_case(path); + dict.set_item(PyHgPathBuf(key), PyHgPathRef(path))?; + } + } + Ok(()) + })?; + Ok(dict.unbind()) + } + + fn __len__(slf: &Bound<'_, Self>) -> PyResult<usize> { + Self::with_inner_read(slf, |_self_ref, inner| Ok(inner.len())) + } + + fn __contains__( + slf: &Bound<'_, Self>, + // TODO we should accept PyAny and return false if wrong type + // review similar "protocol" methods (see example in dirs_multiset) + key: &Bound<'_, PyBytes>, + ) -> PyResult<bool> { + Self::with_inner_read(slf, |_self_ref, inner| { + inner + .contains_key(HgPath::new(key.as_bytes())) + .map_err(dirstate_v2_error) + }) + } + + fn __getitem__( + slf: &Bound<'_, Self>, + key: &Bound<'_, PyBytes>, + ) -> PyResult<Py<DirstateItem>> { + let key_bytes = key.as_bytes(); + let path = HgPath::new(key_bytes); + Self::with_inner_read(slf, |_self_ref, inner| { + match inner.get(path).map_err(dirstate_v2_error)? { + Some(entry) => DirstateItem::new_as_py(slf.py(), entry), + None => Err(PyKeyError::new_err( + String::from_utf8_lossy(key_bytes).to_string(), + )), + } + }) + } + + fn keys(slf: &Bound<'_, Self>) -> PyResult<DirstateMapKeysIterator> { + DirstateMapKeysIterator::new(slf) + } + + fn items(slf: &Bound<'_, Self>) -> PyResult<DirstateMapItemsIterator> { + DirstateMapItemsIterator::new(slf) + } + + fn __iter__(slf: &Bound<'_, Self>) -> PyResult<DirstateMapKeysIterator> { + Self::keys(slf) + } + + fn copymap(slf: &Bound<'_, Self>) -> PyResult<Py<CopyMap>> { + CopyMap::new(slf).and_then(|cm| Py::new(slf.py(), cm)) + } + + fn tracked_dirs( + slf: &Bound<'_, Self>, + py: Python, + ) -> PyResult<Py<PyList>> { + // core iterator is not exact sized, we cannot use `PyList::new` + let dirs = PyList::empty(py); + Self::with_inner_write(slf, |_self_ref, mut inner| { + for path in inner.iter_tracked_dirs().map_err(dirstate_error)? { + dirs.append(PyHgPathDirstateV2Result(path))?; + } + Ok(()) + })?; + Ok(dirs.unbind()) + } + + fn setparents_fixup( + slf: &Bound<'_, Self>, + py: Python, + ) -> PyResult<Py<PyDict>> { + let dict = PyDict::new(py); + let copies = Self::with_inner_write(slf, |_self_ref, mut inner| { + inner.setparents_fixup().map_err(dirstate_v2_error) + })?; + + // it might be interesting to try and use the `IntoPyDict` trait, + // but it does about the same thing + // but that would require performing the inner `as_bytes()` as well + for (key, value) in copies { + dict.set_item(PyHgPathBuf(key), PyHgPathBuf(value))?; + } + Ok(dict.unbind()) + } + + fn debug_iter( + slf: &Bound<'_, Self>, + py: Python, + all: bool, + ) -> PyResult<PyObject> { + Self::with_inner_read(slf, |_self_ref, inner| { + // the iterator returned by `debug_iter()` does not + // implement ExactSizeIterator, which is needed by + // `PyList::new()`, so we need to collect. Probably not a + // performance issue, as this is a debug method. + let as_vec: PyResult<Vec<_>> = inner + .debug_iter(all) + .map(|item| { + let (path, (state, mode, size, mtime)) = + item.map_err(dirstate_v2_error)?; + Ok((PyHgPathRef(path), state, mode, size, mtime)) + }) + .collect(); + // `IntoPyObject` on `Vec` and `&[T]` gives `PyList` or `PyBytes` + Ok(as_vec?.into_pyobject(py)?.unbind()) + }) + } +} + +py_shared_iterator!( + DirstateMapKeysIterator, + PyBytes, + DirstateMap, + inner, + StateMapIter<'static>, + |dsm| dsm.iter(), + DirstateMap::keys_next_result +); + +py_shared_iterator!( + DirstateMapItemsIterator, + PyTuple, + DirstateMap, + inner, + StateMapIter<'static>, + |dsm| dsm.iter(), + DirstateMap::items_next_result +); + +impl DirstateMap { + fn keys_next_result( + py: Python, + res: Result<(&HgPath, DirstateEntry), DirstateV2ParseError>, + ) -> PyResult<Option<Py<PyBytes>>> { + let key = res.map_err(dirstate_v2_error)?.0; + Ok(Some(PyHgPathRef(key).into_pyobject(py)?.unbind())) + } + + fn items_next_result( + py: Python, + res: Result<(&HgPath, DirstateEntry), DirstateV2ParseError>, + ) -> PyResult<Option<Py<PyTuple>>> { + let (key, entry) = res.map_err(dirstate_v2_error)?; + let py_entry = DirstateItem::new_as_py(py, entry)?; + Ok(Some((PyHgPathRef(key), py_entry).into_pyobject(py)?.into())) + } + + pub(super) fn with_inner_read<'py, T>( + slf: &Bound<'py, Self>, + f: impl FnOnce( + &PyRef<'py, Self>, + RwLockReadGuard<OwningDirstateMap>, + ) -> PyResult<T>, + ) -> PyResult<T> { + let self_ref = slf.borrow(); + // Safety: the owner is the right one. We will anyway + // not actually `share` it. + let shareable_ref = unsafe { self_ref.inner.borrow_with_owner(slf) }; + let guard = shareable_ref.try_read().map_err(map_try_lock_error)?; + f(&self_ref, guard) + } + + pub(super) fn with_inner_write<'py, T>( + slf: &Bound<'py, Self>, + f: impl FnOnce( + &PyRef<'py, Self>, + RwLockWriteGuard<OwningDirstateMap>, + ) -> PyResult<T>, + ) -> PyResult<T> { + let self_ref = slf.borrow(); + // Safety: the owner is the right one. We will anyway + // not actually `share` it. + let shareable_ref = unsafe { self_ref.inner.borrow_with_owner(slf) }; + let guard = shareable_ref.try_write().map_err(map_try_lock_error)?; + f(&self_ref, guard) + } +} + +#[pyclass] +pub struct DirstateIdentity { + #[allow(dead_code)] + inner: CoreDirstateIdentity, +} + +#[pymethods] +impl DirstateIdentity { + #[new] + #[allow(clippy::too_many_arguments)] + fn new( + mode: u32, + dev: u64, + ino: u64, + nlink: u64, + uid: u32, + gid: u32, + size: u64, + mtime: i64, + mtime_nsec: i64, + ctime: i64, + ctime_nsec: i64, + ) -> PyResult<Self> { + Ok(Self { + inner: CoreDirstateIdentity { + mode, + dev, + ino, + nlink, + uid, + gid, + size, + mtime, + mtime_nsec, + ctime, + ctime_nsec, + }, + }) + } +}
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-pyo3/src/dirstate/item.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,251 @@ +// dirstate/item.rs +// +// Copyright 2019 Raphaël Gomès <rgomes@octobus.net> +// 2025 Georges Racinet <georges.racinet@cloudcrane.io> +// +// This software may be used and distributed according to the terms of the +// GNU General Public License version 2 or any later version. +//! Bindings for the `hg::dirstate::entry` module of the `hg-core` package. + +use pyo3::exceptions::PyValueError; +use pyo3::prelude::*; +use pyo3::types::PyBytes; + +use std::sync::{RwLock, RwLockReadGuard, RwLockWriteGuard}; + +use hg::dirstate::entry::{DirstateEntry, DirstateV2Data, TruncatedTimestamp}; + +use crate::exceptions::map_lock_error; + +#[pyclass] +pub struct DirstateItem { + entry: RwLock<DirstateEntry>, +} + +/// Type alias to satisfy Clippy in `DirstateItem::new()` +type UncheckedTruncatedTimeStamp = Option<(u32, u32, bool)>; + +#[pymethods] +impl DirstateItem { + #[new] + #[allow(clippy::too_many_arguments)] + #[pyo3(signature = (wc_tracked=false, + p1_tracked=false, + p2_info=false, + has_meaningful_data=true, + has_meaningful_mtime=true, + parentfiledata=None, + fallback_exec=None, + fallback_symlink=None))] + fn new( + wc_tracked: bool, + p1_tracked: bool, + p2_info: bool, + has_meaningful_data: bool, + has_meaningful_mtime: bool, + parentfiledata: Option<(u32, u32, UncheckedTruncatedTimeStamp)>, + fallback_exec: Option<bool>, + fallback_symlink: Option<bool>, + ) -> PyResult<Self> { + let mut mode_size_opt = None; + let mut mtime_opt = None; + if let Some((mode, size, mtime)) = parentfiledata { + if has_meaningful_data { + mode_size_opt = Some((mode, size)) + } + if has_meaningful_mtime { + if let Some(m) = mtime { + mtime_opt = Some(timestamp(m)?); + } + } + } + Ok(Self { + entry: DirstateEntry::from_v2_data(DirstateV2Data { + wc_tracked, + p1_tracked, + p2_info, + mode_size: mode_size_opt, + mtime: mtime_opt, + fallback_exec, + fallback_symlink, + }) + .into(), + }) + } + + #[getter] + fn state(&self, py: Python) -> PyResult<Py<PyBytes>> { + let state_byte = self.read()?.state(); + Ok(PyBytes::new(py, &[state_byte.into()]).unbind()) + } + + #[getter] + fn mode(&self) -> PyResult<i32> { + Ok(self.read()?.mode()) + } + + #[getter] + fn size(&self) -> PyResult<i32> { + Ok(self.read()?.size()) + } + + #[getter] + fn mtime(&self) -> PyResult<i32> { + Ok(self.read()?.mtime()) + } + + #[getter] + fn has_fallback_exec(&self) -> PyResult<bool> { + Ok(self.read()?.get_fallback_exec().is_some()) + } + + #[getter] + fn fallback_exec(&self) -> PyResult<Option<bool>> { + Ok(self.read()?.get_fallback_exec()) + } + + #[setter] + fn set_fallback_exec( + &self, + value: Option<Bound<'_, PyAny>>, + ) -> PyResult<()> { + let mut writable = self.write()?; + match value { + None => { + writable.set_fallback_exec(None); + } + Some(value) => { + if value.is_none() { + // gracinet: this case probably cannot happen, + // because PyO3 setters have a fixed signature, that + // is not defaulting to kwargs, hence there is no + // difference between an explicit None and a default + // (kwarg) None. Still keeping it for safety, it could + // be cleaned up afterwards. + writable.set_fallback_exec(None); + } else { + writable.set_fallback_exec(Some(value.is_truthy()?)); + } + } + } + Ok(()) + } + + #[getter] + fn has_fallback_symlink(&self) -> PyResult<bool> { + Ok(self.read()?.get_fallback_symlink().is_some()) + } + + #[getter] + fn fallback_symlink(&self) -> PyResult<Option<bool>> { + Ok(self.read()?.get_fallback_symlink()) + } + + #[getter] + fn tracked(&self) -> PyResult<bool> { + Ok(self.read()?.tracked()) + } + + #[getter] + fn p1_tracked(&self) -> PyResult<bool> { + Ok(self.read()?.p1_tracked()) + } + + #[getter] + fn added(&self) -> PyResult<bool> { + Ok(self.read()?.added()) + } + + #[getter] + fn modified(&self) -> PyResult<bool> { + Ok(self.read()?.modified()) + } + + #[getter] + fn p2_info(&self) -> PyResult<bool> { + Ok(self.read()?.p2_info()) + } + + #[getter] + fn removed(&self) -> PyResult<bool> { + Ok(self.read()?.removed()) + } + + #[getter] + fn maybe_clean(&self) -> PyResult<bool> { + Ok(self.read()?.maybe_clean()) + } + + #[getter] + fn any_tracked(&self) -> PyResult<bool> { + Ok(self.read()?.any_tracked()) + } + + fn mtime_likely_equal_to( + &self, + other: (u32, u32, bool), + ) -> PyResult<bool> { + if let Some(mtime) = self.read()?.truncated_mtime() { + Ok(mtime.likely_equal(timestamp(other)?)) + } else { + Ok(false) + } + } + + fn drop_merge_data(&self) -> PyResult<()> { + self.write()?.drop_merge_data(); + Ok(()) + } + + fn set_clean( + &self, + mode: u32, + size: u32, + mtime: (u32, u32, bool), + ) -> PyResult<()> { + self.write()?.set_clean(mode, size, timestamp(mtime)?); + Ok(()) + } + + fn set_possibly_dirty(&self) -> PyResult<()> { + self.write()?.set_possibly_dirty(); + Ok(()) + } + + fn set_tracked(&self) -> PyResult<()> { + self.write()?.set_tracked(); + Ok(()) + } + + fn set_untracked(&self) -> PyResult<()> { + self.write()?.set_untracked(); + Ok(()) + } +} + +impl DirstateItem { + pub fn new_as_py(py: Python, entry: DirstateEntry) -> PyResult<Py<Self>> { + Ok(Self { + entry: entry.into(), + } + .into_pyobject(py)? + .unbind()) + } + + fn read(&self) -> PyResult<RwLockReadGuard<DirstateEntry>> { + self.entry.read().map_err(map_lock_error) + } + + fn write(&self) -> PyResult<RwLockWriteGuard<DirstateEntry>> { + self.entry.write().map_err(map_lock_error) + } +} + +pub(crate) fn timestamp( + (s, ns, second_ambiguous): (u32, u32, bool), +) -> PyResult<TruncatedTimestamp> { + TruncatedTimestamp::from_already_truncated(s, ns, second_ambiguous) + .map_err(|_| { + PyValueError::new_err("expected mtime truncated to 31 bits") + }) +}
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-pyo3/src/dirstate/status.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,274 @@ +// status.rs +// +// Copyright 2019 Raphaël Gomès <rgomes@octobus.net> +// 2025 Georges Racinet <georges.racinet@cloudcrane.io> +// +// This software may be used and distributed according to the terms of the +// GNU General Public License version 2 or any later version. +//! Bindings for the `hg::status` module provided by the +//! `hg-core` crate. From Python, this will be seen as the +//! `pyo3_rustext.dirstate.status` function. +use pyo3::intern; +use pyo3::prelude::*; +use pyo3::types::{PyBytes, PyList, PyTuple}; + +use hg::{ + dirstate::status::{ + BadMatch, DirstateStatus, StatusError, StatusOptions, StatusPath, + }, + filepatterns::{ + parse_pattern_syntax_kind, IgnorePattern, PatternError, + PatternFileWarning, + }, + matchers::{ + AlwaysMatcher, DifferenceMatcher, FileMatcher, IncludeMatcher, + IntersectionMatcher, Matcher, NeverMatcher, PatternMatcher, + UnionMatcher, + }, + utils::{ + files::{get_bytes_from_path, get_path_from_bytes}, + hg_path::HgPath, + }, +}; + +use super::dirstate_map::DirstateMap; +use crate::{ + exceptions::{to_string_value_error, FallbackError}, + path::{paths_py_list, paths_pyiter_collect, PyHgPathRef}, +}; + +fn status_path_py_list( + py: Python, + paths: &[StatusPath<'_>], +) -> PyResult<Py<PyList>> { + paths_py_list(py, paths.iter().map(|item| &*item.path)) +} + +fn collect_bad_matches( + py: Python, + collection: &[(impl AsRef<HgPath>, BadMatch)], +) -> PyResult<Py<PyList>> { + let get_error_message = |code: i32| -> String { + // hg-cpython here calling the Python interpreter + // using `os.strerror`. This seems to be equivalent and infallible + std::io::Error::from_raw_os_error(code).to_string() + }; + Ok(PyList::new( + py, + collection.iter().map(|(path, bad_match)| { + let message = match bad_match { + BadMatch::OsError(code) => get_error_message(*code), + BadMatch::BadType(bad_type) => { + format!("unsupported file type (type is {})", bad_type) + } + }; + (PyHgPathRef(path.as_ref()), message) + }), + )? + .unbind()) +} + +fn collect_kindpats( + py: Python, + matcher: &Bound<'_, PyAny>, +) -> PyResult<Vec<IgnorePattern>> { + matcher + .getattr(intern!(py, "_kindpats"))? + .try_iter()? + .map(|k| { + let k = k?; + let py_syntax = k.get_item(0)?; + let py_pattern = k.get_item(1)?; + let py_source = k.get_item(2)?; + + Ok(IgnorePattern::new( + parse_pattern_syntax_kind( + py_syntax.downcast::<PyBytes>()?.as_bytes(), + ) + .map_err(|e| handle_fallback(StatusError::Pattern(e)))?, + py_pattern.downcast::<PyBytes>()?.as_bytes(), + get_path_from_bytes( + py_source.downcast::<PyBytes>()?.as_bytes(), + ), + )) + }) + .collect() +} + +fn extract_matcher( + matcher: &Bound<'_, PyAny>, +) -> PyResult<Box<dyn Matcher + Sync>> { + let py = matcher.py(); + let tampered = matcher + .call_method0(intern!(py, "was_tampered_with_nonrec"))? + .extract::<bool>()?; + if tampered { + return Err(handle_fallback(StatusError::Pattern( + PatternError::UnsupportedSyntax( + "Pattern matcher was tampered with!".to_string(), + ), + ))); + }; + + match matcher.get_type().name()?.to_str()? { + "alwaysmatcher" => Ok(Box::new(AlwaysMatcher)), + "nevermatcher" => Ok(Box::new(NeverMatcher)), + "exactmatcher" => { + let files = matcher.call_method0(intern!(py, "files"))?; + let files: Vec<_> = paths_pyiter_collect(&files)?; + Ok(Box::new( + FileMatcher::new(files).map_err(to_string_value_error)?, + )) + } + "includematcher" => { + // Get the patterns from Python even though most of them are + // redundant with those we will parse later on, as they include + // those passed from the command line. + let ignore_patterns = collect_kindpats(py, matcher)?; + Ok(Box::new( + IncludeMatcher::new(ignore_patterns) + .map_err(|e| handle_fallback(e.into()))?, + )) + } + "unionmatcher" => { + let matchers: PyResult<Vec<_>> = matcher + .getattr("_matchers")? + .try_iter()? + .map(|py_matcher| extract_matcher(&py_matcher?)) + .collect(); + + Ok(Box::new(UnionMatcher::new(matchers?))) + } + "intersectionmatcher" => { + let m1 = extract_matcher(&matcher.getattr("_m1")?)?; + let m2 = extract_matcher(&matcher.getattr("_m2")?)?; + Ok(Box::new(IntersectionMatcher::new(m1, m2))) + } + "differencematcher" => { + let m1 = extract_matcher(&matcher.getattr("_m1")?)?; + let m2 = extract_matcher(&matcher.getattr("_m2")?)?; + Ok(Box::new(DifferenceMatcher::new(m1, m2))) + } + "patternmatcher" => { + let patterns = collect_kindpats(py, matcher)?; + Ok(Box::new( + PatternMatcher::new(patterns) + .map_err(|e| handle_fallback(e.into()))?, + )) + } + + m => Err(FallbackError::new_err(format!("Unsupported matcher {m}"))), + } +} + +fn handle_fallback(err: StatusError) -> PyErr { + match err { + StatusError::Pattern(e) => { + let as_string = e.to_string(); + log::trace!("Rust status fallback, `{}`", &as_string); + FallbackError::new_err(as_string) + } + e => to_string_value_error(e), + } +} + +#[pyfunction] +#[allow(clippy::too_many_arguments)] +pub(super) fn status( + py: Python, + dmap: &Bound<'_, DirstateMap>, + matcher: &Bound<'_, PyAny>, + root_dir: &Bound<'_, PyBytes>, + ignore_files: &Bound<'_, PyList>, + check_exec: bool, + list_clean: bool, + list_ignored: bool, + list_unknown: bool, + collect_traversed_dirs: bool, +) -> PyResult<Py<PyTuple>> { + let root_dir = get_path_from_bytes(root_dir.as_bytes()); + + let ignore_files: PyResult<Vec<_>> = ignore_files + .try_iter()? + .map(|res| { + let ob = res?; + let file = ob.downcast::<PyBytes>()?.as_bytes(); + Ok(get_path_from_bytes(file).to_owned()) + }) + .collect(); + let ignore_files = ignore_files?; + // The caller may call `copymap.items()` separately + let list_copies = false; + + let after_status = |res: Result<(DirstateStatus<'_>, _), StatusError>| { + let (status_res, warnings) = res.map_err(handle_fallback)?; + build_response(py, status_res, warnings) + }; + + let matcher = extract_matcher(matcher)?; + DirstateMap::with_inner_write(dmap, |_dm_ref, mut inner| { + inner.with_status( + &*matcher, + root_dir.to_path_buf(), + ignore_files, + StatusOptions { + check_exec, + list_clean, + list_ignored, + list_unknown, + list_copies, + collect_traversed_dirs, + }, + after_status, + ) + }) +} + +fn build_response( + py: Python, + status_res: DirstateStatus, + warnings: Vec<PatternFileWarning>, +) -> PyResult<Py<PyTuple>> { + let modified = status_path_py_list(py, &status_res.modified)?; + let added = status_path_py_list(py, &status_res.added)?; + let removed = status_path_py_list(py, &status_res.removed)?; + let deleted = status_path_py_list(py, &status_res.deleted)?; + let clean = status_path_py_list(py, &status_res.clean)?; + let ignored = status_path_py_list(py, &status_res.ignored)?; + let unknown = status_path_py_list(py, &status_res.unknown)?; + let unsure = status_path_py_list(py, &status_res.unsure)?; + let bad = collect_bad_matches(py, &status_res.bad)?; + let traversed = paths_py_list(py, status_res.traversed.iter())?; + let py_warnings = PyList::empty(py); + for warning in warnings.iter() { + // We use duck-typing on the Python side for dispatch, good enough for + // now. + match warning { + PatternFileWarning::InvalidSyntax(file, syn) => { + py_warnings.append(( + PyBytes::new(py, &get_bytes_from_path(file)), + PyBytes::new(py, syn), + ))?; + } + PatternFileWarning::NoSuchFile(file) => py_warnings + .append(PyBytes::new(py, &get_bytes_from_path(file)))?, + } + } + + Ok(( + unsure.into_pyobject(py)?, + modified.into_pyobject(py)?, + added.into_pyobject(py)?, + removed.into_pyobject(py)?, + deleted.into_pyobject(py)?, + clean.into_pyobject(py)?, + ignored.into_pyobject(py)?, + unknown.into_pyobject(py)?, + py_warnings.into_pyobject(py)?, + bad.into_pyobject(py)?, + traversed.into_pyobject(py)?, + status_res.dirty.into_pyobject(py)?, + ) + .into_pyobject(py)? + .into()) +}
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-pyo3/src/discovery.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,216 @@ +//! Discovery of common node sets +use std::collections::HashSet; + +use hg::{discovery::PartialDiscovery as CorePartialDiscovery, Revision}; +use pyo3::{ + intern, pyclass, pymethods, + types::{PyAnyMethods, PyDict, PyModule, PyModuleMethods, PyTuple}, + Bound, Py, PyAny, PyObject, PyResult, Python, +}; +use pyo3_sharedref::SharedByPyObject; + +use crate::{ + exceptions::GraphError, + revision::{rev_pyiter_collect, PyRevision}, + revlog::PySharedIndex, + utils::{new_submodule, py_rust_index_to_graph}, +}; + +#[pyclass] +struct PartialDiscovery { + inner: SharedByPyObject<CorePartialDiscovery<PySharedIndex>>, + idx: SharedByPyObject<PySharedIndex>, +} + +#[pymethods] +impl PartialDiscovery { + #[pyo3(signature = (repo, targetheads, respectsize, randomize=true))] + #[new] + fn new( + py: Python, + repo: &Bound<'_, PyAny>, + targetheads: &Bound<'_, PyAny>, + respectsize: bool, + randomize: bool, + ) -> PyResult<Self> { + let index = repo + .getattr(intern!(py, "changelog"))? + .getattr(intern!(py, "index"))?; + let cloned_index = py_rust_index_to_graph(&index.clone())?; + let index = py_rust_index_to_graph(&index)?; + + // Safety: we don't leak any reference derived form the "faked" one in + // `SharedByPyObject` + let target_heads = { + let borrowed_idx = unsafe { index.try_borrow(py)? }; + rev_pyiter_collect(targetheads, &*borrowed_idx)? + }; + // Safety: we don't leak any reference derived form the "faked" one in + // `SharedByPyObject` + let lazy_disco = unsafe { + index.map(py, |idx| { + CorePartialDiscovery::new( + idx, + target_heads, + respectsize, + randomize, + ) + }) + }; + Ok(Self { + inner: lazy_disco, + idx: cloned_index, + }) + } + + fn addcommons( + &mut self, + py: Python, + commons: &Bound<'_, PyAny>, + ) -> PyResult<PyObject> { + let commons = self.pyiter_to_vec(commons)?; + // Safety: we don't leak any reference derived form the "faked" one in + // `SharedByPyObject` + let mut inner = unsafe { self.inner.try_borrow_mut(py)? }; + inner + .add_common_revisions(commons) + .map_err(GraphError::from_hg)?; + Ok(py.None()) + } + + fn addmissings( + &mut self, + py: Python, + missings: &Bound<'_, PyAny>, + ) -> PyResult<PyObject> { + let missings = self.pyiter_to_vec(missings)?; + // Safety: we don't leak any reference derived form the "faked" one in + // `SharedByPyObject` + let mut inner = unsafe { self.inner.try_borrow_mut(py)? }; + inner + .add_missing_revisions(missings) + .map_err(GraphError::from_hg)?; + Ok(py.None()) + } + + fn addinfo( + &mut self, + py: Python, + sample: &Bound<'_, PyAny>, + ) -> PyResult<PyObject> { + let mut missing: Vec<Revision> = vec![]; + let mut common: Vec<Revision> = vec![]; + for info in sample.try_iter()? { + // info is a pair (Revision, bool) + let info = info?; + let info = info.downcast::<PyTuple>()?; + let rev: PyRevision = info.get_item(0)?.extract()?; + // This is fine since we're just using revisions as integers + // for the purposes of discovery + let rev = Revision(rev.0); + let known: bool = info.get_item(1)?.extract()?; + if known { + common.push(rev); + } else { + missing.push(rev); + } + } + // Safety: we don't leak any reference derived form the "faked" one in + // `SharedByPyObject` + let mut inner = unsafe { self.inner.try_borrow_mut(py)? }; + inner + .add_common_revisions(common) + .map_err(GraphError::from_hg)?; + inner + .add_missing_revisions(missing) + .map_err(GraphError::from_hg)?; + Ok(py.None()) + } + + fn hasinfo(&self, py: Python<'_>) -> PyResult<bool> { + // Safety: we don't leak any reference derived form the "faked" one in + // `SharedByPyObject` + let inner = unsafe { self.inner.try_borrow(py)? }; + Ok(inner.has_info()) + } + + fn iscomplete(&self, py: Python<'_>) -> PyResult<bool> { + // Safety: we don't leak any reference derived form the "faked" one in + // `SharedByPyObject` + let inner = unsafe { self.inner.try_borrow(py)? }; + Ok(inner.is_complete()) + } + + fn stats(&self, py: Python<'_>) -> PyResult<Py<PyDict>> { + // Safety: we don't leak any reference derived form the "faked" one in + // `SharedByPyObject` + let inner = unsafe { self.inner.try_borrow(py)? }; + let stats = inner.stats(); + let as_dict = PyDict::new(py); + as_dict.set_item("undecided", stats.undecided)?; + Ok(as_dict.unbind()) + } + + fn commonheads(&self, py: Python<'_>) -> PyResult<HashSet<PyRevision>> { + // Safety: we don't leak any reference derived form the "faked" one in + // `SharedByPyObject` + let inner = unsafe { self.inner.try_borrow(py)? }; + let common_heads = + inner.common_heads().map_err(GraphError::from_hg)?; + Ok(common_heads.into_iter().map(Into::into).collect()) + } + + fn takefullsample( + &mut self, + py: Python, + _headrevs: &Bound<'_, PyAny>, + size: usize, + ) -> PyResult<Py<PyTuple>> { + // Safety: we don't leak any reference derived form the "faked" one in + // `SharedByPyObject` + let mut inner = unsafe { self.inner.try_borrow_mut(py)? }; + let sample = + inner.take_full_sample(size).map_err(GraphError::from_hg)?; + let as_pyrevision = sample.into_iter().map(|rev| PyRevision(rev.0)); + Ok(PyTuple::new(py, as_pyrevision)?.unbind()) + } + + fn takequicksample( + &mut self, + py: Python, + headrevs: &Bound<'_, PyAny>, + size: usize, + ) -> PyResult<Py<PyTuple>> { + let revs = self.pyiter_to_vec(headrevs)?; + // Safety: we don't leak any reference derived form the "faked" one in + // `SharedByPyObject` + let mut inner = unsafe { self.inner.try_borrow_mut(py)? }; + let sample = inner + .take_quick_sample(revs, size) + .map_err(GraphError::from_hg)?; + let as_pyrevision = sample.into_iter().map(|rev| PyRevision(rev.0)); + Ok(PyTuple::new(py, as_pyrevision)?.unbind()) + } +} + +impl PartialDiscovery { + /// Convert a Python iterator of revisions into a vector + fn pyiter_to_vec( + &self, + iter: &Bound<'_, PyAny>, + ) -> PyResult<Vec<Revision>> { + // Safety: we don't leak any reference derived form the "faked" one in + // `SharedByPyObject` + let index = unsafe { self.idx.try_borrow(iter.py())? }; + rev_pyiter_collect(iter, &*index) + } +} + +pub fn init_module<'py>( + py: Python<'py>, + package: &str, +) -> PyResult<Bound<'py, PyModule>> { + let m = new_submodule(py, package, "discovery")?; + m.add_class::<PartialDiscovery>()?; + Ok(m) +}
--- a/rust/hg-pyo3/src/exceptions.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-pyo3/src/exceptions.rs Fri Feb 28 23:28:10 2025 +0100 @@ -1,10 +1,18 @@ -use pyo3::exceptions::{PyRuntimeError, PyValueError}; +use pyo3::exceptions::{PyOSError, PyRuntimeError, PyValueError}; use pyo3::import_exception; use pyo3::{create_exception, PyErr}; +use std::fmt::Display; + +use hg::dirstate::{on_disk::DirstateV2ParseError, DirstateError}; + +use hg::revlog::nodemap::NodeMapError; +use hg::UncheckedRevision; + use crate::revision::PyRevision; create_exception!(pyo3_rustext, GraphError, PyValueError); +create_exception!(pyo3_rustext, FallbackError, PyRuntimeError); import_exception!(mercurial.error, WdirUnsupported); impl GraphError { @@ -13,6 +21,9 @@ hg::GraphError::ParentOutOfRange(r) => { GraphError::new_err(("ParentOutOfRange", PyRevision(r.0))) } + hg::GraphError::ParentOutOfOrder(r) => { + GraphError::new_err(("ParentOutOfOrder", PyRevision(r.0))) + } } } pub fn from_vcsgraph(inner: vcsgraph::graph::GraphReadError) -> PyErr { @@ -36,3 +47,57 @@ pub fn map_lock_error<T>(e: std::sync::PoisonError<T>) -> PyErr { PyRuntimeError::new_err(format!("In Rust PyO3 bindings: {e}")) } + +pub fn map_try_lock_error<T>(e: std::sync::TryLockError<T>) -> PyErr { + PyRuntimeError::new_err(format!("In Rust PyO3 bindings: {e}")) +} + +pub fn to_string_value_error<T: Display>(e: T) -> PyErr { + PyValueError::new_err(e.to_string()) +} + +pub mod mercurial_py_errors { + pyo3::import_exception!(mercurial.error, RevlogError); +} + +pub fn revlog_error_from_msg(e: impl ToString) -> PyErr { + mercurial_py_errors::RevlogError::new_err(e.to_string().into_bytes()) +} + +pub fn revlog_error_bare() -> PyErr { + mercurial_py_errors::RevlogError::new_err((None::<String>,)) +} + +pub fn rev_not_in_index(rev: UncheckedRevision) -> PyErr { + PyValueError::new_err(format!("revlog index out of range: {}", rev)) +} + +pub fn nodemap_error(err: NodeMapError) -> PyErr { + match err { + NodeMapError::MultipleResults => { + mercurial_py_errors::RevlogError::new_err("") + } + + NodeMapError::RevisionNotInIndex(rev) => { + PyValueError::new_err(format!( + "Inconsistency: Revision {} found in nodemap \ + is not in revlog index", + rev + )) + } + } +} + +pub fn graph_error(_err: hg::GraphError) -> PyErr { + // ParentOutOfRange is currently the only alternative + // in `hg::GraphError`. The C index always raises this simple ValueError. + PyValueError::new_err("parent out of range") +} + +pub fn dirstate_error(err: DirstateError) -> PyErr { + PyOSError::new_err(format!("Dirstate error: {:?}", err)) +} + +pub fn dirstate_v2_error(_err: DirstateV2ParseError) -> PyErr { + PyValueError::new_err("corrupted dirstate-v2") +}
--- a/rust/hg-pyo3/src/lib.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-pyo3/src/lib.rs Fri Feb 28 23:28:10 2025 +0100 @@ -1,25 +1,38 @@ use pyo3::prelude::*; mod ancestors; -mod convert_cpython; +mod copy_tracing; mod dagops; +mod dirstate; +mod discovery; mod exceptions; +mod node; +mod path; +mod repo; mod revision; -mod util; +mod revlog; +mod store; +mod transaction; +mod update; +mod utils; #[pymodule] fn pyo3_rustext(py: Python<'_>, m: &Bound<'_, PyModule>) -> PyResult<()> { + m.add("__package__", "mercurial")?; m.add( "__doc__", "Mercurial core concepts - Rust implementation exposed via PyO3", )?; - // the module's __name__ is pyo3_rustext, not mercurial.pyo3_rustext - // (at least at this point). - let name: String = m.getattr("__name__")?.extract()?; - let dotted_name = format!("mercurial.{}", name); + let dotted_name: String = m.getattr("__name__")?.extract()?; + env_logger::init(); m.add_submodule(&ancestors::init_module(py, &dotted_name)?)?; + m.add_submodule(©_tracing::init_module(py, &dotted_name)?)?; m.add_submodule(&dagops::init_module(py, &dotted_name)?)?; + m.add_submodule(&dirstate::init_module(py, &dotted_name)?)?; + m.add_submodule(&discovery::init_module(py, &dotted_name)?)?; + m.add_submodule(&revlog::init_module(py, &dotted_name)?)?; + m.add_submodule(&update::init_module(py, &dotted_name)?)?; m.add("GraphError", py.get_type::<exceptions::GraphError>())?; Ok(()) }
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-pyo3/src/node.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,72 @@ +use pyo3::exceptions::PyValueError; +use pyo3::prelude::*; +use pyo3::types::PyBytes; + +use std::convert::Infallible; + +use hg::revlog::RevlogIndex; +use hg::{ + revlog::index::Index, revlog::node::NODE_BYTES_LENGTH, Node, NodePrefix, + Revision, +}; + +#[derive(Debug, Copy, Clone, PartialEq, derive_more::From)] +pub struct PyNode(pub Node); + +impl<'py> IntoPyObject<'py> for PyNode { + type Target = PyBytes; + type Output = Bound<'py, Self::Target>; + type Error = Infallible; + + fn into_pyobject( + self, + py: Python<'py>, + ) -> Result<Self::Output, Self::Error> { + Ok(PyBytes::new(py, self.0.as_bytes())) + } +} + +/// Copy incoming Python binary Node ID into [`Node`] +/// +/// # Python exceptions +/// Raises `ValueError` if length is not as expected +pub fn node_from_py_bytes(bytes: &Bound<'_, PyBytes>) -> PyResult<Node> { + Node::try_from(bytes.as_bytes()).map_err(|_| { + PyValueError::new_err(format!( + "{}-byte hash required", + NODE_BYTES_LENGTH + )) + }) +} + +/// Convert Python hexadecimal Node ID node or prefix given as `bytes` into +/// [`NodePrefix`]. +/// +/// # Python exceptions +/// Raises `ValueError` if the incoming `bytes` is invalid. +pub fn node_prefix_from_py_bytes( + bytes: &Bound<'_, PyBytes>, +) -> PyResult<NodePrefix> { + let as_bytes = bytes.as_bytes(); + NodePrefix::from_hex(as_bytes).map_err(|_| { + PyValueError::new_err(format!( + "Invalid node or prefix '{}'", + String::from_utf8_lossy(as_bytes) + )) + }) +} + +/// Return the binary node from a checked revision +/// +/// This is meant to be used on revisions already checked to exist, +/// typically obtained from a NodeTree lookup. +/// +/// # Panics +/// Panics if the revision does not exist +pub fn py_node_for_rev<'py>( + py: Python<'py>, + idx: &Index, + rev: Revision, +) -> Bound<'py, PyBytes> { + PyBytes::new(py, idx.node(rev).expect("node should exist").as_bytes()) +}
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-pyo3/src/path.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,106 @@ +// path.rs +// +// Copyright 2019 Raphaël Gomès <rgomes@octobus.net> +// 2025 Georges Racinet <georges.racinet@cloudcrane.io> +// +// This software may be used and distributed according to the terms of the +// GNU General Public License version 2 or any later version. +//! Utilities about `HgPath` and related objects provided by the `hg-core` +//! package. + +use pyo3::prelude::*; +use pyo3::types::{PyBytes, PyList}; + +use std::convert::Infallible; + +use hg::dirstate::on_disk::DirstateV2ParseError; +use hg::utils::hg_path::{HgPath, HgPathBuf}; + +use crate::exceptions::dirstate_v2_error; + +#[derive(Eq, Ord, PartialEq, PartialOrd, Hash, derive_more::From)] +pub struct PyHgPathRef<'a>(pub &'a HgPath); + +impl<'py> IntoPyObject<'py> for PyHgPathRef<'_> { + type Target = PyBytes; + type Output = Bound<'py, Self::Target>; + type Error = Infallible; + + fn into_pyobject( + self, + py: Python<'py>, + ) -> Result<Self::Output, Self::Error> { + Ok(PyBytes::new(py, self.0.as_bytes())) + } +} + +#[derive(Eq, Ord, PartialEq, PartialOrd, Hash, derive_more::From)] +pub struct PyHgPathBuf(pub HgPathBuf); + +// This is for now equivalent to taking a ref as `HgPath` and using +// `HgPathRef`. One day, perhaps, this variant for owned data could be +// implemented without allocation. +impl<'py> IntoPyObject<'py> for PyHgPathBuf { + type Target = PyBytes; + type Output = Bound<'py, Self::Target>; + type Error = Infallible; + + fn into_pyobject( + self, + py: Python<'py>, + ) -> Result<Self::Output, Self::Error> { + Ok(PyBytes::new(py, self.0.as_bytes())) + } +} + +pub struct PyHgPathDirstateV2Result<'a>( + pub Result<&'a HgPath, DirstateV2ParseError>, +); + +impl<'py> IntoPyObject<'py> for PyHgPathDirstateV2Result<'_> { + type Target = PyBytes; + type Output = Bound<'py, Self::Target>; + type Error = PyErr; + + fn into_pyobject( + self, + py: Python<'py>, + ) -> Result<Self::Output, Self::Error> { + Ok(PyBytes::new( + py, + self.0.map_err(dirstate_v2_error)?.as_bytes(), + )) + } +} + +pub fn paths_py_list<I, U>( + py: Python<'_>, + paths: impl IntoIterator<Item = I, IntoIter = U>, +) -> PyResult<Py<PyList>> +where + I: AsRef<HgPath>, + U: ExactSizeIterator<Item = I>, +{ + Ok(PyList::new( + py, + paths + .into_iter() + .map(|p| PyBytes::new(py, p.as_ref().as_bytes())), + )? + .unbind()) +} + +pub fn paths_pyiter_collect<C>(paths: &Bound<'_, PyAny>) -> PyResult<C> +where + C: FromIterator<HgPathBuf>, +{ + paths + .try_iter()? + .map(|p| { + let path = p?; + Ok(HgPathBuf::from_bytes( + path.downcast::<PyBytes>()?.as_bytes(), + )) + }) + .collect() +}
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-pyo3/src/repo.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,21 @@ +use hg::{config::Config, repo::Repo, utils::files::get_path_from_bytes}; +use pyo3::{ + types::{PyBytes, PyBytesMethods}, + Bound, PyResult, +}; + +use crate::utils::HgPyErrExt; + +/// Get a repository from a given [`PyObject`] path, and bubble up any error +/// that comes up. +pub fn repo_from_path(repo_path: &Bound<'_, PyBytes>) -> PyResult<Repo> { + // TODO make the Config a Python class and downcast it here, otherwise we + // lose CLI args and runtime overrides done in Python. + let config = Config::load_non_repo().into_pyerr(repo_path.py())?; + let repo = Repo::find( + &config, + Some(get_path_from_bytes(repo_path.as_bytes()).to_owned()), + ) + .into_pyerr(repo_path.py())?; + Ok(repo) +}
--- a/rust/hg-pyo3/src/revision.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/hg-pyo3/src/revision.rs Fri Feb 28 23:28:10 2025 +0100 @@ -1,10 +1,11 @@ use pyo3::prelude::*; +use pyo3::types::{PyList, PySet}; use hg::revlog::RevlogIndex; use hg::{BaseRevision, Revision, UncheckedRevision}; -use crate::convert_cpython::proxy_index_extract; -use crate::exceptions::GraphError; +use crate::exceptions::{rev_not_in_index, GraphError}; +use crate::utils::proxy_index_extract; /// Revision as exposed to/from the Python layer. /// @@ -35,6 +36,22 @@ } } +impl From<PyRevision> for UncheckedRevision { + fn from(val: PyRevision) -> Self { + val.0.into() + } +} + +pub fn check_revision( + index: &impl RevlogIndex, + rev: impl Into<UncheckedRevision>, +) -> PyResult<Revision> { + let rev = rev.into(); + index + .check_revision(rev) + .ok_or_else(|| rev_not_in_index(rev)) +} + /// Utility function to convert a Python iterable into various collections /// /// We need this in particular @@ -89,3 +106,23 @@ }) .collect() } + +pub fn revs_py_list<U>( + py: Python<'_>, + revs: impl IntoIterator<Item = Revision, IntoIter = U>, +) -> PyResult<Py<PyList>> +where + U: ExactSizeIterator<Item = Revision>, +{ + Ok(PyList::new(py, revs.into_iter().map(PyRevision::from))?.unbind()) +} + +pub fn revs_py_set<U>( + py: Python<'_>, + revs: impl IntoIterator<Item = Revision, IntoIter = U>, +) -> PyResult<Py<PySet>> +where + U: ExactSizeIterator<Item = Revision>, +{ + Ok(PySet::new(py, revs.into_iter().map(PyRevision::from))?.unbind()) +}
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-pyo3/src/revlog/config.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,236 @@ +// revlog/config.rs +// +// Copyright 2020-2024 Raphaël Gomès <raphael.gomes@octobus.net> +// 2024 Georges Racinet <georges.racinet@cloudcrane.io> +// +// This software may be used and distributed according to the terms of the +// GNU General Public License version 2 or any later version. +use pyo3::conversion::FromPyObject; +use pyo3::exceptions::PyValueError; +use pyo3::intern; + +use pyo3::prelude::*; +use pyo3::types::{PyBytes, PyDict, PyDictMethods}; + +use std::sync::OnceLock; + +use hg::revlog::{ + compression::CompressionConfig, + options::{RevlogDataConfig, RevlogDeltaConfig, RevlogFeatureConfig}, + RevlogType, +}; + +/// Helper trait for configuration dicts +/// +/// In Mercurial, it is customary for such dicts to have bytes keys. +trait ConfigPyDict<'a, 'py: 'a, D: FromPyObject<'py>> { + fn extract_item(&'a self, key: &[u8]) -> PyResult<Option<D>>; +} + +impl<'a, 'py, D> ConfigPyDict<'a, 'py, D> for Bound<'py, PyDict> +where + 'py: 'a, + D: FromPyObject<'py>, +{ + fn extract_item(&'a self, key: &[u8]) -> PyResult<Option<D>> { + let py_item = self.get_item(PyBytes::new(self.py(), key))?; + match py_item { + Some(value) => { + if value.is_none() { + Ok(None) + } else { + Ok(Some(value.extract()?)) + } + } + None => Ok(None), + } + } +} + +/// Extraction helper for PyObject attributes. +/// +/// `$obj` is a `Bound('_, PyAny)` and `$attr` is a static String slice. +/// This is both syntactic sugar and more efficient than using `getattr()` +/// manually, as this uses [`intern!`] for efficiency. +/// +/// See the many examples in this module. +/// +/// This does not work to return references (e.g. bytes). Quoting the +/// compiler: "returns a value referencing data owned by the current function" +macro_rules! extract_attr { + ($obj: expr, $attr: expr) => { + $obj.getattr(intern!($obj.py(), $attr)) + .and_then(|a| a.extract()) + }; +} + +// There are no static generics in Rust (because their implementation is +// hard, I'm guessing it's due to different compilation stages, etc.). +// So manually generate all three caches and use them in +// `with_filelog_cache`. +static DELTA_CONFIG_CACHE: OnceLock<(PyObject, RevlogDeltaConfig)> = + OnceLock::new(); +static DATA_CONFIG_CACHE: OnceLock<(PyObject, RevlogDataConfig)> = + OnceLock::new(); +static FEATURE_CONFIG_CACHE: OnceLock<(PyObject, RevlogFeatureConfig)> = + OnceLock::new(); + +/// TODO don't do this and build a `Config` in Rust, expose it to Python and +/// downcast it (after refactoring Python to re-use the same config objects?). +/// +/// Cache the first conversion from Python of filelog config. Other +/// revlog types are not cached. +/// +/// All filelogs in a given repository *most likely* have the +/// exact same config, hence it makes a difference to look it up +/// from Python code only once, especially given that it can be in a +/// loop. +fn with_filelog_config_cache<T: Copy>( + py_config: &Bound<'_, PyAny>, + revlog_type: RevlogType, + cache: &OnceLock<(PyObject, T)>, + callback: impl Fn() -> PyResult<T>, +) -> PyResult<T> { + let mut was_cached = false; + if revlog_type == RevlogType::Filelog { + if let Some((cached_py_config, rust_config)) = cache.get() { + was_cached = true; + // it's not impossible that some extensions + // do some magic with configs or that this code will be used + // for longer-running processes. So compare the source + // `PyObject` in case the source changed, at + // the cost of some overhead. We can't use + // `py_config.eq(cached_py_config)` because all config + // objects are different in Python and `a is b` is false. + if py_config.compare(cached_py_config)?.is_eq() { + return Ok(*rust_config); + } + } + } + let config = callback()?; + // Do not call the lock unnecessarily if it's already been set. + if !was_cached && revlog_type == RevlogType::Filelog { + cache.set((py_config.clone().unbind(), config)).ok(); + } + Ok(config) +} + +pub fn extract_delta_config( + conf: &Bound<'_, PyAny>, + revlog_type: RevlogType, +) -> PyResult<RevlogDeltaConfig> { + with_filelog_config_cache(conf, revlog_type, &DELTA_CONFIG_CACHE, || { + let max_deltachain_span: i64 = + extract_attr!(conf, "max_deltachain_span")?; + let revlog_delta_config = RevlogDeltaConfig { + general_delta: extract_attr!(conf, "general_delta")?, + sparse_revlog: extract_attr!(conf, "sparse_revlog")?, + max_chain_len: extract_attr!(conf, "max_chain_len")?, + max_deltachain_span: if max_deltachain_span < 0 { + None + } else { + Some(max_deltachain_span as u64) + }, + upper_bound_comp: extract_attr!(conf, "upper_bound_comp")?, + delta_both_parents: extract_attr!(conf, "delta_both_parents")?, + candidate_group_chunk_size: extract_attr!( + conf, + "candidate_group_chunk_size" + )?, + debug_delta: extract_attr!(conf, "debug_delta")?, + lazy_delta: extract_attr!(conf, "lazy_delta")?, + lazy_delta_base: extract_attr!(conf, "lazy_delta_base")?, + }; + Ok(revlog_delta_config) + }) +} + +pub fn extract_data_config( + conf: &Bound<'_, PyAny>, + revlog_type: RevlogType, +) -> PyResult<RevlogDataConfig> { + with_filelog_config_cache(conf, revlog_type, &DATA_CONFIG_CACHE, || { + Ok(RevlogDataConfig { + try_pending: extract_attr!(conf, "try_pending")?, + try_split: extract_attr!(conf, "try_split")?, + check_ambig: extract_attr!(conf, "check_ambig")?, + mmap_large_index: extract_attr!(conf, "mmap_large_index")?, + mmap_index_threshold: extract_attr!(conf, "mmap_index_threshold")?, + chunk_cache_size: extract_attr!(conf, "chunk_cache_size")?, + uncompressed_cache_factor: extract_attr!( + conf, + "uncompressed_cache_factor" + )?, + uncompressed_cache_count: extract_attr!( + conf, + "uncompressed_cache_count" + )?, + with_sparse_read: extract_attr!(conf, "with_sparse_read")?, + sr_density_threshold: extract_attr!(conf, "sr_density_threshold")?, + sr_min_gap_size: extract_attr!(conf, "sr_min_gap_size")?, + general_delta: extract_attr!(conf, "generaldelta")?, + }) + }) +} + +fn extract_compression_config( + conf: &Bound<'_, PyAny>, +) -> PyResult<CompressionConfig> { + let compression_options: Bound<'_, PyDict> = + extract_attr!(conf, "compression_engine_options")?; + + let name_bound = conf.getattr("compression_engine")?; + let name_bytes: &[u8] = name_bound.extract()?; + + let compression_engine = match name_bytes { + b"zlib" => { + let level = compression_options.extract_item(b"zlib.level")?; + let mut engine = CompressionConfig::default(); + if let Some(level) = level { + engine + .set_level(level) + .expect("invalid compression level from Python"); + } + engine + } + b"zstd" => { + let zstd_level = + compression_options.extract_item(b"zstd.level")?; + let level = if let Some(level) = zstd_level { + Some(level) + } else { + compression_options.extract_item(b"level")? + }; + + CompressionConfig::zstd(level) + .expect("invalid compression level from Python") + } + b"none" => CompressionConfig::None, + unknown => { + return Err(PyValueError::new_err(format!( + "invalid compression engine {}", + String::from_utf8_lossy(unknown) + ))); + } + }; + Ok(compression_engine) +} + +pub fn extract_feature_config( + conf: &Bound<'_, PyAny>, + revlog_type: RevlogType, +) -> PyResult<RevlogFeatureConfig> { + with_filelog_config_cache(conf, revlog_type, &FEATURE_CONFIG_CACHE, || { + Ok(RevlogFeatureConfig { + compression_engine: extract_compression_config(conf)?, + censorable: extract_attr!(conf, "censorable")?, + has_side_data: extract_attr!(conf, "has_side_data")?, + compute_rank: extract_attr!(conf, "compute_rank")?, + canonical_parent_order: extract_attr!( + conf, + "canonical_parent_order" + )?, + enable_ellipsis: extract_attr!(conf, "enable_ellipsis")?, + }) + }) +}
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-pyo3/src/revlog/index.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,137 @@ +// revlog/index.rs +// +// Copyright 2019-2020 Georges Racinet <georges.racinet@octobus.net> +// 2020-2024 Raphaël Gomès <raphael.gomes@octobus.net> +// 2024 Georges Racinet <georges.racinet@cloudcrane.io> +// +// This software may be used and distributed according to the terms of the +// GNU General Public License version 2 or any later version. +//! Utilities for dealing with the index at the Python boundary +use hg::{BaseRevision, Graph}; +use pyo3::prelude::*; +use pyo3::types::{PyBytes, PyTuple}; +use vcsgraph::graph::Graph as VCSGraph; + +use hg::revlog::{ + index::{Index, RevisionDataParams}, + Node, Revision, RevlogIndex, +}; + +#[derive(derive_more::From, Clone)] +pub struct PySharedIndex { + /// The underlying hg-core index + inner: &'static Index, +} + +impl PySharedIndex { + /// Return a reference to the inner index, bound by `self` + pub fn inner(&self) -> &Index { + self.inner + } + + /// Return an unsafe "faked" `'static` reference to the inner index, for + /// the purposes of Python <-> Rust memory sharing. + pub unsafe fn static_inner(&self) -> &'static Index { + self.inner + } +} + +impl RevlogIndex for PySharedIndex { + fn len(&self) -> usize { + self.inner.len() + } + fn node(&self, rev: Revision) -> Option<&Node> { + self.inner.node(rev) + } +} + +impl Graph for PySharedIndex { + #[inline(always)] + fn parents(&self, rev: Revision) -> Result<[Revision; 2], hg::GraphError> { + self.inner.parents(rev) + } +} + +impl VCSGraph for PySharedIndex { + #[inline(always)] + fn parents( + &self, + rev: BaseRevision, + ) -> Result<vcsgraph::graph::Parents, vcsgraph::graph::GraphReadError> + { + // FIXME This trait should be reworked to decide between Revision + // and UncheckedRevision, get better errors names, etc. + match Graph::parents(self, Revision(rev)) { + Ok(parents) => { + Ok(vcsgraph::graph::Parents([parents[0].0, parents[1].0])) + } + Err(hg::GraphError::ParentOutOfRange(rev)) => { + Err(vcsgraph::graph::GraphReadError::KeyedInvalidKey(rev.0)) + } + Err(hg::GraphError::ParentOutOfOrder(rev)) => { + Err(vcsgraph::graph::GraphReadError::KeyedInvalidKey(rev.0)) + } + } + } +} + +pub fn py_tuple_to_revision_data_params( + tuple: &Bound<'_, PyTuple>, +) -> PyResult<RevisionDataParams> { + // no need to check length: in PyO3 tup.get_item() does return + // proper errors + let offset_or_flags: u64 = tuple.get_item(0)?.extract()?; + let node_id = tuple + .get_item(7)? + .downcast::<PyBytes>()? + .as_bytes() + .try_into() + .expect("nodeid should be set"); + let flags = (offset_or_flags & 0xFFFF) as u16; + let data_offset = offset_or_flags >> 16; + Ok(RevisionDataParams { + flags, + data_offset, + data_compressed_length: tuple.get_item(1)?.extract()?, + data_uncompressed_length: tuple.get_item(2)?.extract()?, + data_delta_base: tuple.get_item(3)?.extract()?, + link_rev: tuple.get_item(4)?.extract()?, + parent_rev_1: tuple.get_item(5)?.extract()?, + parent_rev_2: tuple.get_item(6)?.extract()?, + node_id, + ..Default::default() + }) +} + +pub fn revision_data_params_to_py_tuple( + py: Python<'_>, + params: RevisionDataParams, +) -> PyResult<Bound<'_, PyTuple>> { + PyTuple::new( + py, + &[ + params.data_offset.into_pyobject(py)?.into_any(), + params.data_compressed_length.into_pyobject(py)?.into_any(), + params + .data_uncompressed_length + .into_pyobject(py)? + .into_any(), + params.data_delta_base.into_pyobject(py)?.into_any(), + params.link_rev.into_pyobject(py)?.into_any(), + params.parent_rev_1.into_pyobject(py)?.into_any(), + params.parent_rev_2.into_pyobject(py)?.into_any(), + PyBytes::new(py, ¶ms.node_id).into_any().into_any(), + params._sidedata_offset.into_pyobject(py)?.into_any(), + params + ._sidedata_compressed_length + .into_pyobject(py)? + .into_any(), + params.data_compression_mode.into_pyobject(py)?.into_any(), + params + ._sidedata_compression_mode + .into_pyobject(py)? + .into_any(), + params._rank.into_pyobject(py)?.into_any(), + ], + ) +}
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-pyo3/src/revlog/mod.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,1613 @@ +// revlog.rs +// +// Copyright 2019-2020 Georges Racinet <georges.racinet@octobus.net> +// 2020-2024 Raphaël Gomès <raphael.gomes@octobus.net> +// 2024 Georges Racinet <georges.racinet@cloudcrane.io> +// +// This software may be used and distributed according to the terms of the +// GNU General Public License version 2 or any later version. +#![allow(non_snake_case)] +use hg::revlog::index::IndexHeader; +use hg::revlog::nodemap::Block; +use hg::utils::files::get_bytes_from_path; +use pyo3::buffer::PyBuffer; +use pyo3::conversion::IntoPyObject; +use pyo3::exceptions::{PyIndexError, PyTypeError, PyValueError}; +use pyo3::types::{ + PyBool, PyBytes, PyBytesMethods, PyDict, PyList, PySet, PyTuple, +}; +use pyo3::{prelude::*, IntoPyObjectExt}; +use pyo3_sharedref::{PyShareable, SharedByPyObject}; + +use std::collections::{HashMap, HashSet}; +use std::os::fd::AsRawFd; +use std::sync::{ + atomic::{AtomicUsize, Ordering}, + RwLock, RwLockReadGuard, RwLockWriteGuard, +}; + +use hg::{ + errors::HgError, + revlog::{ + index::{ + Index, Phase, RevisionDataParams, SnapshotsCache, INDEX_ENTRY_SIZE, + }, + inner_revlog::InnerRevlog as CoreInnerRevlog, + nodemap::{NodeMap, NodeMapError, NodeTree as CoreNodeTree}, + options::RevlogOpenOptions, + RevlogError, RevlogIndex, RevlogType, + }, + utils::files::get_path_from_bytes, + vfs::FnCacheVfs, + BaseRevision, Revision, UncheckedRevision, NULL_REVISION, +}; + +use crate::utils::PyBytesDeref; +use crate::{ + exceptions::{ + graph_error, map_lock_error, map_try_lock_error, nodemap_error, + rev_not_in_index, revlog_error_bare, revlog_error_from_msg, + }, + node::{node_from_py_bytes, node_prefix_from_py_bytes, py_node_for_rev}, + revision::{ + check_revision, rev_pyiter_collect, rev_pyiter_collect_or_else, + revs_py_list, revs_py_set, PyRevision, + }, + store::PyFnCache, + transaction::PyTransaction, + utils::{new_submodule, take_buffer_with_slice, with_pybytes_buffer}, +}; + +mod config; +use config::*; +mod index; +pub use index::PySharedIndex; +use index::{ + py_tuple_to_revision_data_params, revision_data_params_to_py_tuple, +}; + +#[pyclass] +struct ReadingContextManager { + inner_revlog: Py<InnerRevlog>, +} + +#[pymethods] +impl ReadingContextManager { + fn __enter__(slf: PyRef<'_, Self>) -> PyResult<()> { + let inner_bound = slf.inner_revlog.bind(slf.py()); + let shareable = &inner_bound.borrow().irl; + // Safety: the owner is correct and we won't use `share()` anyway + let core_irl = + unsafe { shareable.borrow_with_owner(inner_bound) }.read(); + core_irl + .enter_reading_context() + .map_err(revlog_error_from_msg) + .inspect_err(|_e| { + // `__exit__` is not called from Python if `__enter__` fails + core_irl.exit_reading_context(); + }) + } + + #[pyo3(signature = (*_args))] + fn __exit__(slf: PyRef<'_, Self>, _args: &Bound<'_, PyTuple>) { + let inner_bound = slf.inner_revlog.bind(slf.py()); + let shareable = &inner_bound.borrow().irl; + // Safety: the owner is correct and we won't use `share()` anyway + let core_irl_ref = unsafe { shareable.borrow_with_owner(inner_bound) }; + core_irl_ref.read().exit_reading_context(); + } +} + +#[pyclass] +struct WritingContextManager { + inner_revlog: Py<InnerRevlog>, + transaction: RwLock<PyTransaction>, + data_end: Option<usize>, +} + +#[pymethods] +impl WritingContextManager { + fn __enter__(slf: PyRefMut<'_, Self>) -> PyResult<()> { + let inner_bound = slf.inner_revlog.bind(slf.py()); + let shareable = &inner_bound.borrow_mut().irl; + // Safety: the owner is correct and we won't use `share()` anyway + let mut core_irl = + unsafe { shareable.borrow_with_owner(inner_bound) }.write(); + core_irl + .enter_writing_context( + slf.data_end, + &mut *slf + .transaction + .try_write() + .expect("transaction should be protected by the GIL"), + ) + .map_err(revlog_error_from_msg) + .inspect_err(|_e| { + // `__exit__` is not called from Python if `__enter__` fails + core_irl.exit_writing_context(); + }) + } + + #[pyo3(signature = (*_args))] + fn __exit__(slf: PyRef<'_, Self>, _args: &Bound<'_, PyTuple>) { + let inner_bound = slf.inner_revlog.bind(slf.py()); + let shareable = &inner_bound.borrow().irl; + // Safety: the owner is correct and we won't use `share()` anyway + let core_irl_ref = unsafe { shareable.borrow_with_owner(inner_bound) }; + core_irl_ref.write().exit_writing_context(); + } +} + +struct PySnapshotsCache<'a, 'py: 'a>(&'a Bound<'py, PyDict>); + +impl<'a, 'py> PySnapshotsCache<'a, 'py> { + fn insert_for_with_py_result( + &self, + rev: BaseRevision, + value: BaseRevision, + ) -> PyResult<()> { + match self.0.get_item(rev)? { + Some(obj) => obj.downcast::<PySet>()?.add(value), + None => { + let set = PySet::new(self.0.py(), vec![value])?; + self.0.set_item(rev, set) + } + } + } +} + +impl<'a, 'py> SnapshotsCache for PySnapshotsCache<'a, 'py> { + fn insert_for( + &mut self, + rev: BaseRevision, + value: BaseRevision, + ) -> Result<(), RevlogError> { + self.insert_for_with_py_result(rev, value).map_err(|_| { + RevlogError::Other(HgError::unsupported( + "Error in Python caches handling", + )) + }) + } +} + +// Only used from Python *tests* +#[doc(hidden)] +#[pyclass] +pub struct PyFileHandle { + inner_file: std::os::fd::RawFd, +} + +#[pymethods] +impl PyFileHandle { + #[new] + fn new(handle: std::os::fd::RawFd) -> Self { + Self { inner_file: handle } + } + + fn tell(&self, py: Python<'_>) -> PyResult<PyObject> { + let locals = PyDict::new(py); + locals.set_item("os", py.import("os")?)?; + locals.set_item("fd", self.inner_file)?; + let f = py.eval(c"os.fdopen(fd)", None, Some(&locals))?; + + // Prevent Python from closing the file after garbage collecting. + // This is fine since Rust is still holding on to the actual File. + // (and also because it's only used in tests). + std::mem::forget(f.clone()); + + locals.set_item("f", f)?; + let res = py.eval(c"f.tell()", None, Some(&locals))?; + Ok(res.unbind()) + } +} + +#[pyclass] +#[allow(dead_code)] +pub(crate) struct InnerRevlog { + pub(crate) irl: PyShareable<CoreInnerRevlog>, + nt: RwLock<Option<CoreNodeTree>>, + docket: Option<PyObject>, + // Holds a reference to the mmap'ed persistent nodemap data + nodemap_mmap: Option<PyBuffer<u8>>, + // Holds a reference to the mmap'ed persistent index data + index_mmap: Option<PyBuffer<u8>>, + revision_cache: Option<PyObject>, + head_revs_py_list: Option<Py<PyList>>, + head_node_ids_py_list: Option<Py<PyList>>, + use_persistent_nodemap: bool, + nodemap_queries: AtomicUsize, +} + +#[pymethods] +impl InnerRevlog { + #[new] + // The Python side has authority on this signature. + #[allow(clippy::too_many_arguments)] + fn new( + vfs_base: &Bound<'_, PyBytes>, + fncache: &Bound<'_, PyAny>, + vfs_is_readonly: bool, + index_data: &Bound<'_, PyAny>, + index_file: &Bound<'_, PyBytes>, + data_file: &Bound<'_, PyBytes>, + sidedata_file: &Bound<'_, PyAny>, + inline: bool, + data_config: &Bound<'_, PyAny>, + delta_config: &Bound<'_, PyAny>, + feature_config: &Bound<'_, PyAny>, + chunk_cache: &Bound<'_, PyAny>, + default_compression_header: &Bound<'_, PyAny>, + revlog_type: usize, + use_persistent_nodemap: bool, + ) -> PyResult<Self> { + // Let clippy accept the unused arguments. This is a bit better than + // a blank `allow` directive + let _ = sidedata_file; + let _ = chunk_cache; + let _ = default_compression_header; + + let index_file = get_path_from_bytes(index_file.as_bytes()).to_owned(); + let data_file = get_path_from_bytes(data_file.as_bytes()).to_owned(); + let revlog_type = RevlogType::try_from(revlog_type) + .map_err(revlog_error_from_msg)?; + let data_config = extract_data_config(data_config, revlog_type)?; + let delta_config = extract_delta_config(delta_config, revlog_type)?; + let feature_config = + extract_feature_config(feature_config, revlog_type)?; + let options = RevlogOpenOptions::new( + inline, + data_config, + delta_config, + feature_config, + ); + + // Safety: we keep the buffer around inside the returned instance as + // `index_mmap` + let (buf, bytes) = unsafe { take_buffer_with_slice(index_data)? }; + let index = Index::new(bytes, options.index_header()) + .map_err(revlog_error_from_msg)?; + + let base = get_path_from_bytes(vfs_base.as_bytes()).to_owned(); + let core = CoreInnerRevlog::new( + Box::new(FnCacheVfs::new( + base, + vfs_is_readonly, + Box::new(PyFnCache::new(fncache.clone().unbind())), + )), + index, + index_file, + data_file, + data_config, + delta_config, + feature_config, + ); + Ok(Self { + irl: core.into(), + nt: None.into(), + docket: None, + nodemap_mmap: None, + index_mmap: buf.into(), + head_revs_py_list: None, + head_node_ids_py_list: None, + revision_cache: None, + use_persistent_nodemap, + nodemap_queries: AtomicUsize::new(0), + }) + } + + #[getter] + fn canonical_index_file( + slf: &Bound<'_, Self>, + py: Python<'_>, + ) -> PyResult<Py<PyBytes>> { + Self::with_core_read(slf, |_self_ref, irl| { + let path = irl.canonical_index_file(); + Ok(PyBytes::new(py, &get_bytes_from_path(path)).into()) + }) + } + + #[getter] + fn is_delaying(slf: &Bound<'_, Self>) -> PyResult<bool> { + Self::with_core_read(slf, |_self_ref, irl| Ok(irl.is_delaying())) + } + + #[getter] + fn inline(slf: &Bound<'_, Self>) -> PyResult<bool> { + Self::with_core_read(slf, |_self_ref, irl| Ok(irl.is_inline())) + } + + #[setter] + fn set_inline(slf: &Bound<'_, Self>, inline: bool) -> PyResult<()> { + Self::with_core_write(slf, |_self_ref, mut irl| { + irl.inline = inline; + Ok(()) + }) + } + + #[getter] + fn is_writing(slf: &Bound<'_, Self>) -> PyResult<bool> { + Self::with_core_read(slf, |_self_ref, irl| Ok(irl.is_writing())) + } + + #[getter] + fn is_open(slf: &Bound<'_, Self>) -> PyResult<bool> { + Self::with_core_read(slf, |_self_ref, irl| Ok(irl.is_open())) + } + + #[getter] + fn _revisioncache(&self, py: Python<'_>) -> PyResult<PyObject> { + match &self.revision_cache { + None => Ok(py.None()), + Some(cache) => Ok(cache.clone_ref(py)), + } + } + + #[setter] + fn set__revisioncache( + slf: &Bound<'_, Self>, + py: Python<'_>, + value: Option<PyObject>, + ) -> PyResult<()> { + let mut self_ref = slf.borrow_mut(); + self_ref.revision_cache = value.as_ref().map(|v| v.clone_ref(py)); + + match value { + None => { + // This means the property has been deleted, *not* that the + // property has been set to `None`. Whatever happens is up + // to the implementation. Here we just set it to `None`. + self_ref.revision_cache.take(); + } + Some(tuple) => { + if tuple.is_none(py) { + self_ref.revision_cache.take(); + return Ok(()); + } + drop(self_ref); + let tuple: &Bound<'_, PyTuple> = tuple.downcast_bound(py)?; + let node = tuple.get_item(0)?; + let node = node_from_py_bytes(node.downcast()?)?; + let rev: BaseRevision = tuple.get_item(1)?.extract()?; + // Ok because Python only sets this if the revision has been + // checked + let rev = Revision(rev); + let data = tuple.get_item(2)?; + let bytes = data.downcast_into::<PyBytes>()?.unbind(); + Self::with_core_read(slf, |_self_ref, irl| { + let mut last_revision_cache = irl + .last_revision_cache + .lock() + .expect("lock should not be held"); + *last_revision_cache = Some(( + node, + rev, + Box::new(PyBytesDeref::new(py, bytes)), + )); + Ok(()) + })?; + } + } + Ok(()) + } + + #[getter] + fn index_file( + slf: &Bound<'_, Self>, + py: Python<'_>, + ) -> PyResult<Py<PyBytes>> { + Self::with_core_read(slf, |_self_ref, irl| { + let path = get_bytes_from_path(&irl.index_file); + Ok(PyBytes::new(py, &path).unbind()) + }) + } + + #[setter] + fn set_index_file( + slf: &Bound<'_, Self>, + path: &Bound<'_, PyBytes>, + ) -> PyResult<()> { + Self::with_core_write(slf, |_self_ref, mut irl| { + let path = get_path_from_bytes(path.as_bytes()); + path.clone_into(&mut irl.index_file); + Ok(()) + }) + } + + // This is only used in Python *tests* + #[getter] + #[doc(hidden)] + fn _writinghandles( + slf: &Bound<'_, Self>, + py: Python<'_>, + ) -> PyResult<PyObject> { + Self::with_core_read(slf, |_self_ref, irl| { + let handles = irl.python_writing_handles(); + match handles.as_ref() { + None => Ok(py.None()), + Some(handles) => { + let index_handle = PyFileHandle::new( + handles.index_handle.file.as_raw_fd(), + ); + let data_handle = handles + .data_handle + .as_ref() + .map(|h| PyFileHandle::new(h.file.as_raw_fd())); + Ok(PyTuple::new( + py, + &[ + index_handle.into_py_any(py)?, + data_handle.into_py_any(py)?, + py.None(), // Sidedata handle + ], + )? + .unbind() + .into()) + } + } + }) + } + + fn clear_cache(slf: &Bound<'_, Self>) -> PyResult<PyObject> { + assert!(!Self::is_delaying(slf)?); + let mut self_ref = slf.borrow_mut(); + self_ref.revision_cache.take(); + self_ref.nodemap_queries.store(0, Ordering::Relaxed); + drop(self_ref); + + Self::with_core_write(slf, |_self_ref, mut irl| { + irl.clear_cache(); + Ok(slf.py().None()) + }) + } + + fn issnapshot(slf: &Bound<'_, Self>, rev: PyRevision) -> PyResult<bool> { + Self::_index_issnapshot(slf, rev) + } + + #[pyo3(signature = (rev, stoprev=None))] + fn _deltachain( + slf: &Bound<'_, Self>, + py: Python<'_>, + rev: PyRevision, + stoprev: Option<PyRevision>, + ) -> PyResult<Py<PyTuple>> { + Self::_index_deltachain(slf, py, rev, stoprev) + } + + fn compress( + slf: &Bound<'_, Self>, + py: Python<'_>, + data: &Bound<'_, PyAny>, + ) -> PyResult<Py<PyTuple>> { + Self::with_core_read(slf, |_self_ref, irl| { + // Safety: we only hold on to the data for as long as `_buf` + // is alive + let (_buf, data) = unsafe { take_buffer_with_slice(data)? }; + let compressed = + irl.compress(&data).map_err(revlog_error_from_msg)?; + let compressed = compressed.as_deref(); + let header = if compressed.is_some() { + PyBytes::new(py, &b""[..]) + } else { + PyBytes::new(py, &b"u"[..]) + }; + Ok(PyTuple::new( + py, + &[header, PyBytes::new(py, compressed.unwrap_or(&data))], + )? + .unbind()) + }) + } + + #[pyo3(signature = (tr, header, new_index_file_path=None))] + fn split_inline( + slf: &Bound<'_, Self>, + py: Python<'_>, + tr: PyObject, + header: i32, + new_index_file_path: Option<&Bound<'_, PyBytes>>, + ) -> PyResult<Py<PyBytes>> { + // Also unused in Python, TODO clean this up. + let _ = tr; + + Self::with_core_write(slf, |_self_ref, mut irl| { + let new_index_file_path = new_index_file_path + .map(|path| get_path_from_bytes(path.as_bytes()).to_owned()); + let header = IndexHeader::parse(&header.to_be_bytes()) + .expect("invalid header bytes"); + let old_path = irl + .split_inline(header, new_index_file_path) + .map_err(revlog_error_from_msg)?; + Ok(PyBytes::new(py, &get_bytes_from_path(old_path)).unbind()) + }) + } + + fn get_segment_for_revs( + slf: &Bound<'_, Self>, + py: Python<'_>, + startrev: PyRevision, + endrev: PyRevision, + ) -> PyResult<Py<PyTuple>> { + Self::with_core_read(slf, |_self_ref, irl| { + // Here both revisions only come from revlog code, so we assume + // them to be valid. + // Panics will alert the offending programmer if not. + let (offset, data) = irl + .get_segment_for_revs(Revision(startrev.0), Revision(endrev.0)) + .map_err(revlog_error_from_msg)?; + let data = PyBytes::new(py, &data); + Ok(PyTuple::new( + py, + &[offset.into_py_any(py)?, data.into_py_any(py)?], + )? + .unbind()) + }) + } + + fn raw_text( + slf: &Bound<'_, Self>, + py: Python<'_>, + _node: PyObject, + rev: PyRevision, + ) -> PyResult<Py<PyBytes>> { + Self::with_core_read(slf, |_self_ref, irl| { + let mut py_bytes = PyBytes::new(py, &[]).unbind(); + irl.raw_text(Revision(rev.0), |size, f| { + py_bytes = with_pybytes_buffer(py, size, f)?; + Ok(()) + }) + .map_err(revlog_error_from_msg)?; + Ok(py_bytes) + }) + } + + #[allow(clippy::too_many_arguments)] + #[pyo3(signature = ( + transaction, + entry, + data, + _link, + offset, + _sidedata, + _sidedata_offset, + index_end, + data_end, + _sidedata_end + ))] + fn write_entry( + slf: &Bound<'_, Self>, + py: Python<'_>, + transaction: PyObject, + entry: &Bound<'_, PyBytes>, + data: &Bound<'_, PyTuple>, + // TODO remove and also from Python + _link: PyObject, + offset: usize, + // Other underscore args are for revlog-v2, which is unimplemented + _sidedata: PyObject, + _sidedata_offset: u64, + index_end: Option<u64>, + data_end: Option<u64>, + _sidedata_end: Option<u64>, + ) -> PyResult<Py<PyTuple>> { + Self::with_core_write(slf, |_self_ref, mut irl| { + let transaction = PyTransaction::new(transaction); + let header = data.get_borrowed_item(0)?; + let header = header.downcast::<PyBytes>()?; + let data = data.get_borrowed_item(1)?; + let data = data.downcast::<PyBytes>()?; + let (idx_pos, data_pos) = irl + .write_entry( + transaction, + entry.as_bytes(), + (header.as_bytes(), data.as_bytes()), + offset, + index_end, + data_end, + ) + .map_err(revlog_error_from_msg)?; + let tuple = PyTuple::new( + py, + [idx_pos.into_py_any(py)?, data_pos.into_py_any(py)?], + )?; + Ok(tuple.unbind()) + }) + } + + fn delay( + slf: &Bound<'_, Self>, + py: Python<'_>, + ) -> PyResult<Option<Py<PyBytes>>> { + Self::with_core_write(slf, |_self_ref, mut irl| { + let path = irl.delay().map_err(revlog_error_from_msg)?; + Ok(path + .map(|p| PyBytes::new(py, &get_bytes_from_path(p)).unbind())) + }) + } + + fn write_pending( + slf: &Bound<'_, Self>, + py: Python<'_>, + ) -> PyResult<Py<PyTuple>> { + Self::with_core_write(slf, |_self_ref, mut irl| { + let (path, any_pending) = + irl.write_pending().map_err(revlog_error_from_msg)?; + let maybe_path = match path { + Some(path) => PyBytes::new(py, &get_bytes_from_path(path)) + .unbind() + .into_any(), + None => py.None(), + }; + Ok( + PyTuple::new(py, [maybe_path, any_pending.into_py_any(py)?])? + .unbind(), + ) + }) + } + + fn finalize_pending( + slf: &Bound<'_, Self>, + py: Python<'_>, + ) -> PyResult<Py<PyBytes>> { + Self::with_core_write(slf, |_self_ref, mut irl| { + let path = + irl.finalize_pending().map_err(revlog_error_from_msg)?; + Ok(PyBytes::new(py, &get_bytes_from_path(path)).unbind()) + }) + } + + fn _chunk( + slf: &Bound<'_, Self>, + py: Python<'_>, + rev: PyRevision, + ) -> PyResult<Py<PyBytes>> { + Self::with_core_read(slf, |_self_ref, irl| { + let chunk = irl + .chunk_for_rev(Revision(rev.0)) + .map_err(revlog_error_from_msg)?; + Ok(PyBytes::new(py, &chunk).unbind()) + }) + } + + fn reading(slf: &Bound<'_, Self>) -> PyResult<ReadingContextManager> { + Ok(ReadingContextManager { + inner_revlog: slf.clone().unbind(), + }) + } + + #[pyo3(signature = (transaction, data_end=None, sidedata_end=None))] + fn writing( + slf: &Bound<'_, Self>, + transaction: PyObject, + data_end: Option<usize>, + sidedata_end: Option<usize>, + ) -> PyResult<WritingContextManager> { + // Only useful in revlog v2 + let _ = sidedata_end; + Ok(WritingContextManager { + inner_revlog: slf.clone().unbind(), + transaction: RwLock::new(PyTransaction::new(transaction)), + data_end, + }) + } + + // + // -- forwarded index methods -- + // + + fn _index_get_rev( + slf: &Bound<'_, Self>, + node: &Bound<'_, PyBytes>, + ) -> PyResult<Option<PyRevision>> { + let node = node_from_py_bytes(node)?; + + // Do not rewrite this with `Self::with_index_nt_read`: it makes + // inconditionally a volatile nodetree, and that is not the intent + // here: the code below specifically avoids that. + Self::with_core_read(slf, |self_ref, irl| { + let idx = &irl.index; + + let prev_queries = + self_ref.nodemap_queries.fetch_add(1, Ordering::Relaxed); + // Filelogs have no persistent nodemaps and are often small, + // use a brute force lookup from the end + // backwards. If there is a very large filelog + // (automation file that changes every + // commit etc.), it also seems to work quite well for + // all measured purposes so far. + if !self_ref.use_persistent_nodemap && prev_queries <= 3 { + return Ok(idx + .rev_from_node_no_persistent_nodemap(node.into()) + .ok() + .map(Into::into)); + } + + let opt = + self_ref.get_nodetree(idx)?.read().map_err(map_lock_error)?; + let nt = opt.as_ref().expect("nodetree should be set"); + + let rust_rev = + nt.find_bin(idx, node.into()).map_err(nodemap_error)?; + Ok(rust_rev.map(Into::into)) + }) + } + + /// same as `_index_get_rev()` but raises a bare `error.RevlogError` if + /// node is not found. + /// + /// No need to repeat `node` in the exception, `mercurial/revlog.py` + /// will catch and rewrap with it + fn _index_rev( + slf: &Bound<'_, Self>, + node: &Bound<'_, PyBytes>, + ) -> PyResult<PyRevision> { + Self::_index_get_rev(slf, node)?.ok_or_else(revlog_error_bare) + } + + /// return True if the node exist in the index + fn _index_has_node( + slf: &Bound<'_, Self>, + node: &Bound<'_, PyBytes>, + ) -> PyResult<bool> { + Self::_index_get_rev(slf, node).map(|opt| opt.is_some()) + } + + /// find length of shortest hex nodeid of a binary ID + fn _index_shortest( + slf: &Bound<'_, Self>, + node: &Bound<'_, PyBytes>, + ) -> PyResult<usize> { + Self::with_index_nt_read(slf, |idx, nt| { + match nt.unique_prefix_len_node(idx, &node_from_py_bytes(node)?) { + Ok(Some(l)) => Ok(l), + Ok(None) => Err(revlog_error_bare()), + Err(e) => Err(nodemap_error(e)), + } + }) + } + + fn _index_partialmatch<'py>( + slf: &Bound<'py, Self>, + node: &Bound<'py, PyBytes>, + ) -> PyResult<Option<Bound<'py, PyBytes>>> { + Self::with_index_nt_read(slf, |idx, nt| { + Ok(nt + .find_bin(idx, node_prefix_from_py_bytes(node)?) + .map_err(nodemap_error)? + .map(|rev| py_node_for_rev(slf.py(), idx, rev))) + }) + } + + /// append an index entry + fn _index_append( + slf: &Bound<'_, Self>, + tup: &Bound<'_, PyTuple>, + ) -> PyResult<()> { + // no need to check length: in PyO3 tup.get_item() does return + // proper errors + let node_bytes = tup.get_item(7)?.extract()?; + let node = node_from_py_bytes(&node_bytes)?; + + Self::with_index_nt_write(slf, |idx, nt| { + let rev = idx.len() as BaseRevision; + // This is ok since we will immediately add the revision to the + // index + let rev = Revision(rev); + idx.append(py_tuple_to_revision_data_params(tup)?) + .map_err(revlog_error_from_msg)?; + + nt.insert(idx, &node, rev).map_err(nodemap_error)?; + Ok(()) + }) + } + + /// Removes one or several entries from the index. + /// + /// Historically, on the Mercurial revlog index, `__delitem__` has always + /// been both for `del idx[r1]` and `del idx[r1:r2]`. In both cases, + /// all entries starting from `r1` are removed anyway. + fn _index___delitem__( + slf: &Bound<'_, Self>, + arg: &Bound<'_, PyAny>, + ) -> PyResult<()> { + let start = if let Ok(rev) = arg.extract() { + UncheckedRevision(rev) + } else { + // here we could downcast to `PySlice` and use `indices()`, *but* + // the rust-cpython based version could not do that, and + // `indices()` does some resolving that makes it not equivalent, + // e.g., `idx[-1::]` has `start=0`. As we are currently in + // transition, we keep it the old way (hoping it was consistent + // with the C index). + let start = arg.getattr("start")?; + UncheckedRevision(start.extract()?) + }; + + Self::with_index_nt_write(slf, |idx, nt| { + // In the case of a slice, the check is possibly already done by + // `slice.indices`, which is itself an FFI wrapper for CPython's + // `PySlice_GetIndicesEx` + // (Python integration tests will tell us) + let start = idx.check_revision(start).ok_or_else(|| { + nodemap_error(NodeMapError::RevisionNotInIndex(start)) + })?; + idx.remove(start).map_err(revlog_error_from_msg)?; + nt.invalidate_all(); + Self::fill_nodemap(idx, nt)?; + Ok(()) + }) + } + + /// return the gca set of the given revs + #[pyo3(signature = (*revs))] + fn _index_ancestors( + slf: &Bound<'_, Self>, + revs: &Bound<'_, PyTuple>, + ) -> PyResult<PyObject> { + Self::with_index_read(slf, |idx| { + let revs: Vec<_> = rev_pyiter_collect(revs, idx)?; + Ok(PyList::new( + slf.py(), + idx.ancestors(&revs) + .map_err(graph_error)? + .into_iter() + .map(PyRevision::from), + )? + .into_any() + .unbind()) + }) + } + + /// return the heads of the common ancestors of the given revs + #[pyo3(signature = (*revs))] + fn _index_commonancestorsheads( + slf: &Bound<'_, Self>, + revs: &Bound<'_, PyTuple>, + ) -> PyResult<Py<PyList>> { + Self::with_index_read(slf, |idx| { + let revs: Vec<_> = rev_pyiter_collect(revs, idx)?; + revs_py_list( + slf.py(), + idx.common_ancestor_heads(&revs).map_err(graph_error)?, + ) + }) + } + + /// Clear the index caches and inner py_class data. + /// It is Python's responsibility to call `update_nodemap_data` again. + fn _index_clearcaches(slf: &Bound<'_, Self>) -> PyResult<()> { + Self::with_index_write(slf, |idx| { + idx.clear_caches(); + Ok(()) + })?; + + let mut self_ref = slf.borrow_mut(); + self_ref.nt.write().map_err(map_lock_error)?.take(); + self_ref.docket.take(); + self_ref.nodemap_mmap.take(); + self_ref.head_revs_py_list.take(); + self_ref.head_node_ids_py_list.take(); + Ok(()) + } + + /// return the raw binary string representing a revision + fn _index_entry_binary( + slf: &Bound<'_, Self>, + rev: PyRevision, + ) -> PyResult<Py<PyBytes>> { + let rev: UncheckedRevision = rev.into(); + Self::with_index_read(slf, |idx| { + idx.check_revision(rev) + .and_then(|r| idx.entry_binary(r)) + .map(|rust_bytes| PyBytes::new(slf.py(), rust_bytes).unbind()) + .ok_or_else(|| rev_not_in_index(rev)) + }) + } + + /// return a binary packed version of the header + fn _index_pack_header( + slf: &Bound<'_, Self>, + header: i32, + ) -> PyResult<Py<PyBytes>> { + let packed = + Self::with_index_read(slf, |idx| Ok(idx.pack_header(header)))?; + Ok(PyBytes::new(slf.py(), &packed).unbind()) + } + + /// compute phases + fn _index_computephasesmapsets( + slf: &Bound<'_, Self>, + py: Python<'_>, + roots: &Bound<'_, PyDict>, + ) -> PyResult<Py<PyTuple>> { + let (len, phase_maps) = Self::with_index_read(slf, |idx| { + let extracted_roots: PyResult<HashMap<Phase, Vec<Revision>>> = + roots + .iter() + .map(|(phase, revs)| { + let phase = Phase::try_from(phase.extract::<usize>()?) + .map_err(|_| revlog_error_bare())?; + let revs: Vec<Revision> = + rev_pyiter_collect(&revs, idx)?; + Ok((phase, revs)) + }) + .collect(); + idx.compute_phases_map_sets(extracted_roots?) + .map_err(graph_error) + })?; + // Ugly hack, but temporary (!) + const IDX_TO_PHASE_NUM: [usize; 4] = [1, 2, 32, 96]; + let py_phase_maps = PyDict::new(py); + for (i, roots) in phase_maps.into_iter().enumerate() { + py_phase_maps.set_item( + IDX_TO_PHASE_NUM[i], + revs_py_set(py, roots)?.into_any(), + )?; + } + Ok((len, py_phase_maps).into_pyobject(py)?.unbind()) + } + + /// reachableroots + #[pyo3(signature = (*args))] + fn _index_reachableroots2( + slf: &Bound<'_, Self>, + py: Python<'_>, + args: &Bound<'_, PyTuple>, + ) -> PyResult<Py<PyList>> { + // TODO what was the point of having a signature with variable args? + let min_root = UncheckedRevision(args.get_item(0)?.extract()?); + let heads = args.get_item(1)?; + let roots = args.get_item(2)?; + let include_path: bool = args.get_item(3)?.extract()?; + + let as_set = Self::with_index_read(slf, |idx| { + let heads = rev_pyiter_collect_or_else(&heads, idx, |_rev| { + PyIndexError::new_err("head out of range") + })?; + let roots: Result<_, _> = roots + .try_iter()? + .map(|r| { + r.and_then(|o| match o.extract::<PyRevision>() { + Ok(r) => Ok(UncheckedRevision(r.0)), + Err(e) => Err(e), + }) + }) + .collect(); + idx.reachable_roots(min_root, heads, roots?, include_path) + .map_err(graph_error) + })?; + + revs_py_list(py, as_set) + } + + #[pyo3(signature = (*args))] + fn _index_headrevs( + slf: &Bound<'_, Self>, + py: Python<'_>, + args: &Bound<'_, PyTuple>, + ) -> PyResult<Py<PyList>> { + let (filtered_revs, stop_rev) = match args.len() { + 0 => Ok((None, None)), + 1 => Ok((Some(args.get_item(0)?), None)), + 2 => Ok((Some(args.get_item(0)?), Some(args.get_item(1)?))), + _ => Err(PyTypeError::new_err("too many arguments")), + }?; + let stop_rev = stop_rev + .map(|o| o.extract::<Option<i32>>()) + .transpose()? + .flatten(); + let filtered_revs = filtered_revs.filter(|o| !o.is_none()); + + let (from_core, stop_rev) = Self::with_index_read(slf, |idx| { + let stop_rev = stop_rev + // should this not just be the normal checking? + .filter(|rev| 0 <= *rev && *rev < idx.len() as BaseRevision) + .map(Revision); + + let from_core = if let Some(filtered_revs) = filtered_revs { + let filtered_revs = rev_pyiter_collect(&filtered_revs, idx)?; + idx.head_revs_advanced( + &filtered_revs, + stop_rev, + stop_rev.is_none(), + ) + } else if stop_rev.is_some() { + idx.head_revs_advanced(&HashSet::new(), stop_rev, false) + } else { + idx.head_revs_shortcut() + } + .map_err(graph_error)?; + Ok((from_core, stop_rev)) + })?; + + if stop_rev.is_some() { + // we don't cache result for now + let new_heads = + from_core.expect("this case should not be cached yet"); + + revs_py_list(py, new_heads) + } else { + if let Some(new_heads) = from_core { + Self::cache_new_heads_py_list(slf, new_heads)?; + } + + Ok(slf + .borrow() + .head_revs_py_list + .as_ref() + .expect("head revs should be cached") + .clone_ref(py)) + } + } + + /// get head nodeids + fn _index_head_node_ids( + slf: &Bound<'_, Self>, + py: Python<'_>, + ) -> PyResult<Py<PyList>> { + let (head_revs, head_nodes) = Self::with_index_read(slf, |idx| { + // We don't use the shortcut here, as it's actually slower to loop + // through the cached `PyList` than to re-do the whole + // conversion for large lists, which are the performance + // sensitive ones anyway. + let head_revs = idx.head_revs().map_err(graph_error)?; + let head_nodes = PyList::new( + py, + head_revs.iter().map(|r| { + PyBytes::new( + py, + idx.node(*r) + .expect("rev should have been in the index") + .as_bytes(), + ) + .unbind() + }), + )? + .unbind(); + Ok((head_revs, head_nodes)) + })?; + + Self::cache_new_heads_py_list(slf, head_revs)?; + // TODO discussion with Alphare: in hg-cpython, + // `cache_new_heads_node_ids_py_list` reconverts `head_nodes`, + // to store it in the cache attr that is **not actually used**. + // Should we drop the idea of this cache definition or actually + // use it? Perhaps in a later move for perf assessment? + Ok(head_nodes) + } + + /// get diff in head revisions + fn _index_headrevsdiff( + slf: &Bound<'_, Self>, + py: Python<'_>, + begin: PyRevision, + end: PyRevision, + ) -> PyResult<Py<PyTuple>> { + let begin: BaseRevision = begin.0 - 1; + let end: BaseRevision = end.0 - 1; + let (removed, added) = Self::with_index_read(slf, |idx| { + idx.head_revs_diff( + check_revision(idx, begin)?, + check_revision(idx, end)?, + ) + .map_err(graph_error) + })?; + let py_removed = revs_py_list(py, removed)?; + let py_added = revs_py_list(py, added)?; + Ok((py_removed, py_added).into_pyobject(py)?.unbind()) + } + + /// True if the object is a snapshot + fn _index_issnapshot( + slf: &Bound<'_, Self>, + rev: PyRevision, + ) -> PyResult<bool> { + let rev: UncheckedRevision = rev.into(); + let rev = Self::with_index_read(slf, |idx| { + idx.check_revision(rev).ok_or_else(|| rev_not_in_index(rev)) + })?; + Self::with_core_read(slf, |_self_ref, irl| { + irl.is_snapshot(rev) + .map_err(|e| PyValueError::new_err(e.to_string())) + }) + } + + /// Gather snapshot data in a cache dict + fn _index_findsnapshots( + slf: &Bound<'_, Self>, + cache: &Bound<'_, PyDict>, + start_rev: PyRevision, + end_rev: PyRevision, + ) -> PyResult<()> { + let mut cache = PySnapshotsCache(cache); + Self::with_index_read(slf, |idx| { + idx.find_snapshots(start_rev.into(), end_rev.into(), &mut cache) + .map_err(|_| revlog_error_bare()) + })?; + Ok(()) + } + + /// determine revisions with deltas to reconstruct fulltext + #[pyo3(signature = (rev, stop_rev))] + fn _index_deltachain( + slf: &Bound<'_, Self>, + py: Python<'_>, + rev: PyRevision, + stop_rev: Option<PyRevision>, + ) -> PyResult<Py<PyTuple>> { + let rev: UncheckedRevision = rev.into(); + let stop_rev: Option<UncheckedRevision> = stop_rev.map(Into::into); + + let (chain, stopped) = Self::with_index_read(slf, |idx| { + let rev = idx.check_revision(rev).ok_or_else(|| { + nodemap_error(NodeMapError::RevisionNotInIndex(rev)) + })?; + let stop_rev = stop_rev + .map(|r| { + idx.check_revision(r).ok_or_else(|| { + nodemap_error(NodeMapError::RevisionNotInIndex( + rev.into(), + )) + }) + }) + .transpose()?; + idx.delta_chain(rev, stop_rev) + .map_err(|e| PyValueError::new_err(e.to_string())) + })?; + + let py_chain = revs_py_list(py, chain)?.into_any(); + let py_stopped = + PyBool::new(py, stopped).to_owned().unbind().into_any(); + Ok((py_chain, py_stopped).into_pyobject(py)?.unbind()) + } + + /// slice planned chunk read to reach a density threshold + fn _index_slicechunktodensity( + slf: &Bound<'_, Self>, + py: Python<'_>, + revs: &Bound<'_, PyAny>, + target_density: f64, + min_gap_size: usize, + ) -> PyResult<PyObject> { + let as_nested_vec = + Self::with_index_read(slf, |idx| { + let revs: Vec<_> = rev_pyiter_collect(revs, idx)?; + Ok(idx.slice_chunk_to_density( + &revs, + target_density, + min_gap_size, + )) + })?; + let res_len = as_nested_vec.len(); + + // cannot build the outer sequence from iterator, because + // `rev_py_list()` returns `Result<T>` instead of `T`. + let mut res = Vec::with_capacity(res_len); + for chunk in as_nested_vec { + res.push(revs_py_list(py, chunk)?.into_any()); + } + // This is just to do the same as C, not sure why it does this + Ok(if res_len == 1 { + PyTuple::new(py, res)?.unbind().into_any() + } else { + PyList::new(py, res)?.unbind().into_any() + }) + } + + fn _index___len__(slf: &Bound<'_, Self>) -> PyResult<usize> { + Self::with_index_read(slf, |idx| Ok(idx.len())) + } + + fn _index___getitem__( + slf: &Bound<'_, Self>, + py: Python<'_>, + key: &Bound<'_, PyAny>, + ) -> PyResult<PyObject> { + Self::with_index_read(slf, |idx| { + match key.extract::<BaseRevision>() { + Ok(key_as_int) => { + let entry_params = if key_as_int == NULL_REVISION.0 { + RevisionDataParams::default() + } else { + let rev = UncheckedRevision(key_as_int); + match idx.entry_as_params(rev) { + Some(e) => e, + None => { + return Err(PyIndexError::new_err( + "revlog index out of range", + )); + } + } + }; + Ok(revision_data_params_to_py_tuple(py, entry_params)? + .into_any() + .unbind()) + } + // Case when key is a binary Node ID (lame: we're re-unlocking) + _ => Self::_index_get_rev(slf, key.downcast::<PyBytes>()?)? + .map_or_else( + || Ok(py.None()), + |py_rev| Ok(py_rev.into_pyobject(py)?.unbind().into()), + ), + } + }) + } + + /// Returns the full nodemap bytes to be written as-is to disk + fn _index_nodemap_data_all( + slf: &Bound<'_, Self>, + py: Python<'_>, + ) -> PyResult<Py<PyBytes>> { + Self::with_index_nt_write(slf, |idx, nt| { + let old_nt = std::mem::take(nt); + let (readonly, bytes) = old_nt.into_readonly_and_added_bytes(); + + // If there's anything readonly, we need to build the data again + // from scratch + let bytes = if readonly.len() > 0 { + let mut nt = + CoreNodeTree::load_bytes(Box::<Vec<_>>::default(), 0); + Self::fill_nodemap(idx, &mut nt)?; + + let (readonly, bytes) = nt.into_readonly_and_added_bytes(); + assert_eq!(readonly.len(), 0); + + bytes + } else { + bytes + }; + + let bytes = PyBytes::new(py, &bytes); + Ok(bytes.unbind()) + }) + } + + /// Returns the last saved docket along with the size of any changed data + /// (in number of blocks), and said data as bytes. + fn _index_nodemap_data_incremental( + slf: &Bound<'_, Self>, + py: Python<'_>, + ) -> PyResult<PyObject> { + let mut self_ref = slf.borrow_mut(); + let docket = &mut self_ref.docket; + let docket = match docket.as_ref() { + Some(d) => d.clone_ref(py), + None => return Ok(py.None()), + }; + drop(self_ref); + + Self::with_core_write(slf, |self_ref, irl| { + let mut nt = self_ref + .get_nodetree(&irl.index)? + .write() + .map_err(map_lock_error)?; + let nt = nt.take().expect("nodetree should be set"); + let masked_blocks = nt.masked_readonly_blocks(); + let (_, data) = nt.into_readonly_and_added_bytes(); + let changed = masked_blocks * std::mem::size_of::<Block>(); + + Ok(PyTuple::new( + py, + [ + docket, + changed.into_py_any(py)?, + PyBytes::new(py, &data).into_py_any(py)?, + ], + )? + .unbind() + .into_any()) + }) + } + + /// Update the nodemap from the new (mmaped) data. + /// The docket is kept as a reference for later incremental calls. + fn _index_update_nodemap_data( + slf: &Bound<'_, Self>, + py: Python<'_>, + docket: &Bound<'_, PyAny>, + nm_data: &Bound<'_, PyAny>, + ) -> PyResult<PyObject> { + // Safety: we keep the buffer around inside the class as `nodemap_mmap` + let (buf, bytes) = unsafe { take_buffer_with_slice(nm_data)? }; + let len = buf.item_count(); + let data_tip = + docket.getattr("tip_rev")?.extract::<BaseRevision>()?.into(); + + let mut nt = CoreNodeTree::load_bytes(bytes, len); + + Self::with_index_read(slf, |idx| { + let data_tip = idx.check_revision(data_tip).ok_or_else(|| { + nodemap_error(NodeMapError::RevisionNotInIndex(data_tip)) + })?; + let current_tip = idx.len(); + + for r in (data_tip.0 + 1)..current_tip as BaseRevision { + let rev = Revision(r); + // in this case node() won't ever return None + nt.insert(idx, idx.node(rev).expect("node should exist"), rev) + .map_err(nodemap_error)?; + } + + Ok(py.None()) + })?; + + let mut self_ref = slf.borrow_mut(); + self_ref.docket.replace(docket.clone().unbind()); + self_ref.nodemap_mmap = Some(buf); + self_ref.nt.write().map_err(map_lock_error)?.replace(nt); + + Ok(py.None()) + } + + #[getter] + fn _index_entry_size(&self) -> usize { + INDEX_ENTRY_SIZE + } + + #[getter] + fn _index_rust_ext_compat(&self) -> i32 { + 1 + } + + #[getter] + fn _index_is_rust(&self) -> bool { + true + } +} + +impl InnerRevlog { + /// Take the lock on `slf.irl` for reading and call a closure. + /// + /// This serves the purpose to keep the needed intermediate [`PyRef`] + /// that must be obtained to access the data from the [`Bound`] reference + /// and of which the locked [`CoreInnerRevlog`] depends. + /// This also provides releasing of the [`PyRef`] as soon as the closure + /// is done, which is crucial if the caller needs to obtain a [`PyRefMut`] + /// later on. + /// + /// In the closure, we hand back the intermediate [`PyRef`] that + /// has been generated so that the closure can access more attributes. + fn with_core_read<'py, T>( + slf: &Bound<'py, Self>, + f: impl FnOnce( + &PyRef<'py, Self>, + RwLockReadGuard<CoreInnerRevlog>, + ) -> PyResult<T>, + ) -> PyResult<T> { + let self_ref = slf.borrow(); + // Safety: the owner is the right one. We will anyway + // not actually `share` it. Perhaps pyo3-sharedref should provide + // something less scary for this kind of usage. + let shareable_ref = unsafe { self_ref.irl.borrow_with_owner(slf) }; + let guard = shareable_ref.try_read().map_err(map_try_lock_error)?; + f(&self_ref, guard) + } + + /// Take the lock on `slf.irl` for writing and call a closure. + /// + /// See [`Self::with_core_read`] for more explanations. + fn with_core_write<'py, T>( + slf: &Bound<'py, Self>, + f: impl FnOnce( + &PyRef<'py, Self>, + RwLockWriteGuard<CoreInnerRevlog>, + ) -> PyResult<T>, + ) -> PyResult<T> { + let self_ref = slf.borrow(); + // Safety: the owner is the right one. We will anyway + // not actually `share` it. Perhaps pyo3-sharedref should provide + // something less scary for this kind of usage. + let shareable_ref = unsafe { self_ref.irl.borrow_with_owner(slf) }; + let guard = shareable_ref.try_write().map_err(map_try_lock_error)?; + f(&self_ref, guard) + } + + fn with_index_read<T>( + slf: &Bound<'_, Self>, + f: impl FnOnce(&Index) -> PyResult<T>, + ) -> PyResult<T> { + Self::with_core_read(slf, |_, guard| f(&guard.index)) + } + + fn with_index_write<T>( + slf: &Bound<'_, Self>, + f: impl FnOnce(&mut Index) -> PyResult<T>, + ) -> PyResult<T> { + Self::with_core_write(slf, |_, mut guard| f(&mut guard.index)) + } + + /// Lock `slf` for reading and execute a closure on its [`Index`] and + /// [`NodeTree`] + /// + /// The [`NodeTree`] is initialized an filled before hand if needed. + fn with_index_nt_read<T>( + slf: &Bound<'_, Self>, + f: impl FnOnce(&Index, &CoreNodeTree) -> PyResult<T>, + ) -> PyResult<T> { + Self::with_core_read(slf, |self_ref, guard| { + let idx = &guard.index; + let nt = + self_ref.get_nodetree(idx)?.read().map_err(map_lock_error)?; + let nt = nt.as_ref().expect("nodetree should be set"); + f(idx, nt) + }) + } + + fn with_index_nt_write<T>( + slf: &Bound<'_, Self>, + f: impl FnOnce(&mut Index, &mut CoreNodeTree) -> PyResult<T>, + ) -> PyResult<T> { + Self::with_core_write(slf, |self_ref, mut guard| { + let idx = &mut guard.index; + let mut nt = self_ref + .get_nodetree(idx)? + .write() + .map_err(map_lock_error)?; + let nt = nt.as_mut().expect("nodetree should be set"); + f(idx, nt) + }) + } + + /// Fill a [`CoreNodeTree`] by doing a full iteration on the given + /// [`Index`] + /// + /// # Python exceptions + /// Raises `ValueError` if `nt` has existing data that is inconsistent + /// with `idx`. + fn fill_nodemap(idx: &Index, nt: &mut CoreNodeTree) -> PyResult<()> { + for r in 0..idx.len() { + let rev = Revision(r as BaseRevision); + // in this case node() won't ever return None + nt.insert(idx, idx.node(rev).expect("node should exist"), rev) + .map_err(nodemap_error)? + } + Ok(()) + } + + /// Return a working NodeTree of this InnerRevlog + /// + /// In case the NodeTree has not been initialized yet (in particular + /// not from persistent data at instantiation), it is created and + /// filled right away from the index. + /// + /// Technically, the returned NodeTree is still behind the lock of + /// the `nt` field, hence still wrapped in an [`Option`]. Callers + /// will need to take the lock and unwrap with `expect()`. + /// + /// # Python exceptions + /// The case mentioned in [`Self::fill_nodemap()`] cannot happen, as the + /// NodeTree is empty when it is called. + fn get_nodetree( + &self, + idx: &Index, + ) -> PyResult<&RwLock<Option<CoreNodeTree>>> { + if self.nt.read().map_err(map_lock_error)?.is_none() { + let readonly = Box::<Vec<_>>::default(); + let mut nt = CoreNodeTree::load_bytes(readonly, 0); + Self::fill_nodemap(idx, &mut nt)?; + self.nt.write().map_err(map_lock_error)?.replace(nt); + } + Ok(&self.nt) + } + + fn cache_new_heads_py_list( + slf: &Bound<'_, Self>, + new_heads: Vec<Revision>, + ) -> PyResult<Py<PyList>> { + let py = slf.py(); + let new_heads_py_list = revs_py_list(py, new_heads)?; + slf.borrow_mut().head_revs_py_list = + Some(new_heads_py_list.clone_ref(py)); + // TODO is returning really useful? + Ok(new_heads_py_list) + } +} + +#[pyclass] +struct NodeTree { + nt: RwLock<CoreNodeTree>, + index: SharedByPyObject<PySharedIndex>, +} + +#[pymethods] +impl NodeTree { + #[new] + // The share/mapping should be set apart to become the PyO3 homolog of + // `py_rust_index_to_graph` + fn new(index_proxy: &Bound<'_, PyAny>) -> PyResult<Self> { + let py_irl = index_proxy.getattr("inner")?; + let py_irl_ref = py_irl.downcast::<InnerRevlog>()?.borrow(); + let shareable_irl = &py_irl_ref.irl; + + // Safety: the owner is the actual one and we do not leak any + // internal reference. + let index = unsafe { + shareable_irl.share_map(&py_irl, |irl| (&irl.index).into()) + }; + let nt = CoreNodeTree::default(); // in-RAM, fully mutable + + Ok(Self { + nt: nt.into(), + index, + }) + } + + /// Tell whether the NodeTree is still valid + /// + /// In case of mutation of the index, the given results are not + /// guaranteed to be correct, and in fact, the methods borrowing + /// the inner index would fail because of `PySharedRef` poisoning + /// (generation-based guard), same as iterating on a `dict` that has + /// been meanwhile mutated. + fn is_invalidated(&self, py: Python<'_>) -> PyResult<bool> { + // Safety: we don't leak any reference derived from self.index, as + // we only check errors + let result = unsafe { self.index.try_borrow(py) }; + // two cases for result to be an error: + // - the index has previously been mutably borrowed + // - there is currently a mutable borrow + // in both cases this amounts for previous results related to + // the index to still be valid. + Ok(result.is_err()) + } + + fn insert(&self, py: Python<'_>, rev: PyRevision) -> PyResult<()> { + // Safety: we don't leak any reference derived from self.index, + // as `nt.insert` does not store direct references + let idx = &*unsafe { self.index.try_borrow(py)? }; + + let rev = check_revision(idx, rev)?; + if rev == NULL_REVISION { + return Err(rev_not_in_index(rev.into())); + } + + let entry = idx.inner().get_entry(rev).expect("entry should exist"); + let mut nt = self.nt.write().map_err(map_lock_error)?; + nt.insert(idx, entry.hash(), rev).map_err(nodemap_error) + } + + fn shortest( + &self, + py: Python<'_>, + node: &Bound<'_, PyBytes>, + ) -> PyResult<usize> { + let nt = self.nt.read().map_err(map_lock_error)?; + // Safety: we don't leak any reference derived from self.index + // as returned type is Copy + let idx = &*unsafe { self.index.try_borrow(py)? }; + nt.unique_prefix_len_node(idx, &node_from_py_bytes(node)?) + .map_err(nodemap_error)? + .ok_or_else(revlog_error_bare) + } + + /// Lookup by node hex prefix in the NodeTree, returning revision number. + /// + /// This is not part of the classical NodeTree API, but is good enough + /// for unit testing, as in `test-rust-revlog.py`. + fn prefix_rev_lookup( + &self, + py: Python<'_>, + node_prefix: &Bound<'_, PyBytes>, + ) -> PyResult<Option<PyRevision>> { + let prefix = node_prefix_from_py_bytes(node_prefix)?; + let nt = self.nt.read().map_err(map_lock_error)?; + // Safety: we don't leak any reference derived from self.index + // as returned type is Copy + let idx = &*unsafe { self.index.try_borrow(py)? }; + Ok(nt + .find_bin(idx, prefix) + .map_err(nodemap_error)? + .map(|r| r.into())) + } +} + +pub fn init_module<'py>( + py: Python<'py>, + package: &str, +) -> PyResult<Bound<'py, PyModule>> { + let m = new_submodule(py, package, "revlog")?; + m.add_class::<InnerRevlog>()?; + m.add_class::<NodeTree>()?; + m.add_class::<ReadingContextManager>()?; + Ok(m) +}
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-pyo3/src/store.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,76 @@ +// store.rs +// +// Copyright 2020-2024 Raphaël Gomès <raphael.gomes@octobus.net> +// 2024 Georges Racinet <georges.racinet@cloudcrane.io> +// +// This software may be used and distributed according to the terms of the +// GNU General Public License version 2 or any later version. +use pyo3::prelude::*; +use pyo3::types::PyBytes; + +use std::sync::atomic::{AtomicBool, Ordering}; + +use hg::{fncache::FnCache, utils::files::get_bytes_from_path}; + +pub struct PyFnCache { + fncache: PyObject, +} + +impl PyFnCache { + pub fn new(fncache: PyObject) -> Self { + Self { fncache } + } +} + +impl Clone for PyFnCache { + fn clone(&self) -> Self { + Python::with_gil(|py| Self { + fncache: self.fncache.clone_ref(py), + }) + } +} + +/// Cache whether the fncache is loaded to avoid Python round-trip every time. +/// Once the fncache is loaded, it stays loaded unless we're in a very +/// long-running process, none of which we actually support for now. +static FN_CACHE_IS_LOADED: AtomicBool = AtomicBool::new(false); + +// TODO perhaps a bit of magic with `Bound<'_, PyFnCache>` would spare us +// the GIL reacquisitions +impl FnCache for PyFnCache { + fn is_loaded(&self) -> bool { + if FN_CACHE_IS_LOADED.load(Ordering::Relaxed) { + return true; + } + Python::with_gil(|py| { + // TODO raise in case of error? + let is_loaded = self + .fncache + .getattr(py, "is_loaded") + .ok() + .map(|o| { + o.extract::<bool>(py).expect( + "is_loaded returned something other than a bool", + ) + }) + .unwrap_or(false); + if is_loaded { + FN_CACHE_IS_LOADED.store(true, Ordering::Relaxed); + } + is_loaded + }) + } + fn add(&self, path: &std::path::Path) { + Python::with_gil(|py| { + // TODO raise in case of error? + self.fncache + .call_method( + py, + "add", + (PyBytes::new(py, &get_bytes_from_path(path)),), + None, + ) + .ok(); + }) + } +}
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-pyo3/src/transaction.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,33 @@ +use hg::{transaction::Transaction, utils::files::get_bytes_from_path}; +use pyo3::{intern, types::PyBytes, PyObject, Python}; + +/// Wrapper around a Python transaction object, to keep `hg-core` oblivious +/// of the fact it's being called from Python. +pub struct PyTransaction { + inner: PyObject, +} + +impl PyTransaction { + pub fn new(inner: PyObject) -> Self { + Self { inner } + } +} + +impl Clone for PyTransaction { + fn clone(&self) -> Self { + Python::with_gil(|py| Self { + inner: self.inner.clone_ref(py), + }) + } +} + +impl Transaction for PyTransaction { + fn add(&mut self, file: impl AsRef<std::path::Path>, offset: usize) { + Python::with_gil(|py| { + let file = PyBytes::new(py, &get_bytes_from_path(file.as_ref())); + self.inner + .call_method(py, intern!(py, "add"), (file, offset), None) + .expect("transaction add failed"); + }) + } +}
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-pyo3/src/update.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,49 @@ +// update.rs +// +// Copyright 2025 Mercurial developers +// +// This software may be used and distributed according to the terms of the +// GNU General Public License version 2 or any later version. + +//! Bindings for the `hg::update` module provided by the +//! `hg-core` package. +//! +//! From Python, this will be seen as `mercurial.pyo3_rustext.update` +use pyo3::prelude::*; + +use hg::progress::{HgProgressBar, Progress}; +use hg::update::update_from_null as core_update_from_null; +use hg::BaseRevision; +use pyo3::types::PyBytes; + +use crate::exceptions::FallbackError; +use crate::repo::repo_from_path; +use crate::utils::{new_submodule, with_sigint_wrapper, HgPyErrExt}; + +/// See [`core_update_from_null`]. +#[pyfunction] +#[pyo3(signature = (repo_path, to, num_cpus))] +pub fn update_from_null( + repo_path: &Bound<'_, PyBytes>, + to: BaseRevision, + num_cpus: Option<usize>, +) -> PyResult<usize> { + log::trace!("Using update from null fastpath"); + let repo = repo_from_path(repo_path)?; + let progress: &dyn Progress = &HgProgressBar::new("updating"); + + with_sigint_wrapper(repo_path.py(), || { + core_update_from_null(&repo, to.into(), progress, num_cpus) + })? + .into_pyerr(repo_path.py()) +} + +pub fn init_module<'py>( + py: Python<'py>, + package: &str, +) -> PyResult<Bound<'py, PyModule>> { + let m = new_submodule(py, package, "update")?; + m.add("FallbackError", py.get_type::<FallbackError>())?; + m.add_function(wrap_pyfunction!(update_from_null, &m)?)?; + Ok(m) +}
--- a/rust/hg-pyo3/src/util.rs Fri Feb 28 23:25:42 2025 +0100 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,28 +0,0 @@ -use pyo3::prelude::*; -use pyo3::types::PyDict; -/// Create the module, with `__package__` given from parent -/// -/// According to PyO3 documentation, which links to -/// <https://github.com/PyO3/pyo3/issues/1517>, the same convoluted -/// write to sys.modules has to be made as with the `cpython` crate. -pub(crate) fn new_submodule<'py>( - py: Python<'py>, - package_name: &str, - name: &str, -) -> PyResult<Bound<'py, PyModule>> { - let dotted_name = &format!("{}.{}", package_name, name); - let m = PyModule::new(py, name)?; - m.add("__package__", package_name)?; - m.add("__doc__", "DAG operations - Rust implementation")?; - - let sys = PyModule::import(py, "sys")?; - // according to the doc, we could make a static PyString out of - // "modules" with the `intern!` macro, but this is used only at - // registration so it may not be worth the effort. - let sys_modules: Bound<'_, PyDict> = sys.getattr("modules")?.extract()?; - sys_modules.set_item(dotted_name, &m)?; - // Example C code (see pyexpat.c and import.c) will "give away the - // reference", but we won't because it will be consumed once the - // Rust PyObject is dropped. - Ok(m) -}
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/hg-pyo3/src/utils.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,375 @@ +use hg::errors::HgError; +use hg::revlog::index::Index as CoreIndex; +use hg::revlog::inner_revlog::RevisionBuffer; +use pyo3::buffer::{Element, PyBuffer}; +use pyo3::exceptions::{ + PyIOError, PyKeyboardInterrupt, PyRuntimeError, PyValueError, +}; +use pyo3::types::{PyBytes, PyDict}; +use pyo3::{intern, prelude::*}; +use pyo3_sharedref::SharedByPyObject; +use stable_deref_trait::StableDeref; + +use crate::exceptions::FallbackError; +use crate::revlog::{InnerRevlog, PySharedIndex}; + +/// Create the module, with `__package__` given from parent +/// +/// According to PyO3 documentation, which links to +/// <https://github.com/PyO3/pyo3/issues/1517>, the same convoluted +/// write to sys.modules has to be made as with the `cpython` crate. +pub(crate) fn new_submodule<'py>( + py: Python<'py>, + package_name: &str, + name: &str, +) -> PyResult<Bound<'py, PyModule>> { + let dotted_name = &format!("{}.{}", package_name, name); + let m = PyModule::new(py, name)?; + m.add("__package__", package_name)?; + + let sys = PyModule::import(py, "sys")?; + // according to the doc, we could make a static PyString out of + // "modules" with the `intern!` macro, but this is used only at + // registration so it may not be worth the effort. + let sys_modules: Bound<'_, PyDict> = sys.getattr("modules")?.extract()?; + sys_modules.set_item(dotted_name, &m)?; + // Example C code (see pyexpat.c and import.c) will "give away the + // reference", but we won't because it will be consumed once the + // Rust PyObject is dropped. + Ok(m) +} + +/// Retrieve the shared index wrapper (which contains the core index) +/// from the Python index proxy. +pub fn py_rust_index_to_graph( + index_proxy: &Bound<'_, PyAny>, +) -> PyResult<SharedByPyObject<PySharedIndex>> { + let py_irl = index_proxy.getattr("inner")?; + let py_irl_ref = py_irl.downcast::<InnerRevlog>()?.borrow(); + let shareable_irl = &py_irl_ref.irl; + + // Safety: the owner is the actual one and we do not leak any + // internal reference. + let index = + unsafe { shareable_irl.share_map(&py_irl, |irl| (&irl.index).into()) }; + Ok(index) +} + +/// Error propagation for an [`SharedByPyObject`] wrapping a [`Result`] +/// +/// It would be nice for [`SharedByPyObject`] to provide this directly as +/// a variant of the `map` method with a signature such as: +/// +/// ``` +/// unsafe fn map_or_err(&self, +/// py: Python, +/// f: impl FnOnce(T) -> Result(U, E), +/// convert_err: impl FnOnce(E) -> PyErr) +/// ``` +/// +/// This would spare users of the `pyo3` crate the additional `unsafe` deref +/// to inspect the error and return it outside `SharedByPyObject`, and the +/// subsequent unwrapping that this function performs. +pub(crate) fn py_shared_or_map_err<T, E: std::fmt::Debug + Copy>( + py: Python, + leaked: SharedByPyObject<Result<T, E>>, + convert_err: impl FnOnce(E) -> PyErr, +) -> PyResult<SharedByPyObject<T>> { + // Safety: we don't leak the "faked" reference out of `SharedByPyObject` + if let Err(e) = *unsafe { leaked.try_borrow(py)? } { + return Err(convert_err(e)); + } + // Safety: we don't leak the "faked" reference out of `SharedByPyObject` + Ok(unsafe { + leaked.map(py, |res| { + res.expect("Error case should have already be treated") + }) + }) +} + +/// Full extraction of the proxy index object as received in PyO3 to a +/// [`CoreIndex`] reference. +/// +/// # Safety +/// +/// The invariants to maintain are those of the underlying +/// [`SharedByPyObject::try_borrow`]: the caller must not leak the inner +/// reference. +pub(crate) unsafe fn proxy_index_extract<'py>( + index_proxy: &Bound<'py, PyAny>, +) -> PyResult<&'py CoreIndex> { + let py_shared = py_rust_index_to_graph(index_proxy)?; + let py_shared = &*unsafe { py_shared.try_borrow(index_proxy.py())? }; + Ok(unsafe { py_shared.static_inner() }) +} + +/// Type shortcut for the kind of bytes slice trait objects that are used in +/// particular for mmap data +type BoxedBytesSlice = + Box<dyn std::ops::Deref<Target = [u8]> + Send + Sync + 'static>; + +/// Take a Python object backed by a Python buffer, and return the underlying +/// [`PyBuffer`] along with the Rust slice into said buffer. +/// +/// The caller needs to make sure that the Python buffer is not freed before +/// the slice, otherwise we'd get a dangling pointer once the incoming +/// object is freed from Python side. This can be achieved by storing it a +/// Python object. +/// +/// The typical use case is to extract mmap data to make it useable in the +/// constructs from the `hg` crate. +/// +/// # Safety +/// +/// The caller must make sure that the incoming Python object is kept around +/// for at least as long as the returned [`BoxedBytesSlice`]. +// TODO in PyO3, we already get a reference with two lifetimes, and we +// could even take a `Borrowed<'a, 'py, T>`. +// So perhaps we could tie everything together with a lifetime so that is +// is, after all, safe, and this could be called something like `share_buffer`. +#[deny(unsafe_op_in_unsafe_fn)] +pub unsafe fn take_buffer_with_slice( + data: &Bound<'_, PyAny>, +) -> PyResult<(PyBuffer<u8>, BoxedBytesSlice)> { + let buf = PyBuffer::<u8>::get(data)?; + let len = buf.item_count(); + + // Build a slice from the buffer data + let cbuf = buf.buf_ptr(); + let bytes = if std::mem::size_of::<u8>() == buf.item_size() + && buf.is_c_contiguous() + && u8::is_compatible_format(buf.format()) + && buf.dimensions() == 1 + && buf.readonly() + { + unsafe { std::slice::from_raw_parts(cbuf as *const u8, len) } + } else { + return Err(PyValueError::new_err( + "buffer has an invalid memory representation", + )); + }; + + Ok((buf, Box::new(bytes))) +} + +/// Takes an initialization function `init` which writes bytes to a +/// Python-backed buffer, to save on a (potentially large) memory allocation +/// and copy. If `init` fails to write the full expected length `len`, an error +/// is raised. +pub fn with_pybytes_buffer<F>( + py: Python, + len: usize, + init: F, +) -> Result<Py<PyBytes>, hg::revlog::RevlogError> +where + F: FnOnce( + &mut dyn RevisionBuffer<Target = Py<PyBytes>>, + ) -> Result<(), hg::revlog::RevlogError>, +{ + // Largely inspired by code in PyO3 + // https://pyo3.rs/main/doc/pyo3/types/struct.pybytes#method.new_bound_with + unsafe { + let pyptr = pyo3::ffi::PyBytes_FromStringAndSize( + std::ptr::null(), + len as pyo3::ffi::Py_ssize_t, + ); + let pybytes = Bound::from_owned_ptr_or_err(py, pyptr) + .map_err(|e| HgError::abort_simple(e.to_string()))? + .downcast_into_unchecked(); + let buffer: *mut u8 = pyo3::ffi::PyBytes_AsString(pyptr).cast(); + debug_assert!(!buffer.is_null()); + let mut rev_buf = PyRevisionBuffer::new(pybytes.unbind(), buffer, len); + // Initialise the bytestring in init + // If init returns an Err, the buffer is deallocated by `pybytes` + init(&mut rev_buf).map(|_| rev_buf.finish()) + } +} + +/// Safe abstraction over a `PyBytes` together with the `&[u8]` slice +/// that borrows it. Implements `Deref<Target = [u8]>`. +/// +/// Calling `PyBytes::data` requires a GIL marker but we want to access the +/// data in a thread that (ideally) does not need to acquire the GIL. +/// This type allows separating the call an the use. +/// +/// It also enables using a (wrapped) `PyBytes` in GIL-unaware generic code. +pub struct PyBytesDeref { + #[allow(unused)] + keep_alive: Py<PyBytes>, + + /// Borrows the buffer inside `self.keep_alive`, + /// but the borrow-checker cannot express self-referential structs. + data: &'static [u8], +} + +impl PyBytesDeref { + pub fn new(py: Python, bytes: Py<PyBytes>) -> Self { + let as_raw: *const [u8] = bytes.as_bytes(py); + Self { + // Safety: the raw pointer is valid as long as the PyBytes is still + // alive, and the objecs owns it. + data: unsafe { &*as_raw }, + keep_alive: bytes, + } + } + + #[allow(unused)] + pub fn unwrap(self) -> Py<PyBytes> { + self.keep_alive + } +} + +impl std::ops::Deref for PyBytesDeref { + type Target = [u8]; + + fn deref(&self) -> &[u8] { + self.data + } +} + +unsafe impl StableDeref for PyBytesDeref {} + +fn require_send<T: Send>() {} + +#[allow(unused)] +fn static_assert_pybytes_is_send() { + #[allow(clippy::no_effect)] + require_send::<Py<PyBytes>>; +} + +// Safety: `[Py<PyBytes>]` is Send. Raw pointers are not by default, +// but here sending one to another thread is fine since we ensure it stays +// valid. +unsafe impl Send for PyBytesDeref {} + +/// Wrapper around a Python-provided buffer into which the revision contents +/// will be written. Done for speed in order to save a large allocation + copy. +struct PyRevisionBuffer { + py_bytes: Py<PyBytes>, + _buf: *mut u8, + len: usize, + current_buf: *mut u8, + current_len: usize, +} + +impl PyRevisionBuffer { + /// # Safety + /// + /// `buf` should be the start of the allocated bytes of `bytes`, and `len` + /// exactly the length of said allocated bytes. + #[inline] + unsafe fn new(bytes: Py<PyBytes>, buf: *mut u8, len: usize) -> Self { + Self { + py_bytes: bytes, + _buf: buf, + len, + current_len: 0, + current_buf: buf, + } + } + + /// Number of bytes that have been copied to. Will be different to the + /// total allocated length of the buffer unless the revision is done being + /// written. + #[inline] + fn current_len(&self) -> usize { + self.current_len + } +} + +impl RevisionBuffer for PyRevisionBuffer { + type Target = Py<PyBytes>; + + #[inline] + fn extend_from_slice(&mut self, slice: &[u8]) { + assert!(self.current_len + slice.len() <= self.len); + unsafe { + // We cannot use `copy_from_nonoverlapping` since it's *possible* + // to create a slice from the same Python memory region using + // [`PyBytesDeref`]. Probable that LLVM has an optimization anyway? + self.current_buf.copy_from(slice.as_ptr(), slice.len()); + self.current_buf = self.current_buf.add(slice.len()); + } + self.current_len += slice.len() + } + + #[inline] + fn finish(self) -> Self::Target { + // catch unzeroed bytes before it becomes undefined behavior + assert_eq!( + self.current_len(), + self.len, + "not enough bytes read for revision" + ); + self.py_bytes + } +} + +/// Extension trait to help with generic error conversions from hg-core to +/// Python. +pub(crate) trait HgPyErrExt<T> { + fn into_pyerr(self, py: Python) -> PyResult<T>; +} + +impl<T, E> HgPyErrExt<T> for Result<T, E> +where + HgError: From<E>, +{ + fn into_pyerr(self, py: Python) -> PyResult<T> { + self.map_err(|e| match e.into() { + err @ HgError::IoError { .. } => { + PyIOError::new_err(err.to_string()) + } + HgError::UnsupportedFeature(e) => { + FallbackError::new_err(e.to_string()) + } + HgError::RaceDetected(_) => { + unreachable!("must not surface to the user") + } + HgError::Path(path_error) => { + let msg = PyBytes::new(py, path_error.to_string().as_bytes()); + let cls = py + .import(intern!(py, "mercurial.error")) + .and_then(|m| m.getattr(intern!(py, "InputError"))) + .expect("failed to import InputError"); + PyErr::from_value( + cls.call1((msg,)) + .expect("initializing an InputError failed"), + ) + } + HgError::InterruptReceived => PyKeyboardInterrupt::new_err(()), + e => PyRuntimeError::new_err(e.to_string()), + }) + } +} + +/// Wrap a call to `func` so that Python's `SIGINT` handler is first stored, +/// then restored after the call to `func` and finally raised if +/// `func` returns a [`HgError::InterruptReceived`]. +/// +/// We cannot use [`Python::check_signals`] because it only works from the main +/// thread of the main interpreter. To that end, long-running Rust functions +/// need to cooperate by listening to their own `SIGINT` signal and return +/// the appropriate error on catching that signal: this is especially helpful +/// in multithreaded operations. +pub fn with_sigint_wrapper<R>( + py: Python, + func: impl Fn() -> Result<R, HgError>, +) -> PyResult<Result<R, HgError>> { + let signal_py_mod = py.import(intern!(py, "signal"))?; + let sigint_py_const = signal_py_mod.getattr(intern!(py, "SIGINT"))?; + let old_handler = signal_py_mod + .call_method1(intern!(py, "getsignal"), (sigint_py_const.clone(),))?; + let res = func(); + // Reset the old signal handler in Python because we may have changed it + signal_py_mod.call_method1( + intern!(py, "signal"), + (sigint_py_const.clone(), old_handler), + )?; + if let Err(HgError::InterruptReceived) = res { + // Trigger the signal in Python + signal_py_mod + .call_method1(intern!(py, "raise_signal"), (sigint_py_const,))?; + } + Ok(res) +}
--- a/rust/pyo3-sharedref/Cargo.toml Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/pyo3-sharedref/Cargo.toml Fri Feb 28 23:28:10 2025 +0100 @@ -3,10 +3,13 @@ version = "0.1.0" edition = "2021" +[lints] +workspace = true + [lib] name='pyo3_sharedref' -crate-type = ["rlib"] [dependencies] pyo3 = { version = "0.23.1" } stable_deref_trait = "1.2.0" +static_assertions_next = "1.1.2" \ No newline at end of file
--- a/rust/pyo3-sharedref/src/lib.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/pyo3-sharedref/src/lib.rs Fri Feb 28 23:28:10 2025 +0100 @@ -482,9 +482,16 @@ data: T, } -// DO NOT implement Deref for SharedByPyObject<T>! Dereferencing +// DO NOT implement Deref or DerefMut for SharedByPyObject<T>! Dereferencing // SharedByPyObject without taking Python GIL wouldn't be safe. Also, the // underling reference is invalid if generation != state.generation. +static_assertions_next::assert_impl!( + for(T) SharedByPyObject<T>: !Deref +); + +static_assertions_next::assert_impl!( + for(T) SharedByPyObject<T>: !DerefMut +); impl<T: ?Sized> SharedByPyObject<T> { // No panicking version of borrow() and borrow_mut() are implemented @@ -739,3 +746,139 @@ self.data } } + +/// Defines a Python iterator over a Rust iterator. +/// +/// TODO: this is a bit awkward to use, and a better (more complicated) +/// procedural macro would simplify the interface a lot. +/// +/// # Parameters +/// +/// * `$name` is the identifier to give to the resulting Rust struct. +/// * `$success_type` is the resulting Python object. It can be a bultin type, +/// (e.g., `PyBytes`), or any `PyClass`. +/// * `$owner_type` is the type owning the data +/// * `$owner_attr` is the name of the shareable attribute in `$owner_type` +/// * `$shared_type` is the type wrapped in `SharedByPyObject`, typically +/// `SomeIter<'static>` +/// * `$iter_func` is a function to obtain the Rust iterator from the content +/// of the shareable attribute. It can be a closure. +/// * `$result_func` is a function for converting items returned by the Rust +/// iterator into `PyResult<Option<Py<$success_type>`. +/// +/// # Safety +/// +/// `$success_func` may take a reference, whose lifetime may be articial. +/// Do not copy it out of the function call (this would be possible only +/// with inner mutability). +/// +/// # Example +/// +/// The iterator example in [`PyShareable`] can be rewritten as +/// +/// ``` +/// use pyo3::prelude::*; +/// use pyo3_sharedref::*; +/// +/// use pyo3::types::{PyTuple, PyInt}; +/// use std::collections::{hash_set::Iter as IterHashSet, HashSet}; +/// use std::vec::Vec; +/// +/// #[pyclass(sequence)] +/// struct Set { +/// rust_set: PyShareable<HashSet<i32>>, +/// } +/// +/// #[pymethods] +/// impl Set { +/// #[new] +/// fn new(values: &Bound<'_, PyTuple>) -> PyResult<Self> { +/// let as_vec = values.extract::<Vec<i32>>()?; +/// let s: HashSet<_> = as_vec.iter().copied().collect(); +/// Ok(Self { rust_set: s.into() }) +/// } +/// +/// fn __iter__(slf: &Bound<'_, Self>) -> PyResult<SetIterator> { +/// SetIterator::new(slf) +/// } +/// } +/// +/// py_shared_iterator!( +/// SetIterator, +/// PyInt, +/// Set, +/// rust_set, +/// IterHashSet<'static, i32>, +/// |hash_set| hash_set.iter(), +/// it_next_result +/// ); +/// +/// fn it_next_result(py: Python, res: &i32) -> PyResult<Option<Py<PyInt>>> { +/// Ok(Some((*res).into_pyobject(py)?.unbind())) +/// } +/// ``` +/// +/// In the example above, `$result_func` is fairly trivial, and can be replaced +/// by a closure, but things can get more complicated if the Rust +/// iterator itself returns `Result<T, E>` with `T` not implementing +/// `IntoPyObject` and `E` needing to be converted. +/// Also the closure variant is fairly obscure: +/// +/// ```ignore +/// py_shared_iterator!( +/// SetIterator, +/// PyInt, +/// Set, +/// rust_set, +/// IterHashSet<'static, i32>, +/// |hash_set| hash_set.iter(), +/// (|py, i: &i32| Ok(Some((*i).into_pyobject(py)?.unbind()))) +/// ) +/// ``` +#[macro_export] +macro_rules! py_shared_iterator { + ( + $name: ident, + $success_type: ty, + $owner_type: ident, + $owner_attr: ident, + $shared_type: ty, + $iter_func: expr, + $result_func: expr + ) => { + #[pyclass] + pub struct $name { + inner: pyo3_sharedref::SharedByPyObject<$shared_type>, + } + + #[pymethods] + impl $name { + #[new] + fn new(owner: &Bound<'_, $owner_type>) -> PyResult<Self> { + let inner = &owner.borrow().$owner_attr; + // Safety: the data is indeed owned by `owner` + let shared_iter = + unsafe { inner.share_map(owner, $iter_func) }; + Ok(Self { inner: shared_iter }) + } + + fn __iter__(slf: PyRef<'_, Self>) -> PyRef<'_, Self> { + slf + } + + fn __next__( + mut slf: PyRefMut<'_, Self>, + ) -> PyResult<Option<Py<$success_type>>> { + let py = slf.py(); + let shared = &mut slf.inner; + // Safety: we do not leak references derived from the internal + // 'static reference. + let mut inner = unsafe { shared.try_borrow_mut(py) }?; + match inner.next() { + None => Ok(None), + Some(res) => $result_func(py, res), + } + } + } + }; +}
--- a/rust/rhg/Cargo.toml Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/rhg/Cargo.toml Fri Feb 28 23:28:10 2025 +0100 @@ -7,6 +7,9 @@ ] edition = "2021" +[lints] +workspace = true + [dependencies] hg-core = { path = "../hg-core"} chrono = "0.4.23"
--- a/rust/rhg/src/blackbox.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/rhg/src/blackbox.rs Fri Feb 28 23:28:10 2025 +0100 @@ -4,7 +4,7 @@ use format_bytes::format_bytes; use hg::errors::HgError; use hg::repo::Repo; -use hg::utils::{files::get_bytes_from_os_str, shell_quote}; +use hg::utils::{files::get_bytes_from_os_str, strings::shell_quote}; use std::ffi::OsString; // Python does not support %.3f, only %f
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rust/rhg/src/commands/annotate.rs Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,406 @@ +use core::str; +use std::{collections::hash_map::Entry, ffi::OsString}; + +use format_bytes::format_bytes; +use hg::{ + encoding::Encoder, + operations::{ + annotate, AnnotateOptions, AnnotateOutput, ChangesetAnnotation, + }, + revlog::changelog::Changelog, + utils::strings::CleanWhitespace, + FastHashMap, Revision, +}; + +use crate::{error::CommandError, utils::path_utils::resolve_file_args}; + +pub const HELP_TEXT: &str = " +show changeset information by line for each file +"; + +pub fn args() -> clap::Command { + clap::command!("annotate") + .alias("blame") + .arg( + clap::Arg::new("files") + .help("files to annotate") + .required(true) + .num_args(1..) + .value_name("FILE") + .value_parser(clap::value_parser!(OsString)), + ) + .arg( + clap::Arg::new("rev") + .help("annotate the specified revision") + .short('r') + .long("rev") + .value_name("REV") + .default_value("."), + ) + .arg( + clap::Arg::new("no-follow") + .help("don't follow copies and renames") + .long("no-follow") + .action(clap::ArgAction::SetTrue), + ) + .arg( + clap::Arg::new("text") + .help("treat all files as text") + .short('a') + .long("text") + .action(clap::ArgAction::SetTrue), + ) + .arg( + clap::Arg::new("user") + .help("list the author (long with -v)") + .short('u') + .long("user") + .action(clap::ArgAction::SetTrue), + ) + .arg( + clap::Arg::new("number") + .help("list the revision number (default)") + .short('n') + .long("number") + .action(clap::ArgAction::SetTrue), + ) + .arg( + clap::Arg::new("changeset") + .help("list the changeset") + .short('c') + .long("changeset") + .action(clap::ArgAction::SetTrue), + ) + .arg( + clap::Arg::new("date") + .help("list the date (short with -q)") + .short('d') + .long("date") + .action(clap::ArgAction::SetTrue), + ) + .arg( + clap::Arg::new("file") + .help("list the filename") + .short('f') + .long("file") + .action(clap::ArgAction::SetTrue), + ) + .arg( + clap::Arg::new("line-number") + .help("show the line number at the first appearance") + .short('l') + .long("line-number") + .action(clap::ArgAction::SetTrue), + ) + .arg( + clap::Arg::new("quiet") + .help("show short date for -d") + .short('q') + .long("quiet") + .action(clap::ArgAction::SetTrue), + ) + .arg( + clap::Arg::new("verbose") + .help("show full username for -u") + .short('v') + .long("verbose") + .action(clap::ArgAction::SetTrue) + .conflicts_with("quiet"), + ) + .arg( + clap::Arg::new("ignore-all-space") + .help("ignore white space when comparing lines") + .short('w') + .long("ignore-all-space") + .action(clap::ArgAction::SetTrue), + ) + .arg( + clap::Arg::new("ignore-space-change") + .help("ignore changes in the amount of white space") + .short('b') + .long("ignore-space-change") + .action(clap::ArgAction::SetTrue), + ) + .arg( + clap::Arg::new("ignore-blank-lines") + .help("ignore changes whose lines are all blank") + .short('B') + .long("ignore-blank-lines") + .action(clap::ArgAction::SetTrue), + ) + .arg( + clap::Arg::new("ignore-space-at-eol") + .help("ignore changes in whitespace at EOL") + .short('Z') + .long("ignore-space-at-eol") + .action(clap::ArgAction::SetTrue), + ) + .about(HELP_TEXT) +} + +pub fn run(invocation: &crate::CliInvocation) -> Result<(), CommandError> { + let config = invocation.config; + if config.has_non_empty_section(b"annotate") { + return Err(CommandError::unsupported( + "rhg annotate does not support any [annotate] configs", + )); + } + + let repo = invocation.repo?; + let args = invocation.subcommand_args; + + let rev = args.get_one::<String>("rev").expect("rev has a default"); + let rev = hg::revset::resolve_single(rev, repo)?; + + let files = match args.get_many::<OsString>("files") { + None => vec![], + Some(files) => resolve_file_args(repo, files)?, + }; + + let options = AnnotateOptions { + treat_binary_as_text: args.get_flag("text"), + follow_copies: !args.get_flag("no-follow"), + whitespace: if args.get_flag("ignore-all-space") { + CleanWhitespace::All + } else if args.get_flag("ignore-space-change") { + CleanWhitespace::Collapse + } else if args.get_flag("ignore-space-at-eol") { + CleanWhitespace::AtEol + } else { + // We ignore the --ignore-blank-lines flag (present for consistency + // with other commands) since it has no effect on annotate. + CleanWhitespace::None + }, + }; + + let mut include = Include { + user: args.get_flag("user"), + number: args.get_flag("number"), + changeset: args.get_flag("changeset"), + date: args.get_flag("date"), + file: args.get_flag("file"), + line_number: args.get_flag("line-number"), + }; + if !(include.user || include.file || include.date || include.changeset) { + include.number = true; + } + if include.line_number && !(include.number || include.changeset) { + return Err(CommandError::abort( + "at least one of -n/-c is required for -l", + )); + } + + let verbosity = match (args.get_flag("quiet"), args.get_flag("verbose")) { + (false, false) => Verbosity::Default, + (true, false) => Verbosity::Quiet, + (false, true) => Verbosity::Verbose, + (true, true) => unreachable!(), + }; + + let changelog = repo.changelog()?; + let mut formatter = Formatter::new( + &changelog, + invocation.ui.encoder(), + include, + verbosity, + ); + let mut stdout = invocation.ui.stdout_buffer(); + for path in files { + match annotate(repo, &path, rev, options)? { + AnnotateOutput::Text(text) => { + let annotations = formatter.format(text.annotations)?; + for (annotation, line) in annotations.iter().zip(&text.lines) { + stdout.write_all(&format_bytes!( + b"{}: {}", annotation, line + ))?; + } + if let Some(line) = text.lines.last() { + if !line.ends_with(b"\n") { + stdout.write_all(b"\n")?; + } + } + } + AnnotateOutput::Binary => { + stdout.write_all(&format_bytes!( + b"{}: binary file\n", + path.as_bytes() + ))?; + } + AnnotateOutput::NotFound => { + let short = changelog.node_from_rev(rev).short(); + return Err(CommandError::abort(format!( + "{path}: no such file in rev {short:x}", + ))); + } + } + } + stdout.flush()?; + + Ok(()) +} + +struct Formatter<'a> { + changelog: &'a Changelog, + encoder: &'a Encoder, + include: Include, + verbosity: Verbosity, + cache: FastHashMap<Revision, ChangesetData>, +} + +#[derive(Copy, Clone)] +struct Include { + user: bool, + number: bool, + changeset: bool, + date: bool, + file: bool, + line_number: bool, +} + +impl Include { + fn count(&self) -> usize { + // Rust guarantees false is 0 and true is 1. + self.user as usize + + self.number as usize + + self.changeset as usize + + self.date as usize + + self.file as usize + + self.line_number as usize + } +} + +#[derive(Copy, Clone)] +enum Verbosity { + Quiet, + Default, + Verbose, +} + +#[derive(Default)] +struct ChangesetData { + user: Option<Vec<u8>>, + changeset: Option<Vec<u8>>, + date: Option<Vec<u8>>, +} + +impl ChangesetData { + fn create( + revision: Revision, + changelog: &Changelog, + include: Include, + verbosity: Verbosity, + ) -> Result<ChangesetData, CommandError> { + let mut result = ChangesetData::default(); + if !(include.user || include.changeset || include.date) { + return Ok(result); + } + let entry = changelog.entry(revision)?; + let data = entry.data()?; + if include.user { + let user = match verbosity { + Verbosity::Verbose => data.user(), + _ => hg::utils::strings::short_user(data.user()), + }; + result.user = Some(user.to_vec()); + } + if include.changeset { + let changeset = entry.as_revlog_entry().node().short(); + result.changeset = Some(format!("{:x}", changeset).into_bytes()); + } + if include.date { + let date = data.timestamp()?.format(match verbosity { + Verbosity::Quiet => "%Y-%m-%d", + _ => "%a %b %d %H:%M:%S %Y %z", + }); + result.date = Some(format!("{}", date).into_bytes()); + } + Ok(result) + } +} + +impl<'a> Formatter<'a> { + fn new( + changelog: &'a Changelog, + encoder: &'a Encoder, + include: Include, + verbosity: Verbosity, + ) -> Self { + let cache = FastHashMap::default(); + Self { + changelog, + encoder, + include, + verbosity, + cache, + } + } + + fn format( + &mut self, + annotations: Vec<ChangesetAnnotation>, + ) -> Result<Vec<Vec<u8>>, CommandError> { + let mut lines: Vec<Vec<Vec<u8>>> = + Vec::with_capacity(annotations.len()); + let num_fields = self.include.count(); + let mut widths = vec![0usize; num_fields]; + for annotation in annotations { + let revision = annotation.revision; + let data = match self.cache.entry(revision) { + Entry::Occupied(occupied) => occupied.into_mut(), + Entry::Vacant(vacant) => vacant.insert(ChangesetData::create( + revision, + self.changelog, + self.include, + self.verbosity, + )?), + }; + let mut fields = Vec::with_capacity(num_fields); + if let Some(user) = &data.user { + fields.push(user.clone()); + } + if self.include.number { + fields.push(format_bytes!(b"{}", revision)); + } + if let Some(changeset) = &data.changeset { + fields.push(changeset.clone()); + } + if let Some(date) = &data.date { + fields.push(date.clone()); + } + if self.include.file { + fields.push(annotation.path.into_vec()); + } + if self.include.line_number { + fields.push(format_bytes!(b"{}", annotation.line_number)); + } + for (field, width) in fields.iter().zip(widths.iter_mut()) { + *width = std::cmp::max( + *width, + self.encoder.column_width_bytes(field), + ); + } + lines.push(fields); + } + let total_width = widths.iter().sum::<usize>() + num_fields - 1; + Ok(lines + .iter() + .map(|fields| { + let mut bytes = Vec::with_capacity(total_width); + for (i, (field, width)) in + fields.iter().zip(widths.iter()).enumerate() + { + if i > 0 { + let colon = + self.include.line_number && i == num_fields - 1; + bytes.push(if colon { b':' } else { b' ' }); + } + let padding = + width - self.encoder.column_width_bytes(field); + bytes.resize(bytes.len() + padding, b' '); + bytes.extend_from_slice(field); + } + bytes + }) + .collect()) + } +}
--- a/rust/rhg/src/commands/config.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/rhg/src/commands/config.rs Fri Feb 28 23:28:10 2025 +0100 @@ -2,7 +2,7 @@ use clap::Arg; use format_bytes::format_bytes; use hg::errors::HgError; -use hg::utils::SliceExt; +use hg::utils::strings::SliceExt; pub const HELP_TEXT: &str = " With one argument of the form section.name, print just the value of that config item.
--- a/rust/rhg/src/commands/files.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/rhg/src/commands/files.rs Fri Feb 28 23:28:10 2025 +0100 @@ -26,7 +26,7 @@ Arg::new("rev") .help("search the repository as it is in REV") .short('r') - .long("revision") + .long("rev") .value_name("REV"), ) .arg(
--- a/rust/rhg/src/commands/status.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/rhg/src/commands/status.rs Fri Feb 28 23:28:10 2025 +0100 @@ -462,14 +462,18 @@ let (narrow_matcher, narrow_warnings) = narrow::matcher(repo)?; let (sparse_matcher, sparse_warnings) = sparse::matcher(repo)?; - let matcher = match (repo.has_narrow(), repo.has_sparse()) { - (true, true) => { - Box::new(IntersectionMatcher::new(narrow_matcher, sparse_matcher)) - } - (true, false) => narrow_matcher, - (false, true) => sparse_matcher, - (false, false) => Box::new(AlwaysMatcher), - }; + // Sparse is only applicable for the working copy, not history. + let sparse_is_applicable = revpair.is_none() && change.is_none(); + let matcher = + match (repo.has_narrow(), repo.has_sparse() && sparse_is_applicable) { + (true, true) => Box::new(IntersectionMatcher::new( + narrow_matcher, + sparse_matcher, + )), + (true, false) => narrow_matcher, + (false, true) => sparse_matcher, + (false, false) => Box::new(AlwaysMatcher), + }; let matcher = match args.get_many::<std::ffi::OsString>("file") { None => matcher, Some(files) => { @@ -687,8 +691,7 @@ mut paths: Vec<StatusPath<'_>>, ) -> Result<(), CommandError> { paths.sort_unstable(); - // TODO: get the stdout lock once for the whole loop - // instead of in each write + let mut stdout = self.ui.stdout_buffer(); for StatusPath { path, copy_source } in paths { let relative_path; let relative_source; @@ -702,25 +705,27 @@ } else { (path.as_bytes(), copy_source.as_ref().map(|s| s.as_bytes())) }; - // TODO: Add a way to use `write_bytes!` instead of `format_bytes!` - // in order to stream to stdout instead of allocating an - // itermediate `Vec<u8>`. + // TODO: Add a way to use `write_bytes!` instead of + // `format_bytes!` in order to stream to stdout + // instead of allocating an itermediate + // `Vec<u8>`. if !self.no_status { - self.ui.write_stdout_labelled(status_prefix, label)? + stdout.write_stdout_labelled(status_prefix, label)? } let linebreak = if self.print0 { b"\x00" } else { b"\n" }; - self.ui.write_stdout_labelled( + stdout.write_stdout_labelled( &format_bytes!(b"{}{}", path, linebreak), label, )?; if let Some(source) = copy_source { let label = "status.copied"; - self.ui.write_stdout_labelled( + stdout.write_stdout_labelled( &format_bytes!(b" {}{}", source, linebreak), label, )? } } + stdout.flush()?; Ok(()) }
--- a/rust/rhg/src/error.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/rhg/src/error.rs Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,3 @@ -use crate::ui::utf8_to_local; use crate::ui::UiError; use crate::NoRepoInCwdError; use format_bytes::format_bytes; @@ -53,7 +52,7 @@ // TODO: bytes-based (instead of Unicode-based) formatting // of error messages to handle non-UTF-8 filenames etc: // https://www.mercurial-scm.org/wiki/EncodingStrategy#Mixing_output - message: utf8_to_local(message.as_ref()).into(), + message: message.as_ref().as_bytes().to_owned(), detailed_exit_code, hint: None, } @@ -65,9 +64,9 @@ hint: Option<impl AsRef<str>>, ) -> Self { CommandError::Abort { - message: utf8_to_local(message.as_ref()).into(), + message: message.as_ref().as_bytes().to_owned(), detailed_exit_code, - hint: hint.map(|h| utf8_to_local(h.as_ref()).into()), + hint: hint.map(|h| h.as_ref().as_bytes().to_owned()), } } @@ -86,7 +85,7 @@ pub fn unsupported(message: impl AsRef<str>) -> Self { CommandError::UnsupportedFeature { - message: utf8_to_local(message.as_ref()).into(), + message: message.as_ref().as_bytes().to_owned(), } } } @@ -286,6 +285,15 @@ exit_codes::ABORT, ) } + SparseConfigError::WhitespaceAtEdgeOfPattern(prefix) => { + Self::abort_with_exit_code_bytes( + format_bytes!( + b"narrow pattern with whitespace at the edge: {}", + &prefix + ), + exit_codes::ABORT, + ) + } SparseConfigError::IncludesInNarrow => Self::abort( "including other spec files using '%include' \ is not supported in narrowspec",
--- a/rust/rhg/src/main.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/rhg/src/main.rs Fri Feb 28 23:28:10 2025 +0100 @@ -1,12 +1,12 @@ extern crate log; use crate::error::CommandError; -use crate::ui::{local_to_utf8, Ui}; +use crate::ui::Ui; use clap::{command, Arg, ArgMatches}; use format_bytes::{format_bytes, join}; use hg::config::{Config, ConfigSource, PlainInfo}; use hg::repo::{Repo, RepoError}; use hg::utils::files::{get_bytes_from_os_str, get_path_from_bytes}; -use hg::utils::SliceExt; +use hg::utils::strings::SliceExt; use hg::{exit_codes, requirements}; use std::borrow::Cow; use std::collections::{HashMap, HashSet}; @@ -446,7 +446,7 @@ on_unsupported = OnUnsupported::Abort } else { log::debug!("falling back (see trace-level log)"); - log::trace!("{}", local_to_utf8(message)); + log::trace!("{}", String::from_utf8_lossy(message)); if let Err(err) = which::which(executable_path) { exit_no_fallback( ui, @@ -526,6 +526,7 @@ } mod commands { + pub mod annotate; pub mod cat; pub mod config; pub mod debugdata; @@ -609,6 +610,7 @@ fn subcommands() -> Subcommands { let subcommands = vec![ + subcommand!(annotate), subcommand!(cat), subcommand!(debugdata), subcommand!(debugrequirements),
--- a/rust/rhg/src/ui.rs Fri Feb 28 23:25:42 2025 +0100 +++ b/rust/rhg/src/ui.rs Fri Feb 28 23:28:10 2025 +0100 @@ -5,20 +5,23 @@ use format_bytes::write_bytes; use hg::config::Config; use hg::config::PlainInfo; +use hg::encoding::Encoder; use hg::errors::HgError; use hg::filepatterns::PatternFileWarning; use hg::repo::Repo; use hg::sparse; use hg::utils::files::get_bytes_from_path; -use std::borrow::Cow; use std::io; +use std::io::BufWriter; use std::io::IsTerminal; +use std::io::StdoutLock; use std::io::{ErrorKind, Write}; pub struct Ui { stdout: std::io::Stdout, stderr: std::io::Stderr, colors: Option<ColorConfig>, + encoder: Encoder, } /// The kind of user interface error @@ -38,6 +41,7 @@ stderr: std::io::stderr(), colors: ColorConfig::new(config)?, + encoder: Encoder::from_env()?, }) } @@ -51,13 +55,17 @@ stderr: std::io::stderr(), colors: ColorConfig::new(config).unwrap_or(None), + encoder: Encoder::default(), } } /// Returns a buffered handle on stdout for faster batch printing /// operations. - pub fn stdout_buffer(&self) -> StdoutBuffer<std::io::StdoutLock> { - StdoutBuffer::new(self.stdout.lock()) + pub fn stdout_buffer(&self) -> StdoutBuffer<'_, BufWriter<StdoutLock>> { + StdoutBuffer { + stdout: BufWriter::new(self.stdout.lock()), + colors: &self.colors, + } } /// Write bytes to stdout @@ -78,12 +86,24 @@ stderr.flush().or_else(handle_stderr_error) } + pub fn encoder(&self) -> &Encoder { + &self.encoder + } +} + +/// A buffered stdout writer for faster batch printing operations. +pub struct StdoutBuffer<'a, W> { + colors: &'a Option<ColorConfig>, + stdout: W, +} + +impl<'a, W: Write> StdoutBuffer<'a, W> { /// Write bytes to stdout with the given label /// /// Like the optional `label` parameter in `mercurial/ui.py`, /// this label influences the color used for this output. pub fn write_stdout_labelled( - &self, + &mut self, bytes: &[u8], label: &str, ) -> Result<(), UiError> { @@ -96,15 +116,15 @@ } } } - self.write_stdout(bytes) + self.write_all(bytes) } fn write_stdout_with_effects( - &self, + &mut self, bytes: &[u8], effects: &[Effect], ) -> io::Result<()> { - let stdout = &mut self.stdout.lock(); + let stdout = &mut self.stdout; let mut write_line = |line: &[u8], first: bool| { // `line` does not include the newline delimiter if !first { @@ -130,7 +150,17 @@ write_line(line, false)? } } - stdout.flush() + Ok(()) + } + + /// Write bytes to stdout buffer + pub fn write_all(&mut self, bytes: &[u8]) -> Result<(), UiError> { + self.stdout.write_all(bytes).or_else(handle_stdout_error) + } + + /// Flush bytes to stdout + pub fn flush(&mut self) -> Result<(), UiError> { + self.stdout.flush().or_else(handle_stdout_error) } } @@ -144,28 +174,6 @@ } } -/// A buffered stdout writer for faster batch printing operations. -pub struct StdoutBuffer<W: Write> { - buf: io::BufWriter<W>, -} - -impl<W: Write> StdoutBuffer<W> { - pub fn new(writer: W) -> Self { - let buf = io::BufWriter::new(writer); - Self { buf } - } - - /// Write bytes to stdout buffer - pub fn write_all(&mut self, bytes: &[u8]) -> Result<(), UiError> { - self.buf.write_all(bytes).or_else(handle_stdout_error) - } - - /// Flush bytes to stdout - pub fn flush(&mut self) -> Result<(), UiError> { - self.buf.flush().or_else(handle_stdout_error) - } -} - /// Sometimes writing to stdout is not possible, try writing to stderr to /// signal that failure, otherwise just bail. fn handle_stdout_error(error: io::Error) -> Result<(), UiError> { @@ -197,19 +205,6 @@ Err(UiError::StdoutError(error)) } -/// Encode rust strings according to the user system. -pub fn utf8_to_local(s: &str) -> Cow<[u8]> { - // TODO encode for the user's system // - let bytes = s.as_bytes(); - Cow::Borrowed(bytes) -} - -/// Decode user system bytes to Rust string. -pub fn local_to_utf8(s: &[u8]) -> Cow<str> { - // TODO decode from the user's system - String::from_utf8_lossy(s) -} - /// Should formatted output be used? /// /// Note: rhg does not have the formatter mechanism yet,
--- a/setup.py Fri Feb 28 23:25:42 2025 +0100 +++ b/setup.py Fri Feb 28 23:28:10 2025 +0100 @@ -11,7 +11,6 @@ import sys import sysconfig import tempfile -import time if not ssl.HAS_TLSv1_2: error = """ @@ -64,7 +63,6 @@ from setuptools.command.build_py import build_py from setuptools.command.install import install from setuptools.command.install_lib import install_lib -from setuptools.command.install_scripts import install_scripts from setuptools.errors import ( CCompilerError, @@ -80,6 +78,17 @@ from distutils.sysconfig import get_python_inc from distutils.ccompiler import new_compiler +# raise an explicit error if setuptools_scm is not importable +try: + import setuptools_scm +except ImportError: + raise SystemExit( + "Couldn't import setuptools_scm (direct call of setup.py?)." + ) +else: + del setuptools_scm + + ispypy = "PyPy" in sys.version @@ -219,84 +228,6 @@ return b'\n'.join(b' ' + e for e in err) -def findhg(): - """Try to figure out how we should invoke hg for examining the local - repository contents. - - Returns an hgcommand object.""" - # By default, prefer the "hg" command in the user's path. This was - # presumably the hg command that the user used to create this repository. - # - # This repository may require extensions or other settings that would not - # be enabled by running the hg script directly from this local repository. - hgenv = os.environ.copy() - # Use HGPLAIN to disable hgrc settings that would change output formatting, - # and disable localization for the same reasons. - hgenv['HGPLAIN'] = '1' - hgenv['LANGUAGE'] = 'C' - # PYTHONPATH and co can be used for isolated builds, which can break hg - hgenv.pop("PYTHONNOUSERSITE", None) - if "PYTHONPATH" in hgenv: - hgenv["PYTHONPATH"] = os.pathsep.join( - [ - path - for path in hgenv["PYTHONPATH"].split(os.pathsep) - if "-build-env-" not in path - ] - ) - hgcmd = ['hg'] - # Run a simple "hg log" command just to see if using hg from the user's - # path works and can successfully interact with this repository. Windows - # gives precedence to hg.exe in the current directory, so fall back to the - # python invocation of local hg, where pythonXY.dll can always be found. - check_cmd = ['log', '-r.', '-Ttest'] - attempts = [] - - def attempt(cmd, env): - try: - retcode, out, err = runcmd(hgcmd + check_cmd, hgenv) - res = (True, retcode, out, err) - if retcode == 0 and not filterhgerr(err): - return True - except OSError as e: - res = (False, e) - attempts.append((cmd, res)) - return False - - if os.name != 'nt' or not os.path.exists("hg.exe"): - if attempt(hgcmd + check_cmd, hgenv): - return hgcommand(hgcmd, hgenv) - - # Fall back to trying the local hg installation (pure python) - repo_hg = os.path.join(os.path.dirname(__file__), 'hg') - hgenv = localhgenv() - hgcmd = [sys.executable, repo_hg] - if attempt(hgcmd + check_cmd, hgenv): - return hgcommand(hgcmd, hgenv) - # Fall back to trying the local hg installation (whatever we can) - hgenv = localhgenv(pure_python=False) - hgcmd = [sys.executable, repo_hg] - if attempt(hgcmd + check_cmd, hgenv): - return hgcommand(hgcmd, hgenv) - - eprint("/!\\") - eprint(r"/!\ Unable to find a working hg binary") - eprint(r"/!\ Version cannot be extracted from the repository") - eprint(r"/!\ Re-run the setup once a first version is built") - eprint(r"/!\ Attempts:") - for i, e in enumerate(attempts): - eprint(r"/!\ attempt #%d:" % (i)) - eprint(r"/!\ cmd: ", e[0]) - res = e[1] - if res[0]: - eprint(r"/!\ return code:", res[1]) - eprint("/!\\ std output:\n%s" % (res[2].decode()), end="") - eprint("/!\\ std error:\n%s" % (res[3].decode()), end="") - else: - eprint(r"/!\ exception: ", res[1]) - return None - - def localhgenv(pure_python=True): """Get an environment dictionary to use for invoking or importing mercurial from the local repository.""" @@ -321,158 +252,6 @@ version = '' -def _try_get_version(): - hg = findhg() - if hg is None: - return '' - hgid = None - numerictags = [] - cmd = ['log', '-r', '.', '--template', '{tags}\n'] - pieces = sysstr(hg.run(cmd)).split() - numerictags = [t for t in pieces if t[0:1].isdigit()] - hgid = sysstr(hg.run(['id', '-i'])).strip() - if hgid.count('+') == 2: - hgid = hgid.replace("+", ".", 1) - if not hgid: - eprint("/!\\") - eprint(r"/!\ Unable to determine hg version from local repository") - eprint(r"/!\ Failed to retrieve current revision tags") - return '' - if numerictags: # tag(s) found - return _version(tag=numerictags[-1], dirty=hgid.endswith('+')) - else: # no tag found on the checked out revision - ltagcmd = ['log', '--rev', 'wdir()', '--template', '{latesttag}'] - ltag = sysstr(hg.run(ltagcmd)) - if not ltag: - eprint("/!\\") - eprint(r"/!\ Unable to determine hg version from local repository") - eprint( - r"/!\ Failed to retrieve current revision distance to lated tag" - ) - return '' - changessincecmd = [ - 'log', - '-T', - 'x\n', - '-r', - "only(parents(),'%s')" % ltag, - ] - changessince = len(hg.run(changessincecmd).splitlines()) - branch = hg.run(["branch"]).strip() - return _version( - tag=ltag, - branch=branch, - hgid=hgid.rstrip('+'), - changes_since=changessince, - dirty=hgid.endswith('+'), - ) - - -def _version( - tag: str, - branch: str = '', - hgid: str = '', - changes_since: int = 0, - dirty: bool = False, -): - """compute a version number from available information""" - version = tag - if changes_since > 0: - assert branch - if branch == b'stable': - post_nb = 0 - elif branch == b'default': - # we use 1 here to be greater than 0 to make sure change from - # default are considered newer than change on stable - post_nb = 1 - else: - # what is this branch ? probably a local variant ? - post_nb = 2 - - assert hgid - - # logic of the scheme - # - '.postX' to mark the version as "above" the tagged version - # X is 0 for stable, 1 for default, 2 for anything else - # - use '.devY' - # Y is the number of extra revision compared to the tag. So that - # revision with more change are "above" previous ones. - # - '+hg.NODEID.local.DATE' if there is any uncommitted changes. - version += '.post%d.dev%d+hg.%s' % (post_nb, changes_since, hgid) - if dirty: - version = version[:-1] + '.local.' + time.strftime('%Y%m%d') - # try to give warning early about bad version if possible - try: - from packaging.version import Version - - Version(version) - except ImportError: - pass - except ValueError as exc: - eprint(r"/!\ generated version is invalid") - eprint(r"/!\ error: %s" % exc) - return version - - -if os.path.isdir('.hg'): - version = _try_get_version() -elif os.path.exists('.hg_archival.txt'): - kw = dict( - [[t.strip() for t in l.split(':', 1)] for l in open('.hg_archival.txt')] - ) - if 'tag' in kw: - version = _version(tag=kw['tag']) - elif 'latesttag' in kw: - distance = int(kw.get('changessincelatesttag', kw['latesttagdistance'])) - version = _version( - tag=kw['latesttag'], - branch=kw['branch'], - changes_since=distance, - hgid=kw['node'][:12], - ) - else: - version = _version( - tag='0', - branch='unknown-source', - changes_since=1, - hgid=kw.get('node', 'unknownid')[:12], - dirty=True, - ) -elif os.path.exists('mercurial/__version__.py'): - with open('mercurial/__version__.py') as f: - data = f.read() - version = re.search('version = b"(.*)"', data).group(1) -if not version: - if os.environ.get("MERCURIAL_SETUP_MAKE_LOCAL") == "1": - version = "0.0+0" - eprint("/!\\") - eprint(r"/!\ Using '0.0+0' as the default version") - eprint(r"/!\ Re-run make local once that first version is built") - eprint("/!\\") - else: - eprint("/!\\") - eprint(r"/!\ Could not determine the Mercurial version") - eprint(r"/!\ You need to build a local version first") - eprint(r"/!\ Run `make local` and try again") - eprint("/!\\") - msg = "Run `make local` first to get a working local version" - raise SystemExit(msg) - -versionb = version -if not isinstance(versionb, bytes): - versionb = versionb.encode('ascii') - -write_if_changed( - 'mercurial/__version__.py', - b''.join( - [ - b'# this file is autogenerated by setup.py\n' - b'version = b"%s"\n' % versionb, - ] - ), -) - - class hgbuild(build): # Insert hgbuildmo first so that files in mercurial/locale/ are found # when build_py is run next. @@ -637,7 +416,7 @@ ) for rustext in ruststandalones: - rustext.build('' if self.inplace else self.build_lib) + rustext.build('' if self.editable_mode else self.build_lib) return build_ext.build_extensions(self) @@ -715,9 +494,6 @@ ) def run(self): - basepath = os.path.join(self.build_lib, 'mercurial') - self.mkpath(basepath) - rust = self.distribution.rust if self.distribution.pure: modulepolicy = 'py' @@ -733,6 +509,14 @@ b'modulepolicy = b"%s"\n' % modulepolicy.encode('ascii'), ] ) + + if self.editable_mode: + here = os.path.dirname(__file__) + basepath = os.path.join(here, 'mercurial') + else: + basepath = os.path.join(self.build_lib, 'mercurial') + self.mkpath(basepath) + write_if_changed(os.path.join(basepath, '__modulepolicy__.py'), content) build_py.run(self) @@ -1151,78 +935,6 @@ file_util.copy_file = realcopyfile -class hginstallscripts(install_scripts): - """ - This is a specialization of install_scripts that replaces the @LIBDIR@ with - the configured directory for modules. If possible, the path is made relative - to the directory for scripts. - """ - - def initialize_options(self): - install_scripts.initialize_options(self) - - self.install_lib = None - - def finalize_options(self): - install_scripts.finalize_options(self) - self.set_undefined_options('install', ('install_lib', 'install_lib')) - - def run(self): - install_scripts.run(self) - - # It only makes sense to replace @LIBDIR@ with the install path if - # the install path is known. For wheels, the logic below calculates - # the libdir to be "../..". This is because the internal layout of a - # wheel archive looks like: - # - # mercurial-3.6.1.data/scripts/hg - # mercurial/__init__.py - # - # When installing wheels, the subdirectories of the "<pkg>.data" - # directory are translated to system local paths and files therein - # are copied in place. The mercurial/* files are installed into the - # site-packages directory. However, the site-packages directory - # isn't known until wheel install time. This means we have no clue - # at wheel generation time what the installed site-packages directory - # will be. And, wheels don't appear to provide the ability to register - # custom code to run during wheel installation. This all means that - # we can't reliably set the libdir in wheels: the default behavior - # of looking in sys.path must do. - - if ( - os.path.splitdrive(self.install_dir)[0] - != os.path.splitdrive(self.install_lib)[0] - ): - # can't make relative paths from one drive to another, so use an - # absolute path instead - libdir = self.install_lib - else: - libdir = os.path.relpath(self.install_lib, self.install_dir) - - for outfile in self.outfiles: - with open(outfile, 'rb') as fp: - data = fp.read() - - # skip binary files - if b'\0' in data: - continue - - # During local installs, the shebang will be rewritten to the final - # install path. During wheel packaging, the shebang has a special - # value. - if data.startswith(b'#!python'): - logging.info( - 'not rewriting @LIBDIR@ in %s because install path ' - 'not known', - outfile, - ) - continue - - data = data.replace(b'@LIBDIR@', libdir.encode('unicode_escape')) - with open(outfile, 'wb') as fp: - fp.write(data) - - class hginstallcompletion(Command): description = 'Install shell completion' @@ -1255,87 +967,6 @@ self.copy_file(os.path.join('contrib', src), dest) -# virtualenv installs custom distutils/__init__.py and -# distutils/distutils.cfg files which essentially proxy back to the -# "real" distutils in the main Python install. The presence of this -# directory causes py2exe to pick up the "hacked" distutils package -# from the virtualenv and "import distutils" will fail from the py2exe -# build because the "real" distutils files can't be located. -# -# We work around this by monkeypatching the py2exe code finding Python -# modules to replace the found virtualenv distutils modules with the -# original versions via filesystem scanning. This is a bit hacky. But -# it allows us to use virtualenvs for py2exe packaging, which is more -# deterministic and reproducible. -# -# It's worth noting that the common StackOverflow suggestions for this -# problem involve copying the original distutils files into the -# virtualenv or into the staging directory after setup() is invoked. -# The former is very brittle and can easily break setup(). Our hacking -# of the found modules routine has a similar result as copying the files -# manually. But it makes fewer assumptions about how py2exe works and -# is less brittle. - -# This only catches virtualenvs made with virtualenv (as opposed to -# venv, which is likely what Python 3 uses). -py2exehacked = py2exeloaded and getattr(sys, 'real_prefix', None) is not None - -if py2exehacked: - from distutils.command.py2exe import py2exe as buildpy2exe - from py2exe.mf import Module as py2exemodule - - class hgbuildpy2exe(buildpy2exe): - def find_needed_modules(self, mf, files, modules): - res = buildpy2exe.find_needed_modules(self, mf, files, modules) - - # Replace virtualenv's distutils modules with the real ones. - modules = {} - for k, v in res.modules.items(): - if k != 'distutils' and not k.startswith('distutils.'): - modules[k] = v - - res.modules = modules - - import opcode - - distutilsreal = os.path.join( - os.path.dirname(opcode.__file__), 'distutils' - ) - - for root, dirs, files in os.walk(distutilsreal): - for f in sorted(files): - if not f.endswith('.py'): - continue - - full = os.path.join(root, f) - - parents = ['distutils'] - - if root != distutilsreal: - rel = os.path.relpath(root, distutilsreal) - parents.extend(p for p in rel.split(os.sep)) - - modname = '%s.%s' % ('.'.join(parents), f[:-3]) - - if modname.startswith('distutils.tests.'): - continue - - if modname.endswith('.__init__'): - modname = modname[: -len('.__init__')] - path = os.path.dirname(full) - else: - path = None - - res.modules[modname] = py2exemodule( - modname, full, path=path - ) - - if 'distutils' not in res.modules: - raise SystemExit('could not find distutils modules') - - return res - - cmdclass = { 'build': hgbuild, 'build_doc': hgbuilddoc, @@ -1347,13 +978,9 @@ 'install': hginstall, 'install_completion': hginstallcompletion, 'install_lib': hginstalllib, - 'install_scripts': hginstallscripts, 'build_hgexe': buildhgexe, } -if py2exehacked: - cmdclass['py2exe'] = hgbuildpy2exe - packages = [ 'mercurial', 'mercurial.admin', @@ -1495,6 +1122,9 @@ if os.path.exists(cargo_lock): self.depends.append(cargo_lock) for dirpath, subdir, fnames in os.walk(os.path.join(srcdir, 'src')): + if dirpath == os.path.join(srcdir, "target"): + # Skip this large artifacts free + continue self.depends.extend( os.path.join(dirpath, fname) for fname in fnames @@ -1538,6 +1168,7 @@ env['PYTHON_SYS_EXECUTABLE'] = sys.executable env['PYTHONEXECUTABLE'] = sys.executable env['PYTHON'] = sys.executable + env['PYO3_PYTHON'] = sys.executable cargocmd = ['cargo', 'rustc', '--release'] @@ -1730,13 +1361,6 @@ packagename = curdir.replace(os.sep, '.') packagedata[packagename] = list(filter(ordinarypath, files)) -# distutils expects version to be str/unicode. Converting it to -# unicode on Python 2 still works because it won't contain any -# non-ascii bytes and will be implicitly converted back to bytes -# when operated on. -assert isinstance(version, str) -setupversion = version - extra = {} py2exepackages = [ @@ -1794,7 +1418,6 @@ hgbuild.sub_commands.insert(0, ('build_hgextindex', None)) setup( - version=setupversion, long_description=( 'Mercurial is a distributed SCM tool written in Python.' ' It is used by a number of large projects that require'
--- a/tests/dumbhttp.py Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/dumbhttp.py Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,4 @@ -#!/usr/bin/env python +#!/usr/bin/env python3 """
--- a/tests/dummysmtpd.py Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/dummysmtpd.py Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,4 @@ -#!/usr/bin/env python +#!/usr/bin/env python3 """dummy SMTP server for use in tests"""
--- a/tests/filterpyflakes.py Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/filterpyflakes.py Fri Feb 28 23:28:10 2025 +0100 @@ -15,6 +15,7 @@ # for cffi, allow re-exports from pure.* r"cffi/[^:]*:.*\bimport \*' used", r"cffi/[^:]*:.*\*' imported but unused", + r"mercurial/interfaces/types.py:.+' imported but unused", ] keep = True
--- a/tests/get-with-headers.py Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/get-with-headers.py Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,4 @@ -#!/usr/bin/env python +#!/usr/bin/env python3 """This does HTTP requests (GET by default) given a host:port and path and returns a subset of the headers plus the body of the result."""
--- a/tests/hghave Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/hghave Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,4 @@ -#!/usr/bin/env python +#!/usr/bin/env python3 """Test the running system for features availability. Exit with zero if all features are there, non-zero otherwise. If a feature name is prefixed with "no-", the absence of feature is tested.
--- a/tests/hghave.py Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/hghave.py Fri Feb 28 23:28:10 2025 +0100 @@ -141,7 +141,7 @@ return env -def matchoutput(cmd, regexp, ignorestatus=False): +def matchoutput(cmd, regexp: bytes, ignorestatus: bool = False): """Return the match object if cmd executes successfully and its output is matched by the supplied regular expression. """ @@ -175,7 +175,7 @@ import breezy.revision import breezy.revisionspec - breezy.revisionspec.RevisionSpec + del breezy.revisionspec if breezy.__doc__ is None or breezy.version_info[:2] < (3, 1): return False except (AttributeError, ImportError): @@ -340,7 +340,7 @@ try: import _lsprof - _lsprof.Profiler # silence unused import warning + del _lsprof # silence unused import warning return True except ImportError: return False @@ -349,8 +349,8 @@ def _gethgversion(): m = matchoutput('hg --version --quiet 2>&1', br'(\d+)\.(\d+)') if not m: - return (0, 0) - return (int(m.group(1)), int(m.group(2))) + return 0, 0 + return int(m.group(1)), int(m.group(2)) _hgversion = None @@ -386,21 +386,23 @@ def has_hg08(): if checks["hg09"][0](): return True - return matchoutput('hg help annotate 2>&1', '--date') + return matchoutput('hg help annotate 2>&1', b'--date') @check("hg07", "Mercurial >= 0.7") def has_hg07(): if checks["hg08"][0](): return True - return matchoutput('hg --version --quiet 2>&1', 'Mercurial Distributed SCM') + return matchoutput( + 'hg --version --quiet 2>&1', b'Mercurial Distributed SCM' + ) @check("hg06", "Mercurial >= 0.6") def has_hg06(): if checks["hg07"][0](): return True - return matchoutput('hg --version --quiet 2>&1', 'Mercurial version') + return matchoutput('hg --version --quiet 2>&1', b'Mercurial version') @check("gettext", "GNU Gettext (msgfmt)") @@ -416,8 +418,8 @@ def getgitversion(): m = matchoutput('git --version 2>&1', br'git version (\d+)\.(\d+)') if not m: - return (0, 0) - return (int(m.group(1)), int(m.group(2))) + return 0, 0 + return int(m.group(1)), int(m.group(2)) @check("pygit2", "pygit2 Python library") @@ -425,7 +427,7 @@ try: import pygit2 - pygit2.Oid # silence unused import + del pygit2 # silence unused import return True except ImportError: return False @@ -454,7 +456,7 @@ try: import docutils.core - docutils.core.publish_cmdline # silence unused import + del docutils.core # silence unused import return True except ImportError: return False @@ -463,8 +465,8 @@ def getsvnversion(): m = matchoutput('svn --version --quiet 2>&1', br'^(\d+)\.(\d+)') if not m: - return (0, 0) - return (int(m.group(1)), int(m.group(2))) + return 0, 0 + return int(m.group(1)), int(m.group(2)) @checkvers("svn", "subversion client and admin tools >= %s", ('1.3', '1.5')) @@ -646,7 +648,7 @@ try: import pygments - pygments.highlight # silence unused import warning + del pygments # silence unused import warning return True except ImportError: return False @@ -659,9 +661,9 @@ v = pygments.__version__ parts = v.split(".") - return (int(parts[0]), int(parts[1])) + return int(parts[0]), int(parts[1]) except ImportError: - return (0, 0) + return 0, 0 @checkvers("pygments", "Pygments version >= %s", ('2.5', '2.11', '2.14')) @@ -681,7 +683,7 @@ try: import ssl - ssl.CERT_NONE + del ssl return True except ImportError: return False @@ -767,7 +769,7 @@ try: import curses - curses.COLOR_BLUE + del curses # Windows doesn't have a `tic` executable, but the windows_curses # package is sufficient to run the tests without it. @@ -811,12 +813,12 @@ def has_osxpackaging(): try: return ( - matchoutput('pkgbuild', br'Usage: pkgbuild ', ignorestatus=1) + matchoutput('pkgbuild', br'Usage: pkgbuild ', ignorestatus=True) and matchoutput( - 'productbuild', br'Usage: productbuild ', ignorestatus=1 + 'productbuild', br'Usage: productbuild ', ignorestatus=True ) - and matchoutput('lsbom', br'Usage: lsbom', ignorestatus=1) - and matchoutput('xar --help', br'Usage: xar', ignorestatus=1) + and matchoutput('lsbom', br'Usage: lsbom', ignorestatus=True) + and matchoutput('xar --help', br'Usage: xar', ignorestatus=True) ) except ImportError: return False @@ -929,7 +931,7 @@ try: import hypothesis - hypothesis.given + del hypothesis return True except ImportError: return False @@ -945,7 +947,7 @@ try: import mercurial.zstd - mercurial.zstd.__version__ + del mercurial.zstd return True except ImportError: return False @@ -961,7 +963,7 @@ try: import ensurepip - ensurepip.bootstrap + del ensurepip return True except ImportError: return False @@ -990,7 +992,7 @@ try: import fuzzywuzzy - fuzzywuzzy.__version__ + del fuzzywuzzy return True except ImportError: return False @@ -1026,65 +1028,6 @@ return 'HGTESTEXTRAEXTENSIONS' in os.environ -def getrepofeatures(): - """Obtain set of repository features in use. - - HGREPOFEATURES can be used to define or remove features. It contains - a space-delimited list of feature strings. Strings beginning with ``-`` - mean to remove. - """ - # Default list provided by core. - features = { - 'bundlerepo', - 'revlogstore', - 'fncache', - } - - # Features that imply other features. - implies = { - 'simplestore': ['-revlogstore', '-bundlerepo', '-fncache'], - } - - for override in os.environ.get('HGREPOFEATURES', '').split(' '): - if not override: - continue - - if override.startswith('-'): - if override[1:] in features: - features.remove(override[1:]) - else: - features.add(override) - - for imply in implies.get(override, []): - if imply.startswith('-'): - if imply[1:] in features: - features.remove(imply[1:]) - else: - features.add(imply) - - return features - - -@check('reporevlogstore', 'repository using the default revlog store') -def has_reporevlogstore(): - return 'revlogstore' in getrepofeatures() - - -@check('reposimplestore', 'repository using simple storage extension') -def has_reposimplestore(): - return 'simplestore' in getrepofeatures() - - -@check('repobundlerepo', 'whether we can open bundle files as repos') -def has_repobundlerepo(): - return 'bundlerepo' in getrepofeatures() - - -@check('repofncache', 'repository has an fncache') -def has_repofncache(): - return 'fncache' in getrepofeatures() - - @check('dirstate-v2', 'using the v2 format of .hg/dirstate') def has_dirstate_v2(): # Keep this logic in sync with `newreporequirements()` in `mercurial/localrepo.py` @@ -1114,7 +1057,7 @@ try: import vcr - vcr.VCR + del vcr return True except (ImportError, AttributeError): pass @@ -1184,7 +1127,7 @@ try: import _lzma - _lzma.FORMAT_XZ + del _lzma return True except ImportError: return False
--- a/tests/run-tests.py Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/run-tests.py Fri Feb 28 23:28:10 2025 +0100 @@ -54,6 +54,7 @@ import json import multiprocessing import os +import pathlib import platform import queue import random @@ -72,7 +73,6 @@ import uuid import xml.dom.minidom as minidom - # Don't compare sys.version_info directly, to prevent pyupgrade from dropping # the conditional. sys_version_info = sys.version_info @@ -760,8 +760,25 @@ ) path_local_hg = os.path.join(reporootdir, venv_local, BINDIRNAME, b"hg") if not os.path.exists(path_local_hg): - # no local environment but we can still use ./hg to please test-run-tests.t - path_local_hg = os.path.join(reporootdir, b"hg") + if "HGTEST_REAL_HG" in os.environ: + # this file is run from a test (typically test-run-tests.t) + # no local environment but we can still use ./hg to please test-run-tests.t + path_local_hg = os.path.join(reporootdir, b"hg") + else: + message = ( + f"run-tests.py called with --local but {_bytes2sys(venv_local)} does not exist.\n" + f'To create it, run \nmake local PYTHON="{sys.executable}"' + ) + paths_venv = sorted( + pathlib.Path(_bytes2sys(reporootdir)).glob(".venv_*") + ) + if paths_venv: + message += ( + "\nAlternatively, call run-tests.py with a Python " + f"corresponding to {[p.name for p in paths_venv]}." + ) + print(message, file=sys.stderr) + sys.exit(1) pathandattrs = [(path_local_hg, 'with_hg')] if options.chg: @@ -3291,10 +3308,9 @@ self._pythondir = get_site_packages_dir(python_exe) except (FileNotFoundError, subprocess.CalledProcessError): self._pythondir = self._bindir - elif self.options.local: - assert WINDOWS - python_exe = os.path.join(self._bindir, b"python.exe") - self._pythondir = get_site_packages_dir(python_exe) + if self.options.local: + self._python = _bytes2sys(python_exe) + # If it looks like our in-repo Rust binary, use the source root. # This is a bit hacky. But rhg is still not supported outside the # source directory. So until it is, do the simple thing. @@ -3425,6 +3441,7 @@ if self.options.pure: os.environ["HGTEST_RUN_TESTS_PURE"] = "--pure" os.environ["HGMODULEPOLICY"] = "py" + os.environ.pop("HGWITHRUSTEXT", None) if self.options.rust: os.environ["HGMODULEPOLICY"] = "rust+c" if self.options.no_rust:
--- a/tests/test-annotate.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-annotate.t Fri Feb 28 23:28:10 2025 +0100 @@ -1066,9 +1066,9 @@ adding a $ sed 's/EOL$//g' > a <<EOF > a a - > > EOL > b b + > > EOF $ hg ci -m "changea" @@ -1076,33 +1076,41 @@ $ hg annotate a 1: a a - 0: 1: 1: b b + 0: Annotate with --ignore-space-change $ hg annotate --ignore-space-change a 1: a a - 1: 0: 0: b b + 1: Annotate with --ignore-all-space $ hg annotate --ignore-all-space a 0: a a - 0: - 1: + 0: 0: b b + 1: Annotate with --ignore-blank-lines (similar to no options case) $ hg annotate --ignore-blank-lines a 1: a a - 0: 1: 1: b b + 0: + +Annotate with --ignore-space-at-eol + + $ hg annotate --ignore-space-at-eol a + 1: a a + 0: + 1: b b + 1: $ cd .. @@ -1217,6 +1225,58 @@ $ cd .. +Annotate should use the starting revision (-r) as base for ancestor checks. + + $ hg init repo-base + $ cd repo-base + $ echo A > file + $ hg commit -Am "initial" + adding file + $ echo B >> file + $ hg commit -m "linkrev" + $ hg up 0 + 1 files updated, 0 files merged, 0 files removed, 0 files unresolved + $ echo B >> file + $ hg ci -m "linkrev alias" + created new head + $ echo C >> file + $ hg commit -m "change" + $ hg merge 1 --tool :local + 0 files updated, 1 files merged, 0 files removed, 0 files unresolved + (branch merge, don't forget to commit) + $ hg commit -m "merge" + $ hg debugindex file + rev linkrev nodeid p1-nodeid p2-nodeid + 0 0 45f17b21388f 000000000000 000000000000 + 1 1 e338fefefb89 45f17b21388f 000000000000 + 2 3 b2f3b2eded93 e338fefefb89 000000000000 + $ hg log -G --template '{rev}: {desc}' + @ 4: merge + |\ + | o 3: change + | | + | o 2: linkrev alias + | | + o | 1: linkrev + |/ + o 0: initial + +Line B should be attributed to the linkrev 1, because we base ancestor checks +from 4 (starting revision), not from 3 (most recent change to the file). + $ hg annotate file + 0: A + 1: B + 3: C + $ echo D >> file + $ hg commit -m "another change" + $ hg annotate file + 0: A + 1: B + 3: C + 5: D + + $ cd .. + Issue5360: Deleted chunk in p1 of a merge changeset $ hg init repo-5360
--- a/tests/test-audit-path.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-audit-path.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,6 +1,5 @@ The simple store doesn't escape paths robustly and can't store paths with periods, etc. So much of this test fails with it. -#require no-reposimplestore $ hg init repo $ cd repo
--- a/tests/test-bundle.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-bundle.t Fri Feb 28 23:28:10 2025 +0100 @@ -67,8 +67,6 @@ [1] $ hg -R empty verify -q -#if repobundlerepo - Pull full.hg into test (using --cwd) $ hg --cwd test pull ../full.hg @@ -271,8 +269,6 @@ (run 'hg heads' to see heads, 'hg merge' to merge) -#endif - Cannot produce streaming clone bundles with "hg bundle" $ hg -R test bundle -t packed1 packed.hg @@ -283,7 +279,7 @@ packed1 is produced properly -#if reporevlogstore rust +#if rust $ hg -R test debugcreatestreamclonebundle packed.hg writing 2665 bytes for 6 files (no-rust !) @@ -303,7 +299,7 @@ none-packed1;requirements%3Dgeneraldelta%2Crevlog-compression-zstd%2Crevlogv1%2Csparserevlog #endif -#if reporevlogstore no-rust zstd +#if no-rust zstd $ hg -R test debugcreatestreamclonebundle packed.hg writing 2665 bytes for 7 files @@ -319,7 +315,7 @@ none-packed1;requirements%3Dgeneraldelta%2Crevlog-compression-zstd%2Crevlogv1%2Csparserevlog #endif -#if reporevlogstore no-rust no-zstd +#if no-rust no-zstd $ hg -R test debugcreatestreamclonebundle packed.hg writing 2664 bytes for 7 files @@ -335,8 +331,6 @@ none-packed1;requirements%3Dgeneraldelta%2Crevlogv1%2Csparserevlog #endif -#if reporevlogstore - generaldelta requirement is not listed in stream clone bundles unless used $ hg --config format.usegeneraldelta=false init testnongd @@ -345,9 +339,7 @@ $ hg -q commit -A -m initial $ cd .. -#endif - -#if reporevlogstore rust +#if rust $ hg -R testnongd debugcreatestreamclonebundle packednongd.hg writing 301 bytes for 3 files (no-rust !) @@ -369,7 +361,7 @@ #endif -#if reporevlogstore no-rust zstd +#if no-rust zstd $ hg -R testnongd debugcreatestreamclonebundle packednongd.hg writing 301 bytes for 4 files @@ -388,7 +380,7 @@ #endif -#if reporevlogstore no-rust no-zstd +#if no-rust no-zstd $ hg -R testnongd debugcreatestreamclonebundle packednongd.hg writing 301 bytes for 4 files @@ -407,8 +399,6 @@ #endif -#if reporevlogstore - Warning emitted when packed bundles contain secret changesets $ hg init testsecret @@ -418,9 +408,7 @@ $ hg phase --force --secret -r . $ cd .. -#endif - -#if reporevlogstore rust +#if rust $ hg -R testsecret debugcreatestreamclonebundle packedsecret.hg (warning: stream clone bundle will contain secret revisions) @@ -430,7 +418,7 @@ #endif -#if reporevlogstore no-rust zstd +#if no-rust zstd $ hg -R testsecret debugcreatestreamclonebundle packedsecret.hg (warning: stream clone bundle will contain secret revisions) @@ -439,7 +427,7 @@ #endif -#if reporevlogstore no-rust no-zstd +#if no-rust no-zstd $ hg -R testsecret debugcreatestreamclonebundle packedsecret.hg (warning: stream clone bundle will contain secret revisions) @@ -448,8 +436,6 @@ #endif -#if reporevlogstore - Unpacking packed1 bundles with "hg unbundle" isn't allowed $ hg init packed @@ -492,8 +478,8 @@ 9 files to transfer, 2.85 KB of data (rust !) pretxnopen: 000000000000 pretxnclose: aa35859c02ea - transferred 2.60 KB in * seconds (* */sec) (glob) (no-rust !) - transferred 2.85 KB in * seconds (* */sec) (glob) (rust !) + stream-cloned 7 files / 2.60 KB in * seconds (* */sec) (glob) (no-rust !) + stream-cloned 9 files / 2.85 KB in * seconds (* */sec) (glob) (rust !) txnclose: aa35859c02ea (for safety, confirm visibility of streamclone-ed changes by another @@ -513,8 +499,6 @@ abort: cannot apply stream clone bundle on non-empty repo [255] -#endif - Create partial clones $ rm -r empty @@ -532,8 +516,6 @@ 1 files updated, 0 files merged, 0 files removed, 0 files unresolved $ cd partial -#if repobundlerepo - Log -R full.hg in partial $ hg -R bundle://../full.hg log -T phases @@ -669,8 +651,6 @@ abort: *../does-not-exist.hg* (glob) [255] -#endif - $ cd .. hide outer repo @@ -678,8 +658,6 @@ Direct clone from bundle (all-history) -#if repobundlerepo - $ hg clone full.hg full-clone requesting all changes adding changesets @@ -761,8 +739,6 @@ $ cd .. -#endif - test for 540d1059c802 $ hg init orig @@ -785,7 +761,6 @@ $ cd .. -#if repobundlerepo $ cd orig $ hg incoming ../bundle.hg comparing with ../bundle.hg @@ -815,8 +790,6 @@ [255] $ cd .. -#endif - test to bundle revisions on the newly created branch (issue3828): $ hg -q clone -U test test-clone @@ -827,10 +800,8 @@ $ hg -q outgoing ../test-clone 9:b4f5acb1ee27 $ hg -q bundle --branch foo foo.hg ../test-clone -#if repobundlerepo $ hg -R foo.hg -q log -r "bundle()" 9:b4f5acb1ee27 -#endif $ cd .. @@ -846,17 +817,14 @@ full history bundle, refuses to verify non-local repo -#if repobundlerepo $ hg -R all.hg verify abort: cannot verify bundle or remote repos [255] -#endif but, regular verify must continue to work $ hg -R orig verify -q -#if repobundlerepo diff against bundle $ hg init b @@ -871,7 +839,6 @@ -2 -3 $ cd .. -#endif bundle single branch @@ -930,13 +897,11 @@ files: x 3/3 files (100.00%) bundle2-output-part: "cache:rev-branch-cache" (advisory) streamed payload -#if repobundlerepo == Test for issue3441 $ hg clone -q -r0 . part2 $ hg -q -R part2 pull bundle.hg $ hg -R part2 verify -q -#endif == Test bundling no commits @@ -996,8 +961,6 @@ date: Thu Jan 01 00:00:00 1970 +0000 summary: 0 - -#if repobundlerepo $ hg bundle --base 1 -r 3 ../update2bundled.hg 1 changesets found $ hg strip -r 3 @@ -1019,7 +982,6 @@ $ hg update -R ../update2bundled.hg -r 0 0 files updated, 0 files merged, 2 files removed, 0 files unresolved -#endif Test the option that create slim bundle
--- a/tests/test-censor.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-censor.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,3 @@ -#require no-reposimplestore #testcases revlogv1 revlogv2 #if revlogv2
--- a/tests/test-check-code.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-check-code.t Fri Feb 28 23:28:10 2025 +0100 @@ -57,14 +57,16 @@ .arcconfig .clang-format .editorconfig + .flake8 .gitattributes .hgignore .hgsigs .hgtags .jshintrc - CONTRIBUTING + CONTRIBUTING.md CONTRIBUTORS COPYING + MANIFEST.in Makefile README.rst hg
--- a/tests/test-clone-stream-format.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-clone-stream-format.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,6 +1,6 @@ This file contains tests case that deal with format change accross stream clone -#require serve no-reposimplestore no-chg +#require serve no-chg #testcases stream-legacy stream-bundle2
--- a/tests/test-clone-stream-revlog-split.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-clone-stream-revlog-split.t Fri Feb 28 23:28:10 2025 +0100 @@ -124,12 +124,14 @@ adding [c] rbc-names-v2 (7 bytes) adding [c] rbc-revs-v2 (24 bytes) updating the branch cache - transferred 2.11 KB in * seconds (* */sec) (glob) (no-rust !) - transferred 2.29 KB in * seconds (* */sec) (glob) (rust !) + stream-cloned 9 files / 2.11 KB in * seconds (* */sec) (glob) (no-rust stream-bundle2-v3 !) + stream-cloned 11 files / 2.29 KB in * seconds (* */sec) (glob) (rust stream-bundle2-v3 !) bundle2-input-part: total payload size 2285 (stream-bundle2-v2 no-rust !) bundle2-input-part: total payload size 2518 (stream-bundle2-v2 rust !) bundle2-input-part: total payload size 2313 (stream-bundle2-v3 no-rust !) bundle2-input-part: total payload size 2546 (stream-bundle2-v3 rust !) + stream-cloned 8 files / 2.11 KB in * seconds (* */sec) (glob) (no-rust stream-bundle2-v2 !) + stream-cloned 10 files / 2.29 KB in * seconds (* */sec) (glob) (rust stream-bundle2-v2 !) bundle2-input-part: "listkeys" (params: 1 mandatory) supported bundle2-input-bundle: 2 parts total checking for updated bookmarks
--- a/tests/test-clone-stream.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-clone-stream.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,4 @@ -#require serve no-reposimplestore no-chg +#require serve no-chg #testcases stream-legacy stream-bundle2-v2 stream-bundle2-v3 @@ -191,9 +191,9 @@ $ hg clone --stream -U http://localhost:$HGPORT clone1 streaming all changes 1091 files to transfer, 102 KB of data (no-zstd !) - transferred 102 KB in * seconds (* */sec) (glob) (no-zstd !) + stream-cloned 1091 files / 102 KB in * seconds (* */sec) (glob) (no-zstd !) 1091 files to transfer, 98.8 KB of data (zstd !) - transferred 98.8 KB in * seconds (* */sec) (glob) (zstd !) + stream-cloned 1091 files / 98.8 KB in * seconds (* */sec) (glob) (zstd !) searching for changes no changes found #endif @@ -201,20 +201,20 @@ $ hg clone --stream -U http://localhost:$HGPORT clone1 streaming all changes 1094 files to transfer, 102 KB of data (no-zstd !) - transferred 102 KB in * seconds (* */sec) (glob) (no-zstd !) + stream-cloned 1094 files / 102 KB in * seconds (* */sec) (glob) (no-zstd !) 1094 files to transfer, 98.9 KB of data (zstd no-rust !) - transferred 98.9 KB in * seconds (* */sec) (glob) (zstd no-rust !) + stream-cloned 1094 files / 98.9 KB in * seconds (* */sec) (glob) (zstd no-rust !) 1096 files to transfer, 99.0 KB of data (zstd rust !) - transferred 99.0 KB in * seconds (* */sec) (glob) (zstd rust !) + stream-cloned 1096 files / 99.0 KB in * seconds (* */sec) (glob) (zstd rust !) #endif #if stream-bundle2-v3 $ hg clone --stream -U http://localhost:$HGPORT clone1 streaming all changes 1093 entries to transfer - transferred 102 KB in * seconds (* */sec) (glob) (no-zstd !) - transferred 98.9 KB in * seconds (* */sec) (glob) (zstd no-rust !) - transferred 99.0 KB in * seconds (* */sec) (glob) (zstd rust !) + stream-cloned 1094 files / 102 KB in * seconds (* */sec) (glob) (no-zstd !) + stream-cloned 1094 files / 98.9 KB in * seconds (* */sec) (glob) (zstd no-rust !) + stream-cloned 1096 files / 99.0 KB in * seconds (* */sec) (glob) (zstd rust !) #endif #if no-stream-legacy @@ -257,7 +257,7 @@ streaming all changes * files to transfer* (glob) (no-stream-bundle2-v3 !) * entries to transfer (glob) (stream-bundle2-v3 !) - transferred * KB in * seconds (* */sec) (glob) + stream-cloned * files / * KB in * seconds (* */sec) (glob) searching for changes (stream-legacy !) no changes found (stream-legacy !) @@ -304,7 +304,7 @@ streaming all changes * files to transfer* (glob) (no-stream-bundle2-v3 !) * entries to transfer (glob) (stream-bundle2-v3 !) - transferred * KB in * seconds (* */sec) (glob) + stream-cloned * files / * KB in * seconds (* */sec) (glob) searching for changes (stream-legacy !) no changes found (stream-legacy !) @@ -463,7 +463,7 @@ 1097 files to transfer, * KB of data (glob) (stream-bundle2-v2 no-rust !) 1099 files to transfer, * KB of data (glob) (stream-bundle2-v2 rust !) 1096 entries to transfer (stream-bundle2-v3 !) - transferred * KB in * seconds (* */sec) (glob) + stream-cloned * files / * KB in * seconds (* */sec) (glob) searching for changes (stream-legacy !) no changes found (stream-legacy !) updating to branch default @@ -491,7 +491,7 @@ 1097 files to transfer, * KB of data (glob) (stream-bundle2-v2 no-rust !) 1099 files to transfer, * KB of data (glob) (stream-bundle2-v2 rust !) 1096 entries to transfer (stream-bundle2-v3 !) - transferred * KB in * seconds (* */sec) (glob) + stream-cloned * files * KB in * seconds (* */sec) (glob) searching for changes (stream-legacy !) no changes found (stream-legacy !) updating to branch default @@ -514,11 +514,9 @@ $ hg clone --stream http://localhost:$HGPORT phase-no-publish streaming all changes - 1091 files to transfer, * KB of data (glob) (stream-legacy !) - 1098 files to transfer, * KB of data (glob) (stream-bundle2-v2 no-rust !) - 1100 files to transfer, * KB of data (glob) (stream-bundle2-v2 rust !) - 1097 entries to transfer (stream-bundle2-v3 !) - transferred * KB in * seconds (* */sec) (glob) + * files to transfer, * KB of data (glob) (no-stream-bundle2-v3 !) + * entries to transfer (glob) (stream-bundle2-v3 !) + stream-cloned * / * KB in * seconds (* */sec) (glob) searching for changes (stream-legacy !) no changes found (stream-legacy !) updating to branch default @@ -578,7 +576,7 @@ 1099 files to transfer, * KB of data (glob) (stream-bundle2-v2 no-rust !) 1101 files to transfer, * KB of data (glob) (stream-bundle2-v2 rust !) 1098 entries to transfer (no-stream-bundle2-v2 !) - transferred * KB in * seconds (* */sec) (glob) + stream-cloned * files / * KB in * seconds (* */sec) (glob) $ hg -R with-obsolescence log -T '{rev}: {phase}\n' 2: draft 1: draft
--- a/tests/test-clone.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-clone.t Fri Feb 28 23:28:10 2025 +0100 @@ -18,14 +18,12 @@ List files in store/data (should show a 'b.d'): -#if reporevlogstore $ for i in .hg/store/data/*; do > echo $i > done .hg/store/data/a.i .hg/store/data/b.d .hg/store/data/b.i -#endif Trigger branchcache creation:
--- a/tests/test-clonebundles-autogen.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-clonebundles-autogen.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,5 +1,5 @@ -#require no-reposimplestore no-chg +#require no-chg initial setup
--- a/tests/test-clonebundles.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-clonebundles.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,4 @@ -#require no-reposimplestore no-chg +#require no-chg Set up a server @@ -255,6 +255,26 @@ no changes found 2 local changesets published +Out-of-repo storage for inline bundle +------------------------------------- + + $ cp -R server server-extern + $ cat >> server-extern/.hg/hgrc << EOF + > [server] + > peer-bundle-cache-root = `pwd`/server/.hg/bundle-cache + > EOF + $ rm -r server-extern/.hg/bundle-cache + $ hg clone -U ssh://user@dummy/server-extern ssh-inline-clone-extern + applying clone bundle from peer-bundle-cache://full.hg + adding changesets + adding manifests + adding file changes + added 2 changesets with 2 changes to 2 files + finished applying clone bundle + searching for changes + no changes found + 2 local changesets published + HTTP Supports ------------- @@ -397,10 +417,8 @@ $ hg clone -U http://localhost:$HGPORT stream-clone-no-spec applying clone bundle from http://localhost:$HGPORT1/packed.hg - 5 files to transfer, 613 bytes of data (no-rust !) - transferred 613 bytes in * seconds (* */sec) (glob) (no-rust !) - 7 files to transfer, 739 bytes of data (rust !) - transferred 739 bytes in * seconds (* */sec) (glob) (rust !) + * files to transfer, * bytes of data (glob) + stream-cloned * files / * bytes in * seconds (* */sec) (glob) finished applying clone bundle searching for changes no changes found @@ -414,7 +432,7 @@ $ hg clone -U http://localhost:$HGPORT stream-clone-vanilla-spec applying clone bundle from http://localhost:$HGPORT1/packed.hg * files to transfer, * bytes of data (glob) - transferred * bytes in * seconds (* */sec) (glob) + stream-cloned * files / * bytes in * seconds (* */sec) (glob) finished applying clone bundle searching for changes no changes found @@ -428,7 +446,7 @@ $ hg clone -U http://localhost:$HGPORT stream-clone-supported-requirements applying clone bundle from http://localhost:$HGPORT1/packed.hg * files to transfer, * bytes of data (glob) - transferred * bytes in * seconds (* */sec) (glob) + stream-cloned * files / * bytes in * seconds (* */sec) (glob) finished applying clone bundle searching for changes no changes found @@ -575,7 +593,7 @@ (you may want to report this to the server operator) streaming all changes * files to transfer, * bytes of data (glob) - transferred * bytes in * seconds (* */sec) (glob) + stream-cloned * files / * bytes in * seconds (* */sec) (glob) A manifest with a stream clone but no BUNDLESPEC @@ -588,7 +606,7 @@ (you may want to report this to the server operator) streaming all changes * files to transfer, * bytes of data (glob) - transferred * bytes in * seconds (* */sec) (glob) + stream-cloned * files / * bytes in * seconds (* */sec) (glob) A manifest with a gzip bundle and a stream clone @@ -600,7 +618,7 @@ $ hg clone -U --stream http://localhost:$HGPORT uncompressed-gzip-packed applying clone bundle from http://localhost:$HGPORT1/packed.hg * files to transfer, * bytes of data (glob) - transferred * bytes in * seconds (* */sec) (glob) + stream-cloned * files / * bytes in * seconds (* */sec) (glob) finished applying clone bundle searching for changes no changes found @@ -615,7 +633,7 @@ $ hg clone -U --stream http://localhost:$HGPORT uncompressed-gzip-packed-requirements applying clone bundle from http://localhost:$HGPORT1/packed.hg * files to transfer, * bytes of data (glob) - transferred * bytes in * seconds (* */sec) (glob) + stream-cloned * files / * bytes in * seconds (* */sec) (glob) finished applying clone bundle searching for changes no changes found @@ -632,7 +650,7 @@ (you may want to report this to the server operator) streaming all changes * files to transfer, * bytes of data (glob) - transferred * bytes in * seconds (* */sec) (glob) + stream-cloned * files / * bytes in * seconds (* */sec) (glob) Test clone bundle retrieved through bundle2
--- a/tests/test-commandserver.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-commandserver.t Fri Feb 28 23:28:10 2025 +0100 @@ -1173,3 +1173,47 @@ $ cd .. #endif + +Test the --config-file behavior (this will be used by SCM Manager to add auth +and proxy info instead of rewriting the repo hgrc file during pulls and +imports). + + $ cat > config-file.rc <<EOF + > [auth] + > temporary.schemes = https + > temporary.prefix = server.org + > temporary.password = password + > temporary.username = user + > EOF + + >>> from hgclient import check, readchannel, runcommand + >>> @check + ... def checkruncommand(server): + ... # hello block + ... readchannel(server) + ... + ... # no file + ... runcommand(server, [b'config', b'auth']) + ... # with file + ... runcommand(server, + ... [b'config', b'auth', b'--config-file', b'config-file.rc']) + ... # with file and overriding --config + ... runcommand(server, + ... [b'config', b'auth', b'--config-file', b'config-file.rc', + ... b'--config', b'auth.temporary.username=cli-user']) + ... # previous configs aren't cached + ... runcommand(server, [b'config', b'auth']) + *** runcommand config auth + [1] + *** runcommand config auth --config-file config-file.rc + auth.temporary.schemes=https + auth.temporary.prefix=server.org + auth.temporary.password=password + auth.temporary.username=user + *** runcommand config auth --config-file config-file.rc --config auth.temporary.username=cli-user + auth.temporary.schemes=https + auth.temporary.prefix=server.org + auth.temporary.password=password + auth.temporary.username=cli-user + *** runcommand config auth + [1]
--- a/tests/test-completion.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-completion.t Fri Feb 28 23:28:10 2025 +0100 @@ -176,6 +176,7 @@ $ hg debugcomplete --options | sort --color --config + --config-file --cwd --debug --debugger @@ -206,6 +207,7 @@ --cmdserver --color --config + --config-file --cwd --daemon --daemon-postexec @@ -356,7 +358,7 @@ debugwhyunstable: debugwireargs: three, four, five, ssh, remotecmd, insecure debugwireproto: localssh, peer, noreadstderr, nologhandshake, ssh, remotecmd, insecure - diff: rev, from, to, change, text, git, binary, nodates, noprefix, show-function, reverse, ignore-all-space, ignore-space-change, ignore-blank-lines, ignore-space-at-eol, unified, stat, root, include, exclude, subrepos + diff: rev, from, to, change, ignore-changes-from-ancestors, text, git, binary, nodates, noprefix, show-function, reverse, ignore-all-space, ignore-space-change, ignore-blank-lines, ignore-space-at-eol, unified, stat, root, include, exclude, subrepos export: bookmark, output, switch-parent, rev, text, git, binary, nodates, template files: rev, print0, include, exclude, template, subrepos forget: interactive, include, exclude, dry-run
--- a/tests/test-config.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-config.t Fri Feb 28 23:28:10 2025 +0100 @@ -3,7 +3,7 @@ #if windows $ path_list_var() { - > echo $1 | sed 's/:/;/' + > echo $1 | sed 's/:/;/g' > } #else $ path_list_var() { @@ -456,6 +456,20 @@ > logtemplate = "value-B\n" > EOF + $ cat > config-file.rc <<EOF + > [config-test] + > basic = value-CONFIG-FILE + > [ui] + > logtemplate = "value-CONFIG-FILE\n" + > EOF + + $ cat > config-file2.rc <<EOF + > [config-test] + > basic = value-CONFIG-FILE-2 + > [ui] + > logtemplate = "value-CONFIG-FILE-2\n" + > EOF + $ cat > included.rc << EOF > [config-test] @@ -510,6 +524,25 @@ $ HGRCPATH=`path_list_var "file-A.rc:file-B.rc"` hg config config-test.basic --config config-test.basic=value-CLI value-CLI + $ HGRCPATH=`path_list_var "file-A.rc:file-B.rc"` hg config config-test.basic --config-file config-file.rc + value-CONFIG-FILE + +--config-file args are processed in order of appearance + + $ HGRCPATH=`path_list_var "file-A.rc:file-B.rc"` hg config config-test.basic \ + > --config-file config-file.rc --config-file config-file2.rc + value-CONFIG-FILE-2 + +--config overrides --config-file, regardless of order + + $ HGRCPATH=`path_list_var "file-A.rc:file-B.rc"` hg config config-test.basic \ + > --config config-test.basic=value-CLI --config-file config-file.rc + value-CLI + + $ HGRCPATH=`path_list_var "file-A.rc:file-B.rc"` hg config config-test.basic \ + > --config-file config-file.rc --config config-test.basic=value-CLI + value-CLI + Alias ordering -------------- @@ -544,3 +577,46 @@ $ HGRCPATH=`path_list_var "file-A.rc:file-B.rc"` hg log -r . --config ui.logtemplate="value-CLI\n" value-CLI + + + $ HGRCPATH=`path_list_var "file-A.rc:file-B.rc"` hg log -r . --config-file config-file.rc + value-CONFIG-FILE + +--config overrides --config-file + + $ HGRCPATH=`path_list_var "file-A.rc:file-B.rc"` hg log -r . \ + > --config ui.logtemplate="value-CLI\n" --config-file config-file.rc + value-CLI + + +Bad --config-file usage +----------------------- + +For some reason, chg doesn't honor the detailed-exit-code=True setting, and +exits with 255 for these cases that would normally exit with 10 for InputError. +The exit code alone can't be conditionalized in the test here, and this failure +to honor the setting is also true when passing a malformed --config, so disable +the config for now. + + $ cat >> $HGRCPATH <<EOF + > [ui] + > detailed-exit-code = False + > EOF + + $ cat > not-a-config-file.txt <<EOF + > this is a bad config file line + > EOF + + $ hg config auth --config-file not-a-config-file.txt + abort: invalid --config-file content at not-a-config-file.txt:1 + (unexpected leading whitespace: this is a bad config file line) + [255] + + $ hg config auth --config-file non-existent-file.rc + abort: missing file "non-existent-file.rc" for --config-file + [255] + + $ hg config auth --config-file non-existent-file.rc --cwd .. + abort: missing file "non-existent-file.rc" for --config-file + (this file is resolved before --cwd is processed) + [255]
--- a/tests/test-contrib-dumprevlog.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-contrib-dumprevlog.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,5 +1,3 @@ -#require reporevlogstore - $ CONTRIBDIR="$TESTDIR/../contrib" $ hg init repo-a
--- a/tests/test-contrib-perf.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-contrib-perf.t Fri Feb 28 23:28:10 2025 +0100 @@ -251,7 +251,6 @@ $ hg perfdirstatedirs $ hg perfdirstatefoldmap $ hg perfdirstatewrite -#if repofncache $ hg perffncacheencode $ hg perffncacheload $ hg debugrebuildfncache @@ -259,7 +258,6 @@ $ hg perffncachewrite $ hg debugrebuildfncache fncache already up to date -#endif $ hg perfheads $ hg perfignore $ hg perfindex @@ -280,9 +278,7 @@ $ hg perfprogress --total 1000 $ hg perfrawfiles 2 $ hg perfrevlogindex -c -#if reporevlogstore $ hg perfrevlogrevisions .hg/store/data/a.i -#endif #if no-rust Cannot test in Rust because this these are highly invasive and expect a certain
--- a/tests/test-convert-filemap.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-convert-filemap.t Fri Feb 28 23:28:10 2025 +0100 @@ -283,23 +283,14 @@ > exclude dir/subdir > include dir/subdir/file3 > EOF -#if reporevlogstore $ rm source/.hg/store/data/dir/file3.i $ rm source/.hg/store/data/dir/file4.i -#endif -#if reposimplestore - $ rm -rf source/.hg/store/data/dir/file3 - $ rm -rf source/.hg/store/data/dir/file4 -#endif $ hg -q convert --filemap renames.fmap --datesort source dummydest - abort: dir/file3@e96dce0bc6a217656a3a410e5e6bec2c4f42bf7c: no match found (reporevlogstore !) - abort: data/dir/file3/index@e96dce0bc6a2: no node (reposimplestore !) + abort: dir/file3@e96dce0bc6a217656a3a410e5e6bec2c4f42bf7c: no match found [50] $ hg -q convert --filemap renames.fmap --datesort --config convert.hg.ignoreerrors=1 source renames.repo - ignoring: dir/file3@e96dce0bc6a217656a3a410e5e6bec2c4f42bf7c: no match found (reporevlogstore !) - ignoring: dir/file4@6edd55f559cdce67132b12ca09e09cee08b60442: no match found (reporevlogstore !) - ignoring: data/dir/file3/index@e96dce0bc6a2: no node (reposimplestore !) - ignoring: data/dir/file4/index@6edd55f559cd: no node (reposimplestore !) + ignoring: dir/file3@e96dce0bc6a217656a3a410e5e6bec2c4f42bf7c: no match found + ignoring: dir/file4@6edd55f559cdce67132b12ca09e09cee08b60442: no match found $ hg up -q -R renames.repo $ glog -R renames.repo @ 4 "8: change foo" files: foo2
--- a/tests/test-convert-hg-source.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-convert-hg-source.t Fri Feb 28 23:28:10 2025 +0100 @@ -169,12 +169,7 @@ break it -#if reporevlogstore $ rm .hg/store/data/b.* -#endif -#if reposimplestore - $ rm .hg/store/data/b/* -#endif $ cd .. $ hg --config convert.hg.ignoreerrors=True convert broken fixed initializing destination fixed repository @@ -182,8 +177,7 @@ sorting... converting... 4 init - ignoring: b@1e88685f5ddec574a34c70af492f95b6debc8741: no match found (reporevlogstore !) - ignoring: data/b/index@1e88685f5dde: no node (reposimplestore !) + ignoring: b@1e88685f5ddec574a34c70af492f95b6debc8741: no match found 3 changeall 2 changebagain 1 merge
--- a/tests/test-convert.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-convert.t Fri Feb 28 23:28:10 2025 +0100 @@ -532,11 +532,9 @@ contents of fncache file: -#if repofncache $ cat b/.hg/store/fncache | sort - data/a.i (reporevlogstore !) - data/b.i (reporevlogstore !) -#endif + data/a.i + data/b.i test bogus URL
--- a/tests/test-copy.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-copy.t Fri Feb 28 23:28:10 2025 +0100 @@ -86,10 +86,8 @@ copy: a copyrev: b789fdd96dc2f3bd229c1dd8eedf0fc60e2b68e3 -#if reporevlogstore $ md5sum.py .hg/store/data/b.i 44913824c8f5890ae218f9829535922e .hg/store/data/b.i -#endif $ hg cat b > bsum $ md5sum.py bsum 60b725f10c9c85c70d97880dfe8191b3 bsum
--- a/tests/test-debugcommands.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-debugcommands.t Fri Feb 28 23:28:10 2025 +0100 @@ -14,7 +14,6 @@ $ hg revert --all -r 0 adding a $ hg ci -Am make-it-full -#if reporevlogstore $ hg debugrevlog -c format : 1 flags : (none) @@ -125,14 +124,12 @@ full revision size (min/max/avg) : 3 / 3 / 3 inter-snapshot size (min/max/avg) : 0 / 0 / 0 delta size (min/max/avg) : 0 / 0 / 0 -#endif Test debugindex, with and without the --verbose/--debug flag $ hg debugrevlogindex a rev linkrev nodeid p1 p2 0 0 b789fdd96dc2 000000000000 000000000000 -#if no-reposimplestore $ hg --verbose debugrevlogindex a rev offset length linkrev nodeid p1 p2 0 0 3 0 b789fdd96dc2 000000000000 000000000000 @@ -140,13 +137,11 @@ $ hg --debug debugrevlogindex a rev offset length linkrev nodeid p1 p2 0 0 3 0 b789fdd96dc2f3bd229c1dd8eedf0fc60e2b68e3 0000000000000000000000000000000000000000 0000000000000000000000000000000000000000 -#endif $ hg debugrevlogindex -f 1 a rev flag size link p1 p2 nodeid 0 0000 2 0 -1 -1 b789fdd96dc2 -#if no-reposimplestore $ hg --verbose debugrevlogindex -f 1 a rev flag offset length size link p1 p2 nodeid 0 0000 0 3 2 0 -1 -1 b789fdd96dc2 @@ -154,7 +149,6 @@ $ hg --debug debugrevlogindex -f 1 a rev flag offset length size link p1 p2 nodeid 0 0000 0 3 2 0 -1 -1 b789fdd96dc2f3bd229c1dd8eedf0fc60e2b68e3 -#endif $ hg debugindex -c rev linkrev nodeid p1-nodeid p2-nodeid @@ -185,12 +179,7 @@ debugdelta chain basic output -#if reporevlogstore pure rust - $ hg debugindexstats - abort: debugindexstats only works with native C code - [255] -#endif -#if reporevlogstore no-pure no-rust +#if no-pure no-rust $ hg debugindexstats node trie capacity: 4 node trie count: 2 @@ -202,9 +191,13 @@ node trie misses: 1 node trie splits: 1 revs in memory: 3 +#else + $ hg debugindexstats + abort: debugindexstats only works with native C code + [255] #endif -#if reporevlogstore no-pure +#if no-pure $ hg debugdeltachain -m --all-info rev p1 p2 chain# chainlen prev delta size rawsize chainsize ratio lindist extradist extraratio readsize largestblk rddensity srchunks 0 -1 -1 1 1 -1 base 44 43 44 1.02326 44 0 0.00000 44 44 1.00000 1
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tests/test-diff-patches.t Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,467 @@ +============================================================= +Testing comparing changeset regardless of change from parents +============================================================= + +Setup +===== + +Add a bunch of changes some related to each other some not. + + $ hg init test-repo + $ cd test-repo + $ cat << EOF > file-a.txt + > one + > two + > three + > four + > five + > six + > seven + > eight + > nine + > ten + > EOF + $ hg add file-a.txt + $ hg commit -m 'commit_root' + + $ sed s/two/deux/ file-a.txt > a + $ mv a file-a.txt + $ hg commit -m 'commit_A1_change' + + $ sed s/five/cinq/ file-a.txt > a + $ mv a file-a.txt + $ hg commit -m 'commit_A2_change' + + $ cat << EOF > file-b.txt + > egg + > salade + > orange + > EOF + $ hg add file-b.txt + $ hg commit -m 'commit_A3_change' + + $ cat << EOF > file-b.txt + > butter + > egg + > salade + > orange + > EOF + $ hg commit -m 'commit_A4_change' + + $ hg up 'desc("commit_root")' + 1 files updated, 0 files merged, 1 files removed, 0 files unresolved + $ sed s/two/deux/ file-a.txt > a + $ mv a file-a.txt + $ sed s/ten/dix/ file-a.txt > a + $ mv a file-a.txt + $ hg commit -m 'commit_B1_change' + created new head + + $ sed s/five/funf/ file-a.txt > a + $ mv a file-a.txt + $ sed s/eight/acht/ file-a.txt > a + $ mv a file-a.txt + $ hg commit -m 'commit_B2_change' + + $ cat << EOF > file-b.txt + > milk + > egg + > salade + > apple + > EOF + $ hg add file-b.txt + $ hg commit -m 'commit_B3_change' + + $ cat << EOF > file-b.txt + > butter + > milk + > egg + > salade + > apple + > EOF + $ hg commit -m 'commit_B4_change' + + $ hg log -G --patch + @ changeset: 8:0d6b02d59faf + | tag: tip + | user: test + | date: Thu Jan 01 00:00:00 1970 +0000 + | summary: commit_B4_change + | + | diff -r 59c9679fd24c -r 0d6b02d59faf file-b.txt + | --- a/file-b.txt Thu Jan 01 00:00:00 1970 +0000 + | +++ b/file-b.txt Thu Jan 01 00:00:00 1970 +0000 + | @@ -1,3 +1,4 @@ + | +butter + | milk + | egg + | salade + | + o changeset: 7:59c9679fd24c + | user: test + | date: Thu Jan 01 00:00:00 1970 +0000 + | summary: commit_B3_change + | + | diff -r 1e73118ddc3a -r 59c9679fd24c file-b.txt + | --- /dev/null Thu Jan 01 00:00:00 1970 +0000 + | +++ b/file-b.txt Thu Jan 01 00:00:00 1970 +0000 + | @@ -0,0 +1,4 @@ + | +milk + | +egg + | +salade + | +apple + | + o changeset: 6:1e73118ddc3a + | user: test + | date: Thu Jan 01 00:00:00 1970 +0000 + | summary: commit_B2_change + | + | diff -r 30a40f18d81e -r 1e73118ddc3a file-a.txt + | --- a/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + | +++ b/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + | @@ -2,9 +2,9 @@ + | deux + | three + | four + | -five + | +funf + | six + | seven + | -eight + | +acht + | nine + | dix + | + o changeset: 5:30a40f18d81e + | parent: 0:9c17110ca844 + | user: test + | date: Thu Jan 01 00:00:00 1970 +0000 + | summary: commit_B1_change + | + | diff -r 9c17110ca844 -r 30a40f18d81e file-a.txt + | --- a/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + | +++ b/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + | @@ -1,5 +1,5 @@ + | one + | -two + | +deux + | three + | four + | five + | @@ -7,4 +7,4 @@ + | seven + | eight + | nine + | -ten + | +dix + | + | o changeset: 4:e6f5655bdf2e + | | user: test + | | date: Thu Jan 01 00:00:00 1970 +0000 + | | summary: commit_A4_change + | | + | | diff -r 074ad64f5cd7 -r e6f5655bdf2e file-b.txt + | | --- a/file-b.txt Thu Jan 01 00:00:00 1970 +0000 + | | +++ b/file-b.txt Thu Jan 01 00:00:00 1970 +0000 + | | @@ -1,3 +1,4 @@ + | | +butter + | | egg + | | salade + | | orange + | | + | o changeset: 3:074ad64f5cd7 + | | user: test + | | date: Thu Jan 01 00:00:00 1970 +0000 + | | summary: commit_A3_change + | | + | | diff -r 37c330f02452 -r 074ad64f5cd7 file-b.txt + | | --- /dev/null Thu Jan 01 00:00:00 1970 +0000 + | | +++ b/file-b.txt Thu Jan 01 00:00:00 1970 +0000 + | | @@ -0,0 +1,3 @@ + | | +egg + | | +salade + | | +orange + | | + | o changeset: 2:37c330f02452 + | | user: test + | | date: Thu Jan 01 00:00:00 1970 +0000 + | | summary: commit_A2_change + | | + | | diff -r 7bcbc987bcfe -r 37c330f02452 file-a.txt + | | --- a/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + | | +++ b/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + | | @@ -2,7 +2,7 @@ + | | deux + | | three + | | four + | | -five + | | +cinq + | | six + | | seven + | | eight + | | + | o changeset: 1:7bcbc987bcfe + |/ user: test + | date: Thu Jan 01 00:00:00 1970 +0000 + | summary: commit_A1_change + | + | diff -r 9c17110ca844 -r 7bcbc987bcfe file-a.txt + | --- a/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + | +++ b/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + | @@ -1,5 +1,5 @@ + | one + | -two + | +deux + | three + | four + | five + | + o changeset: 0:9c17110ca844 + user: test + date: Thu Jan 01 00:00:00 1970 +0000 + summary: commit_root + + diff -r 000000000000 -r 9c17110ca844 file-a.txt + --- /dev/null Thu Jan 01 00:00:00 1970 +0000 + +++ b/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + @@ -0,0 +1,10 @@ + +one + +two + +three + +four + +five + +six + +seven + +eight + +nine + +ten + + +Then compare the resulting revisions: +==================================== + +A1 and B1 has the same parent, so the same output is expected. + + + $ hg diff --from 'desc("commit_A1_change")' --to 'desc("commit_B1_change")' + diff -r 7bcbc987bcfe -r 30a40f18d81e file-a.txt + --- a/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + +++ b/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + @@ -7,4 +7,4 @@ + seven + eight + nine + -ten + +dix + $ hg diff --from 'desc("commit_A1_change")' --to 'desc("commit_B1_change")' --ignore-changes-from-ancestors + diff -r 7bcbc987bcfe -r 30a40f18d81e file-a.txt + --- a/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + +++ b/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + @@ -7,4 +7,4 @@ + seven + eight + nine + -ten + +dix + +Skipping B1 change mean the final "ten" change is no longer part of the diff + + $ hg diff --from 'desc("commit_A1_change")' --to 'desc("commit_B2_change")' + diff -r 7bcbc987bcfe -r 1e73118ddc3a file-a.txt + --- a/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + +++ b/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + @@ -2,9 +2,9 @@ + deux + three + four + -five + +funf + six + seven + -eight + +acht + nine + -ten + +dix + $ hg diff --from 'desc("commit_A1_change")' --to 'desc("commit_B2_change")' --ignore-changes-from-ancestors + diff -r 1e73118ddc3a file-a.txt + --- a/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + +++ b/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + @@ -2,9 +2,9 @@ + deux + three + four + -five + +funf + six + seven + -eight + +acht + nine + dix + +Skipping A1 changes means the "two" changes introduced by "B1" (but also +present in A2 parent, A1) is back on the table. + + $ hg diff --from 'desc("commit_A2_change")' --to 'desc("commit_B1_change")' + diff -r 37c330f02452 -r 30a40f18d81e file-a.txt + --- a/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + +++ b/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + @@ -2,9 +2,9 @@ + deux + three + four + -cinq + +five + six + seven + eight + nine + -ten + +dix + $ hg diff --from 'desc("commit_A2_change")' --to 'desc("commit_B1_change")' --ignore-changes-from-ancestors + diff -r 30a40f18d81e file-a.txt + --- a/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + +++ b/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + @@ -1,10 +1,10 @@ + one + -two + +deux + three + four + -cinq + +five + six + seven + eight + nine + -ten + +dix + +All changes from A1 and B1 are no longer in the picture as we compare A2 and B2 + + $ hg diff --from 'desc("commit_A2_change")' --to 'desc("commit_B2_change")' + diff -r 37c330f02452 -r 1e73118ddc3a file-a.txt + --- a/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + +++ b/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + @@ -2,9 +2,9 @@ + deux + three + four + -cinq + +funf + six + seven + -eight + +acht + nine + -ten + +dix + $ hg diff --from 'desc("commit_A2_change")' --to 'desc("commit_B2_change")' --ignore-changes-from-ancestors + diff -r 1e73118ddc3a file-a.txt + --- a/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + +++ b/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + @@ -2,9 +2,9 @@ + deux + three + four + -cinq + +funf + six + seven + -eight + +acht + nine + dix + +Similar patches +--------------- + +comparing A3 and B3 patches is much more terse. focusing on the change to the +two similar patches, ignoring the rests of the changes (like comparing apples +and oranges) + + $ hg diff --from 'desc("commit_A3_change")' --to 'desc("commit_B3_change")' + diff -r 074ad64f5cd7 -r 59c9679fd24c file-a.txt + --- a/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + +++ b/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + @@ -2,9 +2,9 @@ + deux + three + four + -cinq + +funf + six + seven + -eight + +acht + nine + -ten + +dix + diff -r 074ad64f5cd7 -r 59c9679fd24c file-b.txt + --- a/file-b.txt Thu Jan 01 00:00:00 1970 +0000 + +++ b/file-b.txt Thu Jan 01 00:00:00 1970 +0000 + @@ -1,3 +1,4 @@ + +milk + egg + salade + -orange + +apple + $ hg diff --from 'desc("commit_A3_change")' --to 'desc("commit_B3_change")' --ignore-changes-from-ancestors + diff -r 59c9679fd24c file-b.txt + --- a/file-b.txt Thu Jan 01 00:00:00 1970 +0000 + +++ b/file-b.txt Thu Jan 01 00:00:00 1970 +0000 + @@ -1,3 +1,4 @@ + +milk + egg + salade + -orange + +apple + + +Conflict handling +----------------- + +Conflict should not be a big deal and its resolution should be presented to the user. + + $ hg diff --from 'desc("commit_A4_change")' --to 'desc("commit_B4_change")' + diff -r e6f5655bdf2e -r 0d6b02d59faf file-a.txt + --- a/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + +++ b/file-a.txt Thu Jan 01 00:00:00 1970 +0000 + @@ -2,9 +2,9 @@ + deux + three + four + -cinq + +funf + six + seven + -eight + +acht + nine + -ten + +dix + diff -r e6f5655bdf2e -r 0d6b02d59faf file-b.txt + --- a/file-b.txt Thu Jan 01 00:00:00 1970 +0000 + +++ b/file-b.txt Thu Jan 01 00:00:00 1970 +0000 + @@ -1,4 +1,5 @@ + butter + +milk + egg + salade + -orange + +apple + $ hg diff --from 'desc("commit_A4_change")' --to 'desc("commit_B4_change")' --ignore-changes-from-ancestors + diff -r 0d6b02d59faf file-b.txt + --- a/file-b.txt Thu Jan 01 00:00:00 1970 +0000 + +++ b/file-b.txt Thu Jan 01 00:00:00 1970 +0000 + @@ -1,9 +1,5 @@ + -<<<<<<< from: e6f5655bdf2e - test: commit_A4_change + butter + -||||||| parent-of-from: 074ad64f5cd7 - test: commit_A3_change + -======= + milk + ->>>>>>> parent-of-to: 59c9679fd24c - test: commit_B3_change + egg + salade + apple
--- a/tests/test-dirstate-read-race.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-dirstate-read-race.t Fri Feb 28 23:28:10 2025 +0100 @@ -231,20 +231,16 @@ The status process should return a consistent result and not crash. -#if dirstate-v1 +(The "pre-commit" state is only visible to (any) rust variant because the pure +python implementation always rewrites, so we are never really in the "-append" +case). + $ cat $TESTTMP/status-race-lock.out + A dir/o (dirstate-v2-append pre-some-read rust !) + R dir/nested/m (dirstate-v2-append pre-some-read rust !) ? dir/n ? p ? q -#endif -#if dirstate-v2 - $ cat $TESTTMP/status-race-lock.out - A dir/o - R dir/nested/m - ? dir/n - ? p - ? q -#endif final cleanup @@ -277,10 +273,19 @@ | o 4f23db756b09 recreate a bunch of files to facilitate dirstate-v2 append +(double check the working copy location before and after the update+concurrent status) + $ hg log -T '{node|short}\n' --rev "." + 9a86dcbfb938 +(update destination) + $ hg log -T '{node|short}\n' --rev ".~1" + 4f23db756b09 $ hg $d2args update --merge ".~1" 0 files updated, 0 files merged, 6 files removed, 0 files unresolved $ touch $TESTTMP/status-race-lock $ wait +(the working copy should have been updated) + $ hg log -T '{node|short}\n' --rev "." + 4f23db756b09 $ hg log -GT '{node|short} {desc}\n' o 9a86dcbfb938 more files to have two commit |
--- a/tests/test-duplicateoptions.py Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-duplicateoptions.py Fri Feb 28 23:28:10 2025 +0100 @@ -7,13 +7,6 @@ ignore = {b'highlight', b'win32text', b'factotum', b'beautifygraph'} -try: - import sqlite3 - - del sqlite3 # unused, just checking that import works -except ImportError: - ignore.add(b'sqlitestore') - if os.name != 'nt': ignore.add(b'win32mbcs')
--- a/tests/test-extension.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-extension.t Fri Feb 28 23:28:10 2025 +0100 @@ -682,28 +682,30 @@ global options ([+] can be repeated): - -R --repository REPO repository root directory or name of overlay bundle - file - --cwd DIR change working directory - -y --noninteractive do not prompt, automatically pick the first choice for - all prompts - -q --quiet suppress output - -v --verbose enable additional output - --color TYPE when to colorize (boolean, always, auto, never, or - debug) - --config CONFIG [+] set/override config option (use 'section.name=value') - --debug enable debugging output - --debugger start debugger - --encoding ENCODE set the charset encoding (default: ascii) - --encodingmode MODE set the charset encoding mode (default: strict) - --traceback always print a traceback on exception - --time time how long the command takes - --profile print command execution profile - --version output version information and exit - -h --help display help and exit - --hidden consider hidden changesets - --pager TYPE when to paginate (boolean, always, auto, or never) - (default: auto) + -R --repository REPO repository root directory or name of overlay bundle + file + --cwd DIR change working directory + -y --noninteractive do not prompt, automatically pick the first choice + for all prompts + -q --quiet suppress output + -v --verbose enable additional output + --color TYPE when to colorize (boolean, always, auto, never, or + debug) + --config CONFIG [+] set/override config option (use + 'section.name=value') + --config-file HGRC [+] load config file to set/override config options + --debug enable debugging output + --debugger start debugger + --encoding ENCODE set the charset encoding (default: ascii) + --encodingmode MODE set the charset encoding mode (default: strict) + --traceback always print a traceback on exception + --time time how long the command takes + --profile print command execution profile + --version output version information and exit + -h --help display help and exit + --hidden consider hidden changesets + --pager TYPE when to paginate (boolean, always, auto, or never) + (default: auto) @@ -721,28 +723,30 @@ global options ([+] can be repeated): - -R --repository REPO repository root directory or name of overlay bundle - file - --cwd DIR change working directory - -y --noninteractive do not prompt, automatically pick the first choice for - all prompts - -q --quiet suppress output - -v --verbose enable additional output - --color TYPE when to colorize (boolean, always, auto, never, or - debug) - --config CONFIG [+] set/override config option (use 'section.name=value') - --debug enable debugging output - --debugger start debugger - --encoding ENCODE set the charset encoding (default: ascii) - --encodingmode MODE set the charset encoding mode (default: strict) - --traceback always print a traceback on exception - --time time how long the command takes - --profile print command execution profile - --version output version information and exit - -h --help display help and exit - --hidden consider hidden changesets - --pager TYPE when to paginate (boolean, always, auto, or never) - (default: auto) + -R --repository REPO repository root directory or name of overlay bundle + file + --cwd DIR change working directory + -y --noninteractive do not prompt, automatically pick the first choice + for all prompts + -q --quiet suppress output + -v --verbose enable additional output + --color TYPE when to colorize (boolean, always, auto, never, or + debug) + --config CONFIG [+] set/override config option (use + 'section.name=value') + --config-file HGRC [+] load config file to set/override config options + --debug enable debugging output + --debugger start debugger + --encoding ENCODE set the charset encoding (default: ascii) + --encodingmode MODE set the charset encoding mode (default: strict) + --traceback always print a traceback on exception + --time time how long the command takes + --profile print command execution profile + --version output version information and exit + -h --help display help and exit + --hidden consider hidden changesets + --pager TYPE when to paginate (boolean, always, auto, or never) + (default: auto) @@ -1034,28 +1038,30 @@ global options ([+] can be repeated): - -R --repository REPO repository root directory or name of overlay bundle - file - --cwd DIR change working directory - -y --noninteractive do not prompt, automatically pick the first choice for - all prompts - -q --quiet suppress output - -v --verbose enable additional output - --color TYPE when to colorize (boolean, always, auto, never, or - debug) - --config CONFIG [+] set/override config option (use 'section.name=value') - --debug enable debugging output - --debugger start debugger - --encoding ENCODE set the charset encoding (default: ascii) - --encodingmode MODE set the charset encoding mode (default: strict) - --traceback always print a traceback on exception - --time time how long the command takes - --profile print command execution profile - --version output version information and exit - -h --help display help and exit - --hidden consider hidden changesets - --pager TYPE when to paginate (boolean, always, auto, or never) - (default: auto) + -R --repository REPO repository root directory or name of overlay bundle + file + --cwd DIR change working directory + -y --noninteractive do not prompt, automatically pick the first choice + for all prompts + -q --quiet suppress output + -v --verbose enable additional output + --color TYPE when to colorize (boolean, always, auto, never, or + debug) + --config CONFIG [+] set/override config option (use + 'section.name=value') + --config-file HGRC [+] load config file to set/override config options + --debug enable debugging output + --debugger start debugger + --encoding ENCODE set the charset encoding (default: ascii) + --encodingmode MODE set the charset encoding mode (default: strict) + --traceback always print a traceback on exception + --time time how long the command takes + --profile print command execution profile + --version output version information and exit + -h --help display help and exit + --hidden consider hidden changesets + --pager TYPE when to paginate (boolean, always, auto, or never) + (default: auto) Make sure that single '-v' option shows help and built-ins only for 'dodo' command $ hg help -v dodo @@ -1071,28 +1077,30 @@ global options ([+] can be repeated): - -R --repository REPO repository root directory or name of overlay bundle - file - --cwd DIR change working directory - -y --noninteractive do not prompt, automatically pick the first choice for - all prompts - -q --quiet suppress output - -v --verbose enable additional output - --color TYPE when to colorize (boolean, always, auto, never, or - debug) - --config CONFIG [+] set/override config option (use 'section.name=value') - --debug enable debugging output - --debugger start debugger - --encoding ENCODE set the charset encoding (default: ascii) - --encodingmode MODE set the charset encoding mode (default: strict) - --traceback always print a traceback on exception - --time time how long the command takes - --profile print command execution profile - --version output version information and exit - -h --help display help and exit - --hidden consider hidden changesets - --pager TYPE when to paginate (boolean, always, auto, or never) - (default: auto) + -R --repository REPO repository root directory or name of overlay bundle + file + --cwd DIR change working directory + -y --noninteractive do not prompt, automatically pick the first choice + for all prompts + -q --quiet suppress output + -v --verbose enable additional output + --color TYPE when to colorize (boolean, always, auto, never, or + debug) + --config CONFIG [+] set/override config option (use + 'section.name=value') + --config-file HGRC [+] load config file to set/override config options + --debug enable debugging output + --debugger start debugger + --encoding ENCODE set the charset encoding (default: ascii) + --encodingmode MODE set the charset encoding mode (default: strict) + --traceback always print a traceback on exception + --time time how long the command takes + --profile print command execution profile + --version output version information and exit + -h --help display help and exit + --hidden consider hidden changesets + --pager TYPE when to paginate (boolean, always, auto, or never) + (default: auto) In case when extension name doesn't match any of its commands, help message should ask for '-v' to get list of built-in aliases @@ -1146,28 +1154,30 @@ global options ([+] can be repeated): - -R --repository REPO repository root directory or name of overlay bundle - file - --cwd DIR change working directory - -y --noninteractive do not prompt, automatically pick the first choice for - all prompts - -q --quiet suppress output - -v --verbose enable additional output - --color TYPE when to colorize (boolean, always, auto, never, or - debug) - --config CONFIG [+] set/override config option (use 'section.name=value') - --debug enable debugging output - --debugger start debugger - --encoding ENCODE set the charset encoding (default: ascii) - --encodingmode MODE set the charset encoding mode (default: strict) - --traceback always print a traceback on exception - --time time how long the command takes - --profile print command execution profile - --version output version information and exit - -h --help display help and exit - --hidden consider hidden changesets - --pager TYPE when to paginate (boolean, always, auto, or never) - (default: auto) + -R --repository REPO repository root directory or name of overlay bundle + file + --cwd DIR change working directory + -y --noninteractive do not prompt, automatically pick the first choice + for all prompts + -q --quiet suppress output + -v --verbose enable additional output + --color TYPE when to colorize (boolean, always, auto, never, or + debug) + --config CONFIG [+] set/override config option (use + 'section.name=value') + --config-file HGRC [+] load config file to set/override config options + --debug enable debugging output + --debugger start debugger + --encoding ENCODE set the charset encoding (default: ascii) + --encodingmode MODE set the charset encoding mode (default: strict) + --traceback always print a traceback on exception + --time time how long the command takes + --profile print command execution profile + --version output version information and exit + -h --help display help and exit + --hidden consider hidden changesets + --pager TYPE when to paginate (boolean, always, auto, or never) + (default: auto) $ hg help -v -e dudu dudu extension - @@ -1182,28 +1192,30 @@ global options ([+] can be repeated): - -R --repository REPO repository root directory or name of overlay bundle - file - --cwd DIR change working directory - -y --noninteractive do not prompt, automatically pick the first choice for - all prompts - -q --quiet suppress output - -v --verbose enable additional output - --color TYPE when to colorize (boolean, always, auto, never, or - debug) - --config CONFIG [+] set/override config option (use 'section.name=value') - --debug enable debugging output - --debugger start debugger - --encoding ENCODE set the charset encoding (default: ascii) - --encodingmode MODE set the charset encoding mode (default: strict) - --traceback always print a traceback on exception - --time time how long the command takes - --profile print command execution profile - --version output version information and exit - -h --help display help and exit - --hidden consider hidden changesets - --pager TYPE when to paginate (boolean, always, auto, or never) - (default: auto) + -R --repository REPO repository root directory or name of overlay bundle + file + --cwd DIR change working directory + -y --noninteractive do not prompt, automatically pick the first choice + for all prompts + -q --quiet suppress output + -v --verbose enable additional output + --color TYPE when to colorize (boolean, always, auto, never, or + debug) + --config CONFIG [+] set/override config option (use + 'section.name=value') + --config-file HGRC [+] load config file to set/override config options + --debug enable debugging output + --debugger start debugger + --encoding ENCODE set the charset encoding (default: ascii) + --encodingmode MODE set the charset encoding mode (default: strict) + --traceback always print a traceback on exception + --time time how long the command takes + --profile print command execution profile + --version output version information and exit + -h --help display help and exit + --hidden consider hidden changesets + --pager TYPE when to paginate (boolean, always, auto, or never) + (default: auto) Disabled extension commands:
--- a/tests/test-filelog.py Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-filelog.py Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,4 @@ -#!/usr/bin/env python +#!/usr/bin/env python3 """ Tests the behavior of filelog w.r.t. data starting with '\1\n' """
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tests/test-fix-path.t Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,67 @@ + +A script that implements uppercasing of specific lines in a file. This +approximates the behavior of code formatters well enough for our tests. + + $ hg init test-repo + $ cd test-repo + + $ mkdir some + $ mkdir some/dir + $ cat > some/dir/uppercase.py <<EOF + > #!$PYTHON + > import re + > import sys + > from mercurial.utils import procutil + > procutil.setbinary(sys.stdin) + > procutil.setbinary(sys.stdout) + > stdin = getattr(sys.stdin, 'buffer', sys.stdin) + > stdout = getattr(sys.stdout, 'buffer', sys.stdout) + > def format(text): + > return re.sub(b' +', b' ', text.upper()) + > stdout.write(format(stdin.read())) + > EOF + + $ chmod +x some/dir/uppercase.py + +#if windows + $ cat > some/dir/uppercase.bat <<EOF + > @echo off + > "$PYTHON" "$TESTTMP/test-repo/some/dir/uppercase.py" + > EOF +#else + $ mv some/dir/uppercase.py some/dir/uppercase +#endif + + $ echo babar > babar.txt + $ hg add babar.txt + +Using absolute paths + + $ cat >> $HGRCPATH <<EOF + > [extensions] + > fix = + > [experimental] + > evolution.createmarkers=True + > evolution.allowunstable=True + > [fix] + > extra-bin-paths=$TESTTMP/test-repo/some/dir/ + > uppercase-whole-file:command=uppercase + > uppercase-whole-file:pattern=set:**.txt + > EOF + + $ hg fix --working-dir + $ cat babar.txt + BABAR + +Using relative paths + + $ cat >> $HGRCPATH <<EOF + > [fix] + > extra-bin-paths=./some/dir/ + > EOF + + $ echo celeste > celeste.txt + $ hg add celeste.txt + $ hg fix --working-dir + $ cat celeste.txt + CELESTE
--- a/tests/test-fix.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-fix.t Fri Feb 28 23:28:10 2025 +0100 @@ -238,6 +238,9 @@ executions that modified a file. This aggregates the same metadata previously passed to the "postfixfile" hook. + You can specify a list of directories to search the tool command in using the + 'fix.extra-bin-paths' configuration. + list of commands: fix rewrite file content in changesets or working directory
--- a/tests/test-flagprocessor.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-flagprocessor.t Fri Feb 28 23:28:10 2025 +0100 @@ -233,7 +233,6 @@ $ echo '[BASE64]a-bit-longer-branching' > base64 $ hg commit -q -m branching -#if repobundlerepo $ hg bundle --base 1 bundle.hg 4 changesets found $ hg --config extensions.strip= strip -r 2 --no-backup --force -q @@ -290,7 +289,6 @@ 1 files changed, 1 insertions(+), 0 deletions(-) $ rm bundle.hg bundle-again.hg -#endif # TEST: hg status
--- a/tests/test-fncache.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-fncache.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,5 +1,3 @@ -#require repofncache - An extension which will set fncache chunksize to 1 byte to make sure that logic does not break @@ -114,7 +112,7 @@ .hg/wcache/checkisexec (execbit !) .hg/wcache/checklink (symlink !) .hg/wcache/checklink-target (symlink !) - .hg/wcache/manifestfulltextcache (reporevlogstore !) + .hg/wcache/manifestfulltextcache $ cd .. Non fncache repo: @@ -156,7 +154,7 @@ .hg/wcache/checkisexec (execbit !) .hg/wcache/checklink (symlink !) .hg/wcache/checklink-target (symlink !) - .hg/wcache/manifestfulltextcache (reporevlogstore !) + .hg/wcache/manifestfulltextcache $ cd .. Encoding of reserved / long paths in the store
--- a/tests/test-gendoc-ro.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-gendoc-ro.t Fri Feb 28 23:28:10 2025 +0100 @@ -5,5 +5,5 @@ until the localization is corrected. $ $TESTDIR/check-gendoc ro checking for parse errors - gendoc.txt:58: (WARNING/2) Inline interpreted text or phrase reference start-string without end-string. - gendoc.txt:58: (WARNING/2) Inline interpreted text or phrase reference start-string without end-string. + gendoc.txt:61: (WARNING/2) Inline interpreted text or phrase reference start-string without end-string. + gendoc.txt:61: (WARNING/2) Inline interpreted text or phrase reference start-string without end-string.
--- a/tests/test-generaldelta.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-generaldelta.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,5 +1,3 @@ -#require no-reposimplestore - Check whether size of generaldelta revlog is not bigger than its regular equivalent. Test would fail if generaldelta was naive implementation of parentdelta: third manifest revision would be fully
--- a/tests/test-globalopts.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-globalopts.t Fri Feb 28 23:28:10 2025 +0100 @@ -140,6 +140,9 @@ $ hg --confi "foo.bar=baz" abort: option --config may not be abbreviated [10] + $ hg --config-f "foo" + abort: option --config-file may not be abbreviated + [10] $ hg --cw a tip abort: option --cwd may not be abbreviated [10]
--- a/tests/test-hardlinks.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-hardlinks.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,4 @@ -#require hardlink reporevlogstore +#require hardlink $ cat > nlinks.py <<EOF > import sys @@ -51,12 +51,12 @@ 1 r1/.hg/store/00manifest.i 1 r1/.hg/store/data/d1/f2.i 1 r1/.hg/store/data/f1.i - 1 r1/.hg/store/fncache (repofncache !) + 1 r1/.hg/store/fncache 1 r1/.hg/store/phaseroots 1 r1/.hg/store/requires 1 r1/.hg/store/undo 1 r1/.hg/store/undo.backup.00changelog.n.bck (rust !) - 1 r1/.hg/store/undo.backup.fncache.bck (repofncache !) + 1 r1/.hg/store/undo.backup.fncache.bck 1 r1/.hg/store/undo.backupfiles @@ -108,12 +108,12 @@ 2 r1/.hg/store/00manifest.i 2 r1/.hg/store/data/d1/f2.i 2 r1/.hg/store/data/f1.i - 1 r1/.hg/store/fncache (repofncache !) + 1 r1/.hg/store/fncache 1 r1/.hg/store/phaseroots 1 r1/.hg/store/requires 1 r1/.hg/store/undo 1 r1/.hg/store/undo.backup.00changelog.n.bck (rust !) - 1 r1/.hg/store/undo.backup.fncache.bck (repofncache !) + 1 r1/.hg/store/undo.backup.fncache.bck 1 r1/.hg/store/undo.backupfiles $ nlinksdir r2/.hg/store @@ -124,7 +124,7 @@ 2 r2/.hg/store/00manifest.i 2 r2/.hg/store/data/d1/f2.i 2 r2/.hg/store/data/f1.i - 1 r2/.hg/store/fncache (repofncache !) + 1 r2/.hg/store/fncache 1 r2/.hg/store/requires Repo r3 should not be hardlinked: @@ -137,7 +137,7 @@ 1 r3/.hg/store/00manifest.i 1 r3/.hg/store/data/d1/f2.i 1 r3/.hg/store/data/f1.i - 1 r3/.hg/store/fncache (repofncache !) + 1 r3/.hg/store/fncache 1 r3/.hg/store/phaseroots 1 r3/.hg/store/requires 1 r3/.hg/store/undo @@ -166,7 +166,7 @@ 1 r3/.hg/store/data/d1/f2.d 1 r3/.hg/store/data/d1/f2.i 1 r3/.hg/store/data/f1.i - 1 r3/.hg/store/fncache (repofncache !) + 1 r3/.hg/store/fncache 1 r3/.hg/store/phaseroots 1 r3/.hg/store/requires 1 r3/.hg/store/undo @@ -196,10 +196,10 @@ 1 r2/.hg/store/00manifest.i 1 r2/.hg/store/data/d1/f2.i 2 r2/.hg/store/data/f1.i - [12] r2/\.hg/store/fncache (re) (repofncache !) + [12] r2/\.hg/store/fncache (re) 1 r2/.hg/store/requires -#if hardlink-whitelisted repofncache +#if hardlink-whitelisted $ nlinksdir r2/.hg/store/fncache 1 r2/.hg/store/fncache #endif @@ -224,10 +224,10 @@ 1 r2/.hg/store/00manifest.i 1 r2/.hg/store/data/d1/f2.i 1 r2/.hg/store/data/f1.i - 1 r2/.hg/store/fncache (repofncache !) + 1 r2/.hg/store/fncache 1 r2/.hg/store/requires -#if hardlink-whitelisted repofncache +#if hardlink-whitelisted $ nlinksdir r2/.hg/store/fncache 1 r2/.hg/store/fncache #endif @@ -282,7 +282,7 @@ 2 r4/.hg/store/data/d1/f2.i 2 r4/.hg/store/data/f1.i 2 r4/.hg/store/data/f3.i - 2 r4/.hg/store/fncache (repofncache !) + 2 r4/.hg/store/fncache 2 r4/.hg/store/phaseroots 2 r4/.hg/store/requires 2 r4/.hg/store/undo @@ -294,7 +294,7 @@ 2 r4/.hg/wcache/checkisexec (execbit !) 2 r4/.hg/wcache/checklink-target (symlink !) 2 r4/.hg/wcache/checknoexec (execbit !) - 2 r4/.hg/wcache/manifestfulltextcache (reporevlogstore !) + 2 r4/.hg/wcache/manifestfulltextcache 2 r4/d1/data1 2 r4/d1/f2 2 r4/f1 @@ -347,7 +347,7 @@ 2 r4/.hg/wcache/checkisexec (execbit !) 2 r4/.hg/wcache/checklink-target (symlink !) 2 r4/.hg/wcache/checknoexec (execbit !) - 1 r4/.hg/wcache/manifestfulltextcache (reporevlogstore !) + 1 r4/.hg/wcache/manifestfulltextcache 2 r4/d1/data1 2 r4/d1/f2 1 r4/f1
--- a/tests/test-help.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-help.t Fri Feb 28 23:28:10 2025 +0100 @@ -437,28 +437,30 @@ global options ([+] can be repeated): - -R --repository REPO repository root directory or name of overlay bundle - file - --cwd DIR change working directory - -y --noninteractive do not prompt, automatically pick the first choice for - all prompts - -q --quiet suppress output - -v --verbose enable additional output - --color TYPE when to colorize (boolean, always, auto, never, or - debug) - --config CONFIG [+] set/override config option (use 'section.name=value') - --debug enable debugging output - --debugger start debugger - --encoding ENCODE set the charset encoding (default: ascii) - --encodingmode MODE set the charset encoding mode (default: strict) - --traceback always print a traceback on exception - --time time how long the command takes - --profile print command execution profile - --version output version information and exit - -h --help display help and exit - --hidden consider hidden changesets - --pager TYPE when to paginate (boolean, always, auto, or never) - (default: auto) + -R --repository REPO repository root directory or name of overlay bundle + file + --cwd DIR change working directory + -y --noninteractive do not prompt, automatically pick the first choice + for all prompts + -q --quiet suppress output + -v --verbose enable additional output + --color TYPE when to colorize (boolean, always, auto, never, or + debug) + --config CONFIG [+] set/override config option (use + 'section.name=value') + --config-file HGRC [+] load config file to set/override config options + --debug enable debugging output + --debugger start debugger + --encoding ENCODE set the charset encoding (default: ascii) + --encodingmode MODE set the charset encoding mode (default: strict) + --traceback always print a traceback on exception + --time time how long the command takes + --profile print command execution profile + --version output version information and exit + -h --help display help and exit + --hidden consider hidden changesets + --pager TYPE when to paginate (boolean, always, auto, or never) + (default: auto) (use 'hg help' for the full list of commands) @@ -537,28 +539,30 @@ global options ([+] can be repeated): - -R --repository REPO repository root directory or name of overlay bundle - file - --cwd DIR change working directory - -y --noninteractive do not prompt, automatically pick the first choice for - all prompts - -q --quiet suppress output - -v --verbose enable additional output - --color TYPE when to colorize (boolean, always, auto, never, or - debug) - --config CONFIG [+] set/override config option (use 'section.name=value') - --debug enable debugging output - --debugger start debugger - --encoding ENCODE set the charset encoding (default: ascii) - --encodingmode MODE set the charset encoding mode (default: strict) - --traceback always print a traceback on exception - --time time how long the command takes - --profile print command execution profile - --version output version information and exit - -h --help display help and exit - --hidden consider hidden changesets - --pager TYPE when to paginate (boolean, always, auto, or never) - (default: auto) + -R --repository REPO repository root directory or name of overlay bundle + file + --cwd DIR change working directory + -y --noninteractive do not prompt, automatically pick the first choice + for all prompts + -q --quiet suppress output + -v --verbose enable additional output + --color TYPE when to colorize (boolean, always, auto, never, or + debug) + --config CONFIG [+] set/override config option (use + 'section.name=value') + --config-file HGRC [+] load config file to set/override config options + --debug enable debugging output + --debugger start debugger + --encoding ENCODE set the charset encoding (default: ascii) + --encodingmode MODE set the charset encoding mode (default: strict) + --traceback always print a traceback on exception + --time time how long the command takes + --profile print command execution profile + --version output version information and exit + -h --help display help and exit + --hidden consider hidden changesets + --pager TYPE when to paginate (boolean, always, auto, or never) + (default: auto) Test the textwidth config option @@ -3108,6 +3112,9 @@ <td>--config CONFIG [+]</td> <td>set/override config option (use 'section.name=value')</td></tr> <tr><td></td> + <td>--config-file HGRC [+]</td> + <td>load config file to set/override config options</td></tr> + <tr><td></td> <td>--debug</td> <td>enable debugging output</td></tr> <tr><td></td> @@ -3312,6 +3319,9 @@ <td>--config CONFIG [+]</td> <td>set/override config option (use 'section.name=value')</td></tr> <tr><td></td> + <td>--config-file HGRC [+]</td> + <td>load config file to set/override config options</td></tr> + <tr><td></td> <td>--debug</td> <td>enable debugging output</td></tr> <tr><td></td>
--- a/tests/test-hghave.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-hghave.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,5 +1,7 @@ $ . "$TESTDIR/helpers-testrepo.sh" + $ . "$TESTDIR/helper-runtests.sh" + Testing that hghave does not crash when checking features $ hghave --test-features 2>/dev/null
--- a/tests/test-hgweb-bundle.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-hgweb-bundle.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,4 @@ -#require serve repobundlerepo +#require serve $ hg init server $ cd server
--- a/tests/test-highlight.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-highlight.t Fri Feb 28 23:28:10 2025 +0100 @@ -1010,7 +1010,7 @@ > EOF $ cat > unknownfile << EOF - > #!/this/helps/pygments/detect/python + > #!/this/helps/pygments/detect/python3 > def foo(): > pass > EOF
--- a/tests/test-http-bundle1.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-http-bundle1.t Fri Feb 28 23:28:10 2025 +0100 @@ -35,19 +35,15 @@ clone via stream -#if no-reposimplestore $ hg clone --stream http://localhost:$HGPORT/ copy 2>&1 streaming all changes - 7 files to transfer, 606 bytes of data (no-zstd !) - 7 files to transfer, 608 bytes of data (zstd no-rust !) - 9 files to transfer, 734 bytes of data (zstd rust !) - transferred * bytes in * seconds (*/sec) (glob) + * files to transfer, * bytes of data (glob) + stream-cloned * files / * bytes in * seconds (*/sec) (glob) searching for changes no changes found updating to branch default 4 files updated, 0 files merged, 0 files removed, 0 files unresolved $ hg verify -R copy -q -#endif try to clone via stream, should use pull instead @@ -216,18 +212,14 @@ $ hg id http://user@localhost:$HGPORT2/ 5fed3813f7f5 -#if no-reposimplestore $ hg clone http://user:pass@localhost:$HGPORT2/ dest 2>&1 streaming all changes - 8 files to transfer, 916 bytes of data (no-zstd !) - 8 files to transfer, 919 bytes of data (zstd no-rust !) - 10 files to transfer, 1.02 KB of data (zstd rust !) - transferred * in * seconds (*/sec) (glob) + * files to transfer, * of data (glob) + stream-cloned * files / * in * seconds (*/sec) (glob) searching for changes no changes found updating to branch default 5 files updated, 0 files merged, 0 files removed, 0 files unresolved -#endif --pull should override server's preferuncompressed @@ -286,16 +278,16 @@ "GET /?cmd=lookup HTTP/1.1" 200 - x-hgarg-1:key=tip x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull "GET /?cmd=listkeys HTTP/1.1" 200 - x-hgarg-1:namespace=namespaces x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull "GET /?cmd=listkeys HTTP/1.1" 200 - x-hgarg-1:namespace=bookmarks x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull - "GET /?cmd=capabilities HTTP/1.1" 401 - (no-reposimplestore !) - "GET /?cmd=capabilities HTTP/1.1" 200 - (no-reposimplestore !) - "GET /?cmd=branchmap HTTP/1.1" 200 - x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull (no-reposimplestore !) - "GET /?cmd=stream_out HTTP/1.1" 200 - x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull (no-reposimplestore !) - "GET /?cmd=listkeys HTTP/1.1" 200 - x-hgarg-1:namespace=bookmarks x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull (no-reposimplestore !) - "GET /?cmd=batch HTTP/1.1" 200 - x-hgarg-1:cmds=heads+%3Bknown+nodes%3D5fed3813f7f5e1824344fdc9cf8f63bb662c292d x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull (no-reposimplestore !) - "GET /?cmd=listkeys HTTP/1.1" 200 - x-hgarg-1:namespace=phases x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull (no-reposimplestore !) - "GET /?cmd=capabilities HTTP/1.1" 401 - (no-reposimplestore !) - "GET /?cmd=capabilities HTTP/1.1" 200 - (no-reposimplestore !) - "GET /?cmd=listkeys HTTP/1.1" 200 - x-hgarg-1:namespace=bookmarks x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull (no-reposimplestore !) + "GET /?cmd=capabilities HTTP/1.1" 401 - + "GET /?cmd=capabilities HTTP/1.1" 200 - + "GET /?cmd=branchmap HTTP/1.1" 200 - x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull + "GET /?cmd=stream_out HTTP/1.1" 200 - x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull + "GET /?cmd=listkeys HTTP/1.1" 200 - x-hgarg-1:namespace=bookmarks x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull + "GET /?cmd=batch HTTP/1.1" 200 - x-hgarg-1:cmds=heads+%3Bknown+nodes%3D5fed3813f7f5e1824344fdc9cf8f63bb662c292d x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull + "GET /?cmd=listkeys HTTP/1.1" 200 - x-hgarg-1:namespace=phases x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull + "GET /?cmd=capabilities HTTP/1.1" 401 - + "GET /?cmd=capabilities HTTP/1.1" 200 - + "GET /?cmd=listkeys HTTP/1.1" 200 - x-hgarg-1:namespace=bookmarks x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull "GET /?cmd=batch HTTP/1.1" 200 - x-hgarg-1:cmds=heads+%3Bknown+nodes%3D x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull "GET /?cmd=getbundle HTTP/1.1" 200 - x-hgarg-1:common=0000000000000000000000000000000000000000&heads=5fed3813f7f5e1824344fdc9cf8f63bb662c292d x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull "GET /?cmd=listkeys HTTP/1.1" 200 - x-hgarg-1:namespace=phases x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull @@ -373,18 +365,14 @@ server has pull-based clones disabled [100] -#if no-reposimplestore ... but keep stream clones working $ hg clone --stream --noupdate http://localhost:$HGPORT1/ test-stream-clone streaming all changes * files to transfer, * of data (glob) - transferred 1.36 KB in * seconds (* */sec) (glob) (no-zstd !) - transferred 1.38 KB in * seconds (* */sec) (glob) (zstd no-rust !) - transferred 1.56 KB in * seconds (* */sec) (glob) (zstd rust !) + stream-cloned * files / * KB in * seconds (* */sec) (glob) searching for changes no changes found -#endif ... and also keep partial clones and pulls working $ hg clone http://localhost:$HGPORT1 --rev 0 test-partial-clone
--- a/tests/test-http-proxy.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-http-proxy.t Fri Feb 28 23:28:10 2025 +0100 @@ -16,10 +16,8 @@ $ http_proxy=http://localhost:$HGPORT1/ hg --config http_proxy.always=True clone --stream http://localhost:$HGPORT/ b streaming all changes - 7 files to transfer, 412 bytes of data (reporevlogstore no-rust !) - 9 files to transfer, 538 bytes of data (reporevlogstore rust !) - 4 files to transfer, 330 bytes of data (reposimplestore !) - transferred * bytes in * seconds (*/sec) (glob) + * files to transfer, * bytes of data (glob) + stream-cloned * files / * bytes in * seconds (*/sec) (glob) updating to branch default 1 files updated, 0 files merged, 0 files removed, 0 files unresolved $ cd b
--- a/tests/test-http.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-http.t Fri Feb 28 23:28:10 2025 +0100 @@ -26,17 +26,13 @@ clone via stream -#if no-reposimplestore $ hg clone --stream http://localhost:$HGPORT/ copy 2>&1 streaming all changes - 10 files to transfer, 715 bytes of data (no-zstd !) - 10 files to transfer, 717 bytes of data (zstd no-rust !) - 12 files to transfer, 843 bytes of data (zstd rust !) - transferred * bytes in * seconds (*/sec) (glob) + * files to transfer, * bytes of data (glob) + stream-cloned * files / * bytes in * seconds (*/sec) (glob) updating to branch default 4 files updated, 0 files merged, 0 files removed, 0 files unresolved $ hg verify -R copy -q -#endif try to clone via stream, should use pull instead @@ -252,15 +248,12 @@ $ hg id http://localhost:$HGPORT2/ --config extensions.x=use_digests.py 5fed3813f7f5 -#if no-reposimplestore $ hg clone http://user:pass@localhost:$HGPORT2/ dest 2>&1 streaming all changes - 11 files to transfer, 1.01 KB of data (no-rust !) - 13 files to transfer, 1.13 KB of data (rust !) - transferred * KB in * seconds (*/sec) (glob) + * files to transfer, * KB of data (glob) + stream-cloned * files / * KB in * seconds (*/sec) (glob) updating to branch default 5 files updated, 0 files merged, 0 files removed, 0 files unresolved -#endif --pull should override server's preferuncompressed $ hg clone --pull http://user:pass@localhost:$HGPORT2/ dest-pull 2>&1 @@ -419,12 +412,12 @@ "GET /?cmd=listkeys HTTP/1.1" 200 - x-hgarg-1:namespace=namespaces x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull x-hgtest-authtype:Digest "GET /?cmd=listkeys HTTP/1.1" 401 - x-hgarg-1:namespace=bookmarks x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull x-hgtest-authtype:Digest "GET /?cmd=listkeys HTTP/1.1" 200 - x-hgarg-1:namespace=bookmarks x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull x-hgtest-authtype:Digest - "GET /?cmd=capabilities HTTP/1.1" 401 - (no-reposimplestore !) - "GET /?cmd=capabilities HTTP/1.1" 200 - (no-reposimplestore !) - "GET /?cmd=batch HTTP/1.1" 200 - x-hgarg-1:cmds=heads+%3Bknown+nodes%3D x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull (no-reposimplestore !) - "GET /?cmd=getbundle HTTP/1.1" 200 - x-hgarg-1:bookmarks=1&$USUAL_BUNDLE_CAPS$&cg=0&common=0000000000000000000000000000000000000000&heads=5fed3813f7f5e1824344fdc9cf8f63bb662c292d&listkeys=bookmarks&stream=1 x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull (no-reposimplestore !) - "GET /?cmd=capabilities HTTP/1.1" 401 - (no-reposimplestore !) - "GET /?cmd=capabilities HTTP/1.1" 200 - (no-reposimplestore !) + "GET /?cmd=capabilities HTTP/1.1" 401 - + "GET /?cmd=capabilities HTTP/1.1" 200 - + "GET /?cmd=batch HTTP/1.1" 200 - x-hgarg-1:cmds=heads+%3Bknown+nodes%3D x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull + "GET /?cmd=getbundle HTTP/1.1" 200 - x-hgarg-1:bookmarks=1&$USUAL_BUNDLE_CAPS$&cg=0&common=0000000000000000000000000000000000000000&heads=5fed3813f7f5e1824344fdc9cf8f63bb662c292d&listkeys=bookmarks&stream=1 x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull + "GET /?cmd=capabilities HTTP/1.1" 401 - + "GET /?cmd=capabilities HTTP/1.1" 200 - "GET /?cmd=batch HTTP/1.1" 200 - x-hgarg-1:cmds=heads+%3Bknown+nodes%3D x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull "GET /?cmd=getbundle HTTP/1.1" 200 - x-hgarg-1:bookmarks=1&$USUAL_BUNDLE_CAPS$&cg=1&common=0000000000000000000000000000000000000000&heads=5fed3813f7f5e1824344fdc9cf8f63bb662c292d&listkeys=bookmarks&phases=1 x-hgproto-1:0.1 0.2 comp=$USUAL_COMPRESSIONS$ partial-pull "GET /?cmd=capabilities HTTP/1.1" 401 - @@ -513,15 +506,13 @@ (remove --pull if specified or upgrade Mercurial) [100] -#if no-reposimplestore ... but keep stream clones working $ hg clone --stream --noupdate http://localhost:$HGPORT1/ test-stream-clone streaming all changes * files to transfer, * of data (glob) - transferred * in * seconds (*/sec) (glob) + stream-cloned * files / * in * seconds (*/sec) (glob) $ cat error.log -#endif ... and also keep partial clones and pulls working $ hg clone http://localhost:$HGPORT1 --rev 0 test/partial/clone
--- a/tests/test-inherit-mode.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-inherit-mode.t Fri Feb 28 23:28:10 2025 +0100 @@ -85,15 +85,9 @@ 00660 ./.hg/store/00manifest.i 00770 ./.hg/store/data/ 00770 ./.hg/store/data/dir/ - 00660 ./.hg/store/data/dir/bar.i (reporevlogstore !) - 00660 ./.hg/store/data/foo.i (reporevlogstore !) - 00770 ./.hg/store/data/dir/bar/ (reposimplestore !) - 00660 ./.hg/store/data/dir/bar/b80de5d138758541c5f05265ad144ab9fa86d1db (reposimplestore !) - 00660 ./.hg/store/data/dir/bar/index (reposimplestore !) - 00770 ./.hg/store/data/foo/ (reposimplestore !) - 00660 ./.hg/store/data/foo/b80de5d138758541c5f05265ad144ab9fa86d1db (reposimplestore !) - 00660 ./.hg/store/data/foo/index (reposimplestore !) - 00660 ./.hg/store/fncache (repofncache !) + 00660 ./.hg/store/data/dir/bar.i + 00660 ./.hg/store/data/foo.i + 00660 ./.hg/store/fncache 00660 ./.hg/store/phaseroots 00600 ./.hg/store/requires 00660 ./.hg/store/undo @@ -104,7 +98,7 @@ 00711 ./.hg/wcache/checkisexec 007.. ./.hg/wcache/checklink (re) 00600 ./.hg/wcache/checklink-target - 00660 ./.hg/wcache/manifestfulltextcache (reporevlogstore !) + 00660 ./.hg/wcache/manifestfulltextcache 00700 ./dir/ 00600 ./dir/bar 00600 ./foo @@ -147,15 +141,9 @@ 00660 ../push/.hg/store/00manifest.i 00770 ../push/.hg/store/data/ 00770 ../push/.hg/store/data/dir/ - 00660 ../push/.hg/store/data/dir/bar.i (reporevlogstore !) - 00660 ../push/.hg/store/data/foo.i (reporevlogstore !) - 00770 ../push/.hg/store/data/dir/bar/ (reposimplestore !) - 00660 ../push/.hg/store/data/dir/bar/b80de5d138758541c5f05265ad144ab9fa86d1db (reposimplestore !) - 00660 ../push/.hg/store/data/dir/bar/index (reposimplestore !) - 00770 ../push/.hg/store/data/foo/ (reposimplestore !) - 00660 ../push/.hg/store/data/foo/b80de5d138758541c5f05265ad144ab9fa86d1db (reposimplestore !) - 00660 ../push/.hg/store/data/foo/index (reposimplestore !) - 00660 ../push/.hg/store/fncache (repofncache !) + 00660 ../push/.hg/store/data/dir/bar.i + 00660 ../push/.hg/store/data/foo.i + 00660 ../push/.hg/store/fncache 00660 ../push/.hg/store/requires 00660 ../push/.hg/store/undo 00660 ../push/.hg/store/undo.backupfiles
--- a/tests/test-init.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-init.t Fri Feb 28 23:28:10 2025 +0100 @@ -28,7 +28,6 @@ share-safe sparserevlog store - testonly-simplestore (reposimplestore !) $ echo this > local/foo $ hg ci --cwd local -A -m "init" adding foo @@ -67,7 +66,6 @@ persistent-nodemap (rust !) revlog-compression-zstd (zstd !) revlogv1 - testonly-simplestore (reposimplestore !) sparserevlog creating repo with format.usefncache=false @@ -84,7 +82,6 @@ share-safe sparserevlog store - testonly-simplestore (reposimplestore !) creating repo with format.dotencode=false @@ -101,7 +98,6 @@ share-safe sparserevlog store - testonly-simplestore (reposimplestore !) creating repo with format.dotencode=false @@ -117,7 +113,6 @@ revlogv1 share-safe store - testonly-simplestore (reposimplestore !) test failure @@ -239,7 +234,6 @@ share-safe sparserevlog store - testonly-simplestore (reposimplestore !) prepare test of init of url configured from paths @@ -263,7 +257,6 @@ share-safe sparserevlog store - testonly-simplestore (reposimplestore !) verify that clone also expand urls @@ -283,7 +276,6 @@ share-safe sparserevlog store - testonly-simplestore (reposimplestore !) clone bookmarks
--- a/tests/test-keyword.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-keyword.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,5 +1,3 @@ -#require no-reposimplestore - Run kwdemo outside a repo $ hg -q --config extensions.keyword= --config keywordmaps.Foo="{author|user}" kwdemo [extensions]
--- a/tests/test-largefiles-update.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-largefiles-update.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,5 +1,3 @@ -#require no-reposimplestore - This file focuses mainly on updating largefiles in the working directory (and ".hg/largefiles/dirstate")
--- a/tests/test-lfconvert.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-lfconvert.t Fri Feb 28 23:28:10 2025 +0100 @@ -106,7 +106,6 @@ share-safe sparserevlog store - testonly-simplestore (reposimplestore !) "lfconvert" includes a newline at the end of the standin files. $ cat .hglf/large .hglf/sub/maybelarge.dat
--- a/tests/test-lfs-largefiles.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-lfs-largefiles.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,4 @@ -#require no-reposimplestore no-chg +#require no-chg This tests the interaction between the largefiles and lfs extensions, and conversion from largefiles -> lfs.
--- a/tests/test-lfs-serve-access.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-lfs-serve-access.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,4 @@ -#require serve no-reposimplestore no-chg +#require serve no-chg $ cat >> $HGRCPATH <<EOF > [extensions]
--- a/tests/test-lfs-serve.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-lfs-serve.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,5 +1,5 @@ #testcases lfsremote-on lfsremote-off -#require serve no-reposimplestore no-chg +#require serve no-chg This test splits `hg serve` with and without using the extension into separate tests cases. The tests are broken down as follows, where "LFS"/"No-LFS"
--- a/tests/test-lfs-test-server.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-lfs-test-server.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,4 @@ -#require no-reposimplestore no-chg +#require no-chg #testcases git-server hg-server #if git-server
--- a/tests/test-lfs.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-lfs.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,4 @@ -#require no-reposimplestore no-chg +#require no-chg $ hg init requirements $ cd requirements
--- a/tests/test-mq-pull-from-bundle.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-mq-pull-from-bundle.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,5 +1,3 @@ -#require repobundlerepo - $ cat <<EOF >> $HGRCPATH > [extensions] > mq=
--- a/tests/test-narrow-clone-no-ellipsis.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-narrow-clone-no-ellipsis.t Fri Feb 28 23:28:10 2025 +0100 @@ -33,7 +33,6 @@ share-safe sparserevlog store - testonly-simplestore (reposimplestore !) $ hg tracked I path:dir/src/f10
--- a/tests/test-narrow-clone-stream.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-narrow-clone-stream.t Fri Feb 28 23:28:10 2025 +0100 @@ -51,7 +51,7 @@ $ hg clone --narrow ssh://user@dummy/master narrow --noupdate --include "dir/src/F10" --stream streaming all changes * files to transfer, * KB of data (glob) - transferred * KB in * seconds (* */sec) (glob) + stream-cloned * files / * KB in * seconds (* */sec) (glob) $ cd narrow $ ls -A
--- a/tests/test-narrow-clone.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-narrow-clone.t Fri Feb 28 23:28:10 2025 +0100 @@ -61,7 +61,6 @@ share-safe sparserevlog store - testonly-simplestore (reposimplestore !) $ hg tracked I path:dir/src/f10
--- a/tests/test-narrow-exchange.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-narrow-exchange.t Fri Feb 28 23:28:10 2025 +0100 @@ -105,8 +105,7 @@ remote: adding file changes remote: transaction abort! remote: rollback completed - remote: abort: data/inside2/f@4a1aa07735e673e20c00fae80f40dc301ee30616: unknown parent (reporevlogstore !) - remote: abort: data/inside2/f/index@4a1aa07735e6: no node (reposimplestore !) + remote: abort: data/inside2/f@4a1aa07735e673e20c00fae80f40dc301ee30616: unknown parent abort: stream ended unexpectedly (got 0 bytes, expected 4) [255]
--- a/tests/test-narrow-patterns.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-narrow-patterns.t Fri Feb 28 23:28:10 2025 +0100 @@ -190,9 +190,7 @@ comparing with ssh://user@dummy/master searching for changes looking for local changes to affected paths - deleting data/dir1/dirA/bar.i (reporevlogstore !) - deleting data/dir1/dirA/bar/0eca1d0cbdaea4651d1d04d71976a6d2d9bfaae5 (reposimplestore !) - deleting data/dir1/dirA/bar/index (reposimplestore !) + deleting data/dir1/dirA/bar.i deleting unwanted files from working copy saved backup bundle to $TESTTMP/narrow/.hg/strip-backup/*-widen.hg (glob) adding changesets @@ -247,9 +245,7 @@ comparing with ssh://user@dummy/master searching for changes looking for local changes to affected paths - deleting data/dir1/dirA/foo.i (reporevlogstore !) - deleting data/dir1/dirA/foo/162caeb3d55dceb1fee793aa631ac8c73fcb8b5e (reposimplestore !) - deleting data/dir1/dirA/foo/index (reposimplestore !) + deleting data/dir1/dirA/foo.i deleting unwanted files from working copy saved backup bundle to $TESTTMP/narrow/.hg/strip-backup/*-widen.hg (glob) adding changesets
--- a/tests/test-narrow-shallow-merges.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-narrow-shallow-merges.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,5 +1,3 @@ -#require no-reposimplestore - $ . "$TESTDIR/narrow-library.sh" create full repo
--- a/tests/test-narrow-shallow.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-narrow-shallow.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,5 +1,3 @@ -#require no-reposimplestore - $ . "$TESTDIR/narrow-library.sh" $ hg init master
--- a/tests/test-narrow-strip.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-narrow-strip.t Fri Feb 28 23:28:10 2025 +0100 @@ -124,7 +124,6 @@ $ hg strip . 1 files updated, 0 files merged, 0 files removed, 0 files unresolved saved backup bundle to $TESTTMP/narrow/.hg/strip-backup/*-backup.hg (glob) -#if repobundlerepo $ hg pull .hg/strip-backup/*-backup.hg pulling from .hg/strip-backup/*-backup.hg (glob) searching for changes @@ -171,4 +170,3 @@ added 3 changesets with 2 changes to 1 files (+1 heads) new changesets *:* (glob) (run 'hg heads' to see heads, 'hg merge' to merge) -#endif
--- a/tests/test-narrow.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-narrow.t Fri Feb 28 23:28:10 2025 +0100 @@ -59,6 +59,12 @@ $ hg clone --narrow ssh://user@dummy/master foo --include a/./c abort: "." and ".." are not allowed in narrowspec paths [255] + $ hg clone --narrow ssh://user@dummy/master foo --include ' ' + abort: leading or trailing whitespace is not allowed in narrowspec paths + [255] + $ hg clone --narrow ssh://user@dummy/master foo --include 'a//c' + abort: empty path components are not allowed in narrowspec paths + [255] Names with '.' in them are OK. $ hg clone --narrow ./master should-work --include a/.b/c @@ -139,11 +145,8 @@ * (glob) moving unwanted changesets to backup saved backup bundle to $TESTTMP/narrow-local-changes/.hg/strip-backup/*-narrow.hg (glob) - deleting data/d0/f.i (reporevlogstore !) + deleting data/d0/f.i deleting meta/d0/00manifest.i (tree !) - deleting data/d0/f/362fef284ce2ca02aecc8de6d5e8a1c3af0556fe (reposimplestore !) - deleting data/d0/f/4374b5650fc5ae54ac857c0f0381971fdde376f7 (reposimplestore !) - deleting data/d0/f/index (reposimplestore !) deleting unwanted files from working copy $ hg log -T "{rev}: {desc} {outsidenarrow}\n" @@ -173,11 +176,8 @@ looking for local changes to affected paths moving unwanted changesets to backup saved backup bundle to $TESTTMP/narrow-local-changes/.hg/strip-backup/*-narrow.hg (glob) - deleting data/d0/f.i (reporevlogstore !) + deleting data/d0/f.i deleting meta/d0/00manifest.i (tree !) - deleting data/d0/f/362fef284ce2ca02aecc8de6d5e8a1c3af0556fe (reposimplestore !) - deleting data/d0/f/4374b5650fc5ae54ac857c0f0381971fdde376f7 (reposimplestore !) - deleting data/d0/f/index (reposimplestore !) deleting unwanted files from working copy Updates off of stripped commit if necessary @@ -194,11 +194,8 @@ 2 files updated, 0 files merged, 0 files removed, 0 files unresolved moving unwanted changesets to backup saved backup bundle to $TESTTMP/narrow-local-changes/.hg/strip-backup/*-narrow.hg (glob) - deleting data/d3/f.i (reporevlogstore !) + deleting data/d3/f.i deleting meta/d3/00manifest.i (tree !) - deleting data/d3/f/2661d26c649684b482d10f91960cc3db683c38b4 (reposimplestore !) - deleting data/d3/f/99fa7136105a15e2045ce3d9152e4837c5349e4d (reposimplestore !) - deleting data/d3/f/index (reposimplestore !) deleting unwanted files from working copy $ hg log -T '{desc}\n' -r . add d10/f @@ -219,11 +216,8 @@ 0 files updated, 0 files merged, 1 files removed, 0 files unresolved moving unwanted changesets to backup saved backup bundle to $TESTTMP/narrow-local-changes/.hg/strip-backup/*-narrow.hg (glob) - deleting data/d3/f.i (reporevlogstore !) + deleting data/d3/f.i deleting meta/d3/00manifest.i (tree !) - deleting data/d3/f/2661d26c649684b482d10f91960cc3db683c38b4 (reposimplestore !) - deleting data/d3/f/5ce0767945cbdbca3b924bb9fbf5143f72ab40ac (reposimplestore !) - deleting data/d3/f/index (reposimplestore !) deleting unwanted files from working copy $ hg id 000000000000 @@ -281,10 +275,8 @@ comparing with ssh://user@dummy/master searching for changes looking for local changes to affected paths - deleting data/d0/f.i (reporevlogstore !) + deleting data/d0/f.i deleting meta/d0/00manifest.i (tree !) - deleting data/d0/f/362fef284ce2ca02aecc8de6d5e8a1c3af0556fe (reposimplestore !) - deleting data/d0/f/index (reposimplestore !) deleting unwanted files from working copy $ hg tracked $ hg files @@ -342,19 +334,15 @@ comparing with ssh://user@dummy/master searching for changes looking for local changes to affected paths - deleting data/d6/f.i (reporevlogstore !) + deleting data/d6/f.i deleting meta/d6/00manifest.i (tree !) - deleting data/d6/f/7339d30678f451ac8c3f38753beeb4cf2e1655c7 (reposimplestore !) - deleting data/d6/f/index (reposimplestore !) deleting unwanted files from working copy $ hg tracked I path:d0 I path:d3 I path:d9 -#if repofncache $ hg debugrebuildfncache fncache already up to date -#endif $ find * d0 d0/f @@ -367,19 +355,15 @@ comparing with ssh://user@dummy/master searching for changes looking for local changes to affected paths - deleting data/d3/f.i (reporevlogstore !) - deleting data/d3/f/2661d26c649684b482d10f91960cc3db683c38b4 (reposimplestore !) - deleting data/d3/f/index (reposimplestore !) + deleting data/d3/f.i deleting unwanted files from working copy $ hg tracked I path:d0 I path:d3 I path:d9 X path:d3/f -#if repofncache $ hg debugrebuildfncache fncache already up to date -#endif $ find * d0 d0/f @@ -390,20 +374,16 @@ comparing with ssh://user@dummy/master searching for changes looking for local changes to affected paths - deleting data/d0/f.i (reporevlogstore !) + deleting data/d0/f.i deleting meta/d0/00manifest.i (tree !) - deleting data/d0/f/362fef284ce2ca02aecc8de6d5e8a1c3af0556fe (reposimplestore !) - deleting data/d0/f/index (reposimplestore !) deleting unwanted files from working copy $ hg tracked I path:d3 I path:d9 X path:d0 X path:d3/f -#if repofncache $ hg debugrebuildfncache fncache already up to date -#endif $ find * d9 d9/f
--- a/tests/test-obsolete.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-obsolete.t Fri Feb 28 23:28:10 2025 +0100 @@ -1406,7 +1406,6 @@ o 0:4b34ecfb0d56 (draft) [ ] A -#if repobundlerepo $ hg incoming ../repo-bundleoverlay --bundle ../bundleoverlay.hg comparing with ../repo-bundleoverlay searching for changes @@ -1419,7 +1418,6 @@ |/ o 0:4b34ecfb0d56 (draft) [ ] A -#endif #if serve @@ -1606,12 +1604,10 @@ phase-heads -- {} (mandatory: True) e008cf2834908e5d6b0f792a9d4b0e2272260fb8 draft -#if repobundlerepo $ hg pull .hg/strip-backup/e008cf283490-*-backup.hg pulling from .hg/strip-backup/e008cf283490-ede36964-backup.hg searching for changes no changes found -#endif $ hg debugobsolete e008cf2834908e5d6b0f792a9d4b0e2272260fb8 b0551702f918510f01ae838ab03a463054c67b46 0 (Thu Jan 01 00:00:00 1970 +0000) {'ef1': '8', 'operation': 'amend', 'user': 'test'} $ hg log -G
--- a/tests/test-parseindex2.py Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-parseindex2.py Fri Feb 28 23:28:10 2025 +0100 @@ -135,7 +135,7 @@ def parse_index2(data, inline, format=constants.REVLOGV1): - index, chunkcache = parsers.parse_index2(data, inline, format=format) + index, chunkcache = parsers.parse_index2(data, inline, False, format=format) return list(index), chunkcache @@ -226,7 +226,7 @@ def testbadargs(self): # Check that parse_index2() raises TypeError on bad arguments. with self.assertRaises(TypeError): - parse_index2(0, True) + parse_index2(0, True, False) def testparseindexfile(self): # Check parsers.parse_index2() on an index file against the @@ -241,7 +241,7 @@ got = parse_index2(data_non_inlined, False) self.assertEqual(want, got) # no inline data - ix = parsers.parse_index2(data_inlined, True)[0] + ix = parsers.parse_index2(data_inlined, True, False)[0] for i, r in enumerate(ix): if r[7] == sha1nodeconstants.nullid: i = -1 @@ -271,16 +271,16 @@ constants.COMP_MODE_INLINE, constants.RANK_UNKNOWN, ) - index, junk = parsers.parse_index2(data_inlined, True) + index, junk = parsers.parse_index2(data_inlined, True, False) got = index[-1] self.assertEqual(want, got) # inline data - index, junk = parsers.parse_index2(data_non_inlined, False) + index, junk = parsers.parse_index2(data_non_inlined, False, False) got = index[-1] self.assertEqual(want, got) # no inline data def testdelitemwithoutnodetree(self): - index, _junk = parsers.parse_index2(data_non_inlined, False) + index, _junk = parsers.parse_index2(data_non_inlined, False, False) def hexrev(rev): if rev == nullrev:
--- a/tests/test-permissions.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-permissions.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,4 @@ -#require unix-permissions no-root reporevlogstore +#require unix-permissions no-root #testcases dirstate-v1 dirstate-v2
--- a/tests/test-phases-exchange.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-phases-exchange.t Fri Feb 28 23:28:10 2025 +0100 @@ -548,7 +548,6 @@ Pulling from bundle does not alter phases of changeset not present in the bundle -#if repobundlerepo $ hg bundle --base 1 -r 6 -r 3 ../partial-bundle.hg 5 changesets found $ hg pull ../partial-bundle.hg @@ -578,7 +577,6 @@ | o 0 public a-A - 054250a37db4 -#endif Pushing to Publish=False (unknown changeset)
--- a/tests/test-push.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-push.t Fri Feb 28 23:28:10 2025 +0100 @@ -123,7 +123,6 @@ updating to branch default 2 files updated, 0 files merged, 0 files removed, 0 files unresolved -#if reporevlogstore Test spurious filelog entries: @@ -190,8 +189,6 @@ $ cd .. -#endif - Test push hook locking =====================
--- a/tests/test-relink.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-relink.t Fri Feb 28 23:28:10 2025 +0100 @@ -75,8 +75,6 @@ relink -#if no-reposimplestore - $ hg relink --debug --config progress.debug=true | fix_path relinking $TESTTMP/repo/.hg/store to $TESTTMP/clone/.hg/store tip has 2 files, estimated total number of files: 3 @@ -106,5 +104,3 @@ repo/.hg/store/data/a.i == clone/.hg/store/data/a.i $ "$PYTHON" arelinked.py repo/.hg/store/data/b.i clone/.hg/store/data/b.i repo/.hg/store/data/b.i != clone/.hg/store/data/b.i - -#endif
--- a/tests/test-remotefilelog-bgprefetch.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-remotefilelog-bgprefetch.t Fri Feb 28 23:28:10 2025 +0100 @@ -30,11 +30,11 @@ $ hgcloneshallow ssh://user@dummy/master shallow --noupdate streaming all changes 3 files to transfer, 776 bytes of data (no-zstd !) - transferred 776 bytes in * seconds (*/sec) (glob) (no-zstd !) + stream-cloned 3 files / 776 bytes in * seconds (*/sec) (glob) (no-zstd !) 3 files to transfer, 784 bytes of data (zstd no-rust !) - transferred 784 bytes in * seconds (*/sec) (glob) (zstd no-rust !) + stream-cloned 3 files / 784 bytes in * seconds (*/sec) (glob) (zstd no-rust !) 5 files to transfer, 911 bytes of data (rust !) - transferred 911 bytes in * seconds (*/sec) (glob) (rust !) + stream-cloned 5 files / 911 bytes in * seconds (*/sec) (glob) (rust !) searching for changes no changes found
--- a/tests/test-remotefilelog-clone-tree.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-remotefilelog-clone-tree.t Fri Feb 28 23:28:10 2025 +0100 @@ -21,9 +21,9 @@ $ hgcloneshallow ssh://user@dummy/master shallow --noupdate streaming all changes 5 files to transfer, 449 bytes of data (no-rust !) - transferred 449 bytes in * seconds (*/sec) (glob) (no-rust !) + stream-cloned 5 files / 449 bytes in * seconds (*/sec) (glob) (no-rust !) 7 files to transfer, 575 bytes of data (rust !) - transferred 575 bytes in *.* seconds (*) (glob) (rust !) + stream-cloned 7 files / 575 bytes in *.* seconds (*) (glob) (rust !) searching for changes no changes found $ cd shallow @@ -68,9 +68,9 @@ $ hgcloneshallow ssh://user@dummy/shallow shallow2 --noupdate streaming all changes 6 files to transfer, 1008 bytes of data (no-rust !) - transferred 1008 bytes in * seconds (*/sec) (glob) (no-rust !) + stream-cloned 6 files / 1008 bytes in * seconds (*/sec) (glob) (no-rust !) 8 files to transfer, 1.11 KB of data (rust !) - transferred 1.11 KB in * seconds (* */sec) (glob) (rust !) + stream-cloned 8 files / 1.11 KB in * seconds (* */sec) (glob) (rust !) searching for changes no changes found $ cd shallow2
--- a/tests/test-remotefilelog-clone.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-remotefilelog-clone.t Fri Feb 28 23:28:10 2025 +0100 @@ -17,10 +17,8 @@ $ hgcloneshallow ssh://user@dummy/master shallow --noupdate streaming all changes - 3 files to transfer, 227 bytes of data (no-rust !) - transferred 227 bytes in * seconds (*/sec) (glob) (no-rust !) - 5 files to transfer, 353 bytes of data (rust !) - transferred 353 bytes in *.* seconds (*) (glob) (rust !) + * to transfer, * bytes of data (glob) + stream-cloned * files / * bytes in * seconds (*/sec) (glob) searching for changes no changes found $ cd shallow @@ -57,10 +55,8 @@ $ hgcloneshallow ssh://user@dummy/shallow shallow2 --noupdate streaming all changes - 4 files to transfer, 564 bytes of data (no-rust !) - transferred 564 bytes in * seconds (*/sec) (glob) (no-rust !) - 6 files to transfer, 690 bytes of data (rust !) - transferred 690 bytes in * seconds (*/sec) (glob) (rust !) + * to transfer, * bytes of data (glob) + stream-cloned * files / * bytes in * seconds (*/sec) (glob) searching for changes no changes found $ cd shallow2
--- a/tests/test-remotefilelog-datapack.py Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-remotefilelog-datapack.py Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,4 @@ -#!/usr/bin/env python +#!/usr/bin/env python3 import hashlib import os
--- a/tests/test-remotefilelog-histpack.py Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-remotefilelog-histpack.py Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,4 @@ -#!/usr/bin/env python +#!/usr/bin/env python3 import hashlib import os
--- a/tests/test-remotefilelog-log.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-remotefilelog-log.t Fri Feb 28 23:28:10 2025 +0100 @@ -21,9 +21,9 @@ $ hgcloneshallow ssh://user@dummy/master shallow --noupdate streaming all changes 3 files to transfer, 473 bytes of data (no-rust !) - transferred 473 bytes in * seconds (*/sec) (glob) (no-rust !) + stream-cloned 3 files / 473 bytes in * seconds (*/sec) (glob) (no-rust !) 5 files to transfer, 599 bytes of data (rust !) - transferred 599 bytes in * seconds (*/sec) (glob) (rust !) + stream-cloned 5 files / 599 bytes in * seconds (*/sec) (glob) (rust !) searching for changes no changes found $ cd shallow
--- a/tests/test-remotefilelog-partial-shallow.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-remotefilelog-partial-shallow.t Fri Feb 28 23:28:10 2025 +0100 @@ -19,11 +19,11 @@ $ hg clone --shallow ssh://user@dummy/master shallow --noupdate --config remotefilelog.includepattern=foo streaming all changes 4 files to transfer, 336 bytes of data (no-zstd !) - transferred 336 bytes in * seconds (* */sec) (glob) (no-zstd !) + stream-cloned 4 files / 336 bytes in * seconds (* */sec) (glob) (no-zstd !) 4 files to transfer, 338 bytes of data (zstd no-rust !) - transferred 338 bytes in * seconds (* */sec) (glob) (zstd no-rust !) + stream-cloned 4 files / 338 bytes in * seconds (* */sec) (glob) (zstd no-rust !) 6 files to transfer, 464 bytes of data (zstd rust !) - transferred 464 bytes in * seconds (*/sec) (glob) (zstd rust !) + stream-cloned 6 files / 464 bytes in * seconds (*/sec) (glob) (zstd rust !) searching for changes no changes found $ cat >> shallow/.hg/hgrc <<EOF
--- a/tests/test-remotefilelog-prefetch.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-remotefilelog-prefetch.t Fri Feb 28 23:28:10 2025 +0100 @@ -23,11 +23,11 @@ $ hgcloneshallow ssh://user@dummy/master shallow --noupdate streaming all changes 3 files to transfer, 528 bytes of data (no-zstd !) - transferred 528 bytes in * seconds (* */sec) (glob) (no-zstd !) + stream-cloned 3 files / 528 bytes in * seconds (* */sec) (glob) (no-zstd !) 3 files to transfer, 532 bytes of data (zstd no-rust !) - transferred 532 bytes in * seconds (* */sec) (glob) (zstd no-rust !) + stream-cloned 3 files / 532 bytes in * seconds (* */sec) (glob) (zstd no-rust !) 5 files to transfer, 659 bytes of data (zstd rust !) - transferred 659 bytes in * seconds (*/sec) (glob) (zstd rust !) + stream-cloned 5 files / 659 bytes in * seconds (*/sec) (glob) (zstd rust !) searching for changes no changes found $ cd shallow @@ -169,11 +169,11 @@ $ hgcloneshallow ssh://user@dummy/master shallow2 streaming all changes 3 files to transfer, 528 bytes of data (no-zstd !) - transferred 528 bytes in * seconds * (glob) (no-zstd !) + stream-cloned 3 files / 528 bytes in * seconds * (glob) (no-zstd !) 3 files to transfer, 532 bytes of data (zstd no-rust !) - transferred 532 bytes in * seconds (* */sec) (glob) (zstd no-rust !) + stream-cloned 3 files / 532 bytes in * seconds (* */sec) (glob) (zstd no-rust !) 5 files to transfer, 659 bytes of data (zstd rust !) - transferred 659 bytes in * seconds (*/sec) (glob) (zstd rust !) + stream-cloned 5 files / 659 bytes in * seconds (*/sec) (glob) (zstd rust !) searching for changes no changes found updating to branch default
--- a/tests/test-remotefilelog-sparse.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-remotefilelog-sparse.t Fri Feb 28 23:28:10 2025 +0100 @@ -23,11 +23,11 @@ $ hgcloneshallow ssh://user@dummy/master shallow --noupdate streaming all changes 3 files to transfer, 527 bytes of data (no-zstd !) - transferred 527 bytes in * seconds (* */sec) (glob) (no-zstd !) + stream-cloned 3 files / 527 bytes in * seconds (* */sec) (glob) (no-zstd !) 3 files to transfer, 534 bytes of data (zstd no-rust !) - transferred 534 bytes in * seconds (* */sec) (glob) (zstd no-rust !) + stream-cloned 3 files / 534 bytes in * seconds (* */sec) (glob) (zstd no-rust !) 5 files to transfer, 660 bytes of data (zstd rust !) - transferred 660 bytes in * seconds (*/sec) (glob) (zstd rust !) + stream-cloned 5 files / 660 bytes in * seconds (*/sec) (glob) (zstd rust !) searching for changes no changes found $ cd shallow @@ -77,12 +77,8 @@ $ hgcloneshallow ssh://user@dummy/master shallow2 streaming all changes - 3 files to transfer, 527 bytes of data (no-zstd !) - transferred 527 bytes in * seconds (*) (glob) (no-zstd !) - 3 files to transfer, 534 bytes of data (zstd no-rust !) - transferred 534 bytes in * seconds (* */sec) (glob) (zstd no-rust !) - 5 files to transfer, 660 bytes of data (zstd rust !) - transferred 660 bytes in * seconds (*/sec) (glob) (zstd rust !) + * files to transfer, * bytes of data (glob) + stream-cloned * files / * bytes in * seconds (*) (glob) searching for changes no changes found updating to branch default
--- a/tests/test-remotefilelog-tags.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-remotefilelog-tags.t Fri Feb 28 23:28:10 2025 +0100 @@ -19,11 +19,11 @@ $ hg clone --shallow ssh://user@dummy/master shallow --noupdate --config remotefilelog.excludepattern=.hgtags streaming all changes 4 files to transfer, 662 bytes of data (no-zstd !) - transferred 662 bytes in * seconds (* */sec) (glob) (no-zstd !) + stream-cloned 4 files / 662 bytes in * seconds (* */sec) (glob) (no-zstd !) 4 files to transfer, 665 bytes of data (zstd no-rust !) - transferred 665 bytes in * seconds (* */sec) (glob) (zstd no-rust !) + stream-cloned 4 files / 665 bytes in * seconds (* */sec) (glob) (zstd no-rust !) 6 files to transfer, 791 bytes of data (zstd rust !) - transferred 791 bytes in * seconds (*/sec) (glob) (zstd rust !) + stream-cloned 6 files / 791 bytes in * seconds (*/sec) (glob) (zstd rust !) searching for changes no changes found $ cat >> shallow/.hg/hgrc <<EOF
--- a/tests/test-remotefilelog-wireproto.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-remotefilelog-wireproto.t Fri Feb 28 23:28:10 2025 +0100 @@ -26,9 +26,9 @@ $ hgcloneshallow ssh://user@dummy/master shallow --noupdate streaming all changes 3 files to transfer, 908 bytes of data (no-rust !) - transferred 908 bytes in * seconds (*/sec) (glob) (no-rust !) + stream-cloned 3 files / 908 bytes in * seconds (*/sec) (glob) (no-rust !) 5 files to transfer, 1.01 KB of data (rust !) - transferred 1.01 KB in * seconds (* */sec) (glob) (rust !) + stream-cloned 5 files / 1.01 KB in * seconds (* */sec) (glob) (rust !) searching for changes no changes found $ cd shallow
--- a/tests/test-repair-strip.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-repair-strip.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,4 @@ -#require unix-permissions no-root reporevlogstore +#require unix-permissions no-root $ cat > $TESTTMP/dumpjournal.py <<EOF > import sys
--- a/tests/test-revlog-v2.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-revlog-v2.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,5 +1,3 @@ -#require reporevlogstore - A repo with unknown revlogv2 requirement string cannot be opened $ hg init invalidreq
--- a/tests/test-rhg.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-rhg.t Fri Feb 28 23:28:10 2025 +0100 @@ -193,6 +193,15 @@ $ $NO_FALLBACK rhg cat -r 1 copy_of_original original content +Annotate files + $ $NO_FALLBACK rhg annotate original + 0: original content + $ $NO_FALLBACK rhg annotate --rev . --user --file --date --number --changeset \ + > --line-number --text --no-follow --ignore-all-space --ignore-space-change \ + > --ignore-blank-lines --ignore-space-at-eol original + test 0 1c9e69808da7 Thu Jan 01 00:00:00 1970 +0000 original:1: original content + $ $NO_FALLBACK rhg blame -r . -ufdnclawbBZ --no-follow original + test 0 1c9e69808da7 Thu Jan 01 00:00:00 1970 +0000 original:1: original content Fallback to Python $ $NO_FALLBACK rhg cat original --exclude="*.rs" @@ -456,3 +465,9 @@ $ echo "ignored-extensions=*" >> $HGRCPATH $ $NO_FALLBACK rhg files a + +Latin-1 is not supported yet + + $ $NO_FALLBACK HGENCODING=latin-1 rhg root + unsupported feature: HGENCODING value 'latin-1' is not supported + [252]
--- a/tests/test-rust-ancestor.py Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-rust-ancestor.py Fri Feb 28 23:28:10 2025 +0100 @@ -5,12 +5,11 @@ from mercurial.testing import revlog as revlogtesting try: - from mercurial import pyo3_rustext, rustext + from mercurial import pyo3_rustext - rustext.__name__ # trigger immediate actual import pyo3_rustext.__name__ except ImportError: - rustext = pyo3_rustext = None + pyo3_rustext = None try: from mercurial.cext import parsers as cparsers @@ -43,15 +42,15 @@ @classmethod def ancestors_mod(cls): - return cls.rustext_pkg.ancestor + return pyo3_rustext.ancestor @classmethod def dagop_mod(cls): - return cls.rustext_pkg.dagop + return pyo3_rustext.dagop @classmethod def graph_error(cls): - return cls.rustext_pkg.GraphError + return pyo3_rustext.GraphError def testiteratorrevlist(self): AncestorsIterator = self.ancestors_mod().AncestorsIterator @@ -193,18 +192,6 @@ missanc.removeancestorsfrom(revs) self.assertEqual(revs, {2, 3}) - -class RustCPythonAncestorsTest( - revlogtesting.RustRevlogBasedTestBase, RustAncestorsTestMixin -): - rustext_pkg = rustext - - -class PyO3AncestorsTest( - revlogtesting.RustRevlogBasedTestBase, RustAncestorsTestMixin -): - rustext_pkg = pyo3_rustext - def test_rank(self): dagop = self.dagop_mod()
--- a/tests/test-rust-discovery.py Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-rust-discovery.py Fri Feb 28 23:28:10 2025 +0100 @@ -3,7 +3,9 @@ from mercurial import policy from mercurial.testing import revlog as revlogtesting -PartialDiscovery = policy.importrust('discovery', member='PartialDiscovery') +PartialDiscovery = policy.importrust( + 'discovery', member='PartialDiscovery', pyo3=True +) try: from mercurial.cext import parsers as cparsers @@ -48,7 +50,7 @@ "rustext or the C Extension parsers module " "discovery relies on is not available", ) -class rustdiscoverytest(revlogtesting.RustRevlogBasedTestBase): +class rustdiscoverytest(revlogtesting.PyO3RevlogBasedTestBase): """Test the correctness of binding to Rust code. This test is merely for the binding to Rust itself: extraction of
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tests/test-rust-revlog.py Fri Feb 28 23:28:10 2025 +0100 @@ -0,0 +1,210 @@ +import struct + +from mercurial.node import ( + bin as node_bin, + hex, +) +from mercurial import error + +try: + from mercurial import rustext + + rustext.__name__ # trigger immediate actual import +except ImportError: + rustext = None +else: + # this would fail already without appropriate ancestor.__package__ + from mercurial.rustext.ancestor import LazyAncestors + +from mercurial.testing import revlog as revlogtesting + +header = struct.unpack(">I", revlogtesting.data_non_inlined[:4])[0] + + +class RustInnerRevlogTestMixin: + """Common tests for both Rust Python bindings.""" + + node_hex0 = b'd1f4bbb0befc13bd8cd39d0fcdd93b8c078c4a2f' + node0 = node_bin(node_hex0) + bogus_node_hex = b'cafe' * 10 + bogus_node = node_bin(bogus_node_hex) + node_hex2 = b"020a0ec626a192ae360b0269fe2de5ba6f05d1e7" + node2 = node_bin(node_hex2) + + def test_index_nodemap(self): + idx = self.parserustindex() + self.assertTrue(idx.has_node(self.node0)) + self.assertFalse(idx.has_node(self.bogus_node)) + + self.assertEqual(idx.get_rev(self.node0), 0) + self.assertEqual(idx.get_rev(self.node0), 0) + + self.assertEqual(idx.rev(self.node0), 0) + with self.assertRaises(error.RevlogError) as exc_info: + idx.rev(self.bogus_node) + self.assertEqual(exc_info.exception.args, (None,)) + + self.assertEqual(idx.partialmatch(self.node_hex0[:3]), self.node0) + self.assertIsNone(idx.partialmatch(self.bogus_node_hex[:3])) + self.assertEqual(idx.shortest(self.node0), 1) + + def test_len(self): + idx = self.parserustindex() + self.assertEqual(len(idx), 4) + + def test_getitem(self): + idx = self.parserustindex() + as_tuple = (0, 82969, 484626, 0, 0, -1, -1, self.node0, 0, 0, 2, 2, -1) + self.assertEqual(idx[0], as_tuple) + self.assertEqual(idx[self.node0], 0) + + def test_heads(self): + idx = self.parserustindex() + self.assertEqual(idx.headrevs(), [3]) + + def test_index_append(self): + idx = self.parserustindex(data=b'') + self.assertEqual(len(idx), 0) + self.assertIsNone(idx.get_rev(self.node0)) + + non_empty_index = self.parserustindex() + idx.append(non_empty_index[0]) + self.assertEqual(len(idx), 1) + self.assertEqual(idx.get_rev(self.node0), 0) + + def test_index_delitem_single(self): + idx = self.parserustindex() + del idx[2] + self.assertEqual(len(idx), 2) + + # the nodetree is consistent + self.assertEqual(idx.get_rev(self.node0), 0) + self.assertIsNone(idx.get_rev(self.node2)) + + # not an error and does nothing + del idx[-1] + self.assertEqual(len(idx), 2) + + for bogus in (-2, 17): + try: + del idx[bogus] + except ValueError as exc: + # this underlines that we should do better with this message + assert exc.args[0] == ( + f"Inconsistency: Revision {bogus} found in nodemap " + "is not in revlog index" + ) + else: + raise AssertionError( + f"an exception was expected for `del idx[{bogus}]`" + ) + + def test_index_delitem_slice(self): + idx = self.parserustindex() + del idx[2:3] + self.assertEqual(len(idx), 2) + + # not an error and not equivalent to `del idx[0::]` but to + # `del idx[-1]` instead and thus does nothing. + del idx[-1::] + self.assertEqual(len(idx), 2) + + for start, stop in ( + (-2, None), + (17, None), + ): + try: + del idx[start:stop] + except ValueError as exc: + # this underlines that we should do better with this message + assert exc.args[0] == ( + f"Inconsistency: Revision {start} found in nodemap " + "is not in revlog index" + ) + else: + raise AssertionError( + f"an exception was expected for `del idx[{start}:{stop}]`" + ) + + # although the upper bound is way too big, this is not an error: + del idx[0::17] + self.assertEqual(len(idx), 0) + + def test_standalone_nodetree(self): + idx = self.parserustindex() + nt = self.nodetree(idx) + for i in range(4): + nt.insert(i) + + # invalidation is upon mutation *of the index* + self.assertFalse(nt.is_invalidated()) + + bin_nodes = [entry[7] for entry in idx] + hex_nodes = [hex(n) for n in bin_nodes] + + for i, node in enumerate(hex_nodes): + self.assertEqual(nt.prefix_rev_lookup(node), i) + self.assertEqual(nt.prefix_rev_lookup(node[:5]), i) + + # all 4 revisions in idx (standard data set) have different + # first nybbles in their Node IDs, + # hence `nt.shortest()` should return 1 for them, except when + # the leading nybble is 0 (ambiguity with NULL_NODE) + for i, (bin_node, hex_node) in enumerate(zip(bin_nodes, hex_nodes)): + shortest = nt.shortest(bin_node) + expected = 2 if hex_node[0] == ord('0') else 1 + self.assertEqual(shortest, expected) + self.assertEqual(nt.prefix_rev_lookup(hex_node[:shortest]), i) + + # test invalidation (generation poisoning) detection + del idx[3] + self.assertTrue(nt.is_invalidated()) + + def test_reading_context_manager(self): + irl = self.make_inner_revlog() + try: + with irl.reading(): + # not much to do yet + pass + except error.RevlogError as exc: + # well our data file does not even exist + self.assertTrue(b"when reading Just a path/test.d" in exc.args[0]) + + +# Conditional skipping done by the base class +class RustInnerRevlogTest( + revlogtesting.RustRevlogBasedTestBase, RustInnerRevlogTestMixin +): + """For reference""" + + def test_ancestors(self): + rustidx = self.parserustindex() + lazy = LazyAncestors(rustidx, [3], 0, True) + # we have two more references to the index: + # - in its inner iterator for __contains__ and __bool__ + # - in the LazyAncestors instance itself (to spawn new iterators) + self.assertTrue(2 in lazy) + self.assertTrue(bool(lazy)) + self.assertEqual(list(lazy), [3, 2, 1, 0]) + # a second time to validate that we spawn new iterators + self.assertEqual(list(lazy), [3, 2, 1, 0]) + + # let's check bool for an empty one + self.assertFalse(LazyAncestors(rustidx, [0], 0, False)) + + def test_canonical_index_file(self): + irl = self.make_inner_revlog() + self.assertEqual(irl.canonical_index_file, b'test.i') + + +# Conditional skipping done by the base class +class PyO3InnerRevlogTest( + revlogtesting.PyO3RevlogBasedTestBase, RustInnerRevlogTestMixin +): + """Testing new PyO3 bindings, by comparison with rust-cpython bindings.""" + + +if __name__ == '__main__': + import silenttestrunner + + silenttestrunner.main(__name__)
--- a/tests/test-share.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-share.t Fri Feb 28 23:28:10 2025 +0100 @@ -49,7 +49,7 @@ checkisexec (execbit !) checklink (symlink no-rust !) checklink-target (symlink no-rust !) - manifestfulltextcache (reporevlogstore !) + manifestfulltextcache $ ls -1 ../repo1/.hg/cache branch2-served rbc-names-v2
--- a/tests/test-shelve2.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-shelve2.t Fri Feb 28 23:28:10 2025 +0100 @@ -207,12 +207,10 @@ $ hg shelve shelved as default 0 files updated, 0 files merged, 1 files removed, 0 files unresolved -#if repobundlerepo $ hg log -G --template '{rev} {desc|firstline} {author}' -R bundle://.hg/shelved/default.hg -r 'bundle()' --hidden o [48] changes to: commit stuff shelve@localhost (re) | ~ -#endif $ hg log -G --template '{rev} {desc|firstline} {author}' @ [37] commit stuff test (re) |
--- a/tests/test-sparse-requirement.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-sparse-requirement.t Fri Feb 28 23:28:10 2025 +0100 @@ -27,7 +27,6 @@ share-safe sparserevlog store - testonly-simplestore (reposimplestore !) $ hg debugsparse --config extensions.sparse= --enable-profile frontend.sparse $ ls -A @@ -49,7 +48,6 @@ share-safe sparserevlog store - testonly-simplestore (reposimplestore !) Client without sparse enabled reacts properly @@ -72,7 +70,6 @@ share-safe sparserevlog store - testonly-simplestore (reposimplestore !) And client without sparse can access
--- a/tests/test-sparse.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-sparse.t Fri Feb 28 23:28:10 2025 +0100 @@ -51,6 +51,14 @@ .hg hide +Test that status --rev --rev and --change ignore sparse rules. + $ hg status --rev null --rev 0 + A hide + A show + $ hg status --change 0 + A hide + A show + Absolute paths outside the repo should just be rejected #if no-windows
--- a/tests/test-sqlitestore.t Fri Feb 28 23:25:42 2025 +0100 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,135 +0,0 @@ -#require sqlite no-chg - -The sqlitestore backend leaves transactions around when used with chg. -Since this backend is primarily intended as proof-of-concept for -alternative storage backends, disable it for chg test runs to avoid -the instability. - - $ cat >> $HGRCPATH <<EOF - > [extensions] - > sqlitestore = - > EOF - -New repo should not use SQLite by default - - $ hg init empty-no-sqlite - $ hg debugrequires -R empty-no-sqlite - dotencode - dirstate-v2 (dirstate-v2 !) - fncache - generaldelta - persistent-nodemap (rust !) - revlog-compression-zstd (zstd !) - revlogv1 - share-safe - sparserevlog - store - -storage.new-repo-backend=sqlite is recognized - - $ hg --config storage.new-repo-backend=sqlite init empty-sqlite - $ hg debugrequires -R empty-sqlite - dotencode - dirstate-v2 (dirstate-v2 !) - exp-sqlite-001 - exp-sqlite-comp-001=zstd (zstd !) - exp-sqlite-comp-001=$BUNDLE2_COMPRESSIONS$ (no-zstd !) - fncache - generaldelta - persistent-nodemap (rust !) - revlog-compression-zstd (zstd !) - revlogv1 - share-safe - sparserevlog - store - - $ cat >> $HGRCPATH << EOF - > [storage] - > new-repo-backend = sqlite - > EOF - -Can force compression to zlib - - $ hg --config storage.sqlite.compression=zlib init empty-zlib - $ hg debugrequires -R empty-zlib - dotencode - dirstate-v2 (dirstate-v2 !) - exp-sqlite-001 - exp-sqlite-comp-001=$BUNDLE2_COMPRESSIONS$ - fncache - generaldelta - persistent-nodemap (rust !) - revlog-compression-zstd (zstd !) - revlogv1 - share-safe - sparserevlog - store - -Can force compression to none - - $ hg --config storage.sqlite.compression=none init empty-none - $ hg debugrequires -R empty-none - dotencode - dirstate-v2 (dirstate-v2 !) - exp-sqlite-001 - exp-sqlite-comp-001=none - fncache - generaldelta - persistent-nodemap (rust !) - revlog-compression-zstd (zstd !) - revlogv1 - share-safe - sparserevlog - store - -Can make a local commit - - $ hg init local-commit - $ cd local-commit - $ echo 0 > foo - $ hg commit -A -m initial - adding foo - -That results in a row being inserted into various tables - - $ sqlite3 .hg/store/db.sqlite -init /dev/null << EOF - > SELECT * FROM filepath; - > EOF - 1|foo - - $ sqlite3 .hg/store/db.sqlite -init /dev/null << EOF - > SELECT * FROM fileindex; - > EOF - 1|1|0|-1|-1|0|0|1||6/\xef(L\xe2\xca\x02\xae\xcc\x8d\xe6\xd5\xe8\xa1\xc3\xaf\x05V\xfe (esc) - - $ sqlite3 .hg/store/db.sqlite -init /dev/null << EOF - > SELECT * FROM delta; - > EOF - 1|1| \xd2\xaf\x8d\xd2"\x01\xdd\x8dH\xe5\xdc\xfc\xae\xd2\x81\xff\x94"\xc7|0 (esc) - - -Tracking multiple files works - - $ echo 1 > bar - $ hg commit -A -m 'add bar' - adding bar - - $ sqlite3 .hg/store/db.sqlite -init /dev/null << EOF - > SELECT * FROM filedata ORDER BY id ASC; - > EOF - 1|1|foo|0|6/\xef(L\xe2\xca\x02\xae\xcc\x8d\xe6\xd5\xe8\xa1\xc3\xaf\x05V\xfe|-1|-1|0|0|1| (esc) - 2|2|bar|0|\xb8\xe0/d3s\x80!\xa0e\xf9Au\xc7\xcd#\xdb_\x05\xbe|-1|-1|1|0|2| (esc) - -Multiple revisions of a file works - - $ echo a >> foo - $ hg commit -m 'modify foo' - - $ sqlite3 .hg/store/db.sqlite -init /dev/null << EOF - > SELECT * FROM filedata ORDER BY id ASC; - > EOF - 1|1|foo|0|6/\xef(L\xe2\xca\x02\xae\xcc\x8d\xe6\xd5\xe8\xa1\xc3\xaf\x05V\xfe|-1|-1|0|0|1| (esc) - 2|2|bar|0|\xb8\xe0/d3s\x80!\xa0e\xf9Au\xc7\xcd#\xdb_\x05\xbe|-1|-1|1|0|2| (esc) - 3|1|foo|1|\xdd\xb3V\xcd\xde1p@\xf7\x8e\x90\xb8*\x8b,\xe9\x0e\xd6j+|0|-1|2|0|3|1 (esc) - - $ cd ..
--- a/tests/test-ssh-bundle1.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-ssh-bundle1.t Fri Feb 28 23:28:10 2025 +0100 @@ -58,16 +58,10 @@ clone remote via stream -#if no-reposimplestore - $ hg clone --stream ssh://user@dummy/remote local-stream streaming all changes - 5 files to transfer, 602 bytes of data (no-zstd !) - transferred 602 bytes in * seconds (*) (glob) (no-zstd !) - 5 files to transfer, 621 bytes of data (zstd no-rust !) - transferred 621 bytes in * seconds (* */sec) (glob) (zstd no-rust !) - 7 files to transfer, 747 bytes of data (zstd rust !) - transferred 747 bytes in * seconds (*/sec) (glob) (zstd rust !) + * files to transfer, * bytes of data (glob) + stream-cloned * files * bytes in * seconds (*/sec) (glob) searching for changes no changes found updating to branch default @@ -83,12 +77,8 @@ $ hg -R local-stream book mybook $ hg clone --stream ssh://user@dummy/local-stream stream2 streaming all changes - 5 files to transfer, 602 bytes of data (no-zstd !) - transferred 602 bytes in * seconds (*) (glob) (no-zstd !) - 5 files to transfer, 621 bytes of data (zstd no-rust !) - transferred 621 bytes in * seconds (* */sec) (glob) (zstd no-rust !) - 7 files to transfer, 747 bytes of data (zstd rust !) - transferred 747 bytes in * seconds (*/sec) (glob) (zstd rust !) + * files to transfer, * bytes of data (glob) + stream-cloned * files * bytes in * seconds (*/sec) (glob) searching for changes no changes found updating to branch default @@ -99,8 +89,6 @@ $ cd $TESTTMP $ rm -rf local-stream stream2 -#endif - clone remote via pull $ hg clone ssh://user@dummy/remote local @@ -483,9 +471,9 @@ Got arguments 1:user@dummy 2:hg -R nonexistent serve --stdio Got arguments 1:user@dummy 2:hg -R /$TESTTMP/nonexistent serve --stdio (no-msys !) Got arguments 1:user@dummy 2:hg -R remote serve --stdio - Got arguments 1:user@dummy 2:hg -R local-stream serve --stdio (no-reposimplestore !) - Got arguments 1:user@dummy 2:hg -R remote serve --stdio (no-reposimplestore !) - Got arguments 1:user@dummy 2:hg -R remote serve --stdio (no-reposimplestore !) + Got arguments 1:user@dummy 2:hg -R local-stream serve --stdio + Got arguments 1:user@dummy 2:hg -R remote serve --stdio + Got arguments 1:user@dummy 2:hg -R remote serve --stdio Got arguments 1:user@dummy 2:hg -R doesnotexist serve --stdio Got arguments 1:user@dummy 2:hg -R remote serve --stdio Got arguments 1:user@dummy 2:hg -R local serve --stdio
--- a/tests/test-ssh.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-ssh.t Fri Feb 28 23:28:10 2025 +0100 @@ -50,15 +50,10 @@ clone remote via stream -#if no-reposimplestore - $ hg clone --stream ssh://user@dummy/remote local-stream streaming all changes - 9 files to transfer, 827 bytes of data (no-zstd !) - transferred 827 bytes in * seconds (*) (glob) (no-zstd !) - 9 files to transfer, 846 bytes of data (zstd no-rust !) - 11 files to transfer, 972 bytes of data (zstd rust !) - transferred * bytes in * seconds (* */sec) (glob) (zstd !) + * files to transfer, * bytes of data (glob) + stream-cloned * bytes in * seconds (* */sec) (glob) updating to branch default 2 files updated, 0 files merged, 0 files removed, 0 files unresolved $ cd local-stream @@ -72,9 +67,8 @@ $ hg -R local-stream book mybook $ hg clone --stream ssh://user@dummy/local-stream stream2 streaming all changes - 12 files to transfer, * of data (glob) (no-rust !) - 14 files to transfer, * of data (glob) (rust !) - transferred * in * seconds (*) (glob) + * files to transfer, * of data (glob) + stream-cloned * files / * in * seconds (*) (glob) updating to branch default 2 files updated, 0 files merged, 0 files removed, 0 files unresolved $ cd stream2 @@ -83,8 +77,6 @@ $ cd $TESTTMP $ rm -rf local-stream stream2 -#endif - clone remote via pull $ hg clone ssh://user@dummy/remote local @@ -555,9 +547,9 @@ Got arguments 1:user@dummy 2:hg -R nonexistent serve --stdio Got arguments 1:user@dummy 2:hg -R $TESTTMP/nonexistent serve --stdio Got arguments 1:user@dummy 2:hg -R remote serve --stdio - Got arguments 1:user@dummy 2:hg -R local-stream serve --stdio (no-reposimplestore !) - Got arguments 1:user@dummy 2:hg -R remote serve --stdio (no-reposimplestore !) - Got arguments 1:user@dummy 2:hg -R remote serve --stdio (no-reposimplestore !) + Got arguments 1:user@dummy 2:hg -R local-stream serve --stdio + Got arguments 1:user@dummy 2:hg -R remote serve --stdio + Got arguments 1:user@dummy 2:hg -R remote serve --stdio Got arguments 1:user@dummy 2:hg -R doesnotexist serve --stdio Got arguments 1:user@dummy 2:hg -R remote serve --stdio Got arguments 1:user@dummy 2:hg -R local serve --stdio
--- a/tests/test-static-http.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-static-http.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,5 +1,3 @@ -#require no-reposimplestore - $ hg clone http://localhost:$HGPORT/ copy abort: * (glob) [100]
--- a/tests/test-status-inprocess.py Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-status-inprocess.py Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,4 @@ -#!/usr/bin/env python +#!/usr/bin/env python3 import sys
--- a/tests/test-stdio.py Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-stdio.py Fri Feb 28 23:28:10 2025 +0100 @@ -1,4 +1,4 @@ -#!/usr/bin/env python +#!/usr/bin/env python3 """ Tests the buffering behavior of stdio streams in `mercurial.utils.procutil`. """
--- a/tests/test-storage.py Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-storage.py Fri Feb 28 23:28:10 2025 +0100 @@ -14,30 +14,6 @@ from mercurial.testing import storage as storagetesting -try: - from mercurial import rustext - - rustext.__name__ - # Does not pass with pure Rust index - import sys - - sys.exit(80) -except ImportError: - pass - -try: - from hgext import sqlitestore -except ImportError: - sqlitestore = None - -try: - import sqlite3 - - if sqlite3.sqlite_version_info < (3, 8, 3): - # WITH clause not supported - sqlitestore = None -except ImportError: - pass try: from mercurial import zstd @@ -117,62 +93,5 @@ makefilefn, maketransaction, addrawrevision ) - -def makesqlitefile(self): - path = STATE['vfs'].join(b'db-%d.db' % STATE['lastindex']) - STATE['lastindex'] += 1 - - db = sqlitestore.makedb(path) - - compression = b'zstd' if zstd else b'zlib' - - return sqlitestore.sqlitefilestore(db, b'dummy-path', compression) - - -def addrawrevisionsqlite( - self, - fl, - tr, - node, - p1, - p2, - linkrev, - rawtext=None, - delta=None, - censored=False, - ellipsis=False, - extstored=False, -): - flags = 0 - - if censored: - flags |= sqlitestore.FLAG_CENSORED - - if ellipsis | extstored: - raise error.Abort( - b'support for ellipsis and extstored flags not ' b'supported' - ) - - if rawtext is not None: - fl._addrawrevision(node, rawtext, tr, linkrev, p1, p2, flags=flags) - elif delta is not None: - fl._addrawrevision( - node, rawtext, tr, linkrev, p1, p2, storedelta=delta, flags=flags - ) - else: - raise error.Abort(b'must supply rawtext or delta arguments') - - -if sqlitestore is not None: - sqlitefileindextests = storagetesting.makeifileindextests( - makesqlitefile, maketransaction, addrawrevisionsqlite - ) - sqlitefiledatatests = storagetesting.makeifiledatatests( - makesqlitefile, maketransaction, addrawrevisionsqlite - ) - sqlitefilemutationtests = storagetesting.makeifilemutationtests( - makesqlitefile, maketransaction, addrawrevisionsqlite - ) - if __name__ == '__main__': silenttestrunner.main(__name__)
--- a/tests/test-stream-bundle-v2.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-stream-bundle-v2.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,6 +1,18 @@ -#require no-reposimplestore +#testcases stream-v2 stream-v3 +#testcases threaded sequential -#testcases stream-v2 stream-v3 +#if threaded + $ cat << EOF >> $HGRCPATH + > [worker] + > parallel-stream-bundle-processing = yes + > parallel-stream-bundle-processing.num-writer = 2 + > EOF +#else + $ cat << EOF >> $HGRCPATH + > [worker] + > parallel-stream-bundle-processing = no + > EOF +#endif #if stream-v2 $ bundle_format="streamv2" @@ -91,20 +103,44 @@ none-v2;stream=v3-exp;requirements%3Dgeneraldelta%2Crevlog-compression-zstd%2Crevlogv1%2Csparserevlog (stream-v3 zstd no-rust !) none-v2;stream=v3-exp;requirements%3Dgeneraldelta%2Crevlog-compression-zstd%2Crevlogv1%2Csparserevlog (stream-v3 rust !) -Test that we can apply the bundle as a stream clone bundle - - $ cat > .hg/clonebundles.manifest << EOF - > http://localhost:$HGPORT1/bundle.hg BUNDLESPEC=`hg debugbundle --spec bundle.hg` - > EOF - $ hg serve -d -p $HGPORT --pid-file hg.pid --accesslog access.log $ cat hg.pid >> $DAEMON_PIDS $ "$PYTHON" $TESTDIR/dumbhttp.py -p $HGPORT1 --pid http.pid $ cat http.pid >> $DAEMON_PIDS +Stream bundle spec with unknown requirements should be filtered out + +#if stream-v2 + $ cat > .hg/clonebundles.manifest << EOF + > http://localhost:$HGPORT1/bundle.hg BUNDLESPEC=none-v2;stream=v2;requirements%3Drevlogv42 + > EOF +#endif +#if stream-v3 + $ cat > .hg/clonebundles.manifest << EOF + > http://localhost:$HGPORT1/bundle.hg BUNDLESPEC=none-v2;stream=v3-exp;requirements%3Drevlogv42 + > EOF +#endif + $ cd .. + $ hg clone -U http://localhost:$HGPORT stream-clone-unsupported-requirements + no compatible clone bundles available on server; falling back to regular clone + (you may want to report this to the server operator) + requesting all changes + adding changesets + adding manifests + adding file changes + added 5 changesets with 5 changes to 5 files + new changesets 426bada5c675:9bc730a19041 (5 drafts) + +Test that we can apply the bundle as a stream clone bundle + + $ cat > main/.hg/clonebundles.manifest << EOF + > http://localhost:$HGPORT1/bundle.hg BUNDLESPEC=`hg debugbundle --spec main/bundle.hg` + > EOF + + #if stream-v2 $ hg clone http://localhost:$HGPORT stream-clone-implicit --debug using http://localhost:$HGPORT/ @@ -132,10 +168,10 @@ adding [c] branch2-served (94 bytes) adding [c] rbc-names-v2 (7 bytes) adding [c] rbc-revs-v2 (40 bytes) - transferred 1.65 KB in * seconds (* */sec) (glob) (no-rust !) bundle2-input-part: total payload size 1857 (no-rust !) - transferred 1.78 KB in * seconds (* */sec) (glob) (rust !) bundle2-input-part: total payload size 2025 (rust !) + stream-cloned 12 files / 1.65 KB in * seconds (* */sec) (glob) (no-rust !) + stream-cloned 14 files / 1.78 KB in * seconds (* */sec) (glob) (rust !) bundle2-input-bundle: 1 parts total updating the branch cache finished applying clone bundle @@ -169,7 +205,12 @@ updating the branch cache (sent 4 HTTP requests and * bytes; received * bytes in responses) (glob) - $ hg clone --stream http://localhost:$HGPORT stream-clone-explicit --debug +test explicite stream request + +(also test unlimited memory usage code path) + + $ hg clone --stream http://localhost:$HGPORT stream-clone-explicit --debug \ + > --config worker.parallel-stream-bundle-processing.memory-target=-1 using http://localhost:$HGPORT/ sending capabilities command sending clonebundles_manifest command @@ -195,10 +236,10 @@ adding [c] branch2-served (94 bytes) adding [c] rbc-names-v2 (7 bytes) adding [c] rbc-revs-v2 (40 bytes) - transferred 1.65 KB in * seconds (* */sec) (glob) (no-rust !) bundle2-input-part: total payload size 1857 (no-rust !) - transferred 1.78 KB in * seconds (* */sec) (glob) (rust !) bundle2-input-part: total payload size 2025 (rust !) + stream-cloned 12 files / 1.65 KB in * seconds (* */sec) (glob) (no-rust !) + stream-cloned 14 files / 1.78 KB in * seconds (* */sec) (glob) (rust !) bundle2-input-bundle: 1 parts total updating the branch cache finished applying clone bundle @@ -260,9 +301,9 @@ adding [c] branch2-served (94 bytes) adding [c] rbc-names-v2 (7 bytes) adding [c] rbc-revs-v2 (40 bytes) - transferred 1.65 KB in * seconds (* */sec) (glob) (no-rust !) + stream-cloned 12 files / 1.65 KB in * seconds (* */sec) (glob) (no-rust !) bundle2-input-part: total payload size 1869 (no-rust !) - transferred 1.78 KB in * seconds (* */sec) (glob) (rust !) + stream-cloned 14 files / 1.78 KB in * seconds (* */sec) (glob) (rust !) bundle2-input-part: total payload size 2037 (rust !) bundle2-input-bundle: 1 parts total updating the branch cache @@ -322,9 +363,9 @@ adding [c] branch2-served (94 bytes) adding [c] rbc-names-v2 (7 bytes) adding [c] rbc-revs-v2 (40 bytes) - transferred 1.65 KB in * seconds (* */sec) (glob) (no-rust !) + stream-cloned 12 files / 1.65 KB in * seconds (* */sec) (glob) (no-rust !) bundle2-input-part: total payload size 1869 (no-rust !) - transferred 1.78 KB in * seconds (* */sec) (glob) (rust !) + stream-cloned 14 files / 1.78 KB in * seconds (* */sec) (glob) (rust !) bundle2-input-part: total payload size 2037 (rust !) bundle2-input-bundle: 1 parts total updating the branch cache
--- a/tests/test-strip.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-strip.t Fri Feb 28 23:28:10 2025 +0100 @@ -492,19 +492,15 @@ $ touch a $ hg ci -qAm a -#if repofncache $ cat .hg/store/fncache | sort data/a.i data/bar.i -#endif $ hg strip tip 0 files updated, 0 files merged, 1 files removed, 0 files unresolved saved backup bundle to $TESTTMP/test/.hg/strip-backup/*-backup.hg (glob) -#if repofncache $ cat .hg/store/fncache data/bar.i -#endif stripping an empty revset @@ -799,14 +795,12 @@ saved backup bundle to $TESTTMP/doublebundle/.hg/strip-backup/3903775176ed-e68910bd-backup.hg $ ls .hg/strip-backup 3903775176ed-e68910bd-backup.hg -#if repobundlerepo $ hg pull -q -r 3903775176ed .hg/strip-backup/3903775176ed-e68910bd-backup.hg $ hg strip -r 0 saved backup bundle to $TESTTMP/doublebundle/.hg/strip-backup/3903775176ed-54390173-backup.hg $ ls .hg/strip-backup 3903775176ed-54390173-backup.hg 3903775176ed-e68910bd-backup.hg -#endif $ cd .. Test that we only bundle the stripped changesets (issue4736) @@ -872,7 +866,6 @@ $ hg bundle -r 'desc(mergeCD)' --base 'desc(commitC)' ../issue4736.hg 2 changesets found -#if repobundlerepo $ hg log -r 'bundle()' -R ../issue4736.hg changeset: 3:6625a5168474 parent: 1:eca11cf91c71 @@ -888,7 +881,6 @@ date: Thu Jan 01 00:00:00 1970 +0000 summary: mergeCD -#endif check strip behavior @@ -934,7 +926,6 @@ strip backup content -#if repobundlerepo $ hg log -r 'bundle()' -R .hg/strip-backup/6625a5168474-*-backup.hg changeset: 3:6625a5168474 parent: 1:eca11cf91c71 @@ -951,8 +942,6 @@ summary: mergeCD -#endif - Check that the phase cache is properly invalidated after a strip with bookmark. $ cat > ../stripstalephasecache.py << EOF
--- a/tests/test-subrepo.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-subrepo.t Fri Feb 28 23:28:10 2025 +0100 @@ -1258,7 +1258,7 @@ ../shared/subrepo-2/.hg/wcache/checkisexec (execbit !) ../shared/subrepo-2/.hg/wcache/checklink (symlink no-rust !) ../shared/subrepo-2/.hg/wcache/checklink-target (symlink no-rust !) - ../shared/subrepo-2/.hg/wcache/manifestfulltextcache (reporevlogstore !) + ../shared/subrepo-2/.hg/wcache/manifestfulltextcache ../shared/subrepo-2/file $ hg -R ../shared in abort: repository default not found
--- a/tests/test-treemanifest.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-treemanifest.t Fri Feb 28 23:28:10 2025 +0100 @@ -322,7 +322,6 @@ 0 4 064927a0648a 000000000000 000000000000 1 5 25ecb8cb8618 000000000000 000000000000 -#if repobundlerepo $ hg incoming .hg/strip-backup/* comparing with .hg/strip-backup/*-backup.hg (glob) searching for changes @@ -332,7 +331,6 @@ date: Thu Jan 01 00:00:00 1970 +0000 summary: modify dir1/a -#endif $ hg unbundle .hg/strip-backup/* adding changesets @@ -464,12 +462,7 @@ Test files for a subdirectory. -#if reporevlogstore $ rm -r .hg/store/meta/~2e_a -#endif -#if reposimplestore - $ rm -r .hg/store/meta/._a -#endif $ hg files -r . b b/bar/fruits.txt b/bar/orange/fly/gnat.py @@ -485,12 +478,7 @@ Test files with just includes and excludes. -#if reporevlogstore $ rm -r .hg/store/meta/~2e_a -#endif -#if reposimplestore - $ rm -r .hg/store/meta/._a -#endif $ rm -r .hg/store/meta/b/bar/orange/fly $ rm -r .hg/store/meta/b/foo/apple/bees $ hg files -r . -I path:b/bar -X path:b/bar/orange/fly -I path:b/foo -X path:b/foo/apple/bees @@ -502,12 +490,7 @@ Test files for a subdirectory, excluding a directory within it. -#if reporevlogstore $ rm -r .hg/store/meta/~2e_a -#endif -#if reposimplestore - $ rm -r .hg/store/meta/._a -#endif $ rm -r .hg/store/meta/b/foo $ hg files -r . -X path:b/foo b b/bar/fruits.txt @@ -523,12 +506,7 @@ Test files for a sub directory, including only a directory within it, and including an unrelated directory. -#if reporevlogstore $ rm -r .hg/store/meta/~2e_a -#endif -#if reposimplestore - $ rm -r .hg/store/meta/._a -#endif $ rm -r .hg/store/meta/b/foo $ hg files -r . -I path:b/bar/orange -I path:a b b/bar/orange/fly/gnat.py @@ -542,12 +520,7 @@ Test files for a pattern, including a directory, and excluding a directory within that. -#if reporevlogstore $ rm -r .hg/store/meta/~2e_a -#endif -#if reposimplestore - $ rm -r .hg/store/meta/._a -#endif $ rm -r .hg/store/meta/b/foo $ rm -r .hg/store/meta/b/bar/orange $ hg files -r . glob:**.txt -I path:b/bar -X path:b/bar/orange @@ -566,7 +539,6 @@ Verify works $ hg verify -q -#if repofncache Dirlogs are included in fncache $ grep meta/.A/00manifest.i .hg/store/fncache meta/.A/00manifest.i @@ -591,7 +563,6 @@ adding meta/b/foo/apple/00manifest.i adding meta/b/foo/apple/bees/00manifest.i 16 items added, 0 removed from fncache -#endif Finish first server $ killdaemons.py @@ -610,12 +581,12 @@ b/@1: parent-directory manifest refers to unknown revision f065da70369e b/@2: parent-directory manifest refers to unknown revision ac0d30948e0b b/@3: parent-directory manifest refers to unknown revision 367152e6af28 - warning: orphan data file 'meta/b/bar/00manifest.i' (reporevlogstore !) - warning: orphan data file 'meta/b/bar/orange/00manifest.i' (reporevlogstore !) - warning: orphan data file 'meta/b/bar/orange/fly/00manifest.i' (reporevlogstore !) - warning: orphan data file 'meta/b/foo/00manifest.i' (reporevlogstore !) - warning: orphan data file 'meta/b/foo/apple/00manifest.i' (reporevlogstore !) - warning: orphan data file 'meta/b/foo/apple/bees/00manifest.i' (reporevlogstore !) + warning: orphan data file 'meta/b/bar/00manifest.i' + warning: orphan data file 'meta/b/bar/orange/00manifest.i' + warning: orphan data file 'meta/b/bar/orange/fly/00manifest.i' + warning: orphan data file 'meta/b/foo/00manifest.i' + warning: orphan data file 'meta/b/foo/apple/00manifest.i' + warning: orphan data file 'meta/b/foo/apple/bees/00manifest.i' crosschecking files in changesets and manifests b/bar/fruits.txt@0: in changeset but not in manifest b/bar/orange/fly/gnat.py@0: in changeset but not in manifest @@ -624,7 +595,7 @@ checking files not checking dirstate because of previous errors checked 4 changesets with 18 changes to 8 files - 6 warnings encountered! (reporevlogstore !) + 6 warnings encountered! 9 integrity errors encountered! (first damaged changeset appears to be 0) [1] @@ -684,8 +655,6 @@ Tree manifest revlogs exist. $ find deepclone/.hg/store/meta | sort deepclone/.hg/store/meta - deepclone/.hg/store/meta/._a (reposimplestore !) - deepclone/.hg/store/meta/._a/00manifest.i (reposimplestore !) deepclone/.hg/store/meta/b deepclone/.hg/store/meta/b/00manifest.i deepclone/.hg/store/meta/b/bar @@ -700,14 +669,13 @@ deepclone/.hg/store/meta/b/foo/apple/00manifest.i deepclone/.hg/store/meta/b/foo/apple/bees deepclone/.hg/store/meta/b/foo/apple/bees/00manifest.i - deepclone/.hg/store/meta/~2e_a (reporevlogstore !) - deepclone/.hg/store/meta/~2e_a/00manifest.i (reporevlogstore !) + deepclone/.hg/store/meta/~2e_a + deepclone/.hg/store/meta/~2e_a/00manifest.i Verify passes. $ cd deepclone $ hg verify -q $ cd .. -#if reporevlogstore Create clones using old repo formats to use in later tests $ hg clone --config format.usestore=False \ > --config experimental.changegroup3=True \ @@ -761,9 +729,8 @@ $ hg clone --config experimental.changegroup3=True --stream -U \ > http://localhost:$HGPORT1 stream-clone-basicstore streaming all changes - 24 files to transfer, * of data (glob) (no-rust !) - 26 files to transfer, * of data (glob) (rust !) - transferred * in * seconds (*) (glob) + * files to transfer, * of data (glob) + stream-cloned * files / * in * seconds (*) (glob) $ hg -R stream-clone-basicstore verify -q $ cat port-1-errors.log @@ -771,9 +738,8 @@ $ hg clone --config experimental.changegroup3=True --stream -U \ > http://localhost:$HGPORT2 stream-clone-encodedstore streaming all changes - 24 files to transfer, * of data (glob) (no-rust !) - 26 files to transfer, * of data (glob) (rust !) - transferred * in * seconds (*) (glob) + * files to transfer, * of data (glob) + stream-cloned * files / * in * seconds (*) (glob) $ hg -R stream-clone-encodedstore verify -q $ cat port-2-errors.log @@ -781,9 +747,8 @@ $ hg clone --config experimental.changegroup3=True --stream -U \ > http://localhost:$HGPORT stream-clone-fncachestore streaming all changes - 23 files to transfer, * of data (glob) (no-rust !) - 25 files to transfer, * of data (glob) (rust !) - transferred * in * seconds (*) (glob) + * files to transfer, * of data (glob) + stream-cloned * files / * in * seconds (*) (glob) $ hg -R stream-clone-fncachestore verify -q $ cat port-0-errors.log @@ -796,8 +761,6 @@ $ hg debugbundle --spec repo-packed.hg none-packed1;requirements%3D(.*%2C)?treemanifest(%2C.*)? (re) -#endif - Bundle with changegroup2 is not supported $ hg -R deeprepo bundle --all -t v2 deeprepo.bundle
--- a/tests/test-unionrepo.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-unionrepo.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,5 +1,3 @@ -#require no-reposimplestore - Test unionrepo functionality Create one repository
--- a/tests/test-upgrade-repo.t Fri Feb 28 23:25:42 2025 +0100 +++ b/tests/test-upgrade-repo.t Fri Feb 28 23:28:10 2025 +0100 @@ -1,5 +1,3 @@ -#require no-reposimplestore - $ cat >> $HGRCPATH << EOF > [extensions] > share =