Mercurial > public > mercurial-scm > hg-stable
diff hgext/automv.py @ 28183:e07daee83029
automv: use 95 as the default similarity threshold
The motivation for the change from 100 to 95 is included in a comment.
* Updated the tests to include a change to a moved file that still should be
caught as a move.
* Use ui.configint() to non-integer configuration entries more gracefully. Also
complain if a similarity outside of the acceptable range is set.
author | Martijn Pieters <mjpieters@fb.com> |
---|---|
date | Tue, 16 Feb 2016 15:58:32 +0000 |
parents | 5ec1ce8fdf0a |
children | a0939666b836 |
line wrap: on
line diff
--- a/hgext/automv.py Fri Feb 19 22:28:09 2016 +0100 +++ b/hgext/automv.py Tue Feb 16 15:58:32 2016 +0000 @@ -11,14 +11,25 @@ The threshold at which a file is considered a move can be set with the ``automv.similarity`` config option. This option takes a percentage between 0 -(disabled) and 100 (files must be identical), the default is 100. +(disabled) and 100 (files must be identical), the default is 95. """ + +# Using 95 as a default similarity is based on an analysis of the mercurial +# repositories of the cpython, mozilla-central & mercurial repositories, as +# well as 2 very large facebook repositories. At 95 50% of all potential +# missed moves would be caught, as well as correspond with 87% of all +# explicitly marked moves. Together, 80% of moved files are 95% similar or +# more. +# +# See http://markmail.org/thread/5pxnljesvufvom57 for context. + from __future__ import absolute_import from mercurial import ( commands, copies, + error, extensions, scmutil, similar @@ -37,7 +48,9 @@ renames = None disabled = opts.pop('no_automv', False) if not disabled: - threshold = float(ui.config('automv', 'similarity', '100')) + threshold = ui.configint('automv', 'similarity', 95) + if not 0 <= threshold <= 100: + raise error.Abort(_('automv.similarity must be between 0 and 100')) if threshold > 0: match = scmutil.match(repo[None], pats, opts) added, removed = _interestingfiles(repo, match)