Mercurial > public > mercurial-scm > hg
diff tests/test-fileset.t @ 38865:899b4c74209c
fileset: combine union of basic patterns into single matcher
This appears to improve query performance in a big repository than I thought.
Writing less Python in a hot loop, faster computation we gain.
$ hg files --cwd mozilla-central --time 'set:a* + b* + c* + d* + e*'
(orig) time: real 0.670 secs (user 0.640+0.000 sys 0.030+0.000)
(new) time: real 0.210 secs (user 0.180+0.000 sys 0.020+0.000)
author | Yuya Nishihara <yuya@tcha.org> |
---|---|
date | Sat, 21 Jul 2018 17:19:12 +0900 |
parents | 73731fa8d1bd |
children | e79a69af1593 |
line wrap: on
line diff
--- a/tests/test-fileset.t Sat Jul 21 17:13:34 2018 +0900 +++ b/tests/test-fileset.t Sat Jul 21 17:19:12 2018 +0900 @@ -53,9 +53,7 @@ (symbol 'glob') (symbol 'b?'))) * matcher: - <unionmatcher matchers=[ - <patternmatcher patterns='(?:a1(?:/|$))'>, - <patternmatcher patterns='(?:b.$)'>]> + <patternmatcher patterns='(?:a1(?:/|$)|b.$)'> a1 b1 b2 @@ -182,8 +180,9 @@ None))) * optimized: (or - (symbol 'a1') - (symbol 'a2') + (patterns + (symbol 'a1') + (symbol 'a2')) (and (func (symbol 'clean') @@ -193,8 +192,7 @@ (string 'b')))) * matcher: <unionmatcher matchers=[ - <patternmatcher patterns='(?:a1$)'>, - <patternmatcher patterns='(?:a2$)'>, + <patternmatcher patterns='(?:a1$|a2$)'>, <intersectionmatcher m1=<predicatenmatcher pred=clean>, m2=<predicatenmatcher pred=grep('b')>>]> @@ -203,13 +201,30 @@ b1 b2 +Union of basic patterns: + + $ fileset -p optimized -s -r. 'a1 or a2 or path:b1' + * optimized: + (patterns + (symbol 'a1') + (symbol 'a2') + (kindpat + (symbol 'path') + (symbol 'b1'))) + * matcher: + <patternmatcher patterns='(?:a1$|a2$|b1(?:/|$))'> + a1 + a2 + b1 + OR expression should be reordered by weight: $ fileset -p optimized -s -r. 'grep("a") or a1 or grep("b") or b2' * optimized: (or - (symbol 'a1') - (symbol 'b2') + (patterns + (symbol 'a1') + (symbol 'b2')) (func (symbol 'grep') (string 'a')) @@ -218,8 +233,7 @@ (string 'b'))) * matcher: <unionmatcher matchers=[ - <patternmatcher patterns='(?:a1$)'>, - <patternmatcher patterns='(?:b2$)'>, + <patternmatcher patterns='(?:a1$|b2$)'>, <predicatenmatcher pred=grep('a')>, <predicatenmatcher pred=grep('b')>]> a1