annotate mercurial/minifileset.py @ 35616:706aa203b396

fileset: add a lightweight file filtering language This patch was inspired by one that Jun Wu authored for the fb-experimental repo, to avoid using matcher for efficiency[1]. We want a way to specify what files will be converted to LFS at commit time. And per discussion, we also want to specify what files to skip, text diff, or merge in another config option. The current `lfs.threshold` config option could not satisfy complex needs. I'm putting it in a core package because Augie floated the idea of also using it for narrow and sparse. Yuya suggested farming out to fileset.parse(), which added support for more symbols. The only fileset element not supported here is 'negate'. (List isn't supported by filesets either.) I also changed the 'always' token to the 'all()' predicate for consistency, and introduced 'none()' to improve readability in a future tracked file based config. The extension operator was changed from '.' to '**', to match how recursive path globs are specified. Finally, I changed the path matcher from '/' to 'path:' at Yuya's suggestion, for consistency with matcher. Unfortunately, ':' is currently reserved in filesets, so this has to be quoted to be processed as a string instead of a symbol[2]. We should probably revisit that, because it's seriously ugly. But it's only used by an experimental extension, and I think using a file based config for LFS may drive some more tweaks, so I'm settling for this for now. I reserved all of the glob characters in fileset except '.' and '_' for the extension test because those are likely valid extension characters. Sample filter settings: all() # everything size(">20MB") # larger than 20MB !**.txt # except for .txt files **.zip | **.tar.gz | **.7z # some types of compressed files "path:bin" # files under "bin" in the project root [1] https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-December/109387.html [2] https://www.mercurial-scm.org/pipermail/mercurial-devel/2018-January/109729.html
author Matt Harbison <matt_harbison@yahoo.com>
date Wed, 10 Jan 2018 22:23:34 -0500
parents
children 735f47b41521
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
35616
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
1 # minifileset.py - a simple language to select files
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
2 #
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
3 # Copyright 2017 Facebook, Inc.
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
4 #
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
5 # This software may be used and distributed according to the terms of the
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
6 # GNU General Public License version 2 or any later version.
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
7
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
8 from __future__ import absolute_import
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
9
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
10 from .i18n import _
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
11 from . import (
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
12 error,
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
13 fileset,
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
14 )
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
15
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
16 def _compile(tree):
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
17 if not tree:
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
18 raise error.ParseError(_("missing argument"))
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
19 op = tree[0]
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
20 if op == 'symbol':
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
21 name = fileset.getstring(tree, _('invalid file pattern'))
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
22 if name.startswith('**'): # file extension test, ex. "**.tar.gz"
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
23 ext = name[2:]
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
24 for c in ext:
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
25 if c in '*{}[]?/\\':
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
26 raise error.ParseError(_('reserved character: %s') % c)
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
27 return lambda n, s: n.endswith(ext)
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
28 raise error.ParseError(_('invalid symbol: %s') % name)
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
29 elif op == 'string':
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
30 # TODO: teach fileset about 'path:', so that this can be a symbol and
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
31 # not require quoting.
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
32 name = fileset.getstring(tree, _('invalid path literal'))
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
33 if name.startswith('path:'): # directory or full path test
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
34 p = name[5:] # prefix
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
35 pl = len(p)
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
36 f = lambda n, s: n.startswith(p) and (len(n) == pl or n[pl] == '/')
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
37 return f
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
38 raise error.ParseError(_("invalid string"),
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
39 hint=_('paths must be prefixed with "path:"'))
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
40 elif op == 'or':
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
41 func1 = _compile(tree[1])
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
42 func2 = _compile(tree[2])
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
43 return lambda n, s: func1(n, s) or func2(n, s)
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
44 elif op == 'and':
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
45 func1 = _compile(tree[1])
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
46 func2 = _compile(tree[2])
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
47 return lambda n, s: func1(n, s) and func2(n, s)
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
48 elif op == 'not':
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
49 return lambda n, s: not _compile(tree[1])(n, s)
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
50 elif op == 'group':
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
51 return _compile(tree[1])
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
52 elif op == 'func':
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
53 symbols = {
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
54 'all': lambda n, s: True,
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
55 'none': lambda n, s: False,
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
56 'size': lambda n, s: fileset.sizematcher(tree[2])(s),
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
57 }
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
58
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
59 x = tree[1]
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
60 name = x[1]
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
61 if x[0] == 'symbol' and name in symbols:
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
62 return symbols[name]
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
63
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
64 raise error.UnknownIdentifier(name, symbols.keys())
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
65 elif op == 'minus': # equivalent to 'x and not y'
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
66 func1 = _compile(tree[1])
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
67 func2 = _compile(tree[2])
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
68 return lambda n, s: func1(n, s) and not func2(n, s)
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
69 elif op == 'negate':
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
70 raise error.ParseError(_("can't use negate operator in this context"))
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
71 elif op == 'list':
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
72 raise error.ParseError(_("can't use a list in this context"),
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
73 hint=_('see hg help "filesets.x or y"'))
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
74 raise error.ProgrammingError('illegal tree: %r' % (tree,))
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
75
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
76 def compile(text):
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
77 """generate a function (path, size) -> bool from filter specification.
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
78
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
79 "text" could contain the operators defined by the fileset language for
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
80 common logic operations, and parenthesis for grouping. The supported path
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
81 tests are '**.extname' for file extension test, and '"path:dir/subdir"'
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
82 for prefix test. The ``size()`` predicate is borrowed from filesets to test
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
83 file size. The predicates ``all()`` and ``none()`` are also supported.
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
84
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
85 '(**.php & size(">10MB")) | **.zip | ("path:bin" & !"path:bin/README")' for
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
86 example, will catch all php files whose size is greater than 10 MB, all
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
87 files whose name ends with ".zip", and all files under "bin" in the repo
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
88 root except for "bin/README".
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
89 """
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
90 tree = fileset.parse(text)
706aa203b396 fileset: add a lightweight file filtering language
Matt Harbison <matt_harbison@yahoo.com>
parents:
diff changeset
91 return _compile(tree)