comparison mercurial/util.py @ 16943:8d08a28aa63e

matcher: use re2 bindings if available There are two sets of Python re2 bindings available on the internet; this code works with both. Using re2 can greatly improve "hg status" performance when a .hgignore file becomes even modestly complex. Example: "hg status" on a clean tree with 134K files, where "hg debugignore" reports a regexp 4256 bytes in size. no .hgignore: 1.76 sec Python re: 2.79 re2: 1.82 The overhead of regexp matching drops from 1.03 seconds with stock re to 0.06 with re2. (For comparison, a git repo with the same contents and .gitignore file runs "git status -s" in 1.71 seconds, i.e. only slightly faster than hg with re2.)
author Bryan O'Sullivan <bryano@fb.com>
date Fri, 01 Jun 2012 15:26:20 -0700
parents 37e081609828
children 0cb55b5c19a3
comparison
equal deleted inserted replaced
16942:87882c8753d4 16943:8d08a28aa63e
627 return False 627 return False
628 return True 628 return True
629 except OSError: 629 except OSError:
630 return True 630 return True
631 631
632 try:
633 import re2
634 _re2 = None
635 except ImportError:
636 _re2 = False
637
638 def compilere(pat):
639 '''Compile a regular expression, using re2 if possible
640
641 For best performance, use only re2-compatible regexp features.'''
642 global _re2
643 if _re2 is None:
644 try:
645 re2.compile
646 _re2 = True
647 except ImportError:
648 _re2 = False
649 if _re2:
650 try:
651 return re2.compile(pat)
652 except re2.error:
653 pass
654 return re.compile(pat)
655
632 _fspathcache = {} 656 _fspathcache = {}
633 def fspath(name, root): 657 def fspath(name, root):
634 '''Get name in the case stored in the filesystem 658 '''Get name in the case stored in the filesystem
635 659
636 The name should be relative to root, and be normcase-ed for efficiency. 660 The name should be relative to root, and be normcase-ed for efficiency.