Mercurial > public > mercurial-scm > hg
comparison mercurial/pure/parsers.py @ 21809:e250b8300e6e
parsers: inline fields of dirstate values in C version
Previously, while unpacking the dirstate we'd create 3-4 new CPython objects
for most dirstate values:
- the state is a single character string, which is pooled by CPython
- the mode is a new object if it isn't 0 due to being in the lookup set
- the size is a new object if it is greater than 255
- the mtime is a new object if it isn't -1 due to being in the lookup set
- the tuple to contain them all
In some cases such as regular hg status, we actually look at all the objects.
In other cases like hg add, hg status for a subdirectory, or hg status with the
third-party hgwatchman enabled, we look at almost none of the objects.
This patch eliminates most object creation in these cases by defining a custom
C struct that is exposed to Python with an interface similar to a tuple. Only
when tuple elements are actually requested are the respective objects created.
The gains, where they're expected, are significant. The following tests are run
against a working copy with over 270,000 files.
parse_dirstate becomes significantly faster:
$ hg perfdirstate
before: wall 0.186437 comb 0.180000 user 0.160000 sys 0.020000 (best of 35)
after: wall 0.093158 comb 0.100000 user 0.090000 sys 0.010000 (best of 95)
and as a result, several commands benefit:
$ time hg status # with hgwatchman enabled
before: 0.42s user 0.14s system 99% cpu 0.563 total
after: 0.34s user 0.12s system 99% cpu 0.471 total
$ time hg add new-file
before: 0.85s user 0.18s system 99% cpu 1.033 total
after: 0.76s user 0.17s system 99% cpu 0.931 total
There is a slight regression in regular status performance, but this is fixed
in an upcoming patch.
author | Siddharth Agarwal <sid0@fb.com> |
---|---|
date | Tue, 27 May 2014 14:27:41 -0700 |
parents | 187bf2dde7c1 |
children | feddc5284724 |
comparison
equal
deleted
inserted
replaced
21808:7537e57f5dbd | 21809:e250b8300e6e |
---|---|
12 _pack = struct.pack | 12 _pack = struct.pack |
13 _unpack = struct.unpack | 13 _unpack = struct.unpack |
14 _compress = zlib.compress | 14 _compress = zlib.compress |
15 _decompress = zlib.decompress | 15 _decompress = zlib.decompress |
16 _sha = util.sha1 | 16 _sha = util.sha1 |
17 | |
18 # Some code below makes tuples directly because it's more convenient. However, | |
19 # code outside this module should always use dirstatetuple. | |
20 def dirstatetuple(*x): | |
21 # x is a tuple | |
22 return x | |
17 | 23 |
18 def parse_manifest(mfdict, fdict, lines): | 24 def parse_manifest(mfdict, fdict, lines): |
19 for l in lines.splitlines(): | 25 for l in lines.splitlines(): |
20 f, n = l.split('\0') | 26 f, n = l.split('\0') |
21 if len(n) > 40: | 27 if len(n) > 40: |
102 # The user could change the file without changing its size | 108 # The user could change the file without changing its size |
103 # within the same second. Invalidate the file's mtime in | 109 # within the same second. Invalidate the file's mtime in |
104 # dirstate, forcing future 'status' calls to compare the | 110 # dirstate, forcing future 'status' calls to compare the |
105 # contents of the file if the size is the same. This prevents | 111 # contents of the file if the size is the same. This prevents |
106 # mistakenly treating such files as clean. | 112 # mistakenly treating such files as clean. |
107 e = (e[0], e[1], e[2], -1) | 113 e = dirstatetuple(e[0], e[1], e[2], -1) |
108 dmap[f] = e | 114 dmap[f] = e |
109 | 115 |
110 if f in copymap: | 116 if f in copymap: |
111 f = "%s\0%s" % (f, copymap[f]) | 117 f = "%s\0%s" % (f, copymap[f]) |
112 e = _pack(">cllll", e[0], e[1], e[2], e[3], len(f)) | 118 e = _pack(">cllll", e[0], e[1], e[2], e[3], len(f)) |