annotate mercurial/hgweb/request.py @ 36861:2cdf47e14c30

hgweb: refactor the request draining code The previous code for draining was only invoked in a few places in the wire protocol. Behavior wasn't consist. Furthermore, it was difficult to reason about. With us converting the input stream to a capped reader, it is now safe to always drain the input stream when its size is known because we can never overrun the input and read into the next HTTP request. The only question is "should we?" This commit changes the draining code so every request is examined. Draining now kicks in for a few requests where it wouldn't before. But I think the code is sufficiently restricted so the behavior is safe. Possibly the most dangerous part of this code is the issuing of Connection: close for POST and PUT requests that don't have a Content-Length. I don't think there are any such uses in our WSGI application, so this should be safe. In the near future, I plan to significantly refactor the WSGI response handling. I anticipate this code evolving a bit. So any minor regressions around draining or connection closing behavior might be fixed as a result of that work. All tests pass with this change. That scares me a bit because it means we are lacking low-level tests for the HTTP protocol. Differential Revision: https://phab.mercurial-scm.org/D2769
author Gregory Szorc <gregory.szorc@gmail.com>
date Sat, 10 Mar 2018 11:03:45 -0800
parents 290fc4c3d1e0
children 1f7d9024674c
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
2391
d351a3be3371 Fixing up comment headers for split up code.
Eric Hopper <hopper@omnifarious.org>
parents: 2355
diff changeset
1 # hgweb/request.py - An http request from either CGI or the standalone server.
131
c9d51742471c moving hgweb to mercurial subdir
jake@edge2.net
parents:
diff changeset
2 #
238
3b92f8fe47ae hgweb.py: kill #! line, clean up copyright notice
mpm@selenic.com
parents: 222
diff changeset
3 # Copyright 21 May 2005 - (c) 2005 Jake Edge <jake@edge2.net>
2859
345bac2bc4ec update copyrights.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents: 2535
diff changeset
4 # Copyright 2005, 2006 Matt Mackall <mpm@selenic.com>
131
c9d51742471c moving hgweb to mercurial subdir
jake@edge2.net
parents:
diff changeset
5 #
8225
46293a0c7e9f updated license to be explicit about GPL version 2
Martin Geisler <mg@lazybytes.net>
parents: 7742
diff changeset
6 # This software may be used and distributed according to the terms of the
10263
25e572394f5c Update license to GPLv2+
Matt Mackall <mpm@selenic.com>
parents: 10261
diff changeset
7 # GNU General Public License version 2 or any later version.
131
c9d51742471c moving hgweb to mercurial subdir
jake@edge2.net
parents:
diff changeset
8
27046
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
9 from __future__ import absolute_import
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
10
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
11 import cgi
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
12 import errno
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
13 import socket
36822
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
14 import wsgiref.headers as wsgiheaders
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
15 #import wsgiref.validate
27046
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
16
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
17 from .common import (
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
18 ErrorResponse,
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
19 HTTP_NOT_MODIFIED,
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
20 statusmessage,
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
21 )
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
22
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
23 from ..thirdparty import (
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
24 attr,
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
25 )
27046
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
26 from .. import (
34514
528b21b853aa request: coerce content-type to native str
Augie Fackler <augie@google.com>
parents: 34513
diff changeset
27 pycompat,
27046
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
28 util,
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
29 )
138
c77a679e9cfa Revamped templated hgweb
mpm@selenic.com
parents: 137
diff changeset
30
6774
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
31 shortcuts = {
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
32 'cl': [('cmd', ['changelog']), ('rev', None)],
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
33 'sl': [('cmd', ['shortlog']), ('rev', None)],
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
34 'cs': [('cmd', ['changeset']), ('node', None)],
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
35 'f': [('cmd', ['file']), ('filenode', None)],
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
36 'fl': [('cmd', ['filelog']), ('filenode', None)],
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
37 'fd': [('cmd', ['filediff']), ('node', None)],
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
38 'fa': [('cmd', ['annotate']), ('filenode', None)],
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
39 'mf': [('cmd', ['manifest']), ('manifest', None)],
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
40 'ca': [('cmd', ['archive']), ('node', None)],
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
41 'tags': [('cmd', ['tags'])],
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
42 'tip': [('cmd', ['changeset']), ('node', ['tip'])],
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
43 'static': [('cmd', ['static']), ('file', None)]
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
44 }
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
45
10261
5eae671c0b57 hgweb: request: strip() form values
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 9694
diff changeset
46 def normalize(form):
5eae671c0b57 hgweb: request: strip() form values
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 9694
diff changeset
47 # first expand the shortcuts
34513
34fcb0f66837 request: use trivial iterator over dictionary keys
Augie Fackler <augie@google.com>
parents: 34512
diff changeset
48 for k in shortcuts:
6774
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
49 if k in form:
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
50 for name, value in shortcuts[k]:
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
51 if value is None:
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
52 value = form[k]
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
53 form[name] = value
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
54 del form[k]
10261
5eae671c0b57 hgweb: request: strip() form values
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 9694
diff changeset
55 # And strip the values
36736
2442927cdd96 hgweb: convert req.form to bytes for all keys and values
Augie Fackler <augie@google.com>
parents: 36291
diff changeset
56 bytesform = {}
10261
5eae671c0b57 hgweb: request: strip() form values
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 9694
diff changeset
57 for k, v in form.iteritems():
36736
2442927cdd96 hgweb: convert req.form to bytes for all keys and values
Augie Fackler <augie@google.com>
parents: 36291
diff changeset
58 bytesform[pycompat.bytesurl(k)] = [
2442927cdd96 hgweb: convert req.form to bytes for all keys and values
Augie Fackler <augie@google.com>
parents: 36291
diff changeset
59 pycompat.bytesurl(i.strip()) for i in v]
2442927cdd96 hgweb: convert req.form to bytes for all keys and values
Augie Fackler <augie@google.com>
parents: 36291
diff changeset
60 return bytesform
6774
0dbb56e90a71 hgweb: move shortcut expansion to request instantiation
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
61
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
62 @attr.s(frozen=True)
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
63 class parsedrequest(object):
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
64 """Represents a parsed WSGI request / static HTTP request parameters."""
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
65
36854
16292bbda39c hgweb: store and use request method on parsed request
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36853
diff changeset
66 # Request method.
16292bbda39c hgweb: store and use request method on parsed request
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36853
diff changeset
67 method = attr.ib()
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
68 # Full URL for this request.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
69 url = attr.ib()
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
70 # URL without any path components. Just <proto>://<host><port>.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
71 baseurl = attr.ib()
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
72 # Advertised URL. Like ``url`` and ``baseurl`` but uses SERVER_NAME instead
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
73 # of HTTP: Host header for hostname. This is likely what clients used.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
74 advertisedurl = attr.ib()
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
75 advertisedbaseurl = attr.ib()
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
76 # WSGI application path.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
77 apppath = attr.ib()
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
78 # List of path parts to be used for dispatch.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
79 dispatchparts = attr.ib()
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
80 # URL path component (no query string) used for dispatch.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
81 dispatchpath = attr.ib()
36819
cfb9ef24968c hgweb: use parsed request to construct query parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36817
diff changeset
82 # Whether there is a path component to this request. This can be true
cfb9ef24968c hgweb: use parsed request to construct query parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36817
diff changeset
83 # when ``dispatchpath`` is empty due to REPO_NAME muckery.
cfb9ef24968c hgweb: use parsed request to construct query parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36817
diff changeset
84 havepathinfo = attr.ib()
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
85 # Raw query string (part after "?" in URL).
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
86 querystring = attr.ib()
36817
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36816
diff changeset
87 # List of 2-tuples of query string arguments.
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36816
diff changeset
88 querystringlist = attr.ib()
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36816
diff changeset
89 # Dict of query string arguments. Values are lists with at least 1 item.
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36816
diff changeset
90 querystringdict = attr.ib()
36822
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
91 # wsgiref.headers.Headers instance. Operates like a dict with case
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
92 # insensitive keys.
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
93 headers = attr.ib()
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
94
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
95 def parserequestfromenv(env):
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
96 """Parse URL components from environment variables.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
97
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
98 WSGI defines request attributes via environment variables. This function
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
99 parses the environment variables into a data structure.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
100 """
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
101 # PEP-0333 defines the WSGI spec and is a useful reference for this code.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
102
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
103 # We first validate that the incoming object conforms with the WSGI spec.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
104 # We only want to be dealing with spec-conforming WSGI implementations.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
105 # TODO enable this once we fix internal violations.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
106 #wsgiref.validate.check_environ(env)
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
107
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
108 # PEP-0333 states that environment keys and values are native strings
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
109 # (bytes on Python 2 and str on Python 3). The code points for the Unicode
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
110 # strings on Python 3 must be between \00000-\000FF. We deal with bytes
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
111 # in Mercurial, so mass convert string keys and values to bytes.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
112 if pycompat.ispy3:
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
113 env = {k.encode('latin-1'): v for k, v in env.iteritems()}
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
114 env = {k: v.encode('latin-1') if isinstance(v, str) else v
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
115 for k, v in env.iteritems()}
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
116
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
117 # https://www.python.org/dev/peps/pep-0333/#environ-variables defines
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
118 # the environment variables.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
119 # https://www.python.org/dev/peps/pep-0333/#url-reconstruction defines
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
120 # how URLs are reconstructed.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
121 fullurl = env['wsgi.url_scheme'] + '://'
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
122 advertisedfullurl = fullurl
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
123
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
124 def addport(s):
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
125 if env['wsgi.url_scheme'] == 'https':
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
126 if env['SERVER_PORT'] != '443':
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
127 s += ':' + env['SERVER_PORT']
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
128 else:
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
129 if env['SERVER_PORT'] != '80':
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
130 s += ':' + env['SERVER_PORT']
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
131
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
132 return s
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
133
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
134 if env.get('HTTP_HOST'):
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
135 fullurl += env['HTTP_HOST']
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
136 else:
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
137 fullurl += env['SERVER_NAME']
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
138 fullurl = addport(fullurl)
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
139
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
140 advertisedfullurl += env['SERVER_NAME']
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
141 advertisedfullurl = addport(advertisedfullurl)
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
142
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
143 baseurl = fullurl
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
144 advertisedbaseurl = advertisedfullurl
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
145
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
146 fullurl += util.urlreq.quote(env.get('SCRIPT_NAME', ''))
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
147 advertisedfullurl += util.urlreq.quote(env.get('SCRIPT_NAME', ''))
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
148 fullurl += util.urlreq.quote(env.get('PATH_INFO', ''))
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
149 advertisedfullurl += util.urlreq.quote(env.get('PATH_INFO', ''))
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
150
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
151 if env.get('QUERY_STRING'):
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
152 fullurl += '?' + env['QUERY_STRING']
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
153 advertisedfullurl += '?' + env['QUERY_STRING']
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
154
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
155 # When dispatching requests, we look at the URL components (PATH_INFO
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
156 # and QUERY_STRING) after the application root (SCRIPT_NAME). But hgwebdir
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
157 # has the concept of "virtual" repositories. This is defined via REPO_NAME.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
158 # If REPO_NAME is defined, we append it to SCRIPT_NAME to form a new app
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
159 # root. We also exclude its path components from PATH_INFO when resolving
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
160 # the dispatch path.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
161
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
162 apppath = env['SCRIPT_NAME']
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
163
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
164 if env.get('REPO_NAME'):
36816
0031e972ded2 hgweb: use the parsed application path directly
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36814
diff changeset
165 if not apppath.endswith('/'):
0031e972ded2 hgweb: use the parsed application path directly
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36814
diff changeset
166 apppath += '/'
0031e972ded2 hgweb: use the parsed application path directly
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36814
diff changeset
167
0031e972ded2 hgweb: use the parsed application path directly
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36814
diff changeset
168 apppath += env.get('REPO_NAME')
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
169
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
170 if 'PATH_INFO' in env:
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
171 dispatchparts = env['PATH_INFO'].strip('/').split('/')
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
172
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
173 # Strip out repo parts.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
174 repoparts = env.get('REPO_NAME', '').split('/')
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
175 if dispatchparts[:len(repoparts)] == repoparts:
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
176 dispatchparts = dispatchparts[len(repoparts):]
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
177 else:
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
178 dispatchparts = []
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
179
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
180 dispatchpath = '/'.join(dispatchparts)
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
181
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
182 querystring = env.get('QUERY_STRING', '')
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
183
36817
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36816
diff changeset
184 # We store as a list so we have ordering information. We also store as
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36816
diff changeset
185 # a dict to facilitate fast lookup.
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36816
diff changeset
186 querystringlist = util.urlreq.parseqsl(querystring, keep_blank_values=True)
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36816
diff changeset
187
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36816
diff changeset
188 querystringdict = {}
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36816
diff changeset
189 for k, v in querystringlist:
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36816
diff changeset
190 if k in querystringdict:
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36816
diff changeset
191 querystringdict[k].append(v)
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36816
diff changeset
192 else:
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36816
diff changeset
193 querystringdict[k] = [v]
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36816
diff changeset
194
36822
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
195 # HTTP_* keys contain HTTP request headers. The Headers structure should
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
196 # perform case normalization for us. We just rewrite underscore to dash
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
197 # so keys match what likely went over the wire.
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
198 headers = []
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
199 for k, v in env.iteritems():
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
200 if k.startswith('HTTP_'):
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
201 headers.append((k[len('HTTP_'):].replace('_', '-'), v))
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
202
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
203 headers = wsgiheaders.Headers(headers)
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
204
36853
ed0456fde625 hgweb: handle CONTENT_LENGTH
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36822
diff changeset
205 # This is kind of a lie because the HTTP header wasn't explicitly
ed0456fde625 hgweb: handle CONTENT_LENGTH
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36822
diff changeset
206 # sent. But for all intents and purposes it should be OK to lie about
ed0456fde625 hgweb: handle CONTENT_LENGTH
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36822
diff changeset
207 # this, since a consumer will either either value to determine how many
ed0456fde625 hgweb: handle CONTENT_LENGTH
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36822
diff changeset
208 # bytes are available to read.
ed0456fde625 hgweb: handle CONTENT_LENGTH
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36822
diff changeset
209 if 'CONTENT_LENGTH' in env and 'HTTP_CONTENT_LENGTH' not in env:
ed0456fde625 hgweb: handle CONTENT_LENGTH
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36822
diff changeset
210 headers['Content-Length'] = env['CONTENT_LENGTH']
ed0456fde625 hgweb: handle CONTENT_LENGTH
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36822
diff changeset
211
36854
16292bbda39c hgweb: store and use request method on parsed request
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36853
diff changeset
212 return parsedrequest(method=env['REQUEST_METHOD'],
16292bbda39c hgweb: store and use request method on parsed request
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36853
diff changeset
213 url=fullurl, baseurl=baseurl,
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
214 advertisedurl=advertisedfullurl,
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
215 advertisedbaseurl=advertisedbaseurl,
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
216 apppath=apppath,
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
217 dispatchparts=dispatchparts, dispatchpath=dispatchpath,
36819
cfb9ef24968c hgweb: use parsed request to construct query parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36817
diff changeset
218 havepathinfo='PATH_INFO' in env,
36817
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36816
diff changeset
219 querystring=querystring,
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36816
diff changeset
220 querystringlist=querystringlist,
36822
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
221 querystringdict=querystringdict,
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
222 headers=headers)
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
223
5566
d74fc8dec2b4 Less indirection in the WSGI web interface. This simplifies some code, and makes it more compliant with WSGI.
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5563
diff changeset
224 class wsgirequest(object):
26132
9df8c729e2e7 hgweb: add some documentation
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25660
diff changeset
225 """Higher-level API for a WSGI request.
9df8c729e2e7 hgweb: add some documentation
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25660
diff changeset
226
9df8c729e2e7 hgweb: add some documentation
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25660
diff changeset
227 WSGI applications are invoked with 2 arguments. They are used to
9df8c729e2e7 hgweb: add some documentation
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25660
diff changeset
228 instantiate instances of this class, which provides higher-level APIs
9df8c729e2e7 hgweb: add some documentation
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25660
diff changeset
229 for obtaining request parameters, writing HTTP output, etc.
9df8c729e2e7 hgweb: add some documentation
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25660
diff changeset
230 """
5566
d74fc8dec2b4 Less indirection in the WSGI web interface. This simplifies some code, and makes it more compliant with WSGI.
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5563
diff changeset
231 def __init__(self, wsgienv, start_response):
34512
482d6f6dba91 hgweb: when constructing or adding to a wsgi environ dict, use native strs
Augie Fackler <augie@google.com>
parents: 27046
diff changeset
232 version = wsgienv[r'wsgi.version']
3673
eb0b4a2d70a9 white space and line break cleanups
Thomas Arendsen Hein <thomas@intevation.de>
parents: 2859
diff changeset
233 if (version < (1, 0)) or (version >= (2, 0)):
4633
ff7253a0d1da Cleanup of whitespace, indentation and line continuation.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 4250
diff changeset
234 raise RuntimeError("Unknown and unsupported WSGI version %d.%d"
2506
d0db3462d568 This patch make several WSGI related alterations.
Eric Hopper <hopper@omnifarious.org>
parents: 2466
diff changeset
235 % version)
34512
482d6f6dba91 hgweb: when constructing or adding to a wsgi environ dict, use native strs
Augie Fackler <augie@google.com>
parents: 27046
diff changeset
236 self.inp = wsgienv[r'wsgi.input']
36860
290fc4c3d1e0 hgweb: use a capped reader for WSGI input stream
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36858
diff changeset
237
290fc4c3d1e0 hgweb: use a capped reader for WSGI input stream
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36858
diff changeset
238 if r'HTTP_CONTENT_LENGTH' in wsgienv:
290fc4c3d1e0 hgweb: use a capped reader for WSGI input stream
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36858
diff changeset
239 self.inp = util.cappedreader(self.inp,
290fc4c3d1e0 hgweb: use a capped reader for WSGI input stream
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36858
diff changeset
240 int(wsgienv[r'HTTP_CONTENT_LENGTH']))
290fc4c3d1e0 hgweb: use a capped reader for WSGI input stream
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36858
diff changeset
241 elif r'CONTENT_LENGTH' in wsgienv:
290fc4c3d1e0 hgweb: use a capped reader for WSGI input stream
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36858
diff changeset
242 self.inp = util.cappedreader(self.inp,
290fc4c3d1e0 hgweb: use a capped reader for WSGI input stream
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36858
diff changeset
243 int(wsgienv[r'CONTENT_LENGTH']))
290fc4c3d1e0 hgweb: use a capped reader for WSGI input stream
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36858
diff changeset
244
34512
482d6f6dba91 hgweb: when constructing or adding to a wsgi environ dict, use native strs
Augie Fackler <augie@google.com>
parents: 27046
diff changeset
245 self.err = wsgienv[r'wsgi.errors']
482d6f6dba91 hgweb: when constructing or adding to a wsgi environ dict, use native strs
Augie Fackler <augie@google.com>
parents: 27046
diff changeset
246 self.threaded = wsgienv[r'wsgi.multithread']
482d6f6dba91 hgweb: when constructing or adding to a wsgi environ dict, use native strs
Augie Fackler <augie@google.com>
parents: 27046
diff changeset
247 self.multiprocess = wsgienv[r'wsgi.multiprocess']
482d6f6dba91 hgweb: when constructing or adding to a wsgi environ dict, use native strs
Augie Fackler <augie@google.com>
parents: 27046
diff changeset
248 self.run_once = wsgienv[r'wsgi.run_once']
2506
d0db3462d568 This patch make several WSGI related alterations.
Eric Hopper <hopper@omnifarious.org>
parents: 2466
diff changeset
249 self.env = wsgienv
10261
5eae671c0b57 hgweb: request: strip() form values
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 9694
diff changeset
250 self.form = normalize(cgi.parse(self.inp,
5eae671c0b57 hgweb: request: strip() form values
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 9694
diff changeset
251 self.env,
5eae671c0b57 hgweb: request: strip() form values
Nicolas Dumazet <nicdumz.commits@gmail.com>
parents: 9694
diff changeset
252 keep_blank_values=1))
5888
956afc025c0f hgweb: separate out start_response() calling
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5887
diff changeset
253 self._start_response = start_response
5993
948a41e77902 hgweb: explicit response status
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5930
diff changeset
254 self.server_write = None
2506
d0db3462d568 This patch make several WSGI related alterations.
Eric Hopper <hopper@omnifarious.org>
parents: 2466
diff changeset
255 self.headers = []
d0db3462d568 This patch make several WSGI related alterations.
Eric Hopper <hopper@omnifarious.org>
parents: 2466
diff changeset
256
18352
e33b9b92a200 hgweb: pass the actual response body to request.response, not just the length
Mads Kiilerich <mads@kiilerich.com>
parents: 18351
diff changeset
257 def respond(self, status, type, filename=None, body=None):
34514
528b21b853aa request: coerce content-type to native str
Augie Fackler <augie@google.com>
parents: 34513
diff changeset
258 if not isinstance(type, str):
528b21b853aa request: coerce content-type to native str
Augie Fackler <augie@google.com>
parents: 34513
diff changeset
259 type = pycompat.sysstr(type)
5888
956afc025c0f hgweb: separate out start_response() calling
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5887
diff changeset
260 if self._start_response is not None:
34722
95be8928d6b2 hgweb: fill in content-type and content-length as native strings
Augie Fackler <augie@google.com>
parents: 34514
diff changeset
261 self.headers.append((r'Content-Type', type))
18348
764a758780b6 hgweb: simplify wsgirequest header handling
Mads Kiilerich <mads@kiilerich.com>
parents: 18347
diff changeset
262 if filename:
26846
7c1b4840c2cd hgweb: replace some str.split() calls by str.partition() or str.rpartition()
Anton Shestakov <av6@dwimlabs.net>
parents: 26200
diff changeset
263 filename = (filename.rpartition('/')[-1]
18348
764a758780b6 hgweb: simplify wsgirequest header handling
Mads Kiilerich <mads@kiilerich.com>
parents: 18347
diff changeset
264 .replace('\\', '\\\\').replace('"', '\\"'))
764a758780b6 hgweb: simplify wsgirequest header handling
Mads Kiilerich <mads@kiilerich.com>
parents: 18347
diff changeset
265 self.headers.append(('Content-Disposition',
764a758780b6 hgweb: simplify wsgirequest header handling
Mads Kiilerich <mads@kiilerich.com>
parents: 18347
diff changeset
266 'inline; filename="%s"' % filename))
18352
e33b9b92a200 hgweb: pass the actual response body to request.response, not just the length
Mads Kiilerich <mads@kiilerich.com>
parents: 18351
diff changeset
267 if body is not None:
34722
95be8928d6b2 hgweb: fill in content-type and content-length as native strings
Augie Fackler <augie@google.com>
parents: 34514
diff changeset
268 self.headers.append((r'Content-Length', str(len(body))))
5888
956afc025c0f hgweb: separate out start_response() calling
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5887
diff changeset
269
5926
15ef6b9c1f2f hgweb: be sure to send a valid content-type for raw files
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5922
diff changeset
270 for k, v in self.headers:
15ef6b9c1f2f hgweb: be sure to send a valid content-type for raw files
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5922
diff changeset
271 if not isinstance(v, str):
18348
764a758780b6 hgweb: simplify wsgirequest header handling
Mads Kiilerich <mads@kiilerich.com>
parents: 18347
diff changeset
272 raise TypeError('header value must be string: %r' % (v,))
5926
15ef6b9c1f2f hgweb: be sure to send a valid content-type for raw files
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5922
diff changeset
273
5888
956afc025c0f hgweb: separate out start_response() calling
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5887
diff changeset
274 if isinstance(status, ErrorResponse):
18348
764a758780b6 hgweb: simplify wsgirequest header handling
Mads Kiilerich <mads@kiilerich.com>
parents: 18347
diff changeset
275 self.headers.extend(status.headers)
12739
8dcd3203a261 hgweb: don't send a body or illegal headers during 304 response
Augie Fackler <durin42@gmail.com>
parents: 10263
diff changeset
276 if status.code == HTTP_NOT_MODIFIED:
8dcd3203a261 hgweb: don't send a body or illegal headers during 304 response
Augie Fackler <durin42@gmail.com>
parents: 10263
diff changeset
277 # RFC 2616 Section 10.3.5: 304 Not Modified has cases where
8dcd3203a261 hgweb: don't send a body or illegal headers during 304 response
Augie Fackler <durin42@gmail.com>
parents: 10263
diff changeset
278 # it MUST NOT include any headers other than these and no
8dcd3203a261 hgweb: don't send a body or illegal headers during 304 response
Augie Fackler <durin42@gmail.com>
parents: 10263
diff changeset
279 # body
8dcd3203a261 hgweb: don't send a body or illegal headers during 304 response
Augie Fackler <durin42@gmail.com>
parents: 10263
diff changeset
280 self.headers = [(k, v) for (k, v) in self.headers if
8dcd3203a261 hgweb: don't send a body or illegal headers during 304 response
Augie Fackler <durin42@gmail.com>
parents: 10263
diff changeset
281 k in ('Date', 'ETag', 'Expires',
8dcd3203a261 hgweb: don't send a body or illegal headers during 304 response
Augie Fackler <durin42@gmail.com>
parents: 10263
diff changeset
282 'Cache-Control', 'Vary')]
36288
a0a004b29a51 hgweb: correctly bytes-ify status, not string-ify
Augie Fackler <augie@google.com>
parents: 34722
diff changeset
283 status = statusmessage(status.code, pycompat.bytestr(status))
5993
948a41e77902 hgweb: explicit response status
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5930
diff changeset
284 elif status == 200:
948a41e77902 hgweb: explicit response status
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5930
diff changeset
285 status = '200 Script output follows'
5888
956afc025c0f hgweb: separate out start_response() calling
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5887
diff changeset
286 elif isinstance(status, int):
956afc025c0f hgweb: separate out start_response() calling
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5887
diff changeset
287 status = statusmessage(status)
956afc025c0f hgweb: separate out start_response() calling
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5887
diff changeset
288
36861
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
289 # Various HTTP clients (notably httplib) won't read the HTTP
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
290 # response until the HTTP request has been sent in full. If servers
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
291 # (us) send a response before the HTTP request has been fully sent,
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
292 # the connection may deadlock because neither end is reading.
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
293 #
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
294 # We work around this by "draining" the request data before
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
295 # sending any response in some conditions.
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
296 drain = False
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
297 close = False
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
298
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
299 # If the client sent Expect: 100-continue, we assume it is smart
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
300 # enough to deal with the server sending a response before reading
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
301 # the request. (httplib doesn't do this.)
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
302 if self.env.get(r'HTTP_EXPECT', r'').lower() == r'100-continue':
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
303 pass
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
304 # Only tend to request methods that have bodies. Strictly speaking,
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
305 # we should sniff for a body. But this is fine for our existing
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
306 # WSGI applications.
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
307 elif self.env[r'REQUEST_METHOD'] not in (r'POST', r'PUT'):
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
308 pass
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
309 else:
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
310 # If we don't know how much data to read, there's no guarantee
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
311 # that we can drain the request responsibly. The WSGI
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
312 # specification only says that servers *should* ensure the
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
313 # input stream doesn't overrun the actual request. So there's
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
314 # no guarantee that reading until EOF won't corrupt the stream
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
315 # state.
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
316 if not isinstance(self.inp, util.cappedreader):
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
317 close = True
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
318 else:
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
319 # We /could/ only drain certain HTTP response codes. But 200
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
320 # and non-200 wire protocol responses both require draining.
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
321 # Since we have a capped reader in place for all situations
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
322 # where we drain, it is safe to read from that stream. We'll
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
323 # either do a drain or no-op if we're already at EOF.
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
324 drain = True
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
325
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
326 if close:
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
327 self.headers.append((r'Connection', r'Close'))
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
328
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
329 if drain:
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
330 assert isinstance(self.inp, util.cappedreader)
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
331 while True:
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
332 chunk = self.inp.read(32768)
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
333 if not chunk:
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
334 break
2cdf47e14c30 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36860
diff changeset
335
36291
af0a19d8812b py3: get bytes-repr of network errors portably
Augie Fackler <augie@google.com>
parents: 36288
diff changeset
336 self.server_write = self._start_response(
af0a19d8812b py3: get bytes-repr of network errors portably
Augie Fackler <augie@google.com>
parents: 36288
diff changeset
337 pycompat.sysstr(status), self.headers)
5888
956afc025c0f hgweb: separate out start_response() calling
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5887
diff changeset
338 self._start_response = None
956afc025c0f hgweb: separate out start_response() calling
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5887
diff changeset
339 self.headers = []
18352
e33b9b92a200 hgweb: pass the actual response body to request.response, not just the length
Mads Kiilerich <mads@kiilerich.com>
parents: 18351
diff changeset
340 if body is not None:
e33b9b92a200 hgweb: pass the actual response body to request.response, not just the length
Mads Kiilerich <mads@kiilerich.com>
parents: 18351
diff changeset
341 self.write(body)
e33b9b92a200 hgweb: pass the actual response body to request.response, not just the length
Mads Kiilerich <mads@kiilerich.com>
parents: 18351
diff changeset
342 self.server_write = None
5888
956afc025c0f hgweb: separate out start_response() calling
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5887
diff changeset
343
5993
948a41e77902 hgweb: explicit response status
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5930
diff changeset
344 def write(self, thing):
18351
3fbdbeab38cc hgweb: don't pass empty response chunks on
Mads Kiilerich <mads@kiilerich.com>
parents: 18350
diff changeset
345 if thing:
3fbdbeab38cc hgweb: don't pass empty response chunks on
Mads Kiilerich <mads@kiilerich.com>
parents: 18350
diff changeset
346 try:
3fbdbeab38cc hgweb: don't pass empty response chunks on
Mads Kiilerich <mads@kiilerich.com>
parents: 18350
diff changeset
347 self.server_write(thing)
25660
328739ea70c3 global: mass rewrite to use modern exception syntax
Gregory Szorc <gregory.szorc@gmail.com>
parents: 18352
diff changeset
348 except socket.error as inst:
18351
3fbdbeab38cc hgweb: don't pass empty response chunks on
Mads Kiilerich <mads@kiilerich.com>
parents: 18350
diff changeset
349 if inst[0] != errno.ECONNRESET:
3fbdbeab38cc hgweb: don't pass empty response chunks on
Mads Kiilerich <mads@kiilerich.com>
parents: 18350
diff changeset
350 raise
1159
b6f5a947e62e Change use of global sys.stdout, sys.stdin os.environ to a hgrequest object.
Vincent Wagelaar <vincent@ricardis.tudelft.nl>
parents: 1143
diff changeset
351
4246
cc81c512a531 avoid _wsgioutputfile <-> _wsgirequest circular reference
Alexis S. L. Carvalho <alexis@cecm.usp.br>
parents: 3673
diff changeset
352 def flush(self):
cc81c512a531 avoid _wsgioutputfile <-> _wsgirequest circular reference
Alexis S. L. Carvalho <alexis@cecm.usp.br>
parents: 3673
diff changeset
353 return None
cc81c512a531 avoid _wsgioutputfile <-> _wsgirequest circular reference
Alexis S. L. Carvalho <alexis@cecm.usp.br>
parents: 3673
diff changeset
354
5566
d74fc8dec2b4 Less indirection in the WSGI web interface. This simplifies some code, and makes it more compliant with WSGI.
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5563
diff changeset
355 def wsgiapplication(app_maker):
5887
41a3fce17625 hgweb: return iterable, add deprecation note
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5886
diff changeset
356 '''For compatibility with old CGI scripts. A plain hgweb() or hgwebdir()
41a3fce17625 hgweb: return iterable, add deprecation note
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5886
diff changeset
357 can and should now be used as a WSGI application.'''
5760
0145f9afb0e7 Removed tabs and trailing whitespace in python files
Thomas Arendsen Hein <thomas@intevation.de>
parents: 5566
diff changeset
358 application = app_maker()
0145f9afb0e7 Removed tabs and trailing whitespace in python files
Thomas Arendsen Hein <thomas@intevation.de>
parents: 5566
diff changeset
359 def run_wsgi(env, respond):
5887
41a3fce17625 hgweb: return iterable, add deprecation note
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5886
diff changeset
360 return application(env, respond)
5760
0145f9afb0e7 Removed tabs and trailing whitespace in python files
Thomas Arendsen Hein <thomas@intevation.de>
parents: 5566
diff changeset
361 return run_wsgi