拥有一个用于网络协议调试的脚本甚至是应用程序的子系统,需要查看究竟是什么请求-响应对,包括有效的 URL、标头、有效负载和状态。而且在各处检测单个请求通常是不切实际的。同时有性能考虑建议使用单个(或少数专门的)requests.Session,因此以下假设遵循the suggestion。
requests 支持所谓的event hooks(从 2.23 开始,实际上只有 response 挂钩)。它基本上是一个事件侦听器,在从requests.request 返回控制之前发出事件。此时请求和响应都已完全定义,因此可以记录。
import logging
import requests
logger = logging.getLogger('httplogger')
def logRoundtrip(response, *args, **kwargs):
extra = {'req': response.request, 'res': response}
logger.debug('HTTP roundtrip', extra=extra)
session = requests.Session()
session.hooks['response'].append(logRoundtrip)
这基本上就是如何记录会话的所有 HTTP 往返行程。
格式化 HTTP 往返日志记录
为了使上面的日志记录有用,可以有专门的logging formatter 来理解日志记录中的req 和res 额外内容。它可能看起来像这样:
import textwrap
class HttpFormatter(logging.Formatter):
def _formatHeaders(self, d):
return '\n'.join(f'{k}: {v}' for k, v in d.items())
def formatMessage(self, record):
result = super().formatMessage(record)
if record.name == 'httplogger':
result += textwrap.dedent('''
---------------- request ----------------
{req.method} {req.url}
{reqhdrs}
{req.body}
---------------- response ----------------
{res.status_code} {res.reason} {res.url}
{reshdrs}
{res.text}
''').format(
req=record.req,
res=record.res,
reqhdrs=self._formatHeaders(record.req.headers),
reshdrs=self._formatHeaders(record.res.headers),
)
return result
formatter = HttpFormatter('{asctime} {levelname} {name} {message}', style='{')
handler = logging.StreamHandler()
handler.setFormatter(formatter)
logging.basicConfig(level=logging.DEBUG, handlers=[handler])
现在,如果您使用session 执行一些请求,例如:
session.get('https://httpbin.org/user-agent')
session.get('https://httpbin.org/status/200')
stderr 的输出如下所示。
2020-05-14 22:10:13,224 DEBUG urllib3.connectionpool Starting new HTTPS connection (1): httpbin.org:443
2020-05-14 22:10:13,695 DEBUG urllib3.connectionpool https://httpbin.org:443 "GET /user-agent HTTP/1.1" 200 45
2020-05-14 22:10:13,698 DEBUG httplogger HTTP roundtrip
---------------- request ----------------
GET https://httpbin.org/user-agent
User-Agent: python-requests/2.23.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
None
---------------- response ----------------
200 OK https://httpbin.org/user-agent
Date: Thu, 14 May 2020 20:10:13 GMT
Content-Type: application/json
Content-Length: 45
Connection: keep-alive
Server: gunicorn/19.9.0
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true
{
"user-agent": "python-requests/2.23.0"
}
2020-05-14 22:10:13,814 DEBUG urllib3.connectionpool https://httpbin.org:443 "GET /status/200 HTTP/1.1" 200 0
2020-05-14 22:10:13,818 DEBUG httplogger HTTP roundtrip
---------------- request ----------------
GET https://httpbin.org/status/200
User-Agent: python-requests/2.23.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
None
---------------- response ----------------
200 OK https://httpbin.org/status/200
Date: Thu, 14 May 2020 20:10:13 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 0
Connection: keep-alive
Server: gunicorn/19.9.0
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true
GUI方式
当您有大量查询时,拥有简单的 UI 和过滤记录的方法会派上用场。我将展示为此使用Chronologer(我是它的作者)。
首先,钩子已被重写,以产生logging 在通过网络发送时可以序列化的记录。它可能看起来像这样:
def logRoundtrip(response, *args, **kwargs):
extra = {
'req': {
'method': response.request.method,
'url': response.request.url,
'headers': response.request.headers,
'body': response.request.body,
},
'res': {
'code': response.status_code,
'reason': response.reason,
'url': response.url,
'headers': response.headers,
'body': response.text
},
}
logger.debug('HTTP roundtrip', extra=extra)
session = requests.Session()
session.hooks['response'].append(logRoundtrip)
其次,日志配置必须适应使用logging.handlers.HTTPHandler(Chronologer 理解)。
import logging.handlers
chrono = logging.handlers.HTTPHandler(
'localhost:8080', '/api/v1/record', 'POST', credentials=('logger', ''))
handlers = [logging.StreamHandler(), chrono]
logging.basicConfig(level=logging.DEBUG, handlers=handlers)
最后,运行 Chronologer 实例。例如使用 Docker:
docker run --rm -it -p 8080:8080 -v /tmp/db \
-e CHRONOLOGER_STORAGE_DSN=sqlite:////tmp/db/chrono.sqlite \
-e CHRONOLOGER_SECRET=example \
-e CHRONOLOGER_ROLES="basic-reader query-reader writer" \
saaj/chronologer \
python -m chronologer -e production serve -u www-data -g www-data -m
然后再次运行请求:
session.get('https://httpbin.org/user-agent')
session.get('https://httpbin.org/status/200')
流处理程序将产生:
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): httpbin.org:443
DEBUG:urllib3.connectionpool:https://httpbin.org:443 "GET /user-agent HTTP/1.1" 200 45
DEBUG:httplogger:HTTP roundtrip
DEBUG:urllib3.connectionpool:https://httpbin.org:443 "GET /status/200 HTTP/1.1" 200 0
DEBUG:httplogger:HTTP roundtrip
现在,如果您打开 http://localhost:8080/(使用“logger”作为用户名,使用空密码作为基本身份验证弹出窗口)并单击“Open”按钮,您应该会看到类似以下内容: