【发布时间】:2021-05-28 09:01:39
【问题描述】:
背景和代码
我有以下函数来处理基于 HTTP status codes 的 Twitter V2 API 中的速率限制。
from datetime import datetime
from osometweet.utils import pause_until
def manage_rate_limits(response):
"""Manage Twitter V2 Rate Limits
This method takes in a `requests` response object after querying
Twitter and uses the headers["x-rate-limit-remaining"] and
headers["x-rate-limit-reset"] headers objects to manage Twitter's
most common, time-dependent HTTP errors.
Wiki Reference: https://github.com/osome-iu/osometweet/wiki/Info:-HTTP-Status-Codes-and-Errors
Twitter Reference: https://developer.twitter.com/en/support/twitter-api/error-troubleshooting
"""
while True:
# The x-rate-limit-remaining parameter is not always present.
# If it is, we want to use it.
try:
# Get number of requests left with our tokens
remaining_requests = int(response.headers["x-rate-limit-remaining"])
# If that number is one, we get the reset-time
# and wait until then, plus 15 seconds (your welcome Twitter).
# The regular 429 exception is caught below as well,
# however, we want to program defensively, where possible.
if remaining_requests == 1:
buffer_wait_time = 15
resume_time = datetime.fromtimestamp( int(response.headers["x-rate-limit-reset"]) + buffer_wait_time )
print(f"One request from being rate limited. Waiting on Twitter.\n\tResume Time: {resume_time}")
pause_until(resume_time)
except Exception as e:
print("An x-rate-limit-* parameter is likely missing...")
print(e)
# Explicitly checking for time dependent errors.
# Most of these errors can be solved simply by waiting
# a little while and pinging Twitter again - so that's what we do.
if response.status_code != 200:
# Too many requests error
if response.status_code == 429:
buffer_wait_time = 15
resume_time = datetime.fromtimestamp( int(response.headers["x-rate-limit-reset"]) + buffer_wait_time )
print(f"Too many requests. Waiting on Twitter.\n\tResume Time: {resume_time}")
pause_until(resume_time)
# Twitter internal server error
elif response.status_code == 500:
# Twitter needs a break, so we wait 30 seconds
resume_time = datetime.now().timestamp() + 30
print(f"Internal server error @ Twitter. Giving Twitter a break...\n\tResume Time: {resume_time}")
pause_until(resume_time)
# Twitter service unavailable error
elif response.status_code == 503:
# Twitter needs a break, so we wait 30 seconds
resume_time = datetime.now().timestamp() + 30
print(f"Twitter service unavailable. Giving Twitter a break...\n\tResume Time: {resume_time}")
pause_until(resume_time)
# If we get this far, we've done something wrong and should exit
raise Exception(
"Request returned an error: {} {}".format(
response.status_code, response.text
)
)
# Each time we get a 200 response, exit the function and return the response object
if response.ok:
return response
此函数从requests 调用中得到一个响应对象,如下所示
response = requests.get(
url,
headers=self._header,
params=payload
)
response = manage_rate_limits(response)
在上述响应调用中,参数如下:
在哪里
-
url= Twitter 的基本端点 URL(在这种情况下是完整的档案学术搜索) -
params/payload= 端点搜索运算符的组合(这些应该无关紧要,但如有必要我可以包括) -
headers/self._bearer_token是用户bearer_token在下面正确的标题格式中
self._header = {"Authorization": f"Bearer {MY_BEARER_TOKEN}"}
问题和错误:
使用上面的代码,我得到一个长时间运行的脚本,它从rate_limit_manager 函数返回以下错误。
Traceback (most recent call last):
File "/scratch/mdeverna/Superspreaders/src/get_rts_of_user.py", line 218, in get_rts_of_user
full_archive_search = True
File "/nfs/nfs5/home/scratch/mdeverna/osometweet/osometweet/api.py", line 248, in search
response = self._oauth.make_request(url, payload)
File "/nfs/nfs5/home/scratch/mdeverna/osometweet/osometweet/oauth.py", line 181, in make_request
response = manage_rate_limits(response)
File "/nfs/nfs5/home/scratch/mdeverna/osometweet/osometweet/rate_limit_manager.py", line 67, in manage_rate_limits
response.status_code, response.text
Exception: Request returned an error: 429 {"title":"Too Many Requests","type":"about:blank","status":429,"detail":"Too Many Requests"}
我不明白的是,打印此异常的行是...
# If we get this far, we've done something wrong and should exit
raise Exception(
"Request returned an error: {} {}".format(
response.status_code, response.text
)
...这说明了response.status_code 打印(等于)429,但是,此函数中较早的条件准确地检查此状态代码但似乎错过了它。似乎检查状态码 = 429 是否被跳过的条件,只是在状态码为 429 的下方打印出来?
这是怎么回事?
【问题讨论】: