使用 Twitter API 获取 Twitter 用户名和关注者数量答案

【问题标题】：Get Twitter username and number of followers with Twitter API使用 Twitter API 获取 Twitter 用户名和关注者数量
【发布时间】：2021-11-25 17:33:26
【问题描述】：

我想使用 Twitter API 按关键字抓取 Twitter。我正在使用 Twitter 搜索 API。

query = 'football'
tweet_fields = "author_id,created_at,text,public_metrics,possibly_sensitive,source,lang"
max_results = "50"

#define search twitter function
headers = {"Authorization": "Bearer {}".format(BEARER_TOKEN)}

url = "https://api.twitter.com/2/tweets/search/recent?query={}&tweet.fields={}&max_results={}".format(query, tweet_fields, max_results)
response = requests.request("GET", url, headers=headers)

status_code = response.status_code
print("Response Status Code:", status_code)

if response.status_code != 200:
    raise Exception(response.status_code, response.text)
else:
    pass

#print(response.json())
twitter_search_data = response.json()['data']

twitter_response = []
for data in twitter_search_data:
    print(data)

我得到了很好的结果，但我也想得到author_username。目前我只能得到author_id

我已尝试将此添加到我的 API 链接，但我没有得到这些结果：

expansions=author_id&user.fields={}
user_fields = "description,username"

url = "https://api.twitter.com/2/tweets/search/recent?query={}&tweet.fields={}&expansions=author_id&user.fields={}&max_results={}".format(query, tweet_fields, user_fields, max_results)

这是示例结果：

{'possibly_sensitive': False, 'source': 'Twitter for Android', 'lang': 'en', 'public_metrics': {'retweet_count': 1, 'reply_count': 0, 'like_count': 0, 'quote_count': 0}, 'created_at': '2021-10-05T12:23:05.000Z', 'id': '1445363916457005058', 'text': 'RT @COiNSTANTIN1: @MEXC_Global @PolkaExOfficial Check out @MiniFootballBsc We are bringing together the football and crypto community.\n⚽️Fa…', 'author_id': '1444275133854715912'}

有没有办法在我的 Twitter API 中添加一些东西，这样我就可以获得： 1.作者用户名 2.作者姓名 3.作者的关注者数量 4.作者的关注数

【问题讨论】：

标签： python api twitter

【解决方案1】：

您的代码已接近您需要的内容，但您通过扩展请求的用户信息实际上是在名为 includes 的第二个数组中传递的；而您错过了这一点，因为您的代码仅打印 data 数组中的每个值。

如果您想要指标（每个用户的关注者/关注者数量），您需要在查询中添加一个额外的用户字段：

user_fields = "description,username,public_metrics"

然后，您可以单独列出includes，或者进行一些匹配以将用户对象与匹配的推文结合起来。最简单的做法是：

print(response.json()['data'])
print(response.json()['includes'])

您可以通过检查 Tweet 对象中的 author_id 与用户对象中的 id 值来匹配用户与 Tweet 数据。

还有一些工具和库可以帮助您自动执行此操作，例如，最新版本的 twarc 可以将这些数据“扁平化”为单个对象。

【讨论】：

谢谢！你知道吗，有没有办法只从一个位置收集推文？例如，如果我的关键字是足球，我会收到来自世界各地的推文，但是有没有办法将位置设置为，例如法国，以便我只收到来自法国的包含“足球”的推文？
如果您查看 Twitter API v2 中的 list of operators，就会发现有地理过滤器，但这些是“高级”，并且仅在学术访问轨道中可用。此外，并不是大多数推文都没有地理信息（用户可以选择将地理标签添加到他们的推文中）。不过，如果您正在寻找法语推文，您可能会考虑使用lang:fr - 这在核心运算符中可用。
关于您的回答的一个问题，我注意到在某些情况下，tweet.fields 的 author_id 和 author.fields 的 id 不匹配？另外，当我从 tweet.fields 获取 author_id 并将其转换为 twitter 用户名时，它与我从 author.fields 获得的用户名不同
我唯一能想到的是，也许您正在查看转发或引用的推文？或者，匹配数组位置？ API 可能不会向 Tweet 对象返回相同数量的用户对象，它只返回 Tweet 数组中的每个用户一次。如果用户发布了多条推文，则用户对象只会在 includes 值中出现一次，因此不要尝试匹配 data[n] 和 includes[n] 的数组位置，因为它们可能不一样。跨度>