【问题标题】:Stream a twitter list using Tweepy使用 Tweepy 流式传输 Twitter 列表
【发布时间】:2021-03-13 10:17:09
【问题描述】:

我想使用 Tweepy 流式传输特定的公共 Twitter 列表时遇到问题。我可以流式传输特定用户,但在这种情况下,过滤器跟随不起作用。我有很长的帐户列表,我想流式传输以进行进一步分析,因此我在 Twitter 上准备了一个包含所有帐户的列表。有人知道怎么处理吗?

我的代码如下:

import tweepy
import sys

class MyStreamListener(tweepy.StreamListener):
    def on_status(self, status):
        print(status.id_str)
        # if "retweeted_status" attribute exists, flag this tweet as a retweet.
        is_retweet = hasattr(status, "retweeted_status")

        # check if text has been truncated
        if hasattr(status,"extended_tweet"):
            text = status.extended_tweet["full_text"]
        else:
            text = status.text

        # check if this is a quote tweet.
        is_quote = hasattr(status, "quoted_status")
        quoted_text = ""
        if is_quote:
            # check if quoted tweet's text has been truncated before recording it
            if hasattr(status.quoted_status,"extended_tweet"):
                quoted_text = status.quoted_status.extended_tweet["full_text"]
            else:
                quoted_text = status.quoted_status.text

        # remove characters that might cause problems with csv encoding
        remove_characters = [",","\n"]
        for c in remove_characters:
            text.replace(c," ")
            quoted_text.replace(c, " ")

        with open("out.csv", "a", encoding='utf-8') as f:
            f.write("%s,%s,%s,%s,%s,%s\n" % (status.created_at,status.user.screen_name,is_retweet,is_quote,text,quoted_text))

    def on_error(self, status_code):
        print("Encountered streaming error (", status_code, ")")
        sys.exit()

consumer_key = "..."
consumer_secret = "..."
access_token = "..."
access_token_secret = "..."


auth = tweepy.OAuthHandler(consumer_key,consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)

if (not api):
    print("Authentication failed!")
    sys.exit(-1)

myStreamListener = MyStreamListener()
myStream = tweepy.Stream(auth = api.auth, listener=myStreamListener,tweet_mode='extended')
with open("out.csv", "w", encoding='utf-8') as f:
        f.write("date,user,is_retweet,is_quote,text,quoted_text\n")
myStream.filter(follow=['52286608'])

【问题讨论】:

    标签: python twitter streaming tweepy


    【解决方案1】:

    您应该能够将follow 参数与以逗号分隔的用户ID 列表一起使用。来自 Twitter API 文档:

    follow
    A comma-separated list of user IDs, indicating the users whose Tweets should be delivered on the stream. Following protected users is not supported. For each user specified, the stream will contain:
    
    - Tweets created by the user.
    - Tweets which are retweeted by the user.
    - Replies to any Tweet created by the user.
    - Retweets of any Tweet created by the user.
    - Manual replies, created without pressing a reply button (e.g. “@twitterapi I agree”).
    
    The stream will not contain:
    
    - Tweets mentioning the user (e.g. “Hello @twitterapi!”).
    - Manual Retweets created without pressing a Retweet button (e.g. “RT @twitterapi The API is great”).
    - Tweets by protected users.
    

    通过这种方式,您最多可以关注 5000 个 ID。

    请注意,您正在连接的 API 已被 v2 filtered stream API 取代,但 Tweepy 目前不支持。

    【讨论】:

    • 感谢您的建议。我考虑过,但由于我有很多帐户要流式传输,我试图省略此选项。我在将 ID 与已添加到列表中的用户匹配时会遇到问题。你知道我怎么能很快做到这一点吗?那会有所帮助。谢谢
    • 这些是硬性限制,所以不幸的是,这是你唯一能做的事情。 users/lookup 端点在单个请求中支持多个(最多 100 个)帐户developer.twitter.com/en/docs/twitter-api/v1/accounts-and-users/…
    猜你喜欢
    • 1970-01-01
    • 2015-01-22
    • 2017-10-16
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2015-12-31
    • 1970-01-01
    相关资源
    最近更新 更多