tweepy api.user_timeline：计数限制为 200答案

【问题标题】：tweepy api.user_timeline: count limited to 200tweepy api.user_timeline：计数限制为 200
【发布时间】：2018-03-25 20:50:51
【问题描述】：

似乎使用 tweepy 我只能使用 user_timeline 方法获得 200 条推文。

class Twitter_User():
    def __init__(self,id,count=200):
        self.id = id
        self.count = count
        self.data = None
    def get_tweets(self):
        store_tweets = api.user_timeline(self.id, count=self.count)
        simple_list = []
        for status in store_tweets:
            array = [status._json["text"].strip(), status._json["favorite_count"], status._json["created_at"],status._json["retweet_count"],[h["text"] for h in status._json["entities"]["hashtags"]]]
            simple_list.append(array)
        self.data = pd.DataFrame(simple_list, columns=["Text", "Like", "Created at","Retweet","Hashtags"])
        self.data = self.data[~self.data["Text"].str.startswith('RT')]
        return self.data
    def __repr__(self):
        id = api.get_user(self.id)
        return id.screen_name

如果我输入大于 200 的数字作为 self.count，我总是会得到一个包含 200 行的数据框，相反，如果我输入一个较小的数字，我会得到正确的行数。不知道，有限制还是得用别的方法？

【问题讨论】：

标签： python twitter tweepy

【解决方案1】：

根据Twitter API docs，您可以从/statuses/user_timeline/ 检索的最多记录是200

从count参数的定义来看：

指定要尝试和检索的推文数量，每个不同请求最多 200 个。最好将 count 的值视为对要返回的推文数量的限制，因为在应用计数后会删除暂停或删除的内容。即使未提供 include_rts，我们也会在计数中包含转推。建议您在使用此 API 方法时始终发送 include_rts=1。

从 api.py 第 114 行中的tweepy source code：

@property
def user_timeline(self):
    """ :reference: https://dev.twitter.com/rest/reference/get/statuses/user_timeline
        :allowed_param:'id', 'user_id', 'screen_name', 'since_id', 'max_id', 'count', 'include_rts'
    """
    return bind_api(
        api=self,
        path='/statuses/user_timeline.json',
        payload_type='status', payload_list=True,
        allowed_param=['id', 'user_id', 'screen_name', 'since_id',
                       'max_id', 'count', 'include_rts']
    )

【讨论】：

【解决方案2】：

您在一个请求中最多只能获得 200 条推文。但是，您可以连续请求较旧的推文。您可以在时间线中获得的最大推文数为 3200。参考是 here。

您可以使用 tweepy 来执行此操作，但您需要使用 tweepy 的光标来获取这些连续的推文页面。查看this 以帮助您入门。

【讨论】：

【解决方案3】：

要获得超过 200 个，您需要在 user_timeline 上使用 cursor，然后遍历页面。

import tweepy

# Consumer keys and access tokens, used for OAuth
consumer_key = ''
consumer_secret = ''
access_token = ''
access_token_secret = ''

# OAuth process, using the keys and tokens
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)

# Creation of the actual interface, using authentication
api = tweepy.API(auth)

for pages in tweepy.Cursor(api.user_timeline, id='id', count=200).pages():        
   print(pages)

【讨论】：

【解决方案4】：

使用 tweepy 光标， #MuniLima 是推特账号， #最初为空的列表，它们以For循环开始。存储高音单元值：'create_at','favourite_count','text'

tweeteo=[]
likes=[]
time = []
for tuit in tweepy.Cursor(api.user_timeline,screen_name='MuniLima').items(2870):
    time.append(tuit.created_at)
    likes.append(tuit.favorite_count)
    tweeteo.append(tuit.text)

【讨论】：