有没有办法用 Tweepy 索引推文？答案

【问题标题】：Is there a way to index a tweet with Tweepy?有没有办法用 Tweepy 索引推文？
【发布时间】：2020-11-06 17:19:13
【问题描述】：

我正在尝试编写一个 Twitter 机器人脚本，该机器人将响应其中包含方程式的提及。首先，我让提及工作（它会回应任何提及它的人）。然后，我尝试实现使用正则表达式的数学函数（我已经创建了这个，它只是将它集成到主机器人程序中的一种手段）。

提及代码：

import mathbotcreds as mtc
import logging
import re
import tweepy
from time import sleep as wait

auth = tweepy.OAuthHandler(mtc.CONSUMER_KEY, mtc.CONSUMER_SECRET)
auth.set_access_token(mtc.ACCESS_TOKEN, mtc.ACCESS_SECRET)

api = tweepy.API(auth, wait_on_rate_limit=True,
                 wait_on_rate_limit_notify=True,
                 retry_count=2)

try:
    api.verify_credentials()
    print("Authentication Successful!")
except:
    print("Error during authentication! :(")
    
mentions = api.mentions_timeline()
pattern = r'([0-9]+.*[-+*/%].*[0-9]+)+' 

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger()

def check_mentions(api, since_id):
    logger.info("Collecting mentions... ")
    new_since_id = since_id
    for tweet in tweepy.Cursor(api.mentions_timeline, since_id=since_id).items():
        new_since_id = max(tweet.id, new_since_id)
        
        if tweet.in_reply_to_status_id is not None:
            continue
        
            api.update_status(
                status=f"Hello! \n\nIt worked! \nYay! ^-^ \n\n (You said: \"{tweet.text}\".)",
                in_reply_to_status_id=tweet.id) 
        
    return new_since_id

def main():
    since_id = 1
    while True:
        since_id = check_mentions(api, since_id)
        logger.info("Waiting... ")
        wait(15) 
        
if __name__ == "__main__":
    logger.info("Running script... ")
    wait(1) 
    main()
    
# for m in mentions:
#     api.update_status(f"@{m.user.screen_name} Hello! \nYou said: \n{m.text}", m.id)
#     wait(15)

提及代码和方程函数：

import mathbotcreds as mtc
import logging
import re
import tweepy
from time import sleep as wait

auth = tweepy.OAuthHandler(mtc.CONSUMER_KEY, mtc.CONSUMER_SECRET)
auth.set_access_token(mtc.ACCESS_TOKEN, mtc.ACCESS_SECRET)

api = tweepy.API(auth, wait_on_rate_limit=True,
                 wait_on_rate_limit_notify=True,
                 retry_count=2)

try:
    api.verify_credentials()
    print("Authentication Successful!")
except:
    print("Error during authentication! :(")
    
mentions = api.mentions_timeline()
pattern = r'([0-9]+.*[-+*/%].*[0-9]+)+' 

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger()

def check_mentions(api, since_id):
    logger.info("Collecting mentions... ")
    new_since_id = since_id
    for tweet in tweepy.Cursor(api.mentions_timeline, since_id=since_id).items():
        match = re.search(pattern, tweet.text)
        equation = tweet.text[match.start():match.end()] 
        new_since_id = max(tweet.id, new_since_id)
        
        if tweet.in_reply_to_status_id is not None:
            continue
        
        if match:
            ans = eval(tweet.text[match.start():match.end()]) 
            api.update_status(
                status=f"The answer to {str(equation)} is {ans}. ",
                in_reply_to_status_id=tweet.id)
        elif not match:
            api.update_status(
                status=f"Hello! \n\nIt worked! \nYay! ^-^ \n\n (You said: \"{tweet.text}\".)",
                in_reply_to_status_id=tweet.id) 
        
    return new_since_id

def main():
    since_id = 1
    while True:
        since_id = check_mentions(api, since_id)
        logger.info("Waiting... ")
        wait(15) 
        
if __name__ == "__main__":
    logger.info("Running script... ")
    wait(1) 
    main()
    
# for m in mentions:
#     api.update_status(f"@{m.user.screen_name} Hello! \nYou said: \n{m.text}", m.id)
#     wait(15)

当我运行它时，我收到一条错误消息，指出 AttributeError: 'NoneType' object has no attribute 'start' 在 eval() 函数 (equation = tweet.text[match.start():match.end()]) 上。我已经研究了这个以及如何索引推文文本（使用 Tweepy）。如果我在eval() 函数的正上方有一个函数，我很困惑为什么会收到NoneType 错误。这不应该抓住吗？为什么会这样？

谢谢！

【问题讨论】：

标签： python regex twitter tweepy twitterapi-python

【解决方案1】：

re.search 在找不到匹配项时返回NoneType。您应该在使用它之前检查返回值，如下所示：

match = re.search(pattern, tweet.text)
if match:
    equation = tweet.text[match.start():match.end()]

【讨论】：

我想，当没有匹配时，它会返回 False，当有 is 匹配时，它会返回 True。现在我明白了，我仍然不确定该怎么做，因为这个脚本（基本上是Real Python's 脚本的副本）正在检查我的提及时间线。那么我应该创建一个函数/if 语句来处理这个错误吗？我的想法是，如果没有匹配，我不应该得到任何错误，脚本应该继续。你能帮我理解为什么它没有吗？谢谢！
要确定，请查看该方法的文档 - 这里：docs.python.org/3/library/re.html#re.search；在这种情况下，re.search 方法如果找不到匹配项，则返回 None。您可以通过以下方式验证这一点：(1) 在该行设置断点，(2) 检查 pattern 的内容，(3) 检查 tweet.text 的内容，(4) 确定它们是否匹配，(5) 跳过执行它的语句，(6) 检查match 的内容，其中它将持有match 对象或None。
好的！谢谢！我忘记了那些“断点”的东西......他们又做了什么（我从来没有真正理解他们做了什么，因此我从来没有真正使用过它们的原因）？另外，如果它找到None，那么我需要它来做点什么。这是因为每条推文（我提到过的地方）都没有匹配项，所以我需要确保它执行某些操作，而不是在发生这种情况时返回错误消息。我仍然需要知道断点的作用（我已经搜索过它，但我仍然有点困惑，我知道它实际上暂停了程序，但它是否返回任何东西？它会在一段时间后继续吗？）...跨度>
您使用的是什么 IDE？查看文档以了解如何进行调试。使用调试器单步执行代码并检查变量是理解代码的好方法。此外，如果您想提高代码质量，请查看单元测试。