【发布时间】:2017-06-25 07:24:54
【问题描述】:
我试图使用 tweepy 挖掘 twitter 数据并将数据加载到 JSON 文件中,但是以下代码:
import tweepy
from tweepy import OAuthHandler
import json
consumer_key = '****'
consumer_secret = '****'
access_token = '****'
access_secret = '****'
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
api = tweepy.API(auth)
f = open('twitterdata.json', 'a+')
for status in tweepy.Cursor(api.home_timeline).items(10):
json.dump(status._json, f)
line = f.readline()
tweet = json.loads(line)
print json.dumps(tweet, indent = 4)
产生错误:
Traceback (most recent call last):
File "mytwittermine.py", line 21, in <module>
tweet = json.loads(line)
File "/usr/lib/python2.7/json/__init__.py", line 339, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 382, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
更新
正如其中一个答案所建议的,我现在在 for 循环的每次迭代中添加一个换行符,所以现在的代码是:
import tweepy
from tweepy import OAuthHandler
import json
consumer_key = '****'
consumer_secret = '****'
access_token = '****'
access_secret = '****'
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
api = tweepy.API(auth)
f = open('twitterdata.json', 'a+')
for status in tweepy.Cursor(api.home_timeline).items(10):
json.dump(status._json, f)
f.write('\n')
f.seek(0)
line = f.readline()
tweet = json.loads(line)
print json.dumps(tweet, indent = 4)
上面的代码给出了ValueError:
Traceback (most recent call last):
File "mytwittermine.py", line 23, in <module>
tweet = json.loads(line)
File "/usr/lib/python2.7/json/__init__.py", line 339, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 367, in decode
raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 1 column 3115 - line 2 column 1 (char 3114 - 301245)
【问题讨论】:
-
那么,
line读到的第一个内容是什么? -
你能提供写好的文件吗?
-
这是一个相当大的文件@StamKaly 我可以复制它的第一行(它有 10 行对应于 10 次迭代)
-
好吧,第一行就够了。
-
@StamKaly 第一行有 301425 个字符,显然问题详细信息无法编辑
标签: python json python-2.7 tweepy