【发布时间】:2015-06-15 23:43:55
【问题描述】:
我正在尝试收集一些本地化的推文,并将它们作为推文字典存储在我的硬盘上。在 fetchsamples 函数的某些迭代中,保存的字典被强制为空状态,尽管在 for 循环期间数据被添加到字典中(见下面的输出)。
我尝试了不同的编码或将“w”和“wb”标志传递给我的保存函数,但没有帮助。
我尝试使用随机字符串重现此内容(以便人们更轻松地检查我的代码),但我无法做到。我不确定推文结构或我的代码中的什么导致了这种行为。
注意:我添加了一个代码 sn-p 来捕捉字典被强制进入空状态以进行调试。
import oauth2 as oauth
import urllib2 as urllib
import json
import pickle
import os
api_key = "Insert api_key here"
api_secret = "Insert api_secret here"
access_token_key = "Insert access_token_key"
access_token_secret = "Insert access_token_secret"
_debug = 0
oauth_token = oauth.Token(key=access_token_key, secret=access_token_secret)
oauth_consumer = oauth.Consumer(key=api_key, secret=api_secret)
signature_method_hmac_sha1 = oauth.SignatureMethod_HMAC_SHA1()
http_method = "GET"
http_handler = urllib.HTTPHandler(debuglevel=_debug)
https_handler = urllib.HTTPSHandler(debuglevel=_debug)
def twitterreq(url, method, parameters):
req = oauth.Request.from_consumer_and_token(oauth_consumer,
token=oauth_token,
http_method=http_method,
http_url=url,
parameters=parameters)
req.sign_request(signature_method_hmac_sha1, oauth_consumer, oauth_token)
headers = req.to_header()
if http_method == "POST":
encoded_post_data = req.to_postdata()
else:
encoded_post_data = None
url = req.to_url()
opener = urllib.OpenerDirector()
opener.add_handler(http_handler)
opener.add_handler(https_handler)
response = opener.open(url, encoded_post_data)
return response
def fetchsamples():
url = "https://stream.twitter.com/1/statuses/sample.json"
url = "https://stream.twitter.com/1/statuses/filter.json?locations=-0.489,51.28,0.236,51.686"
parameters = []
response = twitterreq(url, "GET", parameters)
data = {}
count = 1
for line in response:
try:
strip = json.loads(line.strip())
if strip['coordinates'] != None:
data[count] = strip
count += 1
if count % 10 == 0:
print count, len(data.keys())
except Exception as e:
# Print error and store in a log file
print e
with open("/Temp/Data/error.log","w") as log:
log.write(str(e))
# If 100 tweets have passed save the file
if count % 100 == 0:
print "Before saving: ", len(data.keys())
fp = open("/Temp/Data/"+str(count/100)+".json","w")
json.dump(data,fp,encoding="latin-1")
fp.close()
# This code is for debug purposes to catch when dictionary
# when dictionary is forced into empty state
if os.path.getsize("/Temp/Data/"+str(count/100)+".json") < 10:
print "After saving: ", len(data.keys())
return data
else:
data = {}
data = fetchsamples()
这会产生以下输出而没有错误。 data 字典为空。
100 99
Before saving: 99
110 10
120 20
130 30
140 40
150 50
160 60
170 70
180 80
190 90
200 100
Before saving: 100
Before saving: 0
After saving: 0
【问题讨论】:
标签: python json python-2.7 twitter dictionary