【发布时间】:2021-08-07 12:53:34
【问题描述】:
我正在使用“select_object_content”从 S3 存储桶读取数据,一切正常。我可以从 s3 JSON 文件中获取结果。
但是在得到结果后,我检查了记录及其类型打印为字符串(即
代码示例
S3 中附加的示例JSON 文件
query = "SELECT * FROM s3object[*]['domain'][*] r where r.id > " + str(start) + " and r.id <= " + str(stop) + " limit " + str(pagesize);
r = s3.select_object_content(
Bucket=cache,
Key= key + '.json',
ExpressionType='SQL',
Expression= query,
InputSerialization={'JSON': {"Type": "Lines"}},
OutputSerialization={'JSON': {}},
)
for event in r['Payload']:
if 'Records' in event:
records = event['Records']['Payload'].decode('utf-8')
print(type(records)); // <class 'str'>
print(records); // Please see records printing below example
print(records['hostname']) // Throwing error - 3 records are printing together so cannot access first record
记录是这样打印的,我想访问这个对象内的值
{"id":6,"hostname":"amt.in.","subtype":"NS","value":"ns-529.awsdns-02.net.","passive_dns_count":"7"}
{"id":7,"hostname":"amt.in.","subtype":"NS","value":"ns-1288.awsdns-33.org.","passive_dns_count":"6"}
{"id":8,"hostname":"amt.in.","subtype":"NS","value":"ns-1288.awsdns-33.org.","passive_dns_count":"7"}
我尝试将字符串解析为如下所示的对象,但它也抛出错误
parsed_json = (json.loads(records))
print(parsed_json.hostname)
非常感谢您的帮助。谢谢。
还尝试删除 utf-8 编码,然后打印如下所示的记录
我尝试删除 utf-8 编码,现在出现一些有效错误
现在记录类型打印为字节
<class 'bytes'>
记录是这样打印的
b'{"id":6,"hostname":"amt.in.","subtype":"NS","value":"ns-529.awsdns-02.net.","passive_dns_count":"7"}\n{"id":7,"hostname":"amt.in.","subtype":"NS","value":"ns-1288.awsdns-33.org.","passive_dns_count":"6"}\n{"id":8,"hostname":"amt.in.","subtype":"NS","value":"ns-1288.awsdns-33.org.","passive_dns_count":"7"}\n{"id":9,"hostname":"amt.in.","subtype":"NS","value":"ns-1983.awsdns-55.co.uk.","passive_dns_count":"6"}\n{"id":10,"hostname":"amt.in.","subtype":"NS","value":"ns-1983.awsdns-55.co.uk.","passive_dns_count":"7"}\n'
【问题讨论】:
-
您能分享一下这个文件的示例内容吗?
-
@amitd 示例 json google drive 链接已附加,请检查
标签: python json amazon-web-services amazon-s3 boto3