【问题标题】:Reading JSON file parse objects error in Python在 Python 中读取 JSON 文件解析对象错误
【发布时间】:2021-08-07 12:53:34
【问题描述】:

我正在使用“select_object_content”从 S3 存储桶读取数据,一切正常。我可以从 s3 JSON 文件中获取结果。 但是在得到结果后,我检查了记录及其类型打印为字符串(即 ),但我无法访问该对象内的值并且它抛出了一个错误。

代码示例

S3 中附加的示例JSON 文件

query = "SELECT * FROM s3object[*]['domain'][*] r where r.id > " + str(start) + " and r.id <= " + str(stop) + " limit " + str(pagesize);
      r = s3.select_object_content(
             Bucket=cache,
             Key= key + '.json',
             ExpressionType='SQL',
             Expression= query,
             InputSerialization={'JSON': {"Type": "Lines"}},
             OutputSerialization={'JSON': {}},
      )
   for event in r['Payload']:
         if 'Records' in event:
             records = event['Records']['Payload'].decode('utf-8')
             print(type(records));  // <class 'str'>
             print(records);  // Please see records printing below example 
             print(records['hostname']) // Throwing error - 3 records are printing together so cannot access first record 

记录是这样打印的,我想访问这个对象内的值

{"id":6,"hostname":"amt.in.","subtype":"NS","value":"ns-529.awsdns-02.net.","passive_dns_count":"7"}
{"id":7,"hostname":"amt.in.","subtype":"NS","value":"ns-1288.awsdns-33.org.","passive_dns_count":"6"}
{"id":8,"hostname":"amt.in.","subtype":"NS","value":"ns-1288.awsdns-33.org.","passive_dns_count":"7"}

我尝试将字符串解析为如下所示的对象,但它也抛出错误

parsed_json = (json.loads(records))
print(parsed_json.hostname) 

非常感谢您的帮助。谢谢。

还尝试删除 utf-8 编码,然后打印如下所示的记录

我尝试删除 utf-8 编码,现在出现一些有效错误

现在记录类型打印为字节

<class 'bytes'>

记录是这样打印的

b'{"id":6,"hostname":"amt.in.","subtype":"NS","value":"ns-529.awsdns-02.net.","passive_dns_count":"7"}\n{"id":7,"hostname":"amt.in.","subtype":"NS","value":"ns-1288.awsdns-33.org.","passive_dns_count":"6"}\n{"id":8,"hostname":"amt.in.","subtype":"NS","value":"ns-1288.awsdns-33.org.","passive_dns_count":"7"}\n{"id":9,"hostname":"amt.in.","subtype":"NS","value":"ns-1983.awsdns-55.co.uk.","passive_dns_count":"6"}\n{"id":10,"hostname":"amt.in.","subtype":"NS","value":"ns-1983.awsdns-55.co.uk.","passive_dns_count":"7"}\n'

【问题讨论】:

  • 您能分享一下这个文件的示例内容吗?
  • @amitd 示例 json google drive 链接已附加,请检查

标签: python json amazon-web-services amazon-s3 boto3


【解决方案1】:

你可以试试eval。来自你的 sn-p

records = event['Records']['Payload'].decode('utf-8')
records = eval(records)

print(type(records))  ## <class 'dict'>

更新::

遍历它们

records = records.split()
for doc in records:
    doc = eval(doc)
    print(doc)

【讨论】:

  • 我尝试在这一行记录错误 = eval(records)
  • records = eval(records) 文件 "",第 2 行 {"id":7,"hostname":"amt.in.","subtype":"NS"," value":"ns-1288.awsdns-33.org.","passive_dns_count":"6"}
  • 您能否详细描述错误。您也可以尝试删除 .decode('utf-8')
  • 我删除了 utf-8 现在记录类型打印为字节,我编辑了我的问题以包含字节数据,请检查
  • 不要删除 .decode('utf-8')。另外,在使用 eval 之前使用 split。它会将其转换为字符串列表。然后遍历列表,然后使用 eval 函数
【解决方案2】:

以下是代码 sn-p,用于根据查询执行的响应打印主机名;

    for event in r['Payload']:
        if 'Records' in event:
            records = event['Records']['Payload'].decode('utf-8').split('\n')
            for record in records:
                if len(record) > 0:
                    row=json.loads(record)
                    print(row["hostname"])

【讨论】:

    猜你喜欢
    • 2014-09-17
    • 1970-01-01
    • 1970-01-01
    • 2017-01-02
    • 1970-01-01
    • 1970-01-01
    • 2011-10-15
    • 2017-10-17
    相关资源
    最近更新 更多