【问题标题】:Test and convert avro schema (.avsc) to .avro. AttributeError, array and encoding测试 avro 架构 (.avsc) 并将其转换为 .avro。 AttributeError、数组和编码
【发布时间】:2017-11-07 10:23:26
【问题描述】:

我刚刚开始使用 hadoop,并且正在使用 Avro (fastavro)。

1- 我想验证架构并转换为 .avro 文件。

{
 "type": "record",
 "name": "Node",
 "fields": [
    {
        "name": "nom",
        "type": "string"
    },
    {
        "name": "zone",
        "type": {
            "type": "map",
            "values": "string"
        }
    },
    {
        "name": "price",
        "type": "float"
    },
    {
        "name": "type",
        "type": "string"
    }
  ]
}

我的测试文件(验证架构):

#!/usr/local/bin/python
# -*- coding: utf-8 -*-
import json
import fastavro

schema = json.load(open("myschema.avsc"))
records = [
    {
        "nom": "blabla",
        "zone": ["north", "south", "east"],
        "prix": 4.0,
        "type": "geoloc"
    }
]

fastavro.writer(open("myschema.avro", "wb"), schema, records)

我有这个错误:

Traceback (most recent call last):
  File "test-schema.py", line 17, in <module>
    fastavro.writer(open("myschema.avro", "wb"), schema, records)
  File "/var/www/data-machine/HDFS/env/lib/python3.5/site-packages/fastavro/writer.py", line 614, in writer
    output.write(record)
  File "/var/www/data-machine/HDFS/env/lib/python3.5/site-packages/fastavro/writer.py", line 537, in write
    write_data(self.io, record, self.schema)
  File "/var/www/data-machine/HDFS/env/lib/python3.5/site-packages/fastavro/writer.py", line 432, in write_data
    return fn(fo, datum, schema)
  File "/var/www/data-machine/HDFS/env/lib/python3.5/site-packages/fastavro/writer.py", line 363, in write_record
    name, field.get('default')), field['type'])
  File "/var/www/data-machine/HDFS/env/lib/python3.5/site-packages/fastavro/writer.py", line 432, in write_data
    return fn(fo, datum, schema)
  File "/var/www/data-machine/HDFS/env/lib/python3.5/site-packages/fastavro/writer.py", line 232, in write_map
    for key, val in iteritems(datum):
  File "/var/www/data-machine/HDFS/env/lib/python3.5/site-packages/fastavro/six.py", line 27, in py3_iteritems
    return obj.items()
AttributeError: 'list' object has no attribute 'items'

2- 而且,如果我添加一个数组:

{
    "name": "ingredients", 
    "type": ["string"]
},

错误:

  File "/var/www/data-machine/HDFS/env/lib/python3.5/site-packages/fastavro/writer.py", line 345, in write_union
    raise ValueError(msg)
ValueError: ["north", "south", "east"] (type <class 'list'>) do not match ['string']

最后,我想让“区域”字段可选...

谢谢 :) 法布里斯

【问题讨论】:

    标签: hadoop avro


    【解决方案1】:

    您的地图记录信息有误。它期待像

    "zone":{"key1":"val1","key2":"val2","key3":"val3"},
    

    这是一张地图,而不是一组。如果您想要类似示例的内容,则需要使用数组而不是地图

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2020-07-13
      • 2019-11-22
      • 1970-01-01
      • 2019-11-22
      • 1970-01-01
      • 2015-11-18
      • 2013-07-01
      • 1970-01-01
      相关资源
      最近更新 更多