【发布时间】:2014-06-09 23:27:27
【问题描述】:
Avro 附带了一个名为 Avro-Tools 的工具,可用于在 JSON、Avro-Schema (.avsc) 和二进制格式之间进行转换。 但它不适用于循环引用。
我们有两个文件:
circular.avsc(由 Avro 生成)
circular.json(由 Jackson 生成,因为它具有循环引用,而 Avro 不喜欢它)。
circle.avsc
{
"type":"record",
"name":"Parent",
"namespace":"bigdata.example.avro",
"fields":[
{
"name":"name",
"type":[
"null",
"string"
],
"default":null
},
{
"name":"child",
"type":[
"null",
{
"type":"record",
"name":"Child",
"fields":[
{
"name":"name",
"type":[
"null",
"string"
],
"default":null
},
{
"name":"parent",
"type":[
"null",
"Parent"
],
"default":null
}
]
}
],
"default":null
}
]
}
循环.json
{
"@class":"bigdata.example.avro.Parent",
"@circle_ref_id":1,
"name":"parent",
"child":{
"@class":"bigdata.example.avro.DerivedChild",
"@circle_ref_id":2,
"name":"hello",
"parent":1
}
}
在上面运行avro-tools的命令
java -jar avro-tools-1.7.6.jar fromjson --schema-file circular.avsc circular.json
输出
2014-06-09 14:29:17.759 java[55860:1607] 无法从 SCDynamicStore 加载领域映射信息 Objavro.codenullavro.schema? {"type":"record","name":"Parent","namespace":"bigdata.example.avro","fields":[{"name":"name","type":["null ","string"],"default":null},{"name":"child","type":["null",{"type":"record","name":"Child","字段":[{"name":"name","type":["null","string"],"default":null},{"name":"parent","type":["null ","Parent"],"default":null}]}],"default":null}]}?'???K?jH!??Ė? 线程“main”org.apache.avro 中的异常。 AvroTypeException:预期的启动联合。收到 VALUE_STRING 在 org.apache.avro.io.JsonDecoder.error(JsonDecoder.java:697)
在 org.apache.avro.io.JsonDecoder.readIndex(JsonDecoder.java:441)
在 org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
其他一些 JSON 值尝试使用相同的架构,但没有成功
JSON 1
{
"name":"parent",
"child":{
"name":"hello",
"parent":null
}
}
JSON 2
{
"name":"parent",
"child":{
"name":"hello",
}
}
JSON 3
{
"@class":"bigdata.example.avro.Parent",
"@circle_ref_id":1,
"name":"parent",
"child":{
"@class":"bigdata.example.avro.DerivedChild",
"@circle_ref_id":2,
"name":"hello",
"parent":null
}
}
删除一些“可选”元素:
circle.avsc
{
"type":"record",
"name":"Parent",
"namespace":"bigdata.example.avro",
"fields":[
{
"name":"name",
"type":
"string",
"default":null
},
{
"name":"child",
"type":
{
"type":"record",
"name":"Child",
"fields":[
{
"name":"name",
"type":
"string",
"default":null
},
{
"name":"parent",
"type":
"Parent",
"default":null
}
]
},
"default":null
}
]
}
circular.json
{
"@class":"bigdata.example.avro.Parent",
"@circle_ref_id":1,
"name":"parent",
"child":{
"@class":"bigdata.example.avro.DerivedChild",
"@circle_ref_id":2,
"name":"hello",
"parent":1
}
}
输出
2014-06-09 15:30:53.716 java[56261:1607] 无法从 SCDynamicStore 加载领域映射信息 Objavro.codenullavro.schema?{"type":"record","name":"Parent","namespace":"bigdata.example.avro","fields":[{"name":"name"," type":"string","default":null},{"name":"child","type":{"type":"record","name":"Child","fields":[{ "name":"name","type":"string","default":null},{"name":"parent","type":"Parent","default":null}]}," default":null}]}?x?N??O"?M?`AbException in thread "main" java.lang.StackOverflowError
在 org.apache.avro.io.parsing.Symbol.flattenedSize(Symbol.java:212)
在 org.apache.avro.io.parsing.Symbol$Sequence.flattenedSize(Symbol.java:323)
在 org.apache.avro.io.parsing.Symbol.flattenedSize(Symbol.java:216)
在 org.apache.avro.io.parsing.Symbol$Sequence.flattenedSize(Symbol.java:323)
在 org.apache.avro.io.parsing.Symbol.flattenedSize(Symbol.java:216)
在 org.apache.avro.io.parsing.Symbol$Sequence.flattenedSize(Symbol.java:323)
有人知道我如何使用 Avro 进行循环引用吗?
【问题讨论】:
标签: json apache hadoop jackson avro