【问题标题】:Avro Schema array without name没有名称的 Avro Schema 数组
【发布时间】:2021-05-27 13:40:48
【问题描述】:

问题 1

我想知道以下架构对于 Avro 架构是否有效。请注意,字段数组的第一个对象中缺少名称。

{
  "name": "AgentRecommendationList",
  "type": "record",
  "fields": [
      {
          "type": {
              "type": "array",
              "items": {
                  "name": "friend",
                  "type": "record",
                  "fields": [
                      {
                          "name": "Name",
                          "type": "string"
                      },
                      {
                          "name": "phoneNumber",
                          "type": "string"
                      },
                      {
                          "name": "email",
                          "type": "string"
                      }
                  ]
              }
          }
      }
  ]
}

实际上是针对以下类型的数据设计的

[
        {
            "Name": "1",
            "phoneNumber": "2",
            "email": "3"
        },
        {
            "Name": "1",
            "phoneNumber": "2",
            "email": "3"
        },
        {
            "Name": "1",
            "phoneNumber": "2",
            "email": "3"
        }
 ]

根据下面的阅读,似乎不允许这样没有名称的数组

Avro Schema failure

There is no way to define and avro schema with an array without a field name.

https://avro.apache.org/docs/current/spec.html#schema_complex

name: a JSON string providing the name of the field (required), and

我怀疑下面是正确的

{
  "name": "AgentRecommendationList",
  "type": "record",
  "fields": [
      {
          "name": "friends",
          "type": {
              "type": "array",
              "items": {
                  "name": "friend",
                  "type": "record",
                  "fields": [
                      {
                          "name": "Name",
                          "type": "string"
                      },
                      {
                          "name": "phoneNumber",
                          "type": "string"
                      },
                      {
                          "name": "email",
                          "type": "string"
                      }
                  ]
              }
          }
      }
  ]
}

并且它应该有如下数据,以便成功进行 avro 转换

{
  "friends": [
      {
          "Name": "1",
          "phoneNumber": "2",
          "email": "3"
      },
      {
          "Name": "1",
          "phoneNumber": "2",
          "email": "3"
      },
      {
          "Name": "1",
          "phoneNumber": "2",
          "email": "3"
      }
  ]
}

问题 2

以下架构是有效架构吗?这以第一个示例中没有名称的数组为目标...

{
  "name": "AgentRecommendationList",
  "type": "array",
  "items": {
      "name": "friend",
      "type": "record",
      "fields": [
          {
              "name": "Name",
              "type": "string"
          },
          {
              "name": "phoneNumber",
              "type": "string"
          },
          {
              "name": "email",
              "type": "string"
          }
      ]
   }
}

如果有人能确认我的理解,我将不胜感激......谢谢!

【问题讨论】:

    标签: avro


    【解决方案1】:

    对于问题 1...

    你写的都是对的。正如您所提到的,第一个模式无效,因为记录中的每个字段都需要有一个name。更正后的架构有效,并且更正后的数据适合更新后的架构。

    对于问题 2...

    问题二的架构是有效的,但AgentRecommendationList 名称将被忽略。数组没有名称。在查看问题一的示例后,这可能听起来很奇怪,但在这些示例中,namefield 规范的一部分,而不是数组。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2019-12-21
      • 2019-03-21
      • 2021-06-12
      • 1970-01-01
      • 2017-09-16
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多