【问题标题】:Json data flattening on snowflakeJson 数据在雪花上变平
【发布时间】:2021-05-02 15:25:03
【问题描述】:

我正在尝试将雪花上的 Json 数据压平:

Json 数据:

 {
    "empDetails": [
        {
            "kind": "person",
            "fullName": "John Doe",
            "age": 22,
            "gender": "Male",
            "phoneNumber": {
                "areaCode": "206",
                "number": "1234567"
            },
            "children": [
                {
                    "name": "Jane",
                    "gender": "Female",
                    "age": "6"
                },
                {
                    "name": "John",
                    "gender": "Male",
                    "age": "15"
                }
            ],
            "citiesLived": [
                {
                    "place": "Seattle",
                    "yearsLived": [
                        "1995"
                    ]
                },
                {
                    "place": "Stockholm",
                    "yearsLived": [
                        "2005"
                    ]
                }
            ]
        },
        {
            "kind": "person",
            "fullName": "Mike Jones",
            "age": 35,
            "gender": "Male",
            "phoneNumber": {
                "areaCode": "622",
                "number": "1567845"
            },
            "children": [
                {
                    "name": "Earl",
                    "gender": "Male",
                    "age": "10"
                },
                {
                    "name": "Sam",
                    "gender": "Male",
                    "age": "6"
                },
                {
                    "name": "Kit",
                    "gender": "Male",
                    "age": "8"
                }
            ],
            "citiesLived": [
                {
                    "place": "Los Angeles",
                    "yearsLived": [
                        "1989",
                        "1993",
                        "1998",
                        "2002"
                    ]
                },
                {
                    "place": "Washington DC",
                    "yearsLived": [
                        "1990",
                        "1993",
                        "1998",
                        "2008"
                    ]
                },
                {
                    "place": "Portland",
                    "yearsLived": [
                        "1993",
                        "1998",
                        "2003",
                        "2005"
                    ]
                },
                {
                    "place": "Austin",
                    "yearsLived": [
                        "1973",
                        "1998",
                        "2001",
                        "2005"
                    ]
                }
            ]
        },
        {
            "kind": "person",
            "fullName": "Anna Karenina",
            "age": 45,
            "gender": "Female",
            "phoneNumber": {
                "areaCode": "425",
                "number": "1984783"
            },
            "citiesLived": [
                {
                    "place": "Stockholm",
                    "yearsLived": [
                        "1992",
                        "1998",
                        "2000",
                        "2010"
                    ]
                },
                {
                    "place": "Russia",
                    "yearsLived": [
                        "1998",
                        "2001",
                        ""
                    ]
                },
                {
                    "place": "Austin",
                    "yearsLived": [
                        "1995",
                        "1999"
                    ]
                }
            ]
        }
    ]
}

我能够展平除列/数组年份之外的大部分数据, 对于最后一列,我得到空值。

以下是我迄今为止尝试过的:

  select empd.value:kind,
  empd.value:fullName,
  empd.value:age,
  empd.value:gender,   
  empd.value:phoneNumber,
  empd.value:phoneNumber.areaCode, 
  empd.value:phoneNumber.number ,
  empd.value:children, 
  chldrn.value:name,
  chldrn.value:gender,
  chldrn.value:age,
  city.value:place,
  yr.value:yearsLived
  from my_json emp,
  lateral flatten(input=>emp.Json_data:empDetails) empd , 
  lateral flatten(input=>empd.value:children, OUTER => TRUE) chldrn,   
  lateral flatten(input=>empd.value:citiesLived) city,
  lateral flatten(input=>city.value:yearsLived) yr -- not getting data for 
  this array

有人能帮我理解为什么我的 yearsLived 数组的值为空吗?我不确定我是否在这里遗漏了什么

【问题讨论】:

    标签: snowflake-cloud-data-platform


    【解决方案1】:

    您的查询返回列

    yr.value:yearsLived
    

    好像yr.value 是一个带有字段的对象。

    但是你已经扩展了行中的yearsLived字段

    lateral flatten(input=>city.value:yearsLived) yr 
    

    所以yr.value 实际上只是一个包含年份的VARIANT。您可以保留它,或者将其包装在 TO_NUMBERTO_VARCHAR 中以获得更精确的类型。

    【讨论】:

      【解决方案2】:

      你为什么不试试这个。

      create or replace table json_tab as
      select parse_json('{ "place": "Austin","yearsLived": [ "1995","1999"]}') as years
      select years:yearsLived[0]::int from json_tab
      

      由于您的 JSON 数据是一个数组,如果您想获取特定值或使用任何数组函数来分解它,则需要通过索引访问元素。

      具有展平功能

      select years, v.value::string 
      from json_tab, 
      lateral flatten(input =>years:yearsLived ) v;
      

      【讨论】:

      • 我可以平年分开住。 . .我正在寻找与其他元素和数组一起生活的扁平化年份
      猜你喜欢
      • 2021-07-23
      • 2021-09-08
      • 2021-12-18
      • 1970-01-01
      • 1970-01-01
      • 2021-07-16
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多