【问题标题】:In BigQuery, how do I check if two ARRAY of STRUCTs are equal在 BigQuery 中,如何检查两个 ARRAY 的 STRUCT 是否相等
【发布时间】:2020-12-22 17:18:15
【问题描述】:

我有一个输出两个结构数组的查询:

SELECT modelId, oldClassCounts, newClassCounts
FROM `xyz`
GROUP BY 1

如果oldClassCounts = newClassCounts,我如何创建另一列TRUE

这是 JSON 格式的示例结果:

[
  {
    "modelId": "FBF21609-65F8-4076-9B22-D6E277F1B36A",
    "oldClassCounts": [
      {
        "id": "A041EBB1-E041-4944-B231-48BC4CCE025B",
        "count": "33"
      },
      {
        "id": "B8E4812B-A323-47DD-A6ED-9DF877F501CA",
        "count": "82"
      }
    ],
    "newClassCounts": [
      {
        "id": "A041EBB1-E041-4944-B231-48BC4CCE025B",
        "count": "33"
      },
      {
        "id": "B8E4812B-A323-47DD-A6ED-9DF877F501CA",
        "count": "82"
      }
    ]
  }
]

如果oldClassCountsnewClassCounts 与上面的输出完全相同,我希望相等列为TRUE

其他的都应该是假的。

【问题讨论】:

    标签: sql google-bigquery


    【解决方案1】:

    我会采用这个解决方案

    #standardSQL
    WITH xyz AS (
      SELECT "FBF21609-65F8-4076-9B22-D6E277F1B36A" AS modelId, 
          [STRUCT("A041EBB1-E041-4944-B231-48BC4CCE025B" as id, "33" as count),
          STRUCT("B8E4812B-A323-47DD-A6ED-9DF877F501CA" as id, "82" as count)] AS oldClassCounts,
          [STRUCT("A041EBB1-E041-4944-B231-48BC4CCE025B" as id, "33" as count),
          STRUCT("B8E4812B-A323-47DD-A6ED-9DF877F501CA" as id, "82" as count)] as newClassCounts),
    
    o as (SELECT modelId, id, count, array_length(oldClassCounts) as len FROM xyz, UNNEST(oldClassCounts) as old_c),
    n as (SELECT modelId, id, count, array_length(newClassCounts) as len FROM xyz, UNNEST(newClassCounts) as new_c),
    uneq as (select * from o except distinct select * from n)
    select xyz.*, IF(uneq.modelId is not null, false, true) as equal from xyz left join (select distinct modelId from uneq) uneq on xyz.modelId = uneq.modelId
    

    无论顺序或数组中有重复项,它都可以正常工作。我们的想法是,我们将每个数组视为一个单独的临时表,删除一个存在但另一个不存在的所有元素(使用除了 distinct),然后额外检查数组的长度以防有重复,例如

    "FBF21609-65F8-4076-9B22-D6E277F1B36A" AS modelId, 
          [STRUCT("A041EBB1-E041-4944-B231-48BC4CCE025B" as id, "33" as count),
          STRUCT("B8E4812B-A323-47DD-A6ED-9DF877F501CA" as id, "82" as count),
          STRUCT("B8E4812B-A323-47DD-A6ED-9DF877F501CA" as id, "82" as count)]
    

    【讨论】:

      【解决方案2】:

      我会考虑比较应用于这两个数组的TO_JSON_STRING 函数的结果。

      在查询中,它将通过以下方式完成:

      SELECT modelId, 
             oldClassCounts, 
             newClassCounts, 
             CASE WHEN TO_JSON_STRING(oldClassCounts) = TO_JSON_STRING(newClassCounts) 
                 THEN true 
                 ELSE false 
             END
      FROM `xyz`;
      

      我不确定GROUP BY 1 部分,因为没有字段被分组或聚合。

      如果数组中元素的顺序不同,它就不会起作用。此解决方案并不完美,但适用于您提供的数据。

      【讨论】:

        猜你喜欢
        • 2021-06-23
        • 2015-08-21
        • 1970-01-01
        • 2019-08-27
        • 2019-04-27
        • 2016-09-18
        • 2023-03-07
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多