【发布时间】:2021-08-16 03:06:00
【问题描述】:
我正在从另一个表 (my_existing_table) 创建一个新表 (my_new_table),该表有 4 列,product 和monthly_budget 具有我试图提取的嵌套值:
产品列是这样的字典:
{"name": "Display", "full_name": "Ad Bundle"}
MONTHLY_BUDGETS 是一个包含多个字典的列表,该列如下所示:
[{"id": 123, "quantity_booked": "23", "budget_booked": "0.0", "budget_booked_loc": "0.0"} ,
{"id": 234, "quantity_booked": "34", "budget_booked": "0.0", "budget_booked_loc": "0.0"},
{"id": 455, "quantity_booked": "44", "budget_booked": "0.0", "budget_booked_loc": "0.0"}]
以下是我创建新表并从另一个表中取消嵌套的操作:
CREATE OR REPLACE TABLE my_new_table as (
with og_table as (
select
id,
parse_json(product) as PRODUCT,
IO_NAME,
parse_json(MONTHLY_BUDGETS) as MONTHLY_BUDGETS
from my_existing_table
)
select
id,
PRODUCT:name::string as product_name,
PRODUCT:full_name::string as product_full_name,
IO_NAME,
MONTHLY_BUDGETS:id::integer as monthly_budgets_id,
MONTHLY_BUDGETS:quantity_booked::float as monthly_budgets_quantity_booked,
MONTHLY_BUDGETS:budget_booked_loc::float as monthly_budgets_budget_booked_loc
from og_table,
lateral flatten( input => PRODUCT) as PRODUCT,
lateral flatten( input => MONTHLY_BUDGETS) as MONTHLY_BUDGETS);
但是,一旦我的新表被创建并运行它: 选择不同的 id,count(*) 来自 my_new_table 其中 id = '123' 按 1 分组;
我在 count(*) 列下看到了 18,而我应该只有 1,所以看起来有很多重复,但为什么呢?以及如何防止这种情况发生?
【问题讨论】:
标签: snowflake-cloud-data-platform