【发布时间】:2021-11-27 05:13:49
【问题描述】:
这是我的表,我想合并两行 NUM_FILES 和 Total_size 为 "TABLE_NAME","TBL_ID","PART_ID","TABLE_TYPE","TABLE_LOCATION","TABLE_OWNER","DATABASE_NAME","NUM_FILES","TOTAL_SIZE" products_partitioned,2,2,EXTERNAL_TABLE,hdfs://sandbox-hdp.hortonworks.com:8020/HIVE_ROVER_IT/bikestores/products,hive,rovertesting,"3",4563
我的完整查询是:
SELECT DISTINCT tbl.tbl_name TABLE_NAME, tbl.TBL_ID TBL_ID, pp.PART_ID,
tbl.tbl_type TABLE_TYPE,
sds.location TABLE_LOCATION,
tbl.OWNER TABLE_OWNER,
--tbl.LAST_ACCESS_TIME ASSET_DATE_LAST_MODIFIED,
dbs.name DATABASE_NAME,
CASE pp.PARAM_KEY
WHEN 'numFiles' THEN pp.PARAM_VALUE
END AS NUM_FILES,
CASE pp.PARAM_KEY
WHEN 'totalSize' THEN pp.PARAM_VALUE
END AS TOTAL_SIZE
FROM TBLS tbl
INNER JOIN SDS ON tbl.tbl_id = sds.cd_id
INNER JOIN DBS ON dbs.db_id = tbl.db_id
LEFT JOIN PARTITIONS ON tbl.TBL_ID = PARTITIONS.TBL_ID
INNER JOIN PARTITION_PARAMS pp ON pp.PART_ID = PARTITIONS.PART_ID
WHERE pp.PARAM_KEY IN ('totalSize', 'numFiles') AND tbl.tbl_type IN ('MANAGED_TABLE','EXTERNAL_TABLE')
GROUP BY (tbl.tbl_name, tbl.TBL_ID, pp.PART_ID, tbl.tbl_type, sds.location, tbl.OWNER, dbs.name, pp.PARAM_KEY, pp.PARAM_VALUE)
ORDER BY TBL_ID, PART_ID ;
【问题讨论】: