【发布时间】:2020-05-09 23:21:39
【问题描述】:
我得到下面的查询表:
SELECT
fullVisitorId,
COUNT(fullVisitorId) as id_count,
ARRAY_AGG(trafficSource.medium) AS trafic_medium
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_20170101`
GROUP BY
fullVisitorId
ORDER BY
id_count DESC
对于trafic_medium 列中的每个值(例如:cpc、推荐、有机等),我试图找出每个值在数组中出现的频率,因此最好添加一个新列“计数”这表明该值发生的频率如何?
+-----------+---------+------+
| array_agg | medium | count|
+-----------+---------+------+
| 123 | cpc | 2 |
+-----------+---------+------+
| | organic | 1 |
+-----------+---------+------+
| | cpc | 2 |
+-----------+---------+------+
| 456 | organic | 2 |
+-----------+---------+------+
| | organic | 2 |
+-----------+---------+------+
| | cpc | 1 |
+-----------+---------+------+
我是 SQL 新手,所以我很困惑。
到目前为止我已经试过了:
WITH medium AS
(
SELECT
fullVisitorId,
COUNT(fullVisitorId) as id_count,
ARRAY_AGG(trafficSource.medium) AS trafic_medium
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_20170101`
GROUP BY
fullVisitorId
ORDER BY
id_count DESC
)
SELECT
fullVisitorId,
trafic_medium,
(SELECT AS STRUCT Any_Value(trafic_medium) AS name, COUNT(*) AS count
FROM
UNNEST(trafic_medium) AS trafic_medium) AS trafic_medium_2,
FROM
medium
基于此线程: How to count frequency of elements in a bigquery array field
但是,这仅显示了 'Any_Value 的数量,并非所有不同的。
我将不胜感激!
附言我在 BigQuery 中的“bigquery-public-dataset.google_analytics_sample”上执行此操作
【问题讨论】:
标签: sql arrays count google-bigquery