【发布时间】:2017-05-03 10:22:12
【问题描述】:
我在 BigQuery for Google Analytics 中遇到以下查询问题。由于某种原因,我无法将用户数计算为唯一用户,它本质上是计算行数,因此这些数字与会话非常相似。我也尝试过 EXACT_COUNT_DISTINCT() 但给出了相同的答案。
SELECT
date AS Day,
MAX(CASE
WHEN hits.sourcePropertyInfo.sourcePropertyTrackingId CONTAINS '778****' THEN 'MUG'
WHEN hits.sourcePropertyInfo.sourcePropertyTrackingId = 'Social' THEN 'Social'ELSE 'Website' END) AS Property,
geoNetwork.country AS Country,
SUM(totals.visits) AS visits,
COUNT (DISTINCT(fullVisitorId), 1000000) AS Users,
SUM(IFNULL(totals.newVisits,0)) AS NEW,
(SUM(IFNULL(totals.screenviews,0))+SUM(IFNULL(totals.pageviews,0))) AS PAGEVIEWS,
IFNULL(SUM(CASE
WHEN totals.screenviews = 1 THEN SUM(IFNULL(totals.screenviews,0))
ELSE 0 END)+ SUM(IFNULL(totals.bounces,0)),0) AS BOUNCES,
SUM(CASE
WHEN REGEXP_MATCH(hits.eventInfo.eventAction,'register$|registersuccess|new registration|account signup|registro') THEN 1
ELSE 0 END) AS NewRegistrations,
SUM(CASE
WHEN REGEXP_MATCH(hits.eventInfo.eventAction, 'add to cart|add to bag|click to buy|ass to basket|comprar') OR hits.eventInfo.eventAction CONTAINS 'addtobasket::' THEN 1
ELSE 0 END) AS ClickToBuy,
SUM(IFNULL(totals.transactions,0)) AS Transactions,
SUM(IFNULL(totals.transactionRevenue,0))/1000000 AS Revenue
FROM (TABLE_DATE_RANGE([****.ga_sessions_], TIMESTAMP('2017-03-15'), TIMESTAMP('2017-03-31'))),
GROUP BY
Day,
Country,
geoNetwork.country,
totals.screenviews;
【问题讨论】:
-
你为什么按
screenviews分组? -
@ElliottBrossard 我认为这是问题所在。我试着把它排除在外,但它一直迫使我去
-
我认为问题在于您有一些嵌套聚合,即 SUM 中的 SUM。如果您修复该逻辑,则查询应该可以工作。不过,我真的建议您使用standard SQL 进行分析。您可能还对migration guide 感兴趣。
-
感谢@elliot,会试一试。选择中的子查询会是这里最大的优势吗?
-
其他几个是
COUNT(DISTINCT ...)给出了准确的结果(并且通常比EXACT_COUNT_DISTINCT更快),并且与ga_sessions_表相关的重复字段处理要明智得多,尽管您可能发现有一条学习曲线。 Working with Arrays topic 是一个很好的介绍。
标签: google-analytics google-bigquery