【发布时间】:2019-06-29 16:47:40
【问题描述】:
我想使用BigQuery Standard SQL 进行类似this one 的查询:
SELECT package, COUNT(*) count
FROM (
SELECT REGEXP_EXTRACT(line, r' ([a-z0-9\._]*)\.') package, id
FROM (
SELECT SPLIT(content, '\n') line, id
FROM [github-groovy-files:github.contents]
WHERE content CONTAINS 'import'
HAVING LEFT(line, 6)='import' )
GROUP BY package, id
)
GROUP BY 1
ORDER BY count DESC
LIMIT 30;
我无法通过这样的事情(有效但不能 GROUP 或 COUNT):
with lines as
(SELECT SPLIT(c.content, '\n') line, c.id as id
FROM `<dataset>.contents` c, `<dataset>.files` f
WHERE c.id = f.id AND f.path LIKE '%.groovy')
select
array(select REGEXP_REPLACE(l, r'import |;', '') AS class from unnest(line) as l where l like 'import %') imports, id
from lines;
LEFT() 不在标准 SQL 中,似乎没有可以接受数组类型的函数。
【问题讨论】:
标签: google-bigquery bigquery-standard-sql