【发布时间】:2020-10-29 06:04:06
【问题描述】:
我正在尝试使用云数据流引擎在 bigquery 中运行查询。该查询使用用 Javascript 编写的自定义函数(Levenshtein Distance)。在使用 ST_GeogPoint 或 ARRAY_AGG 等其他功能时,我也在尝试同样的问题。
我收到此错误Function not found: ST_GeogPoint。如果我删除与函数对应的列,我会得到与LevenshteinDistance 相同的错误,然后是ARRAY_AGG,依此类推。
查询如下所示:
WITH
directory AS(
SELECT
TRIM(dir) AS street,
lat,
lon
FROM
bigquery.table.`project-id`.`dataset-name`.`table-name_1`),
cruza AS (
SELECT
name,
TRIM(p.dir) AS dir,
TRIM(directory.dir) AS street,
directory.lat AS lat,
directory.lon AS lon,
ST_GeogPoint(lat,lon) AS latlon,
CAST(FLOOR(DATE_DIFF(CURRENT_DATE(),birth_day,DAY)/362.25) AS int64) AS age,
dataset-name.LevenshteinDistance(TRIM(dir),TRIM(directory.dir)) AS lv_score
FROM
bigquery.table.`project-id`.`dataset-name`.`table-name_2` AS p,
directory
WHERE
p.com = 'my_com' and name is not null)
SELECT
AS value ARRAY_AGG(c ORDER BY lv_score LIMIT 1)[OFFSET(0)] AS col
FROM
cruza c
WHERE
lv_score <= 10
GROUP BY
dir
ORDER BY
col.lv_score
如何使用这些功能?
【问题讨论】:
-
这个问题已经关闭,但不是重复的。这里的答案是不完整的。简单的答案是 Dataflow SQL 不支持 UDF。
标签: google-bigquery google-cloud-dataflow user-defined-functions