【发布时间】:2021-07-27 12:24:25
【问题描述】:
Java UDFs return a scalar result. Java UDTFs are not currently supported.reference
也就是说,我创建了一个 Java UDF,如下所示
CREATE OR replace function MAP_COUNT(colValue String)
returns OBJECT
language java
handler='Frequency.calculate'
target_path='@~/Frequency.jar'
as
$$
import java.util.HashMap;
import java.util.Map;
import java.util.Optional;
class Frequency {
Map<String, Integer> frequencies = new HashMap<>();
public Map<String, Integer> calculate(String colValue) {
frequencies.putIfAbsent(colValue, 0);
frequencies.computeIfPresent(colValue, (key, value) -> value + 1);
return frequencies;
}
}
$$;
在如下查询中使用MAP_COUNT UDF
with temp_1 as
(
SELECT 'John' AS my_col, 27 as age
UNION ALL
SELECT 'John' AS my_col, 28 as age
UNION ALL
SELECT 'doe' AS my_col, 27 as age
UNION ALL
SELECT 'doe' AS my_col, 28 as age
)
select MAP_COUNT(a.my_col) from temp_1 a;
我得到如下结果
|MAP_COUNT(A.MY_COL) |
|-------------------------------|
|{ "John": "1" } |
|{ "John": "2" } |
|{ "John": "2", "doe": "1" } |
|{ "John": "2", "doe": "2"} |
我对 UDF 的期望结果如下
|MAP_COUNT(A.MY_COL) |
|-------------------------------|
|{ "John": "2", "doe": "2"} |
雪花有可能吗?
如果我有如下查询怎么办?
with temp_1 as
(
SELECT 'John' AS my_col, 27 as age
UNION ALL
SELECT 'John' AS my_col, 28 as age
UNION ALL
SELECT 'doe' AS my_col, 27 as age
UNION ALL
SELECT 'doe' AS my_col, 28 as age
)
select MAP_COUNT(a.my_col) as names, MAP_COUNT(a.age) as ages from temp_1 a;
我对 UDF 的期望结果如下
|names ||AGES |
|-------------------------------||-------------------------------|
|{ "John": "2", "doe": "2"} ||{ "27": "2", "28": "2"} |
有一些方法可以通过简单地重组查询来实现这一点,但我想知道是否可以通过在 select 子句中使用类似于OBJECT_AGG 函数的MAP_COUNT 函数来实现。
【问题讨论】:
-
Snowflake Ideas - 有一个叫做:“功能请求:存储的聚合函数”
标签: snowflake-cloud-data-platform user-defined-functions