【发布时间】:2020-08-25 14:57:18
【问题描述】:
实际上 CH 不支持带有部分匹配(字符串等)的左连接,所以我试图在表达式列表中使用 select 子句构建查询,但它不起作用。 或者也许有一种全新的方式(对我来说)可以做到这一点,但我只是在寻找关于如何执行此操作的线索。
错误是“缺少列:'DomainName' while processing query”
select NumberInTypes,
DomainName,
Url,
(select aa.group_name
from (select t1.id, t1.url_part, ugu.name as group_name
from Url t1
any
left join (select id, urlgroup_id, url_id, ug.name
from UrlGroupUrl t2
any
left join (select id, name
from UrlGroup t3
) ug on t2.urlgroup_id = ug.id
) ugu on t1.id = ugu.url_id) aa where t1.Url like '%' || aa.url_part || '%'
) as UrlGroup,
KeywordId,
ResultId,
HashedContent,
SearchEngine,
client_name,
project_name,
group_name,
DateParsed
from PositionNew t1
any
left join (
select id as KeywordId, trimBoth(keyword) as keyword, groupid, group_name, project_name, client_name
from Keyword
any
left join (
select keywordgroup_id as groupid, keyword_id as KeywordId, group_name, project_name, client_name
from KeywordGroupKeyword
any
left join (
select id as groupid, name as group_name, project_id, project_name, client_name
from KeywordGroup
any
left join (
select id as project_id, name as project_name, client_id, client_name
from Project
any
left join (
select id as client_id, name as client_name from Client
) client using client_id
) project using project_id
) kgroup using groupid
) keywordgroup using KeywordId
) keyword using KeywordId
where DateParsed between '2020-07-13' and '2020-08-02'
and PositionType in (1, 3)
and client_name like '%ClientName%'
ORDER BY ResultId,
DomainName,
NumberInType
LIMIT
1 BY ResultId, DomainName;
更新: 显然,您不能在 Clickhouse 的相关子查询中使用 out 查询中的列。所以我完全没有选择,开始认为这是不可能的。
重现问题的简化示例:
第一个表包含 URL
+------------------------------------+
| Url |
+------------------------------------+
| https://example.com/cat/page1.html |
+------------------------------------+
| https://example.com/cat/page2.html |
+------------------------------------+
| https://example2.com/page.html |
+------------------------------------+
第二个表包含 UrlGroups
+-----------------+-----------+
| UrlPart | GroupName |
+-----------------+-----------+
| example.com/cat | DomainCat |
+-----------------+-----------+
| example2.com | Domain2 |
+-----------------+-----------+
我想要实现的是:
+------------------------------------+-----------+
| Url | GroupName |
+------------------------------------+-----------+
| https://example.com/cat/page1.html | DomainCat |
+------------------------------------+-----------+
| https://example.com/cat/page2.html | DomainCat |
+------------------------------------+-----------+
| https://example2.com/page.html | Domain2 |
+------------------------------------+-----------+
ALL LEFT JOIN - 不起作用,因为它需要完全匹配 SUBQUERY - 不起作用,因为您不能使用外部查询中的列来过滤其结果
【问题讨论】:
-
很难重现你的问题——你能提供所有使用过的表的架构吗?或重现此问题的简单示例。
-
@vladimir 我已经用简化的例子更新了这个问题
标签: sql clickhouse