【发布时间】:2022-01-19 13:43:52
【问题描述】:
我有 3 个表,我需要创建一个结果表。
场景:
- “传入句子”表包含句子的 steram 流入数据库。
- 表“tagged_sentences”包含 来自“incoming_sentences”的句子由编辑器标记/标记。有时,如果编辑器在标记数据时出错,管理员会覆盖标签。管理员标记的数据是最终数据,被认为是正确的。
- “帐户”表包含用户的帐户级别信息
下面是带有示例信息的表格。
传入句子
| id | sentence | market | model_identified_intent | tagged_at |
|---|---|---|---|---|
| 1 | abcd | en_in | alphabets | 12/12/2021 |
| 2 | 1234 | en_in | numeric | 11/13/2021 |
| 3 | a1b2 | en_in | alphaNumeric | 10/14/2021 |
| 4 | efgh | en_in | alphabets | 10/15/2021 |
| 5 | e5f6 | en_in | alphaNumeric | 11/16/2021 |
标记句
| id | tagger_id | sentence_id | tagger_tagged_intent |
|---|---|---|---|
| 1 | 32 | 1 | alphabets |
| 2 | 32 | 2 | alphabets |
| 3 | 32 | 3 | Numeric |
| 4 | 33 | 2 | Numeric |
| 5 | 33 | 3 | alphaNumeric |
用户帐户表
| id | user_role | name | |
|---|---|---|---|
| 32 | editor | editor@editor.com | editor123 |
| 33 | admin | admin@admin.com | admin456 |
预期输出:
我想将结果作为“每月标记的句子总数”放在一列和“管理员每月的总更正”中。通过它可以知道错误率。
| year-month | total_tagged | Total Error (Corrected by admin) |
|---|---|---|
| 2021-10 | 2 | 1 |
| 2021-11 | 2 | 1 |
| 2021-12 | 1 | 0 |
请求您帮助解决这个问题。我尝试了下面的代码。但它没有按预期工作。
WITH cte1 AS (SELECT tggs.id id,
tggs.sentence AS sentence,
tggs.market AS market,
tggs.prod_identified_intent AS prod_identified_intent,
tggs.tagged_at AS tagged_at,
ROW_NUMBER() OVER (PARTITION BY tagged_at) AS rn
FROM tagging_sentences tggs),
cte2 AS (SELECT tgds.sentence_id_id AS sentence_id,
tgds.tagger_id_id AS tagger_id,
tgds.tagged_intent AS tagged_intent
FROM tagged_sentences tgds),
cte3 AS (SELECT acts.id AS account_id, acts.email AS email, acts.role AS role FROM accounts AS acts),
cte4 AS (SELECT tggs.tagged_at, COUNT(*) AS count, ROW_NUMBER() OVER (PARTITION BY count(*)) AS rn
FROM tagging_sentences AS tggs
JOIN tagged_sentences AS tgds ON tggs.id = tgds.sentence_id_id
JOIN accounts acts ON tgds.tagger_id_id = acts.id
WHERE tgds.tagger_id_id = 33
AND tgds.sentence_id_id IN (SELECT tagging_sentences.id
FROM tagging_sentences,
tagged_sentences
WHERE tagged_sentences.tagger_id_id = 32) GROUP BY tagged_at)
SELECT TO_CHAR(cte1.tagged_at, 'YYYY-MM'),
COUNT(cte1.sentence), cte4.count
FROM cte1
JOIN cte2 ON cte1.id = cte2.sentence_id
JOIN cte3 ON cte2.tagger_id = cte3.account_id
JOIN cte4 ON cte1.rn = cte4.rn
GROUP BY TO_CHAR(cte1.tagged_at, 'YYYY-MM'), TO_CHAR(cte4.tagged_at, 'YYYY-MM'), cte4.count;
【问题讨论】:
-
请查看您的数据和预期输出。您提供的数据无法生成预期的输出。您的输出指示 2021-10、2021-11 和 2021-12 月份的值,但您的输入仅包含 2021-12 的数据。您需要修改预期的输出,修改或包含额外的输入,或者可能缺少数据列。
-
@Belayer,我的错!我已经编辑了数据集以匹配预期的输出。请查看并提供帮助!
标签: sql postgresql