【发布时间】:2020-05-29 12:57:58
【问题描述】:
我编写了一个查询来查找每个月的值的中位数。这样做已经够困难了,因为 MySQL 没有内置的中值函数,所以我真的不得不用我的中级 SQL 技能跳出框框思考。但现在的问题是运行查询需要很长时间(有时需要 1 或 2 分钟)。有没有办法优化这个查询?或者我应该编写一个 python 脚本来找到中值并使用连接器将其推送到数据库?
这里是查询:
SET @row_num_pos := 0;
SET @median_group_pos := '';
SET @row_num_neg := 0;
SET @median_group_neg := '';
SELECT
p.month_num AS 'month_num',
CASE
WHEN p.month_num = 1 THEN 'Jan'
WHEN p.month_num = 2 THEN 'Feb'
WHEN p.month_num = 3 THEN 'Mar'
WHEN p.month_num = 4 THEN 'Apr'
WHEN p.month_num = 5 THEN 'May'
WHEN p.month_num = 6 THEN 'Jun'
WHEN p.month_num = 7 THEN 'Jul'
WHEN p.month_num = 8 THEN 'Aug'
WHEN p.month_num = 9 THEN 'Sep'
WHEN p.month_num = 10 THEN 'Oct'
WHEN p.month_num = 11 THEN 'Nov'
WHEN p.month_num = 12 THEN 'Dec'
END AS 'Timeline',
p.ck_pos_median AS 'CK+ Median',
n.ck_neg_median AS 'CK- Median'
FROM
(SELECT
s.median_month_pos AS 'month_num',
ROUND(AVG(ck_pos), 1) AS 'ck_pos_median'
FROM
(SELECT
@row_num_pos:=CASE
WHEN @median_group_pos = q.month_num THEN @row_num_pos + 1
ELSE 1
END AS 'count_of_group',
@median_group_pos:=q.month_num AS 'median_month_pos',
q.month_num,
q.ck_pos,
(SELECT
COUNT(*)
FROM
Biocept_DB.result_management_report
WHERE
ck_pos IS NOT NULL
AND MONTH(order_date) = q.month_num) AS total_month
FROM
(SELECT
MONTH(order_date) AS 'month_num', ck_pos
FROM
Biocept_DB.result_management_report
WHERE
ck_pos IS NOT NULL
ORDER BY MONTH(order_date) , ck_pos ASC) AS q) AS s
WHERE
s.count_of_group BETWEEN (s.total_month / 2.0) AND (s.total_month / 2.0 + 1)
GROUP BY s.median_month_pos) AS p
JOIN
(SELECT
s.median_month_neg AS 'month_num',
ROUND(AVG(ck_neg), 1) AS 'ck_neg_median'
FROM
(SELECT
@row_num_neg:=CASE
WHEN @median_group_neg = q.month_num THEN @row_num_neg + 1
ELSE 1
END AS 'count_of_group',
@median_group_neg:=q.month_num AS 'median_month_neg',
q.month_num,
q.ck_neg,
(SELECT
COUNT(*)
FROM
Biocept_DB.result_management_report
WHERE
ck_neg IS NOT NULL
AND MONTH(order_date) = q.month_num) AS total_month
FROM
(SELECT
MONTH(order_date) AS 'month_num', ck_neg
FROM
Biocept_DB.result_management_report
WHERE
ck_neg IS NOT NULL
ORDER BY MONTH(order_date) , ck_neg ASC) AS q) AS s
WHERE
s.count_of_group BETWEEN (s.total_month / 2.0) AND (s.total_month / 2.0 + 1)
GROUP BY s.median_month_neg) AS n ON p.month_num = n.month_num
ORDER BY p.month_num;
SET @row_num_pos := NULL;
SET @median_group_pos := NULL;
SET @row_num_neg := NULL;
SET @median_group_neg := NULL;
【问题讨论】:
-
在 MariaDB 中是 MEDIAN 函数,请参阅:mariadb.com/kb/en/median。我不确定这也在 MySQL 实现中
-
MySQL 中没有 MEDIAN 函数。因此,我一直在考虑过渡。
-
你有一些原始数据并创建表定义供我测试(pastebin)
-
可以从您那里获取数据。我几乎可以肯定我可以优化它
-
当然我会导出并发送给你们。