【发布时间】:2016-12-22 04:03:37
【问题描述】:
表结构简单:
CREATE TABLE `trade` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`account` int(11) NOT NULL,
`date` date NOT NULL,
`amount` double DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `all_idx` (`date`,`account`,`amount`) USING BTREE
) ENGINE=InnoDB;
这张表大约有5M条记录。
要求是:
- 给出日期范围
- 查找日期范围内每个账户的FIRST MAXIMUM交易金额
- 找出MINIMUM交易量AFTER
- 计算这两个数量之间的DIFFERENCE(可能为0)
这是我编写 SQL 的方式:
-- step 1: find the max amount, took about 0.6s
select account, max(amount) max_amount
from trade
where date between '20160101' and '20161220'
group by account;
-- step 2: find the first date, took about 1s
drop temporary table if exists tmp_max_amount;
create temporary table tmp_max_amount
select t1.account, min(t1.date) date, t1.amount
from trade t1, (
select account, max(amount) max_amount
from trade
where date between '20160101' and '20161220'
group by account
) t2
where t1.account = t2.account and t1.amount = t2.amount
group by t1.account, t1.amount;
-- step 3: find the min amount, took about 50s
drop temporary table if exists tmp_min_amount;
create temporary table tmp_min_amount
select t1.account, min(t1.amount) min_amount
from trade t1, tmp_max_amount t2
where t1.account = t2.account and t1.date >= t2.date
group by t1.account;
-- step 4: calculate the difference, took about 0.8s
select x.account, (max_amount - min_amount) diff
from tmp_max_amount x, tmp_min_amount n
where x.account = n.account;
第 3 步中的 SQL 耗时约 50 秒。有什么办法可以提高速度吗?
样本数据:
id | account | date | amount
------|---------|----------|---------
1 | 1000 | 20151001 | 1000 <- not in range
2 | 3000 | 20151002 | 100 <- not in range
3 | 1000 | 20160105 | 800 <- max of 1000
4 | 2000 | 20160110 | 200 <- max of 2000
5 | 2000 | 20160115 | 100 <- min of 2000
6 | 3000 | 20160201 | 1200
....
10000 | 2000 | 20161210 | 200 <- no the first max
10001 | 3000 | 20161210 | 500
10002 | 3000 | 20161212 | 1500 <- max & min of 3000
10003 | 1000 | 20161213 | 300 <- min of 1000
预期结果:
account | diff
--------|------
1000 | 500 <- (800 - 300)
2000 | 100 <- (200 - 100)
3000 | 0 <- (1500 - 1500)
...
【问题讨论】:
-
或许可以避免使用临时表!你能发布一些示例数据和预期的输出吗?
-
@e4c5 感谢您的回复,我刚刚按要求添加了示例数据。
-
尝试在 tmp_max_amount 上添加索引,就像对任何其他表一样。此外,请务必在第三个查询中使用 EXPLAIN 从交易表和临时表中检查您的索引使用情况。
标签: mysql performance group-by