【发布时间】:2010-04-22 12:54:26
【问题描述】:
我有一个很大的表来存储电子邮件中包含的单词
mysql> explain t_message_words;
+----------------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+---------+------+-----+---------+----------------+
| mwr_key | int(11) | NO | PRI | NULL | auto_increment |
| mwr_message_id | int(11) | NO | MUL | NULL | |
| mwr_word_id | int(11) | NO | MUL | NULL | |
| mwr_count | int(11) | NO | | 0 | |
+----------------+---------+------+-----+---------+----------------+
表包含大约 100M 行
mwr_message_id 是消息表的 FK
mwr_word_id 是单词表的 FK
mwr_count 是消息 mwr_message_id 中单词 mwr_word_id 的出现次数
为了计算最常用的词,我使用以下查询
SELECT SUM(mwr_count) AS word_count, mwr_word_id
FROM t_message_words
GROUP BY mwr_word_id
ORDER BY word_count DESC
LIMIT 100;
几乎永远运行(在测试服务器上超过半小时)
mysql> show processlist;
+----+------+----------------+--------+---------+------+----------------------+-----------------------------------------------------
| Id | User | Host | db | Command | Time | State | Info
+----+------+----------------+--------+---------+------+----------------------+-----------------------------------------------------
processlist
| 41 | root | localhost:3148 | tst_db | Query | 1955 | Copying to tmp table | SELECT SUM(mwr_count) AS word_count, mwr_word_id
FROM t_message_words
GROUP BY mwr_word_id |
+----+------+----------------+--------+---------+------+----------------------+-----------------------------------------------------
3 rows in set (0.00 sec)
我可以做些什么来“加速”查询(除了添加更多内存、更多 CPU、更快的磁盘)?
提前谢谢你
斯特凡诺
附:解释结果:
mysql> EXPLAIN SELECT SUM(mwr_count) AS word_count, mwr_word_id
-> FROM t_message_words
-> GROUP BY mwr_word_id
-> ORDER BY word_count DESC
-> LIMIT 100;
+----+-------------+-----------------+-------+---------------+----------------------+---------+------+----------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------+-------+---------------+----------------------+---------+------+----------+---------------------------------+
| 1 | SIMPLE | t_message_words | index | NULL | IDX_t_message_words2 | 4 | NULL | 94823285 | Using temporary; Using filesort |
+----+-------------+-----------------+-------+---------------+----------------------+---------+------+----------+---------------------------------+
1 row in set (0.01 sec)
【问题讨论】:
-
这与双稳态无关。
标签: mysql performance group-by bigtable