在 MySQL 中的内部连接表上使用索引答案

【问题标题】：Using index on inner join table in MySQL在 MySQL 中的内部连接表上使用索引
【发布时间】：2013-06-09 14:58:57
【问题描述】：

我有包含 2 亿条记录的表 Foo 和包含 1000 条记录的表 Bar，它们是多对一连接的。 Foo.someTime 和 Bar.someField 列都有索引。同样在 Bar 900 条记录的 someField 为 1，100 条记录的 someField 为 2。

(1) 该查询立即执行：

mysql> select * from Foo f inner join Bar b on f.table_id = b.table_id where f.someTime     between '2008-08-14' and '2018-08-14' and b.someField = 1 limit 20;
...
20 rows in set (0.00 sec)

(2) 这个只需要永远（唯一的变化是 b.someField = 2）：

mysql> select * from Foo f inner join Bar b on f.table_id = b.table_id where f.someTime     between '2008-08-14' and '2018-08-14' and b.someField = 2 limit 20;

(3) 但是如果我在 someTime 上退出 where 子句，它也会立即执行：

mysql> select * from Foo f inner join Bar b on f.table_id = b.table_id where b.someField = 2 limit 20;
...
20 rows in set (0.00 sec)

(4) 我也可以通过强制使用索引来加快速度：

mysql> select * from Foo f inner join Bar b force index(someField) on f.table_id = b.table_id where f.someTime     between '2008-08-14' and '2018-08-14' and b.someField = 2 limit 20;
...
20 rows in set (0.00 sec)

这是对查询 (2) 的解释（需要永远）

+----+-------------+-------+--------+-------------------------------+-----------+---------+--------------------------+----------+-------------+
| id | select_type | table | type   | possible_keys                 | key       | key_len | ref                      | rows     | Extra       |
+----+-------------+-------+--------+-------------------------------+-----------+---------+--------------------------+----------+-------------+
|  1 | SIMPLE      | g     | range  | bar_id,bar_id_2,someTime      | someTime  | 4       | NULL                     | 95022220 | Using where |
|  1 | SIMPLE      | t     | eq_ref | PRIMARY,someField,bar_id      | PRIMARY   | 4       | db.f.bar_id              |        1 | Using where |
+----+-------------+-------+--------+-------------------------------+-----------+---------+--------------------------+----------+-------------+

这里是（4）的解释（有力指数）

+----+-------------+-------+------+-------------------------------+-----------+---------+--------------------------+----------+-------------+
| id | select_type | table | type | possible_keys                 | key       | key_len | ref                      | rows     | Extra       |
+----+-------------+-------+------+-------------------------------+-----------+---------+--------------------------+----------+-------------+
|  1 | SIMPLE      | t     | ref  | someField                     | someField | 1       |   const                  |       92 |             |
|  1 | SIMPLE      | g     | ref  | bar_id,bar_id_2,someTime      | bar_id    | 4       | db.f.foo_id              | 10558024 | Using where |
+----+-------------+-------+------+-------------------------------+-----------+---------+--------------------------+----------+-------------+

所以问题是如何教 MySQL 使用正确的索引？查询由 ORM 生成，不仅限于这两个字段。而且最好避免对查询进行太多更改（尽管我不确定内部连接是否适合这里）。

更新：

mysql> create index index_name on Foo (bar_id, someTime);

之后查询 (2) 将在 0.00 秒内执行。

【问题讨论】：

如果您的 SELECT 有任何连接，请永远执行SELECT *。相反，请指定您指的是哪颗星。例如，SELECT f.* FROM foo f JOIN bar b ... 可以。否则，不清楚您的* 获取哪些字段，并且会使其变慢
我只使用了SELECT *，例如，在真正的 DB ORM 中生成不带 * 的查询。

标签： mysql indexing inner-join large-data

【解决方案1】：

如果您为foo(table_id, sometime) 创建复合索引，它应该会有很大帮助。这是因为服务器将能够首先通过table_id 缩小结果集，然后通过sometime。

请注意，当使用LIMIT 时，如果许多行符合您的 WHERE 约束条件，服务器不保证将获取哪些行。从技术上讲，每次执行都会给你带来稍微不同的结果。如果您想避免歧义，则在使用LIMIT 时应始终使用ORDER BY。但是，这也意味着您在创建适当的索引时应该更加小心。

【讨论】：

目前我在 Foo 中有 6 列，在 Bar 中有 3 列，它们可以以任何可能的组合包含在 where 中。我应该删除当前索引foo(field1)、foo(field2) 等并用foo(bar_id, field1) 等替换它们吗？
复合索引 (a,b) 适用于单独搜索 a 和 (a,b)（当 a 和 b 都已知时），但不适用于单独搜索 b - 它需要在(b) 上创建索引。如果(a,b) 已经存在，则不需要(b,a) 上的索引。此外，您应该按特定顺序使用(a,b) - 首先是最具选择性的列。