在 MySQL 的 where 子句中使用日期时间索引答案

【问题标题】：Using datetime index in where clause MySQL在 MySQL 的 where 子句中使用日期时间索引
【发布时间】：2020-09-16 11:32:56
【问题描述】：

我有一个包含 2 亿行的表，其中索引是在日期时间数据类型的“created_at”列中创建的。

显示创建表 [tablename] 输出：

 create table `table`
 (`created_at` datetime NOT NULL)
 PRIMARY KEY (`id`)
 KEY `created_at_index` (`created_at`)
 ENGINE=InnoDB AUTO_INCREMENT=208512112 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci'

created_at 范围为 2020-04-01 ~ 2020-05-28。

我只想获取 2020-05-15 23:00:00 之后的行。

当我跑步时：

EXPLAIN SELECT created_at
          FROM table
         where created_at >= '2020-05-15 23:00:00';

它说它输出：

rows       Extra
200mil   Using Where

我的理解是，在 RDMS 中，如果没有未排序的索引行，但是当您在列上创建索引时，它是按排序顺序排列的，因此在找到 '2020-05-15 23:00:00' 之后它会简单地返回之后的所有行。

另外，由于它的基数是 700 万，我认为使用索引会比全表扫描更好。

是因为我输入了日期作为字符串吗？但是当我尝试时

 where created_at >= date('2020-05-15 23:00:00');

还是一样。

和

 where created_at >= datetime('2020-05-15 23:00:00');

输出语法错误。

mysql 是否只是决定进行全表扫描会更有效？

编辑：

使用等号

EXPLAIN SELECT created_at
          FROM table
         where created_at = '2020-05-15';

输出：

key_len    ref     rows     Extra
  5        const    51

在 where 子句中，如果我将字符串更改为 date('2020-05-15')，它会输出：

key_len    ref     rows     Extra
  5        const    51      Using index condition

这是否意味着第一个相等查询没有使用索引？

【问题讨论】：

请为您的表分享show create table 的输出。
@GMB 你这是什么意思？我是如何创建我的表的？
运行show create [tablename]

标签： mysql sql date query-optimization where-clause

【解决方案1】：

您的所有查询都将利用列created_at 上的索引。 MySQL 在匹配 where 子句的谓词时总是使用索引。

您的explains 的输出确实表明您没有此索引，您的create table 的输出证实了这一点。

只需创建索引，您的数据库就会使用它。

这里是a demo：

-- sample table, without the index
create table mytable(id int, created_at datetime);

--  the query does a full scan, as no index is available
explain select created_at from mytable where created_at >= '2020-05-15 23:00:00';

编号 |选择类型 |表|隔断 |类型 |可能的键 |关键 | key_len |参考 |行 |过滤 |额外的 -: | :------------ | :-------- | :--------- | :--- | :------------ | :--- | :-------- | :--- | ---: | --------: | :---------- 1 |简单 |表 | 空 |全部 | 空 | 空 | 空 | 空 | 1 | 100.00 |使用哪里

-- now add the index
create index idx_mytable_created_at on mytable(created_at);

-- the query uses the index
explain select created_at from mytable where created_at >= '2020-05-15 23:00:00';

编号 |选择类型 |表|隔断 |类型 |可能的键 |关键 | key_len |参考 |行 |过滤 |额外的 -: | :------------ | :-------- | :--------- | :---- | :--------------------- | :--------------------- | :-------- | :--- | ---: | --------: | :------------------------ 1 |简单 |表 | 空 |索引 | idx_mytable_created_at | idx_mytable_created_at | 6 | 空 | 1 | 100.00 |使用哪里；使用索引

【讨论】：

感谢您的回答。我没有看到 show create table 查询的最后一部分。它似乎有一个索引，请参阅上面的编辑版本。
我认为MySQL在判断它会比使用索引更有效时使用全表扫描。

【解决方案2】：

如果值均匀分布，大约 25% 的行是>= '2020-05-15 23:00:00' 是的，当您需要如此大比例的表时，Mysql 会更喜欢全表扫描而不是使用索引。

见Why does MySQL not always use index for select query?

在DATE 上下文中，date('2020-05-15 23:00:00') 与'2020-05-15' 相同。

在DATETIME 上下文中，datetime('2020-05-15 23:00:00') 与'2020-05-15 23:00:00' 相同。

Using index 表示INDEX 是“覆盖”的，这意味着整个查询可以完全在索引的 BTree 中执行——无需触及数据的 BTree。

Using index condition 意味着完全不同的东西——它与 MySQL 设计中的两层（“处理程序”和“引擎”）相关的小优化有关。（“ICP”又名“索引条件下推”中的更多详细信息。）

【讨论】：