【问题标题】:MYSQL OR vs IN performanceMYSQL OR 与 IN 性能
【发布时间】:2025-12-18 08:05:02
【问题描述】:

我想知道以下在性能方面是否有任何差异

SELECT ... FROM ... WHERE someFIELD IN(1,2,3,4)

SELECT ... FROM ... WHERE someFIELD between  0 AND 5

SELECT ... FROM ... WHERE someFIELD = 1 OR someFIELD = 2 OR someFIELD = 3 ... 

或者 MySQL 会以编译器优化代码的方式优化 SQL 吗?

编辑:将AND 更改为OR,原因在 cmets 中说明。

【问题讨论】:

  • 我也在研究这个东西,但反对某些语句将 IN 转换为 ORs I could say that it can also be converted to UNIONs 的行,建议替换 OR`s 以优化查询。
  • 这方面有一些优化变化,所以下面的部分答案可能“过时”了。
  • 特别是。项目的数量可能很重要。数字如何“聚集”可能很重要(BETWEEN 1 AND 4 完全匹配,可能更快)。 MySQL/MariaDB 的版本可能很重要。

标签: mysql sql performance optimization


【解决方案1】:

正如其他人所解释的,在查询性能方面,IN 比 OR 更好。

在以下情况下,带有 OR 条件的查询可能需要更长的执行时间。

  1. 如果 MySQL 优化器选择任何其他索引来提高效率(在误报情况下),则执行。
  2. 如果记录数更多(Jacob 明确说明)

【讨论】:

    【解决方案2】:

    2018IN (...) 更快。但是 >= && 甚至比 IN 还要快。

    这是我的benchmark

    【讨论】:

      【解决方案3】:

      接受的答案没有解释原因。

      以下引用自高性能 MySQL,第 3 版。

      在许多数据库服务器中,IN() 只是多个 OR 子句的同义词,因为两者在逻辑上是等价的。在 MySQL 中并非如此,它对 IN() 列表中的值进行排序并使用快速二进制搜索来查看值是否在列表中。这是列表大小的 O(Log n),而等价的一系列 OR 子句是列表大小的 O(n)(即,对于大型列表来说要慢得多)

      【讨论】:

      • 对特定数据库原因的精彩参考。不错!
      【解决方案4】:

      就在你认为安全的时候……

      eq_range_index_dive_limit 的价值是多少?特别是,IN 子句中的项目更多还是更少?

      这将不包括基准测试,但会稍微了解内部工作原理。让我们使用一个工具来看看发生了什么——Optimizer Trace。

      查询:SELECT * FROM canada WHERE id ...

      OR 包含 3 个值,部分跟踪如下所示:

             "condition_processing": {
                "condition": "WHERE",
                "original_condition": "((`canada`.`id` = 296172) or (`canada`.`id` = 295093) or (`canada`.`id` = 293626))",
                "steps": [
                  {
                    "transformation": "equality_propagation",
                    "resulting_condition": "(multiple equal(296172, `canada`.`id`) or multiple equal(295093, `canada`.`id`) or multiple equal(293626, `canada`.`id`))"
                  },
      

      ...

                    "analyzing_range_alternatives": {
                      "range_scan_alternatives": [
                        {
                          "index": "id",
                          "ranges": [
                            "293626 <= id <= 293626",
                            "295093 <= id <= 295093",
                            "296172 <= id <= 296172"
                          ],
                          "index_dives_for_eq_ranges": true,
                          "chosen": true
      

      ...

              "refine_plan": [
                {
                  "table": "`canada`",
                  "pushed_index_condition": "((`canada`.`id` = 296172) or (`canada`.`id` = 295093) or (`canada`.`id` = 293626))",
                  "table_condition_attached": null,
                  "access_type": "range"
                }
              ]
      

      注意 ICP 是如何提供的 ORs。这暗示OR 不会变成IN,InnoDB 将通过ICP 执行一堆= 测试。 (我觉得 MyISAM 不值得考虑。)

      (这是 Percona 的 5.6.22-71.0-log;id 是二级索引。)

      现在 IN() 有几个值

      eq_range_index_dive_limit = 10;有 8 个值。

              "condition_processing": {
                "condition": "WHERE",
                "original_condition": "(`canada`.`id` in (296172,295093,293626,295573,297148,296127,295588,295810))",
                "steps": [
                  {
                    "transformation": "equality_propagation",
                    "resulting_condition": "(`canada`.`id` in (296172,295093,293626,295573,297148,296127,295588,295810))"
                  },
      

      ...

                    "analyzing_range_alternatives": {
                      "range_scan_alternatives": [
                        {
                          "index": "id",
                          "ranges": [
                            "293626 <= id <= 293626",
                            "295093 <= id <= 295093",
                            "295573 <= id <= 295573",
                            "295588 <= id <= 295588",
                            "295810 <= id <= 295810",
                            "296127 <= id <= 296127",
                            "296172 <= id <= 296172",
                            "297148 <= id <= 297148"
                          ],
                          "index_dives_for_eq_ranges": true,
                          "chosen": true
      

      ...

              "refine_plan": [
                {
                  "table": "`canada`",
                  "pushed_index_condition": "(`canada`.`id` in (296172,295093,293626,295573,297148,296127,295588,295810))",
                  "table_condition_attached": null,
                  "access_type": "range"
                }
              ]
      

      请注意,IN 似乎没有变成OR

      附注:请注意常量值已排序。这有两个好处:

      • 通过少跳动,可能会有更好的缓存,更少的 I/O 来获取所有值。
      • 如果两个相似的查询来自不同的连接,并且它们在事务中,则由于列表重叠而导致延迟而不是死锁的可能性更大。

      最后,IN() 有很多值

            {
              "condition_processing": {
                "condition": "WHERE",
                "original_condition": "(`canada`.`id` in (293831,292259,292881,293440,292558,295792,292293,292593,294337,295430,295034,297060,293811,295587,294651,295559,293213,295742,292605,296018,294529,296711,293919,294732,294689,295540,293000,296916,294433,297112,293815,292522,296816,293320,293232,295369,291894,293700,291839,293049,292738,294895,294473,294023,294173,293019,291976,294923,294797,296958,294075,293450,296952,297185,295351,295736,296312,294330,292717,294638,294713,297176,295896,295137,296573,292236,294966,296642,296073,295903,293057,294628,292639,293803,294470,295353,297196,291752,296118,296964,296185,295338,295956,296064,295039,297201,297136,295206,295986,292172,294803,294480,294706,296975,296604,294493,293181,292526,293354,292374,292344,293744,294165,295082,296203,291918,295211,294289,294877,293120,295387))",
                "steps": [
                  {
                    "transformation": "equality_propagation",
                    "resulting_condition": "(`canada`.`id` in (293831,292259,292881,293440,292558,295792,292293,292593,294337,295430,295034,297060,293811,295587,294651,295559,293213,295742,292605,296018,294529,296711,293919,294732,294689,295540,293000,296916,294433,297112,293815,292522,296816,293320,293232,295369,291894,293700,291839,293049,292738,294895,294473,294023,294173,293019,291976,294923,294797,296958,294075,293450,296952,297185,295351,295736,296312,294330,292717,294638,294713,297176,295896,295137,296573,292236,294966,296642,296073,295903,293057,294628,292639,293803,294470,295353,297196,291752,296118,296964,296185,295338,295956,296064,295039,297201,297136,295206,295986,292172,294803,294480,294706,296975,296604,294493,293181,292526,293354,292374,292344,293744,294165,295082,296203,291918,295211,294289,294877,293120,295387))"
                  },
      

      ...

                    "analyzing_range_alternatives": {
                      "range_scan_alternatives": [
                        {
                          "index": "id",
                          "ranges": [
                            "291752 <= id <= 291752",
                            "291839 <= id <= 291839",
                            ...
                            "297196 <= id <= 297196",
                            "297201 <= id <= 297201"
                          ],
                          "index_dives_for_eq_ranges": false,
                          "rows": 111,
                          "chosen": true
      

      ...

              "refine_plan": [
                {
                  "table": "`canada`",
                  "pushed_index_condition": "(`canada`.`id` in (293831,292259,292881,293440,292558,295792,292293,292593,294337,295430,295034,297060,293811,295587,294651,295559,293213,295742,292605,296018,294529,296711,293919,294732,294689,295540,293000,296916,294433,297112,293815,292522,296816,293320,293232,295369,291894,293700,291839,293049,292738,294895,294473,294023,294173,293019,291976,294923,294797,296958,294075,293450,296952,297185,295351,295736,296312,294330,292717,294638,294713,297176,295896,295137,296573,292236,294966,296642,296073,295903,293057,294628,292639,293803,294470,295353,297196,291752,296118,296964,296185,295338,295956,296064,295039,297201,297136,295206,295986,292172,294803,294480,294706,296975,296604,294493,293181,292526,293354,292374,292344,293744,294165,295082,296203,291918,295211,294289,294877,293120,295387))",
                  "table_condition_attached": null,
                  "access_type": "range"
                }
              ]
      

      旁注:由于跟踪的庞大,我需要这个:

      @@global.optimizer_trace_max_mem_size = 32222;
      

      【讨论】:

        【解决方案5】:

        以下是使用 MySQL 5.6 @SQLFiddle 的 6 个查询的详细信息

        总的来说,6 个查询覆盖了独立索引的列,每个数据类型使用了 2 个查询。无论使用 IN() 还是 OR,所有查询都会导致使用索引。

                |   ORs      |   IN()
        integer | uses index | uses index
        date    | uses index | uses index
        varchar | uses index | uses index
        

        我真的只是想揭穿 OR 意味着不能使用索引的声明。这不是真的。索引可用于使用 OR 的查询,如以下示例中显示的 6 个查询。

        在我看来,很多人都忽略了这样一个事实,即 IN() 是一组 OR 的语法快捷方式。在小范围内,使用 IN() -v- OR 之间的性能差异非常(无限)微不足道。

        虽然在更大的范围内 IN() 肯定更方便,但它在逻辑上仍然等同于一组 OR 条件。每个查询的情况都会发生变化,因此最好在表上测试您的查询。

        6个解释计划的总结,都是“使用索引条件”(向右滚动)

          Query               select_type    table    type    possible_keys      key      key_len   ref   rows   filtered           Extra          
                              ------------- --------- ------- --------------- ----------- --------- ----- ------ ---------- ----------------------- 
          Integers using OR   SIMPLE        mytable   range   aNum_idx        aNum_idx    4               10     100.00     Using index condition  
          Integers using IN   SIMPLE        mytable   range   aNum_idx        aNum_idx    4               10     100.00     Using index condition  
          Dates using OR      SIMPLE        mytable   range   aDate_idx       aDate_idx   6               7      100.00     Using index condition  
          Dates using IN      SIMPLE        mytable   range   aDate_idx       aDate_idx   6               7      100.00     Using index condition  
          Varchar using OR    SIMPLE        mytable   range   aName_idx       aName_idx   768             10     100.00     Using index condition  
          Varchar using IN    SIMPLE        mytable   range   aName_idx       aName_idx   768             10     100.00     Using index condition  
        

        SQL Fiddle

        MySQL 5.6 架构设置

        CREATE TABLE `myTable` (
          `id` mediumint(8) unsigned NOT NULL auto_increment,
          `aName` varchar(255) default NULL,
          `aDate` datetime,
          `aNum`  mediumint(8),
          PRIMARY KEY (`id`)
        ) AUTO_INCREMENT=1;
        
        ALTER TABLE `myTable` ADD INDEX `aName_idx` (`aName`);
        ALTER TABLE `myTable` ADD INDEX `aDate_idx` (`aDate`);
        ALTER TABLE `myTable` ADD INDEX `aNum_idx` (`aNum`);
        
        INSERT INTO `myTable` (`aName`,`aDate`)
         VALUES 
         ("Daniel","2017-09-19 01:22:31")
        ,("Quentin","2017-06-03 01:06:45")
        ,("Chester","2017-06-14 17:49:36")
        ,("Lev","2017-08-30 06:27:59")
        ,("Garrett","2018-10-04 02:40:37")
        ,("Lane","2017-01-22 17:11:21")
        ,("Chaim","2017-09-20 11:13:46")
        ,("Kieran","2018-03-10 18:37:26")
        ,("Cedric","2017-05-20 16:25:10")
        ,("Conan","2018-07-10 06:29:39")
        ,("Rudyard","2017-07-14 00:04:00")
        ,("Chadwick","2018-08-18 08:54:08")
        ,("Darius","2018-10-02 06:55:56")
        ,("Joseph","2017-06-19 13:20:33")
        ,("Wayne","2017-04-02 23:20:25")
        ,("Hall","2017-10-13 00:17:24")
        ,("Craig","2016-12-04 08:15:22")
        ,("Keane","2018-03-12 04:21:46")
        ,("Russell","2017-07-14 17:21:58")
        ,("Seth","2018-07-25 05:51:30")
        ,("Cole","2018-06-09 15:32:53")
        ,("Donovan","2017-08-12 05:21:35")
        ,("Damon","2017-06-27 03:44:19")
        ,("Brian","2017-02-01 23:35:20")
        ,("Harper","2017-08-25 04:29:27")
        ,("Chandler","2017-09-30 23:54:06")
        ,("Edward","2018-07-30 12:18:07")
        ,("Curran","2018-05-23 09:31:53")
        ,("Uriel","2017-05-08 03:31:43")
        ,("Honorato","2018-04-07 14:57:53")
        ,("Griffin","2017-01-07 23:35:31")
        ,("Hasad","2017-05-15 05:32:41")
        ,("Burke","2017-07-04 01:11:19")
        ,("Hyatt","2017-03-14 17:12:28")
        ,("Brenden","2017-10-17 05:16:14")
        ,("Ryan","2018-10-10 08:07:55")
        ,("Giacomo","2018-10-06 14:21:21")
        ,("James","2018-02-06 02:45:59")
        ,("Colt","2017-10-10 08:11:26")
        ,("Kermit","2017-09-18 16:57:16")
        ,("Drake","2018-05-20 22:08:36")
        ,("Berk","2017-04-16 17:39:32")
        ,("Alan","2018-09-01 05:33:05")
        ,("Deacon","2017-04-20 07:03:05")
        ,("Omar","2018-03-02 15:04:32")
        ,("Thaddeus","2017-09-19 04:07:54")
        ,("Troy","2016-12-13 04:24:08")
        ,("Rogan","2017-11-02 00:03:25")
        ,("Grant","2017-08-21 01:45:16")
        ,("Walker","2016-11-26 15:54:52")
        ,("Clarke","2017-07-20 02:26:56")
        ,("Clayton","2018-08-16 05:09:29")
        ,("Denton","2018-08-11 05:26:05")
        ,("Nicholas","2018-07-19 09:29:55")
        ,("Hashim","2018-08-10 20:38:06")
        ,("Todd","2016-10-25 01:01:36")
        ,("Xenos","2017-05-11 22:50:35")
        ,("Bert","2017-06-17 18:08:21")
        ,("Oleg","2018-01-03 13:10:32")
        ,("Hall","2018-06-04 01:53:45")
        ,("Evan","2017-01-16 01:04:25")
        ,("Mohammad","2016-11-18 05:42:52")
        ,("Armand","2016-12-18 06:57:57")
        ,("Kaseem","2018-06-12 23:09:57")
        ,("Colin","2017-06-29 05:25:52")
        ,("Arthur","2016-12-29 04:38:13")
        ,("Xander","2016-11-14 19:35:32")
        ,("Dante","2016-12-01 09:01:04")
        ,("Zahir","2018-02-17 14:44:53")
        ,("Raymond","2017-03-09 05:33:06")
        ,("Giacomo","2017-04-17 06:12:52")
        ,("Fulton","2017-06-04 00:41:57")
        ,("Chase","2018-01-14 03:03:57")
        ,("William","2017-05-08 09:44:59")
        ,("Fuller","2017-03-31 20:35:20")
        ,("Jarrod","2017-02-15 02:45:29")
        ,("Nissim","2018-03-11 14:19:25")
        ,("Chester","2017-11-05 00:14:27")
        ,("Perry","2017-12-24 11:58:04")
        ,("Theodore","2017-06-26 12:34:12")
        ,("Mason","2017-10-02 03:53:49")
        ,("Brenden","2018-10-08 10:09:47")
        ,("Jerome","2017-11-05 20:34:25")
        ,("Keaton","2018-08-18 00:55:56")
        ,("Tiger","2017-05-21 16:59:07")
        ,("Benjamin","2018-04-10 14:46:36")
        ,("John","2018-09-05 18:53:03")
        ,("Jakeem","2018-10-11 00:17:38")
        ,("Kenyon","2017-12-18 22:19:29")
        ,("Ferris","2017-03-29 06:59:13")
        ,("Hoyt","2017-01-03 03:48:56")
        ,("Fitzgerald","2017-07-27 11:27:52")
        ,("Forrest","2017-10-05 23:14:21")
        ,("Jordan","2017-01-11 03:48:09")
        ,("Lev","2017-05-25 08:03:39")
        ,("Chase","2017-06-18 19:09:23")
        ,("Ryder","2016-12-13 12:50:50")
        ,("Malik","2017-11-19 15:15:55")
        ,("Zeph","2018-04-04 11:22:12")
        ,("Amala","2017-01-29 07:52:17")
        ;
        

        .

        update MyTable
        set aNum = id
        ;
        

        查询 1

        select 'aNum by OR' q, mytable.*
        from mytable
        where aNum = 12
        OR aNum = 22
        OR aNum = 27
        OR aNum = 32
        OR aNum = 42
        OR aNum = 52
        OR aNum = 62
        OR aNum = 65
        OR aNum = 72
        OR aNum = 82
        

        Results

        |          q | id |    aName |                aDate | aNum |
        |------------|----|----------|----------------------|------|
        | aNum by OR | 12 | Chadwick | 2018-08-18T08:54:08Z |   12 |
        | aNum by OR | 22 |  Donovan | 2017-08-12T05:21:35Z |   22 |
        | aNum by OR | 27 |   Edward | 2018-07-30T12:18:07Z |   27 |
        | aNum by OR | 32 |    Hasad | 2017-05-15T05:32:41Z |   32 |
        | aNum by OR | 42 |     Berk | 2017-04-16T17:39:32Z |   42 |
        | aNum by OR | 52 |  Clayton | 2018-08-16T05:09:29Z |   52 |
        | aNum by OR | 62 | Mohammad | 2016-11-18T05:42:52Z |   62 |
        | aNum by OR | 65 |    Colin | 2017-06-29T05:25:52Z |   65 |
        | aNum by OR | 72 |   Fulton | 2017-06-04T00:41:57Z |   72 |
        | aNum by OR | 82 |  Brenden | 2018-10-08T10:09:47Z |   82 |
        

        查询 2

        select 'aNum by IN' q, mytable.*
        from mytable
        where aNum IN (
                    12
                  , 22
                  , 27
                  , 32
                  , 42
                  , 52
                  , 62
                  , 65
                  , 72
                  , 82
                  )
        

        Results

        |          q | id |    aName |                aDate | aNum |
        |------------|----|----------|----------------------|------|
        | aNum by IN | 12 | Chadwick | 2018-08-18T08:54:08Z |   12 |
        | aNum by IN | 22 |  Donovan | 2017-08-12T05:21:35Z |   22 |
        | aNum by IN | 27 |   Edward | 2018-07-30T12:18:07Z |   27 |
        | aNum by IN | 32 |    Hasad | 2017-05-15T05:32:41Z |   32 |
        | aNum by IN | 42 |     Berk | 2017-04-16T17:39:32Z |   42 |
        | aNum by IN | 52 |  Clayton | 2018-08-16T05:09:29Z |   52 |
        | aNum by IN | 62 | Mohammad | 2016-11-18T05:42:52Z |   62 |
        | aNum by IN | 65 |    Colin | 2017-06-29T05:25:52Z |   65 |
        | aNum by IN | 72 |   Fulton | 2017-06-04T00:41:57Z |   72 |
        | aNum by IN | 82 |  Brenden | 2018-10-08T10:09:47Z |   82 |
        

        查询 3

        select 'adate by OR' q, mytable.*
        from mytable
        where aDate= str_to_date("2017-02-15 02:45:29",'%Y-%m-%d %h:%i:%s')
        OR aDate = str_to_date("2018-03-10 18:37:26",'%Y-%m-%d %h:%i:%s')
        OR aDate = str_to_date("2017-05-20 16:25:10",'%Y-%m-%d %h:%i:%s')
        OR aDate = str_to_date("2018-07-10 06:29:39",'%Y-%m-%d %h:%i:%s')
        OR aDate = str_to_date("2017-07-14 00:04:00",'%Y-%m-%d %h:%i:%s')
        OR aDate = str_to_date("2018-08-18 08:54:08",'%Y-%m-%d %h:%i:%s')
        OR aDate = str_to_date("2018-10-02 06:55:56",'%Y-%m-%d %h:%i:%s')
        OR aDate = str_to_date("2017-04-20 07:03:05",'%Y-%m-%d %h:%i:%s')
        OR aDate = str_to_date("2018-03-02 15:04:32",'%Y-%m-%d %h:%i:%s')
        OR aDate = str_to_date("2017-09-19 04:07:54",'%Y-%m-%d %h:%i:%s')
        OR aDate = str_to_date("2016-12-13 04:24:08",'%Y-%m-%d %h:%i:%s')
        

        Results

        |           q | id |    aName |                aDate | aNum |
        |-------------|----|----------|----------------------|------|
        | adate by OR | 47 |     Troy | 2016-12-13T04:24:08Z |   47 |
        | adate by OR | 76 |   Jarrod | 2017-02-15T02:45:29Z |   76 |
        | adate by OR | 44 |   Deacon | 2017-04-20T07:03:05Z |   44 |
        | adate by OR | 46 | Thaddeus | 2017-09-19T04:07:54Z |   46 |
        | adate by OR | 10 |    Conan | 2018-07-10T06:29:39Z |   10 |
        | adate by OR | 12 | Chadwick | 2018-08-18T08:54:08Z |   12 |
        | adate by OR | 13 |   Darius | 2018-10-02T06:55:56Z |   13 |
        

        查询 4

        select 'adate by IN' q, mytable.*
        from mytable
        where aDate IN (
                  str_to_date("2017-02-15 02:45:29",'%Y-%m-%d %h:%i:%s')
                , str_to_date("2018-03-10 18:37:26",'%Y-%m-%d %h:%i:%s')
                , str_to_date("2017-05-20 16:25:10",'%Y-%m-%d %h:%i:%s')
                , str_to_date("2018-07-10 06:29:39",'%Y-%m-%d %h:%i:%s')
                , str_to_date("2017-07-14 00:04:00",'%Y-%m-%d %h:%i:%s')
                , str_to_date("2018-08-18 08:54:08",'%Y-%m-%d %h:%i:%s')
                , str_to_date("2018-10-02 06:55:56",'%Y-%m-%d %h:%i:%s')
                , str_to_date("2017-04-20 07:03:05",'%Y-%m-%d %h:%i:%s')
                , str_to_date("2018-03-02 15:04:32",'%Y-%m-%d %h:%i:%s')
                , str_to_date("2017-09-19 04:07:54",'%Y-%m-%d %h:%i:%s')
                , str_to_date("2016-12-13 04:24:08",'%Y-%m-%d %h:%i:%s')
                )
        

        Results

        |           q | id |    aName |                aDate | aNum |
        |-------------|----|----------|----------------------|------|
        | adate by IN | 47 |     Troy | 2016-12-13T04:24:08Z |   47 |
        | adate by IN | 76 |   Jarrod | 2017-02-15T02:45:29Z |   76 |
        | adate by IN | 44 |   Deacon | 2017-04-20T07:03:05Z |   44 |
        | adate by IN | 46 | Thaddeus | 2017-09-19T04:07:54Z |   46 |
        | adate by IN | 10 |    Conan | 2018-07-10T06:29:39Z |   10 |
        | adate by IN | 12 | Chadwick | 2018-08-18T08:54:08Z |   12 |
        | adate by IN | 13 |   Darius | 2018-10-02T06:55:56Z |   13 |
        

        查询 5

        select 'name by  OR' q, mytable.*
        from mytable
        where aname = 'Alan'
        OR aname = 'Brian'
        OR aname = 'Chandler'
        OR aname = 'Darius'
        OR aname = 'Evan'
        OR aname = 'Ferris'
        OR aname = 'Giacomo'
        OR aname = 'Hall'
        OR aname = 'James'
        OR aname = 'Jarrod'
        

        Results

        |           q | id |    aName |                aDate | aNum |
        |-------------|----|----------|----------------------|------|
        | name by  OR | 43 |     Alan | 2018-09-01T05:33:05Z |   43 |
        | name by  OR | 24 |    Brian | 2017-02-01T23:35:20Z |   24 |
        | name by  OR | 26 | Chandler | 2017-09-30T23:54:06Z |   26 |
        | name by  OR | 13 |   Darius | 2018-10-02T06:55:56Z |   13 |
        | name by  OR | 61 |     Evan | 2017-01-16T01:04:25Z |   61 |
        | name by  OR | 90 |   Ferris | 2017-03-29T06:59:13Z |   90 |
        | name by  OR | 37 |  Giacomo | 2018-10-06T14:21:21Z |   37 |
        | name by  OR | 71 |  Giacomo | 2017-04-17T06:12:52Z |   71 |
        | name by  OR | 16 |     Hall | 2017-10-13T00:17:24Z |   16 |
        | name by  OR | 60 |     Hall | 2018-06-04T01:53:45Z |   60 |
        | name by  OR | 38 |    James | 2018-02-06T02:45:59Z |   38 |
        | name by  OR | 76 |   Jarrod | 2017-02-15T02:45:29Z |   76 |
        

        查询 6

        select 'name by IN' q, mytable.*
        from mytable
        where aname IN (
              'Alan'
             ,'Brian'
             ,'Chandler'
             , 'Darius'
             , 'Evan'
             , 'Ferris'
             , 'Giacomo'
             , 'Hall'
             , 'James'
             , 'Jarrod'
             )
        

        Results

        |          q | id |    aName |                aDate | aNum |
        |------------|----|----------|----------------------|------|
        | name by IN | 43 |     Alan | 2018-09-01T05:33:05Z |   43 |
        | name by IN | 24 |    Brian | 2017-02-01T23:35:20Z |   24 |
        | name by IN | 26 | Chandler | 2017-09-30T23:54:06Z |   26 |
        | name by IN | 13 |   Darius | 2018-10-02T06:55:56Z |   13 |
        | name by IN | 61 |     Evan | 2017-01-16T01:04:25Z |   61 |
        | name by IN | 90 |   Ferris | 2017-03-29T06:59:13Z |   90 |
        | name by IN | 37 |  Giacomo | 2018-10-06T14:21:21Z |   37 |
        | name by IN | 71 |  Giacomo | 2017-04-17T06:12:52Z |   71 |
        | name by IN | 16 |     Hall | 2017-10-13T00:17:24Z |   16 |
        | name by IN | 60 |     Hall | 2018-06-04T01:53:45Z |   60 |
        | name by IN | 38 |    James | 2018-02-06T02:45:59Z |   38 |
        | name by IN | 76 |   Jarrod | 2017-02-15T02:45:29Z |   76 |
        

        【讨论】:

          【解决方案6】:

          我需要确定这一点,因此我对这两种方法进行了基准测试。我始终发现IN 比使用OR 快得多。

          不要相信那些给出“意见”的人,科学就是测试和证据。

          我运行了 1000 倍等效查询的循环(为了保持一致性,我使用了 sql_no_cache):

          IN: 2.34969592094s

          OR: 5.83781504631s

          更新:
          (我没有原始测试的源代码,就像 6 年前一样,尽管它返回的结果与此范围相同测试)

          请求一些示例代码来测试这个,这里是最简单的用例。使用 Eloquent 来简化语法,原始 SQL 等效项执行相同的操作。

          $t = microtime(true); 
          for($i=0; $i<10000; $i++):
          $q = DB::table('users')->where('id',1)
              ->orWhere('id',2)
              ->orWhere('id',3)
              ->orWhere('id',4)
              ->orWhere('id',5)
              ->orWhere('id',6)
              ->orWhere('id',7)
              ->orWhere('id',8)
              ->orWhere('id',9)
              ->orWhere('id',10)
              ->orWhere('id',11)
              ->orWhere('id',12)
              ->orWhere('id',13)
              ->orWhere('id',14)
              ->orWhere('id',15)
              ->orWhere('id',16)
              ->orWhere('id',17)
              ->orWhere('id',18)
              ->orWhere('id',19)
              ->orWhere('id',20)->get();
          endfor;
          $t2 = microtime(true); 
          echo $t."\n".$t2."\n".($t2-$t)."\n";
          

          1482080514.3635
          1482080517.3713
          3.0078368186951

          $t = microtime(true); 
          for($i=0; $i<10000; $i++): 
          $q = DB::table('users')->whereIn('id',[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20])->get(); 
          endfor; 
          $t2 = microtime(true); 
          echo $t."\n".$t2."\n".($t2-$t)."\n";
          

          1482080534.0185
          1482080536.178
          2.1595389842987

          【讨论】:

          • 这些测试中使用了哪些索引?
          • 我也在优化查询,发现IN 语句比OR 快30%。
          • Do not believe people who give their "opinion" 你说的 100% 正确,不幸的是 Stack Overflow 上到处都是
          • 性能原因(引用 MariaDB(一个 MySQL 新的免费分支)文档):Returns 1 if expr is equal to any of the values in the IN list, else returns 0. If all values are constants, they are evaluated according to the type of expr and sorted. The search for the item then is done using a binary search. This means IN is very quick if the IN value list consists entirely of constants . Otherwise, type conversion takes place according to the rules described at Type Conversion, but applied to all the arguments. => 如果您的列是整数,也将整数传递给IN...
          • 作为“不要相信给出“意见”的人”的必然结果:提供性能数据而不包括用于获取这些数据的脚本、表格和索引会使他们无法验证。因此,这些数字就像“意见”一样好。
          【解决方案7】:

          我认为 BETWEEN 会更快,因为它应该被转换成:

          Field >= 0 AND Field <= 5
          

          据我了解,IN 无论如何都会转换为一堆 OR 语句。 IN 的价值在于易用性。 (不必多次键入每个列名,并且还可以更轻松地与现有逻辑一起使用 - 您不必担心 AND/OR 优先级,因为 IN 是一个语句。使用一堆 OR 语句,您有以确保您将它们括在括号中,以确保它们被评估为一个条件。)

          您问题的唯一真正答案是PROFILE YOUR QUERIES。然后你就会知道什么在你的特定情况下最有效。

          【讨论】:

          • 统计上,Between 有机会触发范围索引。 IN() 没有这个特权。但是是的,海滩是对的:您需要分析您的请求以了解是否使用索引以及使用哪个索引。很难预测 MySQL 优化器会选择什么。
          • “我的理解是一个 IN 无论如何都会被转换成一堆 OR 语句。”你在哪里读到的?我希望它把它放在一个哈希图中进行 O(1) 查找。
          • IN 被转换为 OR 是 SQLServer 处理它的方式(或者至少是这样 - 现在可能已经改变,多年未使用它)。我一直找不到任何证据表明 MySQL 会这样做。
          • 这个答案是正确的,之间转换为“1 tocker.ca/2015/05/25/…
          【解决方案8】:

          我还为未来的 Google 员工做了一个测试。返回结果的总数是 7264 out of 10000

          SELECT * FROM item WHERE id = 1 OR id = 2 ... id = 10000
          

          此查询耗时 0.1239

          SELECT * FROM item WHERE id IN (1,2,3,...10000)
          

          此查询耗时 0.0433

          INOR 快 3 倍

          【讨论】:

          • 它是什么 MySQL 引擎,您是否清除了两个查询之间的 MySQL 缓冲区和操作系统文件缓存?
          • 您的测试是一个狭窄的用例。查询返回 72% 的数据,不太可能从索引中受益。
          • 我敢打赌,大部分时间都花在了查询、解析和查询规划上。这当然是一个考虑因素:如果您将有 10k OR 语句,那么您将有很多多余的文本,只需使用 OR 来表达它:最好尽可能使用最紧凑的表达式。
          【解决方案9】:

          我认为对 sunseeker 观察的一种解释是 MySQL 实际上对 IN 语句中的值进行排序,如果它们都是静态值并使用二进制搜索,这比普通的 OR 替代方案更有效。我不记得我在哪里读到的,但 sunseeker 的结果似乎是一个证明。

          【讨论】:

          • 我也听说过排序了。
          【解决方案10】:

          这取决于你在做什么;范围有多宽,数据类型是什么(我知道您的示例使用数字数据类型,但您的问题也适用于许多不同的数据类型)。

          这是一个您希望以两种方式编写查询的实例;让它工作,然后使用 EXPLAIN 找出执行差异。

          我确信对此有一个具体的答案,但实际上,这就是我要找出我给定问题的答案的方法。

          这可能会有所帮助:http://forge.mysql.com/wiki/Top10SQLPerformanceTips

          问候,
          弗兰克

          【讨论】:

          • 这应该是选择的答案。
          • 链接已过时 - 我认为这可能是等价的? wikis.oracle.com/pages/viewpage.action?pageId=27263381(感谢甲骨文;-P)
          • 在等效页面上,它说:“在选择索引字段时避免使用 IN(...),它会降低 SELECT 查询的性能。” - 知道为什么吗?
          • 网址已过期
          【解决方案11】:

          我知道,只要你在 Field 上有一个索引,BETWEEN 就会用它快速找到一端,然后遍历到另一端。这是最有效的。

          我见过的每个 EXPLAIN 都显示“IN (...)”和“... OR ...”可以互换并且同样(无效)有效。这是您所期望的,因为优化器无法知道它们是否包含一个区间。它也等效于单个值上的 UNION ALL SELECT。

          【讨论】:

            【解决方案12】:

            我敢打赌它们是相同的,您可以通过执行以下操作来运行测试:

            将“in (1,2,3,4)”循环 500 次,看看需要多长时间。循环“=1 or =2 or=3...”版本 500 次,看看它运行了多长时间。

            您也可以尝试连接方式,如果 someField 是索引并且您的表很大,它可能会更快...

            SELECT ... 
                FROM ... 
                    INNER JOIN (SELECT 1 as newField UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4) dt ON someFIELD =newField
            

            我在我的 SQL Server 上尝试了上面的 join 方法,它与 in (1,2,3,4) 几乎相同,它们都导致聚集索引查找。我不确定 MySQL 将如何处理它们。

            【讨论】:

              【解决方案13】:

              根据我对编译器优化这些类型查询的方式的理解,使用 IN 子句比多个 OR 子句更有效。如果您有可以使用 BETWEEN 子句的值,那仍然更有效。

              【讨论】:

                【解决方案14】:

                OR 将是最慢的。 IN 或 BETWEEN 是否更快取决于您的数据,但我希望 BETWEEN 通常更快,因为它可以简单地从索引中获取一个范围(假设 someField 已编入索引)。

                【讨论】: