MySQL：存储过程慢答案

【问题标题】：MySQL: Stored Procedure SlowMySQL：存储过程慢
【发布时间】：2016-08-30 21:29:32
【问题描述】：

我有一个数据库表，我想在其中获取指定父级的所有子级（第 n 级）。为此，我使用了answer from here，它工作正常。

代码如下：

DELIMITER $$
CREATE PROCEDURE getParents(in_id INT)
  BEGIN
    DROP TEMPORARY TABLE IF EXISTS results;
    DROP TEMPORARY TABLE IF EXISTS temp2;
    DROP TEMPORARY TABLE IF EXISTS temp1;

    CREATE TEMPORARY TABLE temp1 AS
      SELECT DISTINCT * FROM agents WHERE upline_id = in_id;

    CREATE TEMPORARY TABLE results AS
      SELECT * FROM temp1;

    WHILE (SELECT count(*) FROM temp1) DO
      CREATE TEMPORARY TABLE temp2 AS
        SELECT DISTINCT *
        FROM agents
        WHERE upline_id IN (SELECT id FROM temp1);

      INSERT INTO results SELECT * FROM temp2;
      DROP TEMPORARY TABLE IF EXISTS temp1;

      CREATE TEMPORARY TABLE temp1 AS
        SELECT * FROM temp2;
      DROP TEMPORARY TABLE IF EXISTS temp2;

    END WHILE;   

    SELECT * FROM results;

  END $$
DELIMITER ;

这是我的使用方法：

调用 getParents(2060);

一切都很好，但查询运行缓慢。表也不包含超过 10,000 条记录。

有没有办法优化上面的存储过程，让它运行得更快一点？

我是 MySQL 新手，所以我不知道如何优化这个存储过程。感谢您的帮助

【问题讨论】：

您在查询中使用了太多查询和查询......这会减慢进程。
您能具体说明您要查找的内容吗？特定父母的子女和孙子女等等（当您将父母放在“upline_id”中并将孩子放在表格中的“id”中时，您的代码会执行此操作）？特定孩子的父母（这是您链接中的问题，您的程序名称是什么意思？）或者只是特定父母的 n 年级孙辈（中间没有所有孩子）（在您的问题，但可能不是这个意思，因为这个“n”没有出现在你的代码中）？
@Solarflare：父母的孩子，但递归
@dev02 我更新了我的答案，你可能想试试engine=memory，也许它会解决你的速度问题。

标签： mysql sql stored-procedures

【解决方案1】：

更新：我可能忘记了最可能对您进行的优化：我猜您不会将内存用于临时表。对于 mysql >= 5.6，您可以在配置中设置 default_tmp_storage_engine=MEMORY （这会产生让您在回答下一个堆栈溢出问题时忘记它的副作用），或者您可以在查询中使用 engine = memory 如果您有较早的版本或不能或不想更改您的配置。我更新了查询以使用内存。如果这确实是您的问题，并且您将硬盘用于临时表，那么您可能已经对原始代码感到满意，并在各处添加了 engine=memory，因为它会产生最大的影响。

创建和删除临时表是昂贵的操作。第一个优化是创建一次，然后删除内容，如下所示：

DELIMITER $$

CREATE PROCEDURE getParents(in_id INT)
BEGIN

    drop table if exists temp1;
    drop table if exists temp2;
    drop table if exists results; 

    create temporary table temp2 engine=memory as (select id, upline_id from agents where upline_id = in_id); 
    create temporary table results engine=memory as (select id, upline_id from temp2); 
    create temporary table temp1 (id int, upline_id int) engine=memory;

    while (select count(*) from temp2) do 

        insert into temp1 (id, upline_id)
        select a.id, upline_id 
        from agents a
        where a.upline_id in (select id from temp2) ;

        insert into results (id, upline_id)
        select distinct id, upline_id
        from temp1;

        delete from temp2;

        insert into temp2 (id, upline_id)
        select distinct id, upline_id
        from temp1;

        delete from temp1;
    end while;    

    select a.* 
    from results r
    join agents a
    on a.id = r.id;

    drop table if exists temp1;
    drop table if exists temp2;
    drop table if exists results; 

End $$  

DELIMITER ;

下一个优化可能是通过提前加入来减少重复次数（使代码看起来更复杂但应该更快），从而一次执行多个级别，并通过保存子级别 n 来删除一个临时表：

DELIMITER $$

CREATE PROCEDURE getParents(in_id INT)
BEGIN

    set @n = 1;

    drop table if exists temp1;
    drop table if exists results; 

    create temporary table results (id int, upline_id int, n int) engine = memory;
    insert into results (id, upline_id, n) 
    select id, upline_id, @n from agents where upline_id = in_id; 

    create temporary table temp1 (id0 int, upline_id0 int, id1 int, upline_id1 int, 
                                  id2 int, upline_id2 int, id3 int, upline_id3 int) engine = memory;

    while (select count(*) from results where n = @n) do 

        insert into temp1 
        select a0.id as id0, a0.upline_id as upline_id0, 
        a1.id as id1, a1.upline_id as upline_id1, 
        a2.id as id2, a2.upline_id as upline_id2, 
        a3.id as id3, a3.upline_id as upline_id3
        from agents a0
        left outer join agents a1
        on a1.upline_id = a0.id
        left outer join agents a2
        on a2.upline_id = a1.id
        left outer join agents a3
        on a3.upline_id = a2.id
        where a0.upline_id in (select id from results where n = @n) ;

        insert into results (id, upline_id, n)
        select distinct id0, upline_id0, @n + 1
        from temp1
        where not id0 is null;
        insert into results (id, upline_id, n)
        select distinct id1, upline_id1, @n + 2
        from temp1
        where not id1 is null;
        insert into results (id, upline_id, n)
        select distinct id2, upline_id2, @n + 3
        from temp1
        where not id2 is null;
        insert into results (id, upline_id, n)
        select distinct id3, upline_id3, @n + 4
        from temp1
        where not id3 is null;

        set @n = @n + 4;

        delete from temp1;

    end while;    

    select a.* 
    from results r
    join agents a
    on a.id = r.id;

    drop table if exists temp1;
    drop table if exists results; 

End $$  

DELIMITER ;

您可以增加或减少连接数。更多的连接不一定会更快，因为如果您有稀疏数据，您可能会执行未使用的连接，因此您可能需要进行一些测试。（这将取决于您的数据、每个父母的孩子数量以及您最常查询的深度，但 3-4 可能是一个很好的起点。您不应该把它设置得太高，并且应该为父母测试它很多孩子/孙子。）

但获得结果的最快方法是查看嵌套集，Managing Hierarchical Data in MySQL。阅读和理解有点困难，但是嵌套集的操作要快得多（它们正是为您在数据库中遇到的问题而设计的）。您可以在同一个表中同时拥有这两种结构（如果您可能在另一个地方需要它们，那么这不是反对它的理由），您只需在更改数据时保持它们是最新的。而且，好吧，首先阅读很多。但这值得您花时间。

【讨论】：

我收到错误Sql Error (1136): Column count doesn't match value count at row 1
@dev02，是的，它会，但詹姆斯的解决方案也是如此。您能否为 id/upline_id 添加一些示例数据以及您对特定呼叫的期望结果？或者使用 James 的示例数据并写下您对 CALL getChildren(14); 的期望结果 - 我的结果与他的结果相同。我们可能对“n 级”或您如何定义/理解id 和upline_id 有不同的理解。 upline_id 是孩子的（唯一）父母的 id（其中“孩子”是在树结构中看到的孩子——“人类”孩子可能有两个父母和他自己的几个孩子——这不是这里的意思）。
您的解决方案为我提供了准确的结果和与子父关系一样多的行，这很好，但在生产中使用仍然有点慢。我不明白詹姆斯的解决方案，因为它与我的表格的列不匹配。谢谢
你能不能让它更快，或者在你的例子中结合正则表达式的东西，或者将解决方案与詹姆斯或其他东西结合起来。我只想让给定父母的所有孩子直到第 n 级类似于 Postgres SQL 中的WITH RECURSSIVE 事情
@dev02：如果您发布的代码是适合您的代码，您只需删除 James 代码中的 data 并将agentsZ 重命名为agents。否则，您需要发布您的表格数据。我在他的 cmets 中提出了对他的解决方案的改进，该解决方案无需临时表即可工作，如果您使用它，它将比我的更快（至少如果您没有真正的大树）。要么等待他添加它，要么自己组合它（我不会“窃取”他的想法，并且将其完全发布为评论太长了）。不过，您可以使用真实表格来提高速度（临时表格会减慢速度）。

【解决方案2】：

（注意 Solarflare 建议我以错误的方式使用此递归，请在最后的编辑中查看来自父母的孩子。）

这是我能想到的最快的方法，假设每个孩子只有一个父母，我希望它是正确的。 请注意，我将表名称更改为agentsZ，因为开头有一个drop命令，如果在没有Z的情况下运行，它将清除原始表。这样做的原因是存储过程将运行开箱即用，前提是您将表名称更改为“代理”，并将数据列名称替换为您需要的列的实际名称（星号不起作用）。

原始代码：

# DROP TABLE IF EXISTS agentsZ;

CREATE TABLE agentsZ (id TINYINT UNSIGNED PRIMARY KEY, upline_id TINYINT UNSIGNED, `data` CHAR(8));

INSERT INTO agentsZ
VALUES (1, 4, 'A'),
(2, 3, 'B'),
(5, 8, 'C'),
(6, 7, 'D'),
(4, 9, 'E'),
(3, 9, 'F'),
(9, 12, 'G'),
(8, 11, 'H'),
(7, 10, 'I'),
(12, 13, 'J'),
(11, 14, 'K'),
(10, 14, 'L');

DELIMITER $

DROP PROCEDURE IF EXISTS getParents$

CREATE PROCEDURE getParents(in_id INT)
BEGIN

    SET @VUplineID := in_id;
    SELECT id, @VUplineID := upline_id upline_id, `data` FROM agentsZ WHERE id = @VUplineID;

END$

DELIMITER ;

CALL getParents(1);

测试代码：

mysql> DROP TABLE IF EXISTS agentsZ;
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql>
mysql> CREATE TABLE agentsZ (id TINYINT UNSIGNED PRIMARY KEY, upline_id TINYINT UNSIGNED, `data` CHAR(8));
Query OK, 0 rows affected (0.06 sec)

mysql>
mysql> INSERT INTO agentsZ
    -> VALUES (1, 4, 'A'),
    -> (2, 3, 'B'),
    -> (5, 8, 'C'),
    -> (6, 7, 'D'),
    -> (4, 9, 'E'),
    -> (3, 9, 'F'),
    -> (9, 12, 'G'),
    -> (8, 11, 'H'),
    -> (7, 10, 'I'),
    -> (12, 13, 'J'),
    -> (11, 14, 'K'),
    -> (10, 14, 'L');
Query OK, 12 rows affected (0.02 sec)
Records: 12  Duplicates: 0  Warnings: 0

mysql>
mysql> DELIMITER $
mysql>
mysql> DROP PROCEDURE IF EXISTS getParents$
Query OK, 0 rows affected (0.02 sec)

mysql>
mysql> CREATE PROCEDURE getParents(in_id INT)
    -> BEGIN
    ->
    -> SET @VUplineID := in_id;
    -> SELECT id, @VUplineID := upline_id upline_id, `data` FROM agentsZ WHERE id = @VUplineID;
    ->
    -> END$
Query OK, 0 rows affected (0.00 sec)

mysql>
mysql> DELIMITER ;
mysql>
mysql> CALL getParents(1);
+----+-----------+------+
| id | upline_id | data |
+----+-----------+------+
|  1 |         4 | A    |
|  4 |         9 | E    |
|  9 |        12 | G    |
| 12 |        13 | J    |
+----+-----------+------+
4 rows in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

稍微不那么优雅，但在这里走另一条路是另一个功能：

DELIMITER $

DROP PROCEDURE IF EXISTS getChildren$

CREATE PROCEDURE getChildren(in_id INT)
BEGIN

    SET @VBeforeRows := -1;
    SET @VAfterRows := 0;
    SET @VDownLineIDRegex := CONCAT('^', in_id, '$');

    WHILE @VAfterRows != @VBeforeRows DO

        SET @VBeforeRows := @VAfterRows;

        DROP TEMPORARY TABLE IF EXISTS ZResults;

        CREATE TEMPORARY TABLE ZResults
        SELECT id, upline_id, IF(@VDownLineIDRegex REGEXP CONCAT('\|\^', id, '\$'), @VLoop := FALSE, @VDownLineIDRegex := CONCAT(@VDownLineIDRegex, '|^', id, '$')) idRegex, `data` FROM agentsZ WHERE upline_id REGEXP @VDownLineIDRegex;

        SELECT COUNT(*) INTO @VAfterRows FROM ZResults;

    END WHILE;

    SELECT id, upline_id, `data` FROM ZResults;

END$

DELIMITER ;

CALL getChildren(14);

这是我执行时输出的副本：

mysql> DROP TABLE IF EXISTS agentsZ;
Query OK, 0 rows affected (0.02 sec)

mysql>
mysql> CREATE TABLE agentsZ (id TINYINT UNSIGNED PRIMARY KEY, upline_id TINYINT UNSIGNED, `data` CHAR(8));
Query OK, 0 rows affected (0.04 sec)

mysql>
mysql> INSERT INTO agentsZ
    -> VALUES (1, 4, 'A'),
    -> (2, 3, 'B'),
    -> (5, 8, 'C'),
    -> (6, 7, 'D'),
    -> (4, 9, 'E'),
    -> (3, 9, 'F'),
    -> (9, 12, 'G'),
    -> (8, 11, 'H'),
    -> (7, 10, 'I'),
    -> (12, 13, 'J'),
    -> (11, 14, 'K'),
    -> (10, 14, 'L');
Query OK, 12 rows affected (0.02 sec)
Records: 12  Duplicates: 0  Warnings: 0

mysql>
mysql> DELIMITER $
mysql>
mysql> DROP PROCEDURE IF EXISTS getChildren$
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql>
mysql> CREATE PROCEDURE getChildren(in_id INT)
    -> BEGIN
    ->
    ->     SET @VBeforeRows := -1;
    ->     SET @VAfterRows := 0;
    ->     SET @VDownLineIDRegex := CONCAT('^', in_id, '$');
    ->
    ->     WHILE @VAfterRows != @VBeforeRows DO
    ->
    ->         SET @VBeforeRows := @VAfterRows;
    ->
    ->         DROP TEMPORARY TABLE IF EXISTS ZResults;
    ->
    ->         CREATE TEMPORARY TABLE ZResults
    ->         SELECT id, upline_id, IF(@VDownLineIDRegex REGEXP CONCAT('\|\^', id, '\$'), @VLoop := FALSE, @VDownLineIDRegex := CONCAT(@VDownLineIDRegex, '|^', id, '$')) idRegex, `data` FROM agentsZ WHERE upline_id REGEXP @VDownLineIDRegex;
    ->
    ->         SELECT COUNT(*) INTO @VAfterRows FROM ZResults;
    ->
    ->     END WHILE;
    ->
    ->     SELECT id, upline_id, `data` FROM ZResults;
    ->
    -> END$
Query OK, 0 rows affected (0.01 sec)

mysql>
mysql> DELIMITER ;
mysql>
mysql> CALL getChildren(14);
+----+-----------+------+
| id | upline_id | data |
+----+-----------+------+
|  5 |         8 | C    |
|  6 |         7 | D    |
|  7 |        10 | I    |
|  8 |        11 | H    |
| 10 |        14 | L    |
| 11 |        14 | K    |
+----+-----------+------+
6 rows in set (0.13 sec)

Query OK, 0 rows affected (0.13 sec)

问候，

詹姆斯

【讨论】：

SET @VUplineID := 1; 应该是 SET @VUplineID := in_id;。但据我了解原始程序，我预计getParents 实际上是指getChildren，因为它向下遍历，但我不再那么确定了......
感谢指点！我会改变那个海峡。希望dev02能为我们指明方向。
我喜欢最后一个程序，但它只找到一个级别的孩子，而它应该是第 n 级。谢谢
应该是n级，我检查一下
我无法重现 1 级问题，执行时查看我最后的输出，您可以在运行相同代码时重现这些结果吗？