【发布时间】:2011-02-12 03:54:43
【问题描述】:
我有两个表,表 A 有 700,000 个条目,表 B 有 600,000 个条目。结构如下:
表 A:
+-----------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+---------------------+------+-----+---------+----------------+
| id | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| number | bigint(20) unsigned | YES | | NULL | |
+-----------+---------------------+------+-----+---------+----------------+
表 B:
+-------------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------------+------+-----+---------+----------------+
| id | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| number_s | bigint(20) unsigned | YES | MUL | NULL | |
| number_e | bigint(20) unsigned | YES | MUL | NULL | |
| source | varchar(50) | YES | | NULL | |
+-------------+---------------------+------+-----+---------+----------------+
我正在尝试使用以下代码查找表 A 中的任何值是否存在于表 B 中:
$sql = "SELECT number from TableA";
$result = mysql_query($sql) or die(mysql_error());
while($row = mysql_fetch_assoc($result)) {
$number = $row['number'];
$sql = "SELECT source, count(source) FROM TableB WHERE number_s < $number AND number_e > $number GROUP BY source";
$re = mysql_query($sql) or die(mysql_error);
while($ro = mysql_fetch_array($re)) {
echo $number."\t".$ro[0]."\t".$ro[1]."\n";
}
}
我希望查询会很快,但由于某种原因,它并不快。我对选择的解释(具有特定的“数字”值)给了我以下信息:
mysql> explain SELECT source, count(source) FROM TableB WHERE number_s < 1812194440 AND number_e > 1812194440 GROUP BY source;
+----+-------------+------------+------+-------------------------+------+---------+------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+-------------------------+------+---------+------+--------+----------------------------------------------+
| 1 | SIMPLE | TableB | ALL | number_s,number_e | NULL | NULL | NULL | 696325 | Using where; Using temporary; Using filesort |
+----+-------------+------------+------+-------------------------+------+---------+------+--------+----------------------------------------------+
1 row in set (0.00 sec)
有什么我可以从中挤出的优化吗?
我尝试为同一任务编写一个存储过程,但它甚至似乎一开始就不起作用...它没有给出任何语法错误...我尝试运行它一天,它是仍在运行,感觉很奇怪。
CREATE PROCEDURE Filter()
Begin
DECLARE number BIGINT UNSIGNED;
DECLARE x INT;
DECLARE done INT DEFAULT 0;
DECLARE cur1 CURSOR FOR SELECT number FROM TableA;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
CREATE TEMPORARY TABLE IF NOT EXISTS Flags(number bigint unsigned, count int(11));
OPEN cur1;
hist_loop: LOOP
FETCH cur1 INTO number;
SELECT count(*) from TableB WHERE number_s < number AND number_e > number INTO x;
IF done = 1 THEN
LEAVE hist_loop;
END IF;
IF x IS NOT NULL AND x>0 THEN
INSERT INTO Flags(number, count) VALUES(number, x);
END IF;
END LOOP hist_loop;
CLOSE cur1;
END
【问题讨论】:
-
让我直截了当地说...您正在运行 700,001 个查询,但您对它不快感到惊讶吗?
-
嗯..我不是说它不快..我只是想问是否有任何更多的优化可以让它更快...... :)
-
使用
$number BETWEEN number_s AND number_e会不会比较慢? -
并没有真正观察到性能差异。当然,我没有放时序说明,只是观察了 echo 命令的输出。
-
如果你想知道是否有行,你可能需要一个 JOIN 查询。然后数据库可以优化 1 个查询,而不是处理 700k + 查询,这会带来所有开销。
标签: php stored-procedures mysql query-optimization