【发布时间】:2014-02-27 01:44:51
【问题描述】:
SELECT "items".* FROM "items"
INNER JOIN item_mods ON item_mods.item_id = items.id
INNER JOIN mods ON mods.id = item_mods.mod_id
AND item_mods.mod_id = 3
WHERE (items.player_id = '1')
GROUP BY items.id, item_mods.primary_value
ORDER BY item_mods.primary_value DESC NULLS LAST, items.created_at DESC LIMIT 100
此查询目前大约需要 7 秒。我在 items 表上有大约 550k 条记录,在 item_mods 表上有大约 250 万条记录,在 mods 表上有大约 800 条记录。我有很多索引,但我不确定我是否使用了正确的索引。
如果你要优化这个查询,你会推荐什么?
这里是解释分析。
http://explain.depesz.com/s/aiYH
"Limit (cost=107274.88..107275.13 rows=100 width=554) (actual time=6648.872..6648.888 rows=100 loops=1)"
" -> Sort (cost=107274.88..107419.24 rows=57745 width=554) (actual time=6648.870..6648.879 rows=100 loops=1)"
" Sort Key: item_mods.primary_value, items.created_at"
" Sort Method: top-N heapsort Memory: 103kB"
" -> Group (cost=104634.82..105067.91 rows=57745 width=554) (actual time=6358.348..6529.342 rows=57498 loops=1)"
" -> Sort (cost=104634.82..104779.18 rows=57745 width=554) (actual time=6358.344..6423.184 rows=57498 loops=1)"
" Sort Key: items.id, item_mods.primary_value"
" Sort Method: external sort Disk: 25624kB"
" -> Nested Loop (cost=23182.35..71248.94 rows=57745 width=554) (actual time=3339.625..6127.659 rows=57498 loops=1)"
" -> Index Scan using mods_pkey on mods (cost=0.00..8.27 rows=1 width=4) (actual time=0.323..0.324 rows=1 loops=1)"
" Index Cond: (id = 3)"
" -> Merge Join (cost=23182.35..70663.22 rows=57745 width=558) (actual time=3339.298..6108.202 rows=57498 loops=1)"
" Merge Cond: (items.id = item_mods.item_id)"
" -> Index Scan using items_pkey on items (cost=0.00..45112.64 rows=543004 width=550) (actual time=3.190..2575.715 rows=543024 loops=1)"
" Filter: (player_id = 1)"
" -> Materialize (cost=23182.33..23471.20 rows=57774 width=12) (actual time=3336.099..3388.810 rows=57547 loops=1)"
" -> Sort (cost=23182.33..23326.76 rows=57774 width=12) (actual time=3336.095..3370.179 rows=57547 loops=1)"
" Sort Key: item_mods.item_id"
" Sort Method: external sort Disk: 1240kB"
" -> Bitmap Heap Scan on item_mods (cost=1084.27..17622.45 rows=57774 width=12) (actual time=31.728..3263.762 rows=57547 loops=1)"
" Recheck Cond: (mod_id = 3)"
" -> Bitmap Index Scan on primary_value_mod_id_desc (cost=0.00..1069.83 rows=57774 width=0) (actual time=29.565..29.565 rows=57547 loops=1)"
" Index Cond: (mod_id = 3)"
"Total runtime: 6652.100 ms"
更新
我已按照建议修改了查询。我使用 GROUP BY 只为每个项目 ID 选择 1 个项目,但我想 distinct 也可以。这是新的查询和解释,它仍然需要太长时间。查询的想法是查找玩家 '1' 拥有的所有带有物品修饰符 '3' 的物品,这些物品由具有最高主值的修饰符排序。
SELECT DISTINCT("items".id), "item_mods".primary_value, "items".created_at
FROM "items" INNER JOIN item_mods ON item_mods.item_id = items.id
INNER JOIN mods ON mods.id = item_mods.mod_id AND item_mods.mod_id = 3
WHERE (items.player_id = '1')
ORDER BY item_mods.primary_value DESC NULLS LAST, items.created_at DESC LIMIT 100
解释http://explain.depesz.com/s/t4Zq
"Limit (cost=73737.59..73738.59 rows=100 width=16) (actual time=6450.253..6450.344 rows=100 loops=1)"
" -> Unique (cost=73737.59..74315.04 rows=57745 width=16) (actual time=6450.248..6450.316 rows=100 loops=1)"
" -> Sort (cost=73737.59..73881.95 rows=57745 width=16) (actual time=6450.242..6450.272 rows=100 loops=1)"
" Sort Key: item_mods.primary_value, items.created_at, items.id"
" Sort Method: external merge Disk: 1456kB"
" -> Hash Join (cost=46944.77..68183.71 rows=57745 width=16) (actual time=3018.769..6342.109 rows=57498 loops=1)"
" Hash Cond: (item_mods.item_id = items.id)"
" -> Nested Loop (cost=1084.27..18208.45 rows=57774 width=8) (actual time=15.911..3219.086 rows=57547 loops=1)"
" -> Index Scan using mods_pkey on mods (cost=0.00..8.27 rows=1 width=4) (actual time=0.486..0.489 rows=1 loops=1)"
" Index Cond: (id = 3)"
" -> Bitmap Heap Scan on item_mods (cost=1084.27..17622.45 rows=57774 width=12) (actual time=15.416..3197.257 rows=57547 loops=1)"
" Recheck Cond: (mod_id = 3)"
" -> Bitmap Index Scan on primary_value_mod_id_desc (cost=0.00..1069.83 rows=57774 width=0) (actual time=13.517..13.517 rows=57547 loops=1)"
" Index Cond: (mod_id = 3)"
" -> Hash (cost=36420.95..36420.95 rows=543004 width=12) (actual time=2987.089..2987.089 rows=543024 loops=1)"
" Buckets: 4096 Batches: 32 Memory Usage: 811kB"
" -> Seq Scan on items (cost=0.00..36420.95 rows=543004 width=12) (actual time=0.012..2825.650 rows=543024 loops=1)"
" Filter: (player_id = 1)"
"Total runtime: 6457.586 ms"
更新 2
好的,我想我快到了。 此查询需要 6 秒并产生我想要的结果
SELECT "items".id, item_mods.primary_value
FROM "items"
INNER JOIN item_mods ON item_mods.item_id = items.id AND item_mods.mod_id = 36
WHERE (items.player_id = '1')
ORDER BY item_mods.primary_value DESC, item_mods.id DESC
LIMIT 100
但是这个查询需要 9 毫秒!注意 ORDER BY 的区别。但我需要它们按最近的顺序排列。我在 (item_mods.primary_value DESC, item_mods.id DESC) 上有一个索引,但它似乎没有使用它?
SELECT "items".id, item_mods.primary_value
FROM "items"
INNER JOIN item_mods ON item_mods.item_id = items.id AND item_mods.mod_id = 36
WHERE (items.player_id = '1')
ORDER BY item_mods.primary_value DESC
LIMIT 100
【问题讨论】:
-
只有在您确实需要选择所有字段时才使用 *
-
另外,确保 items.player_id、item_mods.item_id、item_mods.mod_id 都有自己的 idex
-
当您有
GROUP BY时,如何选择*? -
如果 items.id 是唯一的,首先删除 group by items.id,其次为什么使用 group by 而不使用聚合函数?
-
external merge Disk: 1456kB。SET work_mem = '50MB'并重试。另外,items(player_id)上是否有索引?
标签: sql postgresql