【发布时间】:2012-04-10 21:34:57
【问题描述】:
在查询中引入 ORDER BY 子句会增加总时间,因为数据库必须做额外的工作才能对结果集进行排序:
- 将生成的元组复制到一些临时内存中
- 对它们进行排序(希望在内存中,否则使用磁盘)
- 将结果流式传输到客户端
我想念的是为什么仅仅从连接表中添加一列会产生如此不同的性能。
查询1
EXPLAIN ANALYZE
SELECT p.*
FROM product_product p
JOIN django_site d ON (p.site_id = d.id)
WHERE (p.active = true AND p.site_id = 1 )
ORDER BY d.domain, p.ordering, p.name
查询计划
Sort (cost=3909.83..3952.21 rows=16954 width=1086) (actual time=1120.618..1143.922 rows=16946 loops=1)
Sort Key: django_site.domain, product_product.ordering, product_product.name
Sort Method: quicksort Memory: 25517kB
-> Nested Loop (cost=0.00..2718.86 rows=16954 width=1086) (actual time=0.053..87.396 rows=16946 loops=1)
-> Seq Scan on django_site (cost=0.00..1.01 rows=1 width=24) (actual time=0.010..0.012 rows=1 loops=1)
Filter: (id = 1)
-> Seq Scan on product_product (cost=0.00..2548.31 rows=16954 width=1066) (actual time=0.036..44.138 rows=16946 loops=1)
Filter: (product_product.active AND (product_product.site_id = 1))
Total runtime: 1182.515 ms
查询 2
同上,但不按django_site.domain排序
查询计划
Sort (cost=3909.83..3952.21 rows=16954 width=1066) (actual time=257.094..278.905 rows=16946 loops=1)
Sort Key: product_product.ordering, product_product.name
Sort Method: quicksort Memory: 25161kB
-> Nested Loop (cost=0.00..2718.86 rows=16954 width=1066) (actual time=0.075..86.120 rows=16946 loops=1)
-> Seq Scan on django_site (cost=0.00..1.01 rows=1 width=4) (actual time=0.015..0.017 rows=1 loops=1)
Filter: (id = 1)
-> Seq Scan on product_product (cost=0.00..2548.31 rows=16954 width=1066) (actual time=0.052..44.024 rows=16946 loops=1)
Filter: (product_product.active AND (product_product.site_id = 1))
Total runtime: 305.392 ms
This question 可能是相关的。
编辑:添加了更多细节
Table "public.product_product"
Column | Type |
-------------+------------------------+---------
id | integer | not null default nextval('product_product_id_seq'::regclass)
site_id | integer | not null
name | character varying(255) | not null
slug | character varying(255) | not null
sku | character varying(255) |
ordering | integer | not null
[snip some columns ]
Indexes:
"product_product_pkey" PRIMARY KEY, btree (id)
"product_product_site_id_key" UNIQUE, btree (site_id, sku)
"product_product_site_id_key1" UNIQUE, btree (site_id, slug)
"product_product_site_id" btree (site_id)
"product_product_slug" btree (slug)
"product_product_slug_like" btree (slug varchar_pattern_ops)
Table "public.django_site"
Column | Type |
--------+------------------------+----------
id | integer | not null default nextval('django_site_id_seq'::regclass)
domain | character varying(100) | not null
name | character varying(50) | not null
Indexes:
"django_site_pkey" PRIMARY KEY, btree (id)
Postgres 版本是 8.4
一些表格统计数据:
# select count(*) from django_site;
count
-------
1
# select count(*) from product_product;
count
-------
17540
# select active, count(*) from product_product group by active;
active | count
--------+-------
f | 591
t | 16949
# select site_id, count(*) from product_product group by site_id;
site_id | count
---------+-------
1 | 17540
【问题讨论】:
-
Site_id 应该是两个表中的关键字段(的一部分)。是吗?
标签: sql performance postgresql sql-order-by collation