【发布时间】:2018-05-30 17:26:51
【问题描述】:
我的查询有一些问题 - 花费了太多时间(2636124 毫秒!):
SELECT COUNT(*) AS "__count"
FROM "dictionary_dictionary"
WHERE NOT ("dictionary_dictionary"."id" IN (SELECT U1."word_id" AS Col1
FROM "dictionary_frequencydata" U1
WHERE U1."user_id" = 1));
此查询由 ORM (Django) 生成。当我尝试(使用 ORM)执行它时,我的应用程序挂起,当我输入 psql 时 - psql 挂起。
解释分析:
Aggregate (cost=329583550.40..329583550.41 rows=1 width=8) (actual
time=2636109.932..2636109.933 rows=1 loops=1)
-> Seq Scan on dictionary_dictionary (cost=0.00..329583390.76
rows=63856 width=0) (actual time=2636109.922..2636109.922 rows=0 loops=1)
Filter: (NOT (SubPlan 1))
Rows Removed by Filter: 127712
SubPlan 1
-> Materialize (cost=0.00..4821.74 rows=135828 width=4) (actual time=0.006..12.453 rows=63856 loops=127712)
-> Seq Scan on dictionary_frequencydata u1 (cost=0.00..3611.60 rows=135828 width=4) (actual time=0.299..95.915 rows=127712 loops=1)
Filter: (user_id = 1)
Rows Removed by Filter: 28054
Planning time: 0.277 ms
Execution time: 2636124.744 ms
(11 wierszy)`
我的 Django 模型
class Dictionary(DateTimeModel):
base_word = models.ForeignKey(BaseDictionary, related_name=_('dict_words'))
word = models.CharField(max_length=64)
version = models.ForeignKey(Version)
class FrequencyData(DateTimeModel):
word = models.ForeignKey(Dictionary, related_name=_('frequency_data'))
count = models.BigIntegerField(null=True, blank=True)
source = models.ForeignKey(Source, related_name=_('frequency_data'), null=True, blank=True)
user = models.ForeignKey(settings.AUTH_USER_MODEL, related_name=_('frequency_data'))
user_ip_address = models.GenericIPAddressField(null=True, blank=True)
date_of_checking = models.DateTimeField(null=True, blank=True)
is_checked = models.BooleanField(default=False)
表定义:
\d+ dictionary_dictionary
Tabela "public.dictionary_dictionary"
Kolumna | Typ | Porównanie | Nullowalne | Domyślnie | Przechowywanie | Cel statystyk | Opis
----------------------+--------------------------+------------+------------+--------------------------------------------------------------------+----------------+---------------+------
id | integer | | not null | nextval('dictionary_dictionary_id_seq'::regclass) | plain | |
date_created | timestamp with time zone | | not null | | plain | |
date_modified | timestamp with time zone | | not null | | plain | |
word | character varying(64) | | not null | | extended | |
algorithm_version_id | integer | | not null | | plain | |
base_word_id | integer | | not null | | plain | |
Indeksy:
"dictionary_dictionary_pkey" PRIMARY KEY, btree (id)
"dictionary_phonet_algorithm_version_id_0f0af100" btree (algorithm_version_id)
"dictionary_dictionary_base_word_id_8db15cb4" btree (base_word_id)
Ograniczenia kluczy obcych:
"dictionary__algorithm_version_id_0f0af100_fk_phonetic_" FOREIGN KEY (algorithm_version_id) REFERENCES dictionary_algorithmversion(id) DEFERRABLE INITIALLY DEFERRED
"dictionary__base_word_id_8db15cb4_fk_phonetic_" FOREIGN KEY (base_word_id) REFERENCES dictionary_grammaticaldictionary(id) DEFERRABLE INITIALLY DEFERRED
Wskazywany przez:
TABLE "dictionary_frequencydata" CONSTRAINT "dictionary__word_id_c231110d_fk_phonetic_" FOREIGN KEY (word_id) REFERENCES dictionary_dictionary(id) DEFERRABLE INITIALLY DEFERRED
=========
\d+ dictionary_frequencydata
Tabela "public.dictionary_frequencydata"
Kolumna | Typ | Porównanie | Nullowalne | Domyślnie | Przechowywanie | Cel statystyk | Opis
------------------+--------------------------+------------+------------+---------------------------------------------------------------+----------------+---------------+------
id | integer | | not null | nextval('dictionary_frequencydata_id_seq'::regclass) | plain | |
date_created | timestamp with time zone | | not null | | plain | |
date_modified | timestamp with time zone | | not null | | plain | |
count | bigint | | | | plain | |
user_ip_address | inet | | | | main | |
date_of_checking | timestamp with time zone | | | | plain | |
is_checked | boolean | | not null | | plain | |
source_id | integer | | | | plain | |
user_id | integer | | not null | | plain | |
word_id | integer | | not null | | plain | |
Indeksy:
"dictionary_frequencydata_pkey" PRIMARY KEY, btree (id)
"dictionary_frequencydata_source_id_38bb205a" btree (source_id)
"dictionary_frequencydata_user_id_c6dfedce" btree (user_id)
"dictionary_frequencydata_word_id_c231110d" btree (word_id)
Ograniczenia kluczy obcych:
"dictionary__source_id_38bb205a_fk_phonetic_" FOREIGN KEY (source_id) REFERENCES dictionary_frequencysource(id) DEFERRABLE INITIALLY DEFERRED
"dictionary__user_id_c6dfedce_fk_auth_user" FOREIGN KEY (user_id) REFERENCES auth_user(id) DEFERRABLE INITIALLY DEFERRED
"dictionary__word_id_c231110d_fk_phonetic_" FOREIGN KEY (word_id) REFERENCES dictionary_dictionary(id) DEFERRABLE INITIALLY DEFERRED
这是共享主机。 字典数据库表 - 120k 行 FrequencyData - 160k 行
【问题讨论】:
-
执行以下两个查询需要多长时间:
select count(*) from dictionary_dictionary;和select count(DISTINCT d.id) from dictionary_dictionary d join f dictionary_frequencydata on d.id = f.word_id WHERE f.user_id = 1 -
第一个:54 ms,第二个:345 ms 解释:pastebin.com/T96Q3ipt
-
SELECT U1."word_id" AS Col1 FROM "dictionary_frequencydata" U1 WHERE U1."user_id" = 1运行多长时间? -
您是否尝试过使用 DISTINCT 关键字对其进行操作?例如
SELECT COUNT(*) AS "__count" FROM "dictionary_dictionary" WHERE NOT ("dictionary_dictionary"."id" IN (SELECT distinct U1."word_id" AS Col1 FROM "dictionary_frequencydata" U1 WHERE U1."user_id" = 1)); -
DISTINCT 有效。谢谢!如果您写下此答案,我会将其标记为已接受。
标签: sql postgresql