【发布时间】:2011-02-19 18:23:14
【问题描述】:
我有一个名为 Projects 的表,它具有以下关系:
有很多贡献 有很多付款
在我的结果集中,我需要以下聚合值:
- 唯一贡献者的数量(贡献表上的 DonorID)
- 供款总额(供款表上的金额总和)
- 支付总额(支付表上 PaymentAmount 的总和)
因为有太多的聚合函数和多个连接,所以使用标准聚合函数和 GROUP BY 子句会变得很麻烦。我还需要能够对这些字段进行排序和过滤。所以我想出了两个选择:
使用子查询:
SELECT Project.ID AS PROJECT_ID,
(SELECT SUM(PaymentAmount) FROM Payment WHERE ProjectID = PROJECT_ID) AS TotalPaidBack,
(SELECT COUNT(DISTINCT DonorID) FROM Contribution WHERE RecipientID = PROJECT_ID) AS ContributorCount,
(SELECT SUM(Amount) FROM Contribution WHERE RecipientID = PROJECT_ID) AS TotalReceived
FROM Project;
使用临时表:
DROP TABLE IF EXISTS Project_Temp;
CREATE TEMPORARY TABLE Project_Temp (project_id INT NOT NULL, total_payments INT, total_donors INT, total_received INT, PRIMARY KEY(project_id)) ENGINE=MEMORY;
INSERT INTO Project_Temp (project_id,total_payments)
SELECT `Project`.ID, IFNULL(SUM(PaymentAmount),0) FROM `Project` LEFT JOIN `Payment` ON ProjectID = `Project`.ID GROUP BY 1;
INSERT INTO Project_Temp (project_id,total_donors,total_received)
SELECT `Project`.ID, IFNULL(COUNT(DISTINCT DonorID),0), IFNULL(SUM(Amount),0) FROM `Project` LEFT JOIN `Contribution` ON RecipientID = `Project`.ID GROUP BY 1
ON DUPLICATE KEY UPDATE total_donors = VALUES(total_donors), total_received = VALUES(total_received);
SELECT * FROM Project_Temp;
两者的测试相当可比,在 0.7 - 0.8 秒范围内,1000 行。但我真的很关心可扩展性,我不想随着表的增长而重新设计所有内容。最好的方法是什么?
【问题讨论】:
标签: sql function performance subquery aggregate