【发布时间】:2011-05-06 20:08:10
【问题描述】:
我的问题是关于非规范化。在数据库中,何时应该将派生数据存储在自己的列中,而不是每次需要时都计算?
例如,假设您的用户因他们的问题而获得支持。您在其个人资料上显示用户的声誉。当用户被点赞时,您应该增加他们的声誉,还是应该在检索他们的个人资料时计算它:
SELECT User.id, COUNT(*) AS reputation FROM User
LEFT JOIN Question
ON Question.User_id = User.id
LEFT JOIN Upvote
ON Upvote.Question_id = Question.id
GROUP BY User.id
为了获得用户的声誉,查询的处理器密集程度必须达到多少,才值得用自己的列增量跟踪它?
继续我们的示例,假设 Upvote 的权重取决于投它的用户拥有多少 Upvote(而不是声誉)。检索他们的声誉的查询突然爆炸:
SELECT
User.id AS User_id,
SUM(UpvoteWeight.weight) AS reputation
FROM User
LEFT JOIN Question
ON User.id = Question.User_id
LEFT JOIN (
SELECT
Upvote.Question_id,
COUNT(Upvote2.id)+1 AS weight
FROM Upvote
LEFT JOIN User
ON Upvote.User_id = User.id
LEFT JOIN Question
ON User.id = Question.User_id
LEFT JOIN Upvote AS Upvote2
ON
Question.id = Upvote2.Question_id
AND Upvote2.date < Upvote.date
GROUP BY Upvote.id
) AS UpvoteWeight ON Question.id = UpvoteWeight.Question_id
GROUP BY User.id
这与增量解决方案的难度相去甚远。规范化何时值得,规范化的好处何时会失去非规范化的好处(在这种情况下是查询难度和/或性能)?
【问题讨论】:
标签: mysql normalization