【问题标题】:Calculating SUM on 3 related tables计算 3 个相关表的 SUM
【发布时间】:2016-10-25 19:44:39
【问题描述】:

我有三个处于一对多关系的表。由于我的业务场景不好解释,所以我将使用一个更熟悉的术语:

客户 -> 发票 -> 发票详情

假设存在 Customers.Value1、Invoices.Value2、InvoiceDetails.Value3,所有类型均为 double(实数)。

我需要获取包含来自特定国家/地区的客户的所有记录的 Value1、Value2 和 Value3 的摘要(实际上我的 where 子句有更多条件,但所有条件仅与客户表相关)。

我需要的值的 3 查询示例如下所示:

SELECT SUM(c.Value1) FROM Customers c WHERE c.Country = <cond>
SELECT SUM(i.Value2) FROM Customers c INNER JOIN Invoices i ON c.Id = i.CustomerId WHERE c.Country = <cond>
SELECT SUM(d.Value2) FROM (Customers c INNER JOIN Invoices i ON c.Id = i.CustomerId) INNER JOIN InvoiceDetails d ON i.Id = d.InvoiceId WHERE c.Country = <cond>

现在想象一下,如果我的 WHERE 子句非常复杂,那么重复这个 WHERE 子句 3 次看起来很糟糕并且容易出错。另外,在这个例子中,我们以相同的方式过滤记录 3 次

有没有办法避免重复 WHERE 子句,并在单个查询中执行此操作?

编辑:根据建议在连接查询中执行所有三个摘要的答案,让我提供数据来解释为什么这是不正确的。

Customers from Spain:
Customer1  Value1 = 10
Customer2  Value1 = 20

Invoices for customers from Spain:
Invoice1  Customer1 Value2 = 100
Invoice2  Customer1 Value2 = 200
Invoice3  Customer2 Value2 = 300
Invoice4  Customer2 Value2 = 400

SELECT SUM(c.Value1) FROM Customers c WHERE c.Country = "Spain"
returns 30

SELECT SUM(c.Value1), SUM(i.Value2) FROM Customers c INNER JOIN Invoices i ON c.Id = i.CustomerId WHERE c.Country = "Spain"
returns 60, 1000

如您所见,由于合并记录,客户摘要的结果不正确。

【问题讨论】:

  • 为什么你不能做一个单一的joined 查询并一次性总结它们?只需group by 就可以了
  • @MarcB 好吧,如果我在第二个查询中添加了 SUM(c.Value1),那么由于连接查询中的重复值,Value1 的摘要将不正确
  • 这是 EF,对吧?为什么不使用导航属性,让 EF 为您创建 SQL 查询?
  • @IvanStoev 你能举个例子吗?
  • c.Value1i.Value2d.Value2 - intdecimal类型是什么?

标签: sql-server entity-framework sql-server-2012 entity-framework-6


【解决方案1】:

由于您使用的是 EF,因此您可以定义和使用导航属性而不是连接。例如:

public class Customer
{
    // ...
    public ICollection<Invoice> Invoices { get; set; }
}

public class Invoice
{
    // ...
    public ICollection<InvoiceDetail> Details { get; set; }
}

现在您可以像这样使用简单的 LINQ To Entities 查询(因为您需要多个聚合,所以查询使用 按常量分组 技术):

var query = 
    from c in db.Customers
    where c.Country = <cond>
    group c by 1 into g
    selec new
    {
        Value1 = g.Sum(c => (double?)c.Value1) ?? 0,
        Value2 = g.SelectMany(c => c.Invoices).Sum(i => (double?)i.Value2) ?? 0,
        Value3 = g.SelectMany(c => c.Invoices).SelectMany(i => i.Details).Sum(d => (double?)d.Value2) ?? 0,
    };
var result = query.FirstOrDefault();

需要nullable 强制转换以避免在相应集合为空时出现Sum 异常。

更新:以上没有产生好的 SQL。奇怪的是,您编写 LINQ 查询的方式如何影响生成的 SQL 查询(我有一种感觉,我回到了我们以编写查询的方式控制 SQL 查询执行计划的时代)。这是替代的 LINQ 查询:

var query =
    from c in db.Customers
    where c.Country == "BG"
    let Value1 = (double?)c.Value1
    let Value2 = c.Invoices.Sum(i => (double?)i.Value2)
    let Value3 = c.Invoices.SelectMany(i => i.Details).Sum(i => (double?)i.Value2)
    group new { Value1, Value2, Value3 } by 1 into g
    select new
    {
        Value1 = g.Sum(e => e.Value1),
        Value2 = g.Sum(e => e.Value2),
        Value3 = g.Sum(e => e.Value3),
    };
var result = query.FirstOrDefault();

产生更接近预期的东西:

SELECT
    [Limit1].[K1] AS [C1],
    [Limit1].[A1] AS [C2],
    [Limit1].[A2] AS [C3],
    [Limit1].[A3] AS [C4]
    FROM ( SELECT TOP (1)
        [Project2].[K1] AS [K1],
        SUM([Project2].[A1]) AS [A1],
        SUM([Project2].[A2]) AS [A2],
        SUM([Project2].[A3]) AS [A3]
        FROM ( SELECT
            1 AS [K1],
            [Project2].[Value1] AS [A1],
            [Project2].[C1] AS [A2],
            [Project2].[C2] AS [A3]
            FROM ( SELECT
                [Project1].[Value1] AS [Value1],
                [Project1].[C1] AS [C1],
                (SELECT
                    SUM([Extent4].[Value2]) AS [A1]
                    FROM  [dbo].[Invoice] AS [Extent3]
                    INNER JOIN [dbo].[InvoiceDetail] AS [Extent4] ON [Extent3].[Id] = [Extent4].[Invoice_Id]
                    WHERE [Project1].[Id] = [Extent3].[Customer_Id]) AS [C2]
                FROM ( SELECT
                    [Extent1].[Id] AS [Id],
                    [Extent1].[Value1] AS [Value1],
                    (SELECT
                        SUM([Extent2].[Value2]) AS [A1]
                        FROM [dbo].[Invoice] AS [Extent2]
                        WHERE [Extent1].[Id] = [Extent2].[Customer_Id]) AS [C1]
                    FROM [dbo].[Customer] AS [Extent1]
                    WHERE N'BG' = [Extent1].[Country]
                )  AS [Project1]
            )  AS [Project2]
        )  AS [Project2]
        GROUP BY [K1]
    )  AS [Limit1]

【讨论】:

  • 问题是这个查询没有优化——这就是我所说的对同一记录集进行多次过滤的问题——你的代码产生的查询将过滤客户表三次。在我的场景中,我需要总结 1 个客户字段、7 个发票字段和 10 个产品字段。根据我的测量结果,L2E 查询比在临时表中过滤客户的 SQL 查询慢大约 15-20 倍,然后执行我提供的 3 个查询。这真是低效。
  • 同意。 L2E 查询受到高度限制并且依赖于 SQL 转换。用更好的 SQL 查询更新,不能说性能如何。
  • 谢谢,该查询产生了更好的 sql。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2015-01-23
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多