【问题标题】:Entity Framework GroupBy to Sql Generation实体框架 GroupBy 到 Sql 生成
【发布时间】:2016-10-02 03:20:54
【问题描述】:

我遇到了 EF 的性能问题。

        using (var context = new CustomDbContext())
        {
            var result = context.
                TransactionLines
                .Where(x => 
                    x.Transaction.TransactionTypeId == 1433 &&
                    (x.Transaction.Eob.EobBatchId == null || x.Transaction.Eob.EobBatch.Status == EobBatchStatusEnum.Completed)
                )
                .GroupBy(x => x.VisitLine.ProcedureId)
                .Select(x => new
                {
                    Id = x.Key,
                    PaidAmount = x.Sum(t => t.PaidAmount),
                    Code = context.Procedures.Where(h => h.Id == x.Key).Select(h => h.Code).FirstOrDefault()
                }).ToArray();
        }

EF生成下一条sql:

SELECT 
1 AS [C1], 
[Project6].[ProcedureId] AS [ProcedureId], 
[Project6].[C2] AS [C2], 
[Project6].[C1] AS [C3]
FROM ( SELECT 
    [Project5].[ProcedureId] AS [ProcedureId], 
    [Project5].[C1] AS [C1], 
    (SELECT 
        SUM([Extent7].[PaidAmount]) AS [A1]
        FROM     [dbo].[TransactionLines] AS [Extent7]
        INNER JOIN [dbo].[Transactions] AS [Extent8] ON [Extent7].[TransactionId] = [Extent8].[Id]
        LEFT OUTER JOIN [dbo].[Eobs] AS [Extent9] ON [Extent8].[EobId] = [Extent9].[Id]
        LEFT OUTER JOIN [dbo].[EobBatches] AS [Extent10] ON [Extent9].[EobBatchId] = [Extent10].[Id]
        LEFT OUTER JOIN [dbo].[VisitLines] AS [Extent11] ON [Extent7].[VisitLineId] = [Extent11].[Id]
        WHERE (([Extent9].[EobBatchId] IS NULL) OR (1 = [Extent10].[Status])) AND ([Extent8].[TransactionTypeId] = 1433) AND (([Project5].[ProcedureId] = [Extent11].[ProcedureId]) OR (([Project5].[ProcedureId] IS NULL) AND ([Extent11].[ProcedureId] IS NULL)))) AS [C2]
    FROM ( SELECT 
        [Project4].[ProcedureId] AS [ProcedureId], 
        [Project4].[C1] AS [C1]
        FROM ( SELECT 
            [Project2].[ProcedureId] AS [ProcedureId], 
            (SELECT TOP (1) 
                [Extent6].[Code] AS [Code]
                FROM [dbo].[Procedures] AS [Extent6]
                WHERE [Extent6].[Id] = [Project2].[ProcedureId]) AS [C1]
            FROM ( SELECT 
                [Distinct1].[ProcedureId] AS [ProcedureId]
                FROM ( SELECT DISTINCT 
                    [Extent5].[ProcedureId] AS [ProcedureId]
                    FROM     [dbo].[TransactionLines] AS [Extent1]
                    INNER JOIN [dbo].[Transactions] AS [Extent2] ON [Extent1].[TransactionId] = [Extent2].[Id]
                    LEFT OUTER JOIN [dbo].[Eobs] AS [Extent3] ON [Extent2].[EobId] = [Extent3].[Id]
                    LEFT OUTER JOIN [dbo].[EobBatches] AS [Extent4] ON [Extent3].[EobBatchId] = [Extent4].[Id]
                    LEFT OUTER JOIN [dbo].[VisitLines] AS [Extent5] ON [Extent1].[VisitLineId] = [Extent5].[Id]
                    WHERE (([Extent3].[EobBatchId] IS NULL) OR (1 = [Extent4].[Status])) AND ([Extent2].[TransactionTypeId] = 1433)
                )  AS [Distinct1]
            )  AS [Project2]
        )  AS [Project4]
    )  AS [Project5]
)  AS [Project6]

查询持续时间约为 3 秒。 如果直接使用 Group By 编写 sql 查询,则查询时长为 1.5 秒,占用的 CPU 资源减少一半。

    SELECT sq.ProcedureId, SUM(sq.PaidAmount), (SELECT TOP(1) Procedures.Code From Procedures Where Procedures.Id = sq.ProcedureId) as Code
FROM(
    SELECT [Extent5].[ProcedureId] AS [ProcedureId],[Extent1].PaidAmount as [PaidAmount]
    FROM     [dbo].[TransactionLines] AS [Extent1]
    INNER JOIN [dbo].[Transactions] AS [Extent2] ON [Extent1].[TransactionId] = [Extent2].[Id]
    LEFT OUTER JOIN [dbo].[Eobs] AS [Extent3] ON [Extent2].[EobId] = [Extent3].[Id]
    LEFT OUTER JOIN [dbo].[EobBatches] AS [Extent4] ON [Extent3].[EobBatchId] = [Extent4].[Id]
    LEFT OUTER JOIN [dbo].[VisitLines] AS [Extent5] ON [Extent1].[VisitLineId] = [Extent5].[Id]
    WHERE (([Extent3].[EobBatchId] IS NULL) OR (1 = [Extent4].[Status])) AND ([Extent2].[TransactionTypeId] = 1433)
) sq
GROUP BY sq.ProcedureId

我编写了不同的 linq,但仍然无法强制 EF 生成 GroupBy 而不是子查询。 理想情况下,我不想使用函数或手动编写 sql,因为我在构建 linq 逻辑时有很多条件。

是否可以强制 EF 完全按照 linq 中编写的方式生成 SQL?

【问题讨论】:

  • 首先尝试使用显式连接重写您的 LINQ 查询。

标签: c# sql-server entity-framework linq


【解决方案1】:

尽量避免

context.Procedures.Where(h => h.Id == x.Key).Select(h => h.Code).FirstOrDefault()

通过在GroupBy 子句中包含Code - 我知道这似乎是多余的,但众所周知,EF 在翻译涉及使用密钥访问器和聚合以外的其他内容的分组操作时遇到问题:

//...
.GroupBy(x => new { Id = x.VisitLine.ProcedureId, x.VisitLine.Procedure.Code })
.Select(x => new
{
    Id = x.Key.Id,
    PaidAmount = x.Sum(t => t.PaidAmount),
    Code = x.Key.Code
}).ToArray();

更新:以上在我的测试环境(最新的EF6.1.3)中生成如下SQL:

SELECT
    1 AS [C1],
    [GroupBy1].[K1] AS [ProcedureId],
    [GroupBy1].[A1] AS [C2],
    [GroupBy1].[K2] AS [Code]
    FROM ( SELECT
        [Extent5].[ProcedureId] AS [K1],
        [Extent6].[Code] AS [K2],
        SUM([Filter1].[PaidAmount]) AS [A1]
        FROM    (SELECT [Extent1].[VisitLineId] AS [VisitLineId], [Extent1].[PaidAmount] AS [PaidAmount]
            FROM    [dbo].[TransactionLine] AS [Extent1]
            INNER JOIN [dbo].[Transaction] AS [Extent2] ON [Extent1].[TransactionId] = [Extent2].[Id]
            LEFT OUTER JOIN [dbo].[Eob] AS [Extent3] ON [Extent2].[EobId] = [Extent3].[Id]
            LEFT OUTER JOIN [dbo].[EobBatch] AS [Extent4] ON [Extent3].[EobBatchId] = [Extent4].[Id]
            WHERE (1433 = [Extent2].[TransactionTypeId]) AND ([Extent3].[EobBatchId] IS NULL OR [Extent4].[Status] = 1) ) AS [Filter1]
        LEFT OUTER JOIN [dbo].[VisitLine] AS [Extent5] ON [Filter1].[VisitLineId] = [Extent5].[Id]
        LEFT OUTER JOIN [dbo].[Procedure] AS [Extent6] ON [Extent5].[ProcedureId] = [Extent6].[Id]
        GROUP BY [Extent5].[ProcedureId], [Extent6].[Code]
    )  AS [GroupBy1]

这比我预期的要好得多。

更新 2: EF 是一头奇怪的野兽。使用双投影会产生预期的结果:

//...
.GroupBy(x => x.VisitLine.ProcedureId)
.Select(x => new
{
    Id = x.Key,
    PaidAmount = x.Sum(t => t.PaidAmount),
})
.Select(x => new
{
    x.Id,
    x.PaidAmount,
    Code = context.Procedures.Where(h => h.Id == x.Id).Select(h => h.Code).FirstOrDefault()
}).ToArray();

产生以下内容:

SELECT
    1 AS [C1],
    [Project2].[ProcedureId] AS [ProcedureId],
    [Project2].[C1] AS [C2],
    [Project2].[C2] AS [C3]
    FROM ( SELECT
        [GroupBy1].[A1] AS [C1],
        [GroupBy1].[K1] AS [ProcedureId],
        (SELECT TOP (1)
            [Extent6].[Code] AS [Code]
            FROM [dbo].[Procedure] AS [Extent6]
            WHERE [Extent6].[Id] = [GroupBy1].[K1]) AS [C2]
        FROM ( SELECT
            [Extent5].[ProcedureId] AS [K1],
            SUM([Filter1].[PaidAmount]) AS [A1]
            FROM   (SELECT [Extent1].[VisitLineId] AS [VisitLineId], [Extent1].[PaidAmount] AS [PaidAmount]
                FROM    [dbo].[TransactionLine] AS [Extent1]
                INNER JOIN [dbo].[Transaction] AS [Extent2] ON [Extent1].[TransactionId] = [Extent2].[Id]
                LEFT OUTER JOIN [dbo].[Eob] AS [Extent3] ON [Extent2].[EobId] = [Extent3].[Id]
                LEFT OUTER JOIN [dbo].[EobBatch] AS [Extent4] ON [Extent3].[EobBatchId] = [Extent4].[Id]
                WHERE (1433 = [Extent2].[TransactionTypeId]) AND ([Extent3].[EobBatchId] IS NULL OR [Extent4].[Status] = 1) ) AS [Filter1]
            LEFT OUTER JOIN [dbo].[VisitLine] AS [Extent5] ON [Filter1].[VisitLineId] = [Extent5].[Id]
            GROUP BY [Extent5].[ProcedureId]
        )  AS [GroupBy1]
    )  AS [Project2]

附:如果不清楚,请回答您的具体问题

是否可以强制 EF 完全按照 linq 中编写的方式生成 SQL?

没有。相反,您应该以某种方式编写 LINQ 查询以获得所需(或更接近)的 SQL 查询。

【讨论】:

  • 将第二列添加到 group by 不会强制 EF 生成 GroupBy,它会向子查询添加额外的列并向它们添加额外的过滤器,这会显着降低当前查询的性能。在我的情况下,对过程代码使用子查询要好得多
  • 问题不在于从过程中获取代码,而在于 EF 生成的具有相同过滤器的子查询很少,这比使用 GroupBy 方法效率低得多。
  • 我并不是说子查询效率低下 - 我正在尝试使用不会导致 EF 生成不必要的子查询的 LINQ 构造。您确定建议的修改不会转化为更好的 SQL 吗?我无法测试,因为没有模型类。
  • 是的,我知道,因为我的第一个 linq 实现是使用 groupBy 中的这 2 列编写的。 sql查询是一样的。
  • 谢谢。双选会产生想要的结果。奇怪的行为。我认为,检查将 DbExpressions 转换为 SQL 的翻译器的源代码会很有用
猜你喜欢
  • 2012-03-09
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多