【问题标题】:Optimize query with multiple OUTER APPLY使用多个 OUTER APPLY 优化查询
【发布时间】:2018-03-08 11:12:32
【问题描述】:

我正在查询多个OUTER APPLY,但所有表在连接列上都有主键(因此此处使用聚集索引),所以我不知道如何进一步优化此查询。此外,这里也不可能使用索引视图,因为它们禁止使用ORDER BYTOP

所以我有

  • Fields 带有 Id 主键和大量其他列。

  • WeatherHistory 具有复杂主键(FieldId[Date])和很多列的表,

  • NdviImageHistory 表与FieldId[Date][Base64] 列(复杂主键FieldId[Date])其中[Base64] 存储图像base64,

  • NaturalColorImageHistory 表与FieldId[Date][Base64] 列(复杂主键FieldId[Date])其中[Base64] 存储图像base64,

  • NdviHistory 表与 FieldId[Date]MeanNdvi 列(复杂主键 FieldId[Date]),

  • FieldSeasonHistory 表与 FieldStartDateEndDate 列(复杂主键 FieldId[Date])。

我的查询

SELECT Fields.*,
    WeatherHistory.TempSumC AS CurrentTempSumC,
    TempSumF AS CurrentTempSumF,
    PrecipitationSumMm AS CurrentPrecipitationSumMm,
    nih.[Base64] AS CurrentNdviImageBase64,
    ncih.[Base64] AS CurrentNaturalColorImageBase64,
    MeanNdvi AS CurrentMeanNdvi,
    IsOpenSeason
FROM Fields
LEFT JOIN WeatherHistory ON FieldId = Id AND [Date] = CAST(GETUTCDATE() AS DATE)
OUTER APPLY
(
    SELECT TOP 1 [Base64]
    FROM NdviImageHistory
    WHERE FieldId = Id
    ORDER BY [Date] DESC
) nih
OUTER APPLY
(
    SELECT TOP 1 [Base64]
    FROM NaturalColorImageHistory
    WHERE FieldId = Id
    ORDER BY [Date] DESC
) ncih
OUTER APPLY
(
    SELECT TOP 1 MeanNdvi
    FROM NdviHistory
    WHERE FieldId = Id
    ORDER BY [Date] DESC
) nh
OUTER APPLY
(
    SELECT TOP 1 CASE WHEN EndDate IS NULL THEN 1 ELSE 0 END AS IsOpenSeason
    FROM FieldSeasonHistory
    WHERE FieldId = Id
    ORDER BY [StartDate] DESC
) fsh
WHERE UserId = (SELECT Id FROM Users WHERE Email = @email) AND IsArchived = 0

我没有创建任何索引,因为我认为自动生成的集群索引(基于主键)应该足够了(但我可能是错的)。此查询执行大约 15 秒,但我想减少查询时间。


编辑:Fields 表的UserIdIsArchived 列添加了索引。查询执行计划:


** 编辑 2:** 统计:

SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

(13 row(s) affected)
Table 'FieldSeasonHistory'. Scan count 13, logical reads 26, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'NdviHistory'. Scan count 13, logical reads 26, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'NaturalColorImageHistory'. Scan count 13, logical reads 26, physical reads 0, read-ahead reads 0, lob logical reads 604, lob physical reads 0, lob read-ahead reads 0.
Table 'NdviImageHistory'. Scan count 13, logical reads 39, physical reads 0, read-ahead reads 0, lob logical reads 68, lob physical reads 0, lob read-ahead reads 0.
Table 'WeatherHistory'. Scan count 0, logical reads 39, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Users'. Scan count 0, logical reads 228, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Fields'. Scan count 1, logical reads 19, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

 SQL Server Execution Times:
   CPU time = 15 ms,  elapsed time = 16 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

【问题讨论】:

    标签: sql sql-server query-optimization cross-apply outer-apply


    【解决方案1】:

    每个表都需要多列索引。索引应该是 where 中的列,order by 中的列,然后是 select 中的列。例如:

    • NdviImageHistory(FieldId, [Date], [Base64])
    • NaturalColorImageHistory(FieldId, [Date], [Base64])
    • 等等。

    【讨论】:

    • 我需要为每一列创建单独的索引还是为多列创建单个索引?
    • 我的意思是每表一个还是每列一个?
    • 我也无法索引[Base64],因为它是VARCHAR(MAX) 列。
    • @VadimOvchinnikov 。 . .然后使用前两列创建索引。这些是复合索引——具有多列的索引。
    • 好的,我明白了。对于表NdviImageHistoryNaturalColorImageHistory,已经有聚集索引,因为前两列是复合主键。我还假设问题与这些表有关,因为如果我从查询中删除它们,查询会在一秒钟内执行。
    【解决方案2】:

    像这样更新您的查询。

    SELECT Fields.*,
        WeatherHistory.TempSumC AS CurrentTempSumC,
        TempSumF AS CurrentTempSumF,
        PrecipitationSumMm AS CurrentPrecipitationSumMm,
        nih.[Base64] AS CurrentNdviImageBase64,
        ncih.[Base64] AS CurrentNaturalColorImageBase64,
        MeanNdvi AS CurrentMeanNdvi,
        IsOpenSeason
    FROM Fields
    LEFT JOIN WeatherHistory ON FieldId = Id AND [Date] = CAST(GETUTCDATE() AS DATE)
    OUTER APPLY
    (
        SELECT MAX(NdviImageHistory.ID) MAX_ID
        FROM NdviImageHistory
        WHERE FieldId = Id
    
    ) nih_ID
    OUTER APPLY
    (
        SELECT [Base64] FROM NdviImageHistory X WHERE X.ID = nih_ID.MAX_ID
    )nih
    

    用户更多外申请所有这些表 {nih , ncih, nh, fsh } 试试这个。

    我只为 [nih] 使用了 1 个外部

    从 OUTER APPLY 连接中删除 TOP 1 和 Order by

    【讨论】:

    • 你确定这会编译吗?
    • 是的,它的工作原理,如果你可以为我在解决方案中提到的每个表添加更多外部。
    • 好的,为什么你认为这会比原始查询执行得更快?
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2019-04-02
    • 1970-01-01
    • 1970-01-01
    • 2018-01-15
    • 1970-01-01
    • 2023-03-11
    • 2023-03-12
    相关资源
    最近更新 更多