【发布时间】:2022-01-07 14:25:05
【问题描述】:
我知道,我知道。不应该是这样的。
大局,我正在处理地图数据并试图确定公交车站离哪条街道最近。一条街道由一系列点组成。有些街道有 2 个点,但大多数街道大约有 1/2 个。
我的积分表如下所示:
CREATE TABLE [dbo].[MapStreetsPoints](
[FeatureId] [int] NOT NULL,
[PointNumber] [int] NOT NULL,
[Latitude] [decimal](8, 5) NOT NULL,
[Longitude] [decimal](8, 5) NOT NULL,
CONSTRAINT [PK_MapStreetsPoints] PRIMARY KEY CLUSTERED
(
[FeatureId] ASC,
[PointNumber] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
) ON [PRIMARY]
来自 MapStreetsPoints 的样本数据
FeatureId PointNumber Latitude Longitude
----------- ----------- ---------- ----------
1 1 39.81396 -75.83017
1 2 39.81392 -75.83019
1 3 39.81387 -75.83018
1 4 39.81344 -75.83003
1 5 39.81339 -75.83000
1 6 39.81336 -75.82996
1 7 39.81333 -75.82990
1 8 39.81332 -75.82983
1 9 39.81332 -75.82977
1 10 39.81335 -75.82972
2 1 39.72170 -76.11486
2 2 39.72209 -76.11482
2 3 39.72248 -76.11474
2 4 39.72279 -76.11457
2 5 39.72362 -76.11404
2 6 39.72418 -76.11364
2 7 39.72482 -76.11321
2 8 39.72526 -76.11296
2 9 39.72560 -76.11282
2 10 39.72597 -76.11275
2 11 39.72644 -76.11274
这个表大约有 200 万行。
基本上,我将行过滤为可管理的内容,将表自我连接到自身,应用缓慢的顺序并占据前 1 行。我这样做是为了识别最接近某个点的特征。
我的内联表值函数如下所示。
ALTER FUNCTION dbo.fnMapReverseGeocodeNearestStreet
(
@Longitude Decimal(8,5),
@Latitude Decimal(8,5)
)
RETURNS TABLE
AS
RETURN
(
Select Top 1
A.FeatureId
From MapStreetsPoints As A
Inner Join MapStreetsPoints As B
On A.FeatureId = B.FeatureId
And A.PointNumber = B.PointNumber - 1
Where A.Longitude Between Convert(Decimal(8,5), @Longitude - 0.002)
and Convert(Decimal(8,5), @Longitude + 0.002)
And A.Latitude Between Convert(Decimal(8,5), @Latitude - 0.002)
And Convert(Decimal(8,5), @Latitude + 0.002)
And B.Longitude Between Convert(Decimal(8,5), @Longitude - 0.002)
and Convert(Decimal(8,5), @Longitude + 0.002)
And B.Latitude Between Convert(Decimal(8,5), @Latitude - 0.002)
And Convert(Decimal(8,5), @Latitude + 0.002)
Order By dbo.PerpendicularDistanceToLine(@Longitude, @Latitude,
A.Longitude,
A.Latitude,
B.Longitude,
B.Latitude)
)
多步表值函数如下所示。
Alter FUNCTION dbo.fnMapReverseGeocodeNearestStreet_2
(
@Longitude Decimal(8,5),
@Latitude Decimal(8,5)
)
RETURNS @Output TABLE (FeatureId Int)
AS
BEGIN
-- Fill the table variable with the rows for your result set
Declare @Temp
Table (
FeatureId Int,
StartLongitude Decimal(8,5),
StartLatitude Decimal(8,5),
EndLongitude Decimal(8,5),
EndLatitude Decimal(8,5)
);
Insert
Into @Temp(FeatureId, StartLongitude, StartLatitude, EndLongitude, EndLatitude)
Select A.FeatureId,
A.Longitude As StartLongitude,
A.Latitude As StartLatitude,
B.Longitude As EndLongitude,
B.Latitude As EndLatitude
From MapStreetsPoints As A
Inner Join MapStreetsPoints As B
On A.FeatureId = B.FeatureId
And A.PointNumber = B.PointNumber - 1
Where A.Longitude Between Convert(Decimal(8,5), @Longitude - 0.002)
and Convert(Decimal(8,5), @Longitude + 0.002)
And A.Latitude Between Convert(Decimal(8,5), @Latitude - 0.002)
And Convert(Decimal(8,5), @Latitude + 0.002)
And B.Longitude Between Convert(Decimal(8,5), @Longitude - 0.002)
and Convert(Decimal(8,5), @Longitude + 0.002)
And B.Latitude Between Convert(Decimal(8,5), @Latitude - 0.002)
And Convert(Decimal(8,5), @Latitude + 0.002);
Insert
Into @Output(FeatureId)
Select Top 1 FeatureId
From @Temp T
Order By dbo.PerpendicularDistanceToLine(@Longitude, @Latitude, T.StartLongitude, T.StartLatitude, T.EndLongitude, T.EndLatitude);
RETURN
END
逻辑是一样的。不同之处在于我在应用慢速顺序之前加载了一个带有中间结果的表变量。在我看来,排序是在过滤之前应用的。
为了测试这一点,我使用了以下查询:
Select Map.FeatureId,
BusStop.BusStopId1,
BusStop.Description,
MapStreets.Hazardous
From BusStop
Cross Apply dbo.fnMapReverseGeocodeNearestStreet(BusStop.XCoord,BusStop.YCoord) As Map
Inner Join MapStreets
On Map.FeatureId = MapStreets.FeatureId
BusStop 表中有大约 2,000 行。
当我运行内联表值函数的测试代码时,需要 22 秒。使用多步表值功能,耗时17秒。
itvf的演出计划:
|--Nested Loops(Inner Join, OUTER REFERENCES:([A].[FeatureId], [Expr1008]) WITH UNORDERED PREFETCH)
|--Nested Loops(Inner Join, OUTER REFERENCES:([**DatabaseName**].[dbo].[BusStop].[XCoord], [**DatabaseName**].[dbo].[BusStop].[YCoord]))
| |--Index Scan(OBJECT:([**DatabaseName**].[dbo].[BusStop].[idx_BusStop_Description]))
| |--Index Spool(SEEK:([**DatabaseName**].[dbo].[BusStop].[XCoord]=[**DatabaseName**].[dbo].[BusStop].[XCoord] AND [**DatabaseName**].[dbo].[BusStop].[YCoord]=[**DatabaseName**].[dbo].[BusStop].[YCoord]))
| |--Sort(TOP 1, ORDER BY:([Expr1004] ASC))
| |--Compute Scalar(DEFINE:([Expr1004]=[**DatabaseName**].[dbo].[PerpendicularDistanceToLine]([**DatabaseName**].[dbo].[BusStop].[XCoord],[**DatabaseName**].[dbo].[BusStop].[YCoord],[**DatabaseName**].[dbo].[MapStreetsPoints].[Longitude] as [A].[Longitude],[**DatabaseName**].[dbo].[MapStreetsPoints].[Latitude] as [A].[Latitude],[**DatabaseName**].[dbo].[MapStreetsPoints].[Longitude] as [B].[Longitude],[**DatabaseName**].[dbo].[MapStreetsPoints].[Latitude] as [B].[Latitude])))
| |--Hash Match(Inner Join, HASH:([A].[FeatureId], [A].[PointNumber])=([B].[FeatureId], [Expr1007]), RESIDUAL:([**DatabaseName**].[dbo].[MapStreetsPoints].[FeatureId] as [A].[FeatureId]=[**DatabaseName**].[dbo].[MapStreetsPoints].[FeatureId] as [B].[FeatureId] AND [**DatabaseName**].[dbo].[MapStreetsPoints].[PointNumber] as [A].[PointNumber]=[Expr1007]))
| |--Index Seek(OBJECT:([**DatabaseName**].[dbo].[MapStreetsPoints].[MapStreetsPoints4] AS [A]), SEEK:([A].[Longitude] >= CONVERT(decimal(8,5),[**DatabaseName**].[dbo].[BusStop].[XCoord]-(0.002),0) AND [A].[Longitude] <= CONVERT(decimal(8,5),[**DatabaseName**].[dbo].[BusStop].[XCoord]+(0.002),0)), WHERE:([**DatabaseName**].[dbo].[MapStreetsPoints].[Latitude] as [A].[Latitude]>=CONVERT(decimal(8,5),[**DatabaseName**].[dbo].[BusStop].[YCoord]-(0.002),0) AND [**DatabaseName**].[dbo].[MapStreetsPoints].[Latitude] as [A].[Latitude]<=CONVERT(decimal(8,5),[**DatabaseName**].[dbo].[BusStop].[YCoord]+(0.002),0)) ORDERED FORWARD)
| |--Compute Scalar(DEFINE:([Expr1007]=[**DatabaseName**].[dbo].[MapStreetsPoints].[PointNumber] as [B].[PointNumber]-(1)))
| |--Index Seek(OBJECT:([**DatabaseName**].[dbo].[MapStreetsPoints].[MapStreetsPoints4] AS [B]), SEEK:([B].[Longitude] >= CONVERT(decimal(8,5),[**DatabaseName**].[dbo].[BusStop].[XCoord]-(0.002),0) AND [B].[Longitude] <= CONVERT(decimal(8,5),[**DatabaseName**].[dbo].[BusStop].[XCoord]+(0.002),0)), WHERE:([**DatabaseName**].[dbo].[MapStreetsPoints].[Latitude] as [B].[Latitude]>=CONVERT(decimal(8,5),[**DatabaseName**].[dbo].[BusStop].[YCoord]-(0.002),0) AND [**DatabaseName**].[dbo].[MapStreetsPoints].[Latitude] as [B].[Latitude]<=CONVERT(decimal(8,5),[**DatabaseName**].[dbo].[BusStop].[YCoord]+(0.002),0)) ORDERED FORWARD)
|--Clustered Index Seek(OBJECT:([**DatabaseName**].[dbo].[MapStreets].[PK_MapStreets]), SEEK:([**DatabaseName**].[dbo].[MapStreets].[FeatureId]=[**DatabaseName**].[dbo].[MapStreetsPoints].[FeatureId] as [A].[FeatureId]) ORDERED FORWARD)
XML Version of showplan for multi-step version
多步骤版本的展示计划是:
|--Table Insert(OBJECT:(@Temp), SET:([FeatureId] = [**DatabaseName**].[dbo].[MapStreetsPoints].[FeatureId] as [A].[FeatureId],[StartLongitude] = [**DatabaseName**].[dbo].[MapStreetsPoints].[Longitude] as [A].[Longitude],[StartLatitude] = [UDSD 2021
|--Top(ROWCOUNT est 0)
|--Nested Loops(Inner Join, OUTER REFERENCES:([B].[FeatureId], [Expr1013]))
|--Compute Scalar(DEFINE:([Expr1013]=[**DatabaseName**].[dbo].[MapStreetsPoints].[PointNumber] as [B].[PointNumber]-(1)))
| |--Index Seek(OBJECT:([**DatabaseName**].[dbo].[MapStreetsPoints].[MapStreetsPoints4] AS [B]), SEEK:([B].[Longitude] >= CONVERT(decimal(8,5),[@Longitude]-(0.002),0) AND [B].[Longitude] <= CONVERT(decimal(8,5),[@Longitude]+(0.0
|--Index Seek(OBJECT:([**DatabaseName**].[dbo].[MapStreetsPoints].[MapStreetsPoints_PointNumber] AS [A]), SEEK:([A].[PointNumber]=[Expr1013] AND [A].[FeatureId]=[**DatabaseName**].[dbo].[MapStreetsPoints].[FeatureId] as [B].[FeatureI
|--Table Insert(OBJECT:([**DatabaseName**].[dbo].[fnMapReverseGeocodeNearestStreet_2]), SET:([FeatureId] = @Temp.[FeatureId] as [T].[FeatureId]))
|--Sort(TOP 1, ORDER BY:([Expr1005] ASC))
|--Compute Scalar(DEFINE:([Expr1005]=[**DatabaseName**].[dbo].[PerpendicularDistanceToLine]([@Longitude],[@Latitude],@Temp.[StartLongitude] as [T].[StartLongitude],@Temp.[StartLatitude] as [T].[StartLatitude],@Temp.[EndLongitude] as [T]
|--Table Scan(OBJECT:(@Temp AS [T]))
我不明白为什么内联表值函数更慢。我最好的猜测是 order by 应用得太早了,因此在太多行上运行。应该提一下,应用过滤后,order by 可以查看的行通常少于 100 行。
PerpendicularDistanceToLine 是一个返回标量的多语句 UDF。它没有表访问权限,它只是对输入应用一系列数学运算。
【问题讨论】:
-
我唯一能想到的可能是第一个版本中的索引减慢了排序速度?
-
您真的使用的是 SQL 2005,还是标记错误?
-
是的。实际上是 SQL Server 2005。我还有几个客户在使用 2005。(悲伤的脸)。
-
ORDER BY中的标量 UDF 将定义缓慢。我建议您考虑将dbo.PerpendicularDistanceToLineitself 转换为内联 TVF,然后CROSS APPLYing 并按此排序。您可能应该使用 GIS 数据类型,例如geography,不确定 2005 年是否支持它 -
@MartinSmith - 我修改了问题以包含 showplan 的 XML 版本。
标签: sql sql-server sql-server-2005