【发布时间】:2017-09-12 16:38:04
【问题描述】:
我有以下 CTE,它使用递归来获取层次结构中每个节点的级别,最后,由于我有 27 个级别,我正在获取每个级别的名称,因为最终用户对查看不感兴趣GUID。
with EmpTree
as
(
select e.DWH_Dim_TFS_File_DWH_File_Guid, cast(cast(e.DWH_Dim_TFS_File_DWH_File_Guid as binary(4)) as varbinary(max)) as EmpHier,
1 as EmployeeLevel
from [DWH].[Dim_TFS_File_View] as e
where e.DWH_Dim_TFS_File_DWH_FileParent_Guid is null
union all
select c.DWH_Dim_TFS_File_DWH_File_Guid, cast(p.EmpHier + cast(c.DWH_Dim_TFS_File_DWH_File_Guid as binary(4)) as varbinary(max)),
EmployeeLevel +1 as EmployeeLevel
from EmpTree as p
join [DWH].[Dim_TFS_File_View] as c
on c.DWH_Dim_TFS_File_DWH_FileParent_Guid = p.DWH_Dim_TFS_File_DWH_File_Guid
)
select TOP 100 PERCENT DWH_Dim_TFS_File_DWH_File_Guid
,EmployeeLevel
,(SELECT [File_Name] from [DWH].[Dim_TFS_File_View] as pu where nullif(cast(substring(EmpHier, 1, 4) as int), 0) = pu.DWH_Dim_TFS_File_DWH_File_Guid) level1
,(SELECT [File_Name] from [DWH].[Dim_TFS_File_View] as pu where nullif(cast(substring(EmpHier, 1, 9) as int), 0) = pu.DWH_Dim_TFS_File_DWH_File_Guid) level2
,(SELECT [File_Name] from [DWH].[Dim_TFS_File_View] as pu where nullif(cast(substring(EmpHier, 1, 13) as int), 0) = pu.DWH_Dim_TFS_File_DWH_File_Guid) level3
from EmpTree
order by DWH_Dim_TFS_File_DWH_File_Guid
我有 27 个关卡... 我正在获取信息的视图具有三个索引: (父母,孩子) (父母) (儿童)
这个表有 200M 行并且还在增长。所以这个查询非常慢,我认为其中大部分也是每个级别的所有“名称抓取”的原因......
这是实现我的结果的更有效方法吗?也许通过一些连接?
如果可以的话请帮忙
谢谢!!
没有得到正确的递归,只获取 Anchor 部分。
with EmpTree
as
(
select e.DWH_Dim_TFS_File_DWH_FileParent_Guid,e.DWH_Dim_TFS_File_DWH_File_Guid,
1 as Depth,
File_Name_String = CAST(CAST(e.File_Name AS BINARY(100)) AS VARBINARY(8000))
from [dbo].[Hierarchy_Luis] as e
where e.DWH_Dim_TFS_File_DWH_FileParent_Guid is null
union all
select e.DWH_Dim_TFS_File_DWH_FileParent_Guid,e.DWH_Dim_TFS_File_DWH_File_Guid,
p.Depth +1 as Depth,
File_Name_String = CAST(CONCAT(p.File_Name_String, CAST(e.File_Name AS BINARY(100))) AS VARBINARY(8000))
from [dbo].[Hierarchy_Luis] as e
join EmpTree as p
on e.DWH_Dim_TFS_File_DWH_FileParent_Guid = p.DWH_Dim_TFS_File_DWH_File_Guid
)
SELECT
p.DWH_Dim_TFS_File_DWH_File_Guid,
p.Depth,
Level01 =CAST(SUBSTRING(p.File_Name_String, 1, 100) as nvarchar(100)),
Level02 =CAST(SUBSTRING(p.File_Name_String, 101, 100) as nvarchar(100)),
Level03 =CAST(SUBSTRING(p.File_Name_String, 201, 100) as nvarchar(100)),
Level04 =CAST(SUBSTRING(p.File_Name_String, 301, 100) as nvarchar(100)),
Level05 =CAST(SUBSTRING(p.File_Name_String, 401, 100) as nvarchar(100)),
Level07 =CAST(SUBSTRING(p.File_Name_String, 501, 100) as nvarchar(100)),
Level08 =CAST(SUBSTRING(p.File_Name_String, 601, 100) as nvarchar(100)),
Level09 =CAST(SUBSTRING(p.File_Name_String, 701, 100) as nvarchar(100)),
Level10 =CAST(SUBSTRING(p.File_Name_String, 801, 100) as nvarchar(100)),
Level11 =CAST(SUBSTRING(p.File_Name_String, 901, 100) as nvarchar(100)),
Level12 =CAST(SUBSTRING(p.File_Name_String, 1001, 100) as nvarchar(100)),
Level13 =CAST(SUBSTRING(p.File_Name_String, 1101, 100) as nvarchar(100)),
Level14 =CAST(SUBSTRING(p.File_Name_String, 1201, 100) as nvarchar(100)),
Level15 =CAST(SUBSTRING(p.File_Name_String, 1301, 100) as nvarchar(100)),
Level16 =CAST(SUBSTRING(p.File_Name_String, 1401, 100) as nvarchar(100)),
Level17 =CAST(SUBSTRING(p.File_Name_String, 1501, 100) as nvarchar(100)),
Level18 =CAST(SUBSTRING(p.File_Name_String, 1601, 100) as nvarchar(100)),
Level19 =CAST(SUBSTRING(p.File_Name_String, 1701, 100) as nvarchar(100)),
Level20 =CAST(SUBSTRING(p.File_Name_String, 1801, 100) as nvarchar(100)),
Level21 =CAST(SUBSTRING(p.File_Name_String, 1901, 100) as nvarchar(100)),
Level22 =CAST(SUBSTRING(p.File_Name_String, 2001, 100) as nvarchar(100)),
Level23 =CAST(SUBSTRING(p.File_Name_String, 2101, 100) as nvarchar(100)),
Level24 =CAST(SUBSTRING(p.File_Name_String, 2201, 100) as nvarchar(100)),
Level25 =CAST(SUBSTRING(p.File_Name_String, 2301, 100) as nvarchar(100)),
Level26 =CAST(SUBSTRING(p.File_Name_String, 2401, 100) as nvarchar(100)),
Level27 =CAST(SUBSTRING(p.File_Name_String, 2501, 100) as nvarchar(100))
FROM EmpTree p
【问题讨论】:
-
您将不得不提供一些示例数据和预期的输出。加入条件子字符串对我来说似乎很粗略......但是你又一次转换为二进制,然后是 varbinary(max) 所以很难理解你真正在做什么......
-
第一件事...... BINARY(4) 不够大,无法容纳 GUID......所以你可能会从中得到坏数据...... CAST(CAST( e.DWH_Dim_TFS_File_DWH_File_Guid AS BINARY(4)) AS VARBINARY(MAX)) AS EmpHier。至于性能,我不得不假设 3 个相关的子查询是这里真正的瓶颈,但我们需要可用的测试数据来验证。
-
仔细查看代码...不需要相关的子查询。您可以简单地将 File_Name 添加到递归 CTE。
-
CTE 在深度递归和大型数据集方面表现糟糕。使用直接循环可能会更好,这样它就不必一次性缓存所有数据。
-
Jason 你是对的,但这些不是普通的 guid,而是他们在数据库中给出的名称。此外,我可以在递归中获取 File_Name,但是我仍然必须制作这 27 列,并且可能添加一个列,整个路径用“/”分隔,然后对于每个级别,使用子字符串函数来识别名称每个级别?
标签: sql-server performance tsql recursion ssms