SQL Recursive CTE 在每次递归中替换记录答案

【问题标题】：SQL Recursive CTE Replacing records in each recursionSQL Recursive CTE 在每次递归中替换记录
【发布时间】：2020-05-16 10:44:12
【问题描述】：

我有一张这样的桌子：

ItemID  ItemFormula
100     'ID_3+ID_5'
110     'ID_2+ID_6'
120     'ID_100+ID_110'
130     'ID_120+ID_4'

这是一个公式表的简化版本，包含近 1000 条记录和多达 40 级参考（用于其他项目的项目）。任务是将公式分解为一个级别的参考，其中一个项目中没有其他项目。例如在上表中 id=130 我应该有 '((ID_3+ID_5)+(ID_2+ID_6))+ID_4'

编辑：操作不限于“+”，项目之间有一个可识别的字符。为了简单起见，我删除了那个字符。

我可以为此使用递归 CTE。但我的问题是，由于参考水平很高，我的递归选择有很多记录加入，所以需要很多时间才能完成。

我的问题是：我可以只在每次递归发生时保留之前的递归吗？

这是我的 CTE 代码

WITH Formula
  AS  (SELECT A.ItemID
             ,'ID_' + CONVERT(VARCHAR(20), A.ItemID) AS ItemText
             ,CONVERT(VARCHAR(MAX), A.ItemFormula) AS ItemFormula 
       FROM (VALUES (100,'ID_3+ID_5'),
                    (110,'ID_2+ID_6'),
                    (120,'ID_100+ID_110'),
                    (130,'ID_120+ID_4')   
                ) A (ItemID,ItemFormula)

  )
    ,REC
  AS
      (
          SELECT A.ItemID
                ,A.ItemText
                ,A.ItemFormula
                ,1 AS LevelID
          FROM Formula A
          UNION ALL
          SELECT A.ItemID
                ,A.ItemText
                ,' '
                 + TRIM (REPLACE (REPLACE (A.ItemFormula, B.ItemText, ' ( ' + B.ItemFormula + ' ) '), '  ', ' '))
                 + ' ' AS ItemFormula
                ,A.LevelID + 1 AS LevelID
          FROM REC A
              CROSS APPLY
          (
              SELECT *
              FROM
              (
                  SELECT *
                        ,ROW_NUMBER () OVER (ORDER BY GETDATE ()) AS RowNum
                  FROM Formula B2
                  WHERE CHARINDEX (B2.ItemText, A.ItemFormula) > 0
              ) B3
              WHERE B3.RowNum = 1
          ) B
      )
    ,FinalQ
  AS
      (
          SELECT A2.ItemID
                ,A2.ItemFormula
                ,A2.LevelID
          FROM
          (
              SELECT A.ItemID
                    ,REPLACE (TRIM (A.ItemFormula), ' ', '') AS ItemFormula
                    ,A.LevelID
                    ,ROW_NUMBER () OVER (PARTITION BY A.ItemID ORDER BY A.LevelID DESC) AS RowNum
              FROM REC A
          ) A2
          WHERE A2.RowNum = 1
      )
SELECT * FROM FinalQ A2 ORDER BY A2.ItemID;

提前致谢。

【问题讨论】：

用您正在使用的数据库标记您的问题。 + 是唯一允许的操作吗？我们可以为引用假设一个更好的格式，例如[ID_1]？
ROW_NUMBER () OVER (ORDER BY GETDATE ()) - 这将不确定地对行进行排序，因为GETDATE() 将为整个语句运行一次。根据您需要做的事情，您可能会更幸运地使用存储过程执行此操作，或者将其拉入常规应用程序并构建依赖关系图。或者重新设计设计，这样您就不必每次都解析列表，只需解析一个 id。
@Clockwork-Muse 是为了保证连接只为引用多个项目的项目返回一行。这样公式中的项目就会被一一处理。因为 ROW_NUMBER 必须有一个 ORDER BY 子句，所以我把它放在那里。顺序在这里无关紧要。
@GordonLinoff 我在我发布的 EDIT 中回答了你的问题。

标签： sql sql-server recursion replace

【解决方案1】：

我的问题是：我可以只在每次递归发生时保留之前的递归吗？

没有。递归 CTE 将继续向先前迭代中找到的行添加行。您没有某种控件可以让您在递归 CTE 在其迭代期间删除行。

但是，您可以在递归 CTE 完成后将它们过滤掉，也许在仅考虑最后 有意义 行的辅助 CTE 上（通过某种规则的定义）。

在 PostgreSQL 中发现了唯一模糊相似的想法，您可以在 UNION ALL 之外使用 UNION 子句，以避免产生更多相同的行。但无论如何，这与您需要的不同。

【讨论】：

【解决方案2】：

这是一个极其复杂的问题。以下是一些想法：

找出哪些项目不需要任何插入。这些是没有任何其他参考的。
为项目插入构建排序。假设已定义项目，则插入可以进入项目。为此可以使用递归 CTE。
枚举插入。（1）中的所有内容都得到“1”。其余的按顺序排列。
按插入顺序处理插入。

这是我的解决方案：

with ordering as (
      select itemid, itemtext, itemformula, convert(varchar(max), null) as otheritemtext, 1 as lev
      from formula f
      where not exists (select 1
                        from formula f2 join
                             string_split(f.itemformula, '+') s
                             on f2.itemtext = s.value
                        where f2.itemid <> f.itemid
                       )
       union all
       select f.itemid, f.itemtext, f.itemformula, convert(varchar(max), s.value), lev + 1
       from formula f cross apply
            string_split(f.itemformula, '+') s join
            ordering o
            on o.itemtext = s.value
        -- where lev <= 2
     ),
     ordered as (
      select distinct o.*,
             dense_rank() over (order by (case when lev = 1 then -1 else lev end), (case when lev = 1 then '' else otheritemtext end)) as seqnum
      from ordering o
     ),
     cte as (
      select o.itemid, o.itemtext, o.itemformula, convert(varchar(max), o.otheritemtext) as otheritemtext,
             o.itemformula as newformula, o.seqnum, 1 as lev
      from ordered o
      where seqnum = 1
      union all
      select cte.itemid, o.itemtext, o.itemformula, convert(varchar(max), cte.itemtext),
             replace(o.itemformula, o.otheritemtext, concat('(', cte.newformula, ')')),  o.seqnum, cte.lev + 1
      from cte join
           ordered o
           on cte.itemtext = o.otheritemtext and cte.seqnum < o.seqnum
     )
select *
from cte;

还有db<>fiddle。

【讨论】：

【解决方案3】：

您可以利用公式的逻辑顺序（如果有的话）（Item_100 不能引用 Item_150）并按降序处理项目。以下使用 LIKE，它不适用于具有重叠模式的公式（例如 ID_10 和 ID_100），您可以通过一些字符串操作或保持固定长度的 ItemID（例如 ID_10010 和 ID_10100：从高位开始对项目进行编号）来解决这个问题像10000这样的数字）

declare @f table
(
ItemId int,
ItemFormula varchar(1000)
);

insert into @f(ItemId, ItemFormula)
values 
(100, 'ID_3+ID_5'),
(110, 'ID_2+ID_6'),
(120, 'ID_100+ID_110'),
(130, 'ID_120+ID_4'),
(140, '(ID_130+ID_110)/ID_100'),
(150, 'sqrt(ID_140, ID_130)'), 
(160, 'ID_150-ID_120+ID_140');


;with cte
as
(
select f.ItemId, replace(cast(f.ItemFormula as varchar(max)), isnull('ID_' + cast(r.ItemId as varchar(max)), ''), isnull('(' + r.ItemFormula+ ')', '')) as therepl, 1 as lvl
from @f as f
outer apply (
    select *
    from
    (
    select rr.*, row_number() over(order by rr.ItemId desc) as rownum
    from @f as rr
    where f.ItemFormula like '%ID_' + cast(rr.ItemId as varchar(1000)) + '%'
    ) as src
    where rownum = 1
    ) as r
union all
select c.ItemId, replace(c.therepl, 'ID_' + cast(r.ItemId as varchar(max)), '(' + r.ItemFormula+ ')'), c.lvl+1
from cte as c
cross apply (
    select *
    from
    (
    select rr.*, row_number() over(order by rr.ItemId desc) as rownum
    from @f as rr
    where c.therepl like '%ID_' + cast(rr.ItemId as varchar(1000)) + '%'
    ) as src
    where rownum = 1
    ) as r
),
rown
as
(
select *, row_number() over (partition by itemid order by lvl desc) as rownum
from cte
)
select *
from rown
where rownum = 1;

【讨论】：