【问题标题】:T-SQL - using STUFF to concatenate grouped columns and removing duplicatesT-SQL - 使用 STUFF 连接分组列并删除重复项
【发布时间】:2021-07-01 13:37:28
【问题描述】:

我有一张如下所示的表格:

EmailAddress: nvarchar(255)
MarketingEmailOptIn: nvarchar(50)
NewsletterOptIn: nvarchar(50)
ThoughtLeaderOptIn: nvarchar(50)

下面显示的我的 SQL 语句获取上面的数据并使用逗号作为分隔符连接“订阅类型”:

SELECT  
    EmailAddress,
    STUFF((SELECT ',' + 
              CASE
                 WHEN B.MarketingEmailOptIn = 'TRUE' THEN 'MarketingEmail' 
                 WHEN B.ThoughtLeaderOptIn = 'TRUE' THEN 'ThoughtLeader'
                 WHEN B.NewsletterOptIn = 'TRUE' THEN 'Newsletter'
              END
          FROM UK_AGT_AgentForms_TEST_DE B 
          WHERE ISNULL(B.EmailAddress, '') = ISNULL(A.EmailAddress, '')
          FOR XML PATH('')), 1, 2, '') AS Subscriptions
FROM
    UK_AGT_AgentForms_TEST_DE A
GROUP BY 
    EmailAddress 

运行此 SQL 会产生以下输出:

但是请注意,MarketingEmail 被列出了两次,因为源表也列出了两次(第 1 行和第 2 行)。我需要省略检测到的任何重复项,以便生成的表如下所示:

我对 STUFF 关键字很陌生。我只是有点迷失如何在运行时检测重复项 - 任何建议都值得赞赏。谢谢

【问题讨论】:

  • 您使用的是什么版本的 SQL Server?
  • varchar(5) 足够时,为什么最后3 列和nvarchar(50) 就足够了?
  • @Larnu 不知道。不幸的是,我不能随意更改数据结构。
  • 也许将 FROM UK_AGT_AgentForms_TEST_DE B 替换为 FROM (SELECT DISTINCT * FROM UK_AGT_AgentForms_TEST_DE) B?
  • 我认为string_agg() 可能更适合您,如果您可以使用的话。

标签: sql-server tsql stuff


【解决方案1】:

试试这样的:

DECLARE @Data table (
    EmailAddress nvarchar(255),
    MarketingEmailOptIn nvarchar(50),
    NewsletterOptIn nvarchar(50),
    ThoughtLeaderOptIn nvarchar(50)
);

INSERT INTO @Data VALUES
    ( 'mike@mikemarks.com', 'TRUE', NULL, NULL ),
    ( 'mike@mikemarks.com', 'TRUE', 'TRUE', NULL ),
    ( 'mike@mikemarks.com', 'TRUE', NULL, 'TRUE' );

SELECT
    EmailAddress
    , STUFF ( ( CASE WHEN EOptIn = 'TRUE' THEN ',MarketingEmail' ELSE '' END
        + CASE WHEN NOptIn = 'TRUE' THEN ',Newsletter' ELSE '' END
        + CASE WHEN TOptIn = 'TRUE' THEN ',ThoughtLeader' ELSE '' END 
    ), 1, 1, '' ) AS Subscriptions
FROM (

    SELECT TOP 100 PERCENT
        EmailAddress
        , MAX ( MarketingEmailOptIn ) AS EOptIn
        , MAX ( NewsletterOptIn ) AS NOptIn
        , MAX ( ThoughtLeaderOptIn ) AS TOptIn
    FROM @Data A --UK_AGT_AgentForms_TEST_DE
    GROUP BY EmailAddress
    ORDER BY EmailAddress

) AS x
ORDER BY 
    EmailAddress;

返回

+--------------------+-----------------------------------------+
|    EmailAddress    |              Subscriptions              |
+--------------------+-----------------------------------------+
| mike@mikemarks.com | MarketingEmail,Newsletter,ThoughtLeader |
+--------------------+-----------------------------------------+

【讨论】:

    【解决方案2】:

    如果你有 Sql Server 2017 或更高版本,你可以使用String_agg() 来简化这个:

    SELECT   
        EmailAddress,
            STRING_AGG(CASE
                     WHEN MarketingEmailOptIn = 'TRUE' THEN 'MarketingEmail' 
                     WHEN ThoughtLeaderOptIn = 'TRUE' THEN 'ThoughtLeader'
                     WHEN NewsletterOptIn = 'TRUE' THEN 'Newsletter'
                  END, ', ') AS Subscriptions
    FROM
        UK_AGT_AgentForms_TEST_DE
    GROUP BY 
        EmailAddress
    

    如果您仍然看到重复项,您可以在嵌套查询中使用条件聚合先将其汇总:

    SELECT  
        EmailAddress,
              CASE WHEN MarketingEmailOptIn > 0 THEN 'MarketingEmail,' ELSE '' END
            + CASE WHEN ThoughtLeaderOptIn > 0 THEN 'ThoughtLeader,' ELSE '' END
            + CASE WHEN NewsletterOptIn = > 0 THEN 'Newsletter' ELSE '' END
             AS Subscriptions
    FROM (
        SELECT EmailAddress
            , SUM(CASE WHEN MarketingEmailOptIn = 'TRUE' THEN 1 ELSE 0 END) MarketingEmailOptIn
            , SUM(CASE WHEN ThoughtLeaderOptIn = 'TRUE' THEN 1 ELSE 0 END) ThoughtLeaderOptIn
            , SUM(CASE WHEN NewsletterOptIn = 'TRUE' THEN 1 ELSE 0 END) NewsletterOptIn
        FROM UK_AGT_AgentForms_TEST_DE
        GROUP BY EmailAddress
    ) T
    

    【讨论】:

      【解决方案3】:

      皮尤。我不得不玩这个。也许不是完美的解决方案,但我认为我能够实现您正在尝试的目标。它虽然不使用 stuff 功能。它只是连接每个字符串,然后删除最后一个逗号。

      SELECT EmailAddress, CASE WHEN LEN(Subscriptions) > 0 THEN LEFT(Subscriptions, LEN(Subscriptions) - 1) ELSE '' END AS Subscriptions
      FROM (
          SELECT EmailAddress, CONCAT(
                  CASE WHEN SUM(CASE WHEN MarketingEmailOptIn = 'TRUE' THEN 1 ELSE 0 END) > 0 THEN 'MarketingEmail, ' ELSE '' END,
                  CASE WHEN SUM(CASE WHEN NewsletterOptIn = 'TRUE' THEN 1 ELSE 0 END) > 0 THEN 'Newsletter, ' ELSE '' END,
                  CASE WHEN SUM(CASE WHEN ThoughtLeaderOptIn = 'TRUE' THEN 1 ELSE 0 END) > 0 THEN 'ThoughLeader, ' ELSE '' END
              ) AS Subscriptions
          FROM UK_AGT_AgentForms_TEST_DE 
          GROUP BY EmailAddress
      ) AS a
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2021-08-30
        • 2013-01-14
        • 2012-07-05
        • 2022-11-10
        • 1970-01-01
        • 2017-06-15
        相关资源
        最近更新 更多