【问题标题】:Remove empty paragraph elements from HTML via SQL通过 SQL 从 HTML 中删除空段落元素
【发布时间】:2020-02-20 05:40:58
【问题描述】:

帖子已被编辑,因为原始帖子的措辞和结构要求为我完成工作,而不是完成工作的最佳方法。

问题

电子邮件模板(如下所示)包含标签(封装在“$”字符中),这些标签通过 SQL Server 触发器替换为相关数据(如果不为空)。处理后,可能会有标签没有被替换的情况。

我可以用空的 varchars 替换标签,但这会导致电子邮件在电子邮件客户端或浏览器中呈现时包含大面积的空白区域。

将 JavaScript 嵌入到电子邮件中是以后开发的考虑因素,但这超出了我的工作范围。这必须完全通过 SQL 来实现。

问题

实现相同结果的更智能方法是什么?

标签

$ChargeAmount_Changed$
$StartDate_Changed$
$EndDate_Changed$
$EscalationAmount_Changed$
$AccountCode_Changed$
$COSAccountCode_Changed$
$InvoiceDesc_Changed$
$BillingPeriod_Changed$
$BillingCycle_Changed$
$FinanceParty_Changed$
$FinanceAmount_Changed$
$BillingCustomerCode_Changed$
$ChargeAmount_Crtitical$
$StartDate_Critical$
$EndDate_Critical$
$EscalationAmount_Critical$
$AccountCode_Critical$
$COSAccountCode_Critical$
$InvoiceDescription_Critical$
$BillingPeriod_Critical$
$BillingCycle_Critical$
$FinanceParty_Critical$
$FinanceAmount_Critical$
$BillingCustomerCode_Critical$

代码

CREATE FUNCTION fn_bpo_SALSEmail_RemoveTags(
        @Html       VARCHAR(MAX)
,       @EmailFlag  VARCHAR(50)
)
RETURNS VARCHAR(MAX)
WITH    ENCRYPTION
AS 
BEGIN
    DECLARE @RowID INT
    ,   @Tag VARCHAR(50)
    ,   @StartIndex INT = 0
    ,   @EndIndex INT
    ,   @StartTag VARCHAR(50)
    ,   @EndTag VARCHAR(50)
    ,   @RowVar VARCHAR (MAX)
    ,   @DeterminedRow INT = 0

    DECLARE @Tags TABLE    (
        fldTagID    INT IDENTITY (1, 1) PRIMARY KEY
    ,   fldTag      VARCHAR(50) NOT NULL
    );

    INSERT INTO @Tags
    SELECT      fldTag 
    FROM        tblSALSEmailLoopingFields WITH (NOLOCK)
    WHERE       fldEmailFlag = @EmailFlag 


    SELECT @RowID = COUNT(fldTag) FROM @Tags

    WHILE @RowID <> 0
    BEGIN   
        SET @DeterminedRow = 0
        SELECT  @Tag = fldTag 
        FROM    @Tags
        WHERE   fldTagID = @RowID

        SET @StartIndex = PATINDEX('%' + @Tag +'%', @Html)
        SET @EndIndex = @StartIndex + LEN(@Tag)

        -- Expression found
        IF @StartIndex > 0
        BEGIN
        -- Have not found the whole row yet
        WHILE @DeterminedRow = 0
        BEGIN
            --See if the index is at the start of the opening paragraph element for the row
            SET @StartTag = SUBSTRING(@Html, @StartIndex, LEN('<p'))

            --See if the index is at the end of the closing paragraph element for the row
            SET @EndTag = SUBSTRING(@Html, @EndIndex,  LEN('</p>'))
            --March the start index back
            IF @StartTag <> '<p' 
            BEGIN
                SET @StartIndex = @StartIndex - 1
            END

            --March the end index forward
            IF @EndTag <> '</p>'
            BEGIN
                SET @EndIndex = @EndIndex + 1
            END

            -- Found the whole row! replace with empty var
            IF @StartTag = '<p' AND @EndTag = '</p>'
            BEGIN
                SET @DeterminedRow = 1
                -- Get the whole row
                SET @RowVar = SUBSTRING(@Html, @StartIndex, @EndIndex - @StartIndex + LEN(@EndTag))
                SET @Html = REPLACE(@Html, @RowVar, '')
            END
        END
        END
        SET @Html = REPLACE(@Html, @Tag, '') 
        SET @RowID = @RowID - 1
    END

    RETURN @Html
END

HTML 模板


-- Start of template
<!DOCTYPE html
    PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">

<head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title> </title>
    <style type="text/css">
        .cs2654AE3A {
            text-align: left;
            text-indent: 0pt;
            margin: 0pt 0pt 0pt 0pt
        }

        .cs37BA8FCA {
            color: #000000;
            background-color: transparent;
            font-family: Verdana;
            font-size: 9pt;
            font-weight: normal;
            font-style: normal;
        }
    </style>
</head>

<body>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">A contract item fee has been ??fldAction?? on Contract No.
            ??fldContractNo?? , for customer ??fldCustomerCode?? .</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">The fee ??fldAction?? is ??fldFeeType??</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">Located on ??fldItemType?? : ??fldItemCode??</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">&nbsp;</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$ChargeAmount_Changed$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$ChargeAmount_Crtitical$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">&nbsp;</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$StartDate_Changed$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$StartDate_Critical$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">&nbsp;</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$EndDate_Changed$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$EndDate_Critical$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">&nbsp;</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$EscalationAmount_Changed$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$EscalationAmount_Critical$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">&nbsp;</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$AccountCode_Changed$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$AccountCode_Critical$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">&nbsp;</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$COSAccountCode_Changed$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$COSAccountCode_Critical$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">&nbsp;</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$InvoiceDesc_Changed$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$InvoiceDescription_Critical$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">&nbsp;</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$BillingPeriod_Changed$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$BillingPeriod_Critical$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">&nbsp;</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$BillingCycle_Changed$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$BillingCycle_Critical$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">&nbsp;</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$FinanceParty_Changed$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$FinanceParty_Critical$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">&nbsp;</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$FinanceAmount_Changed$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$FinanceAmount_Critical$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">&nbsp;</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$BillingCustomerCode_Changed$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">$BillingCustomerCode_Critical$</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">&nbsp;</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">This change was made by : ??fldEmpFullName??</span></p>
    <p class="cs2654AE3A"><span class="cs37BA8FCA">&nbsp;</span></p>
</body>

</html>
-- End of template

【问题讨论】:

    标签: html sql-server optimization


    【解决方案1】:

    在我看来,这不是为了开发数据库。

    我的猜测是,搜索一个没有被删除的标签,然后向后搜索

    <p
    

    保存这个patindex,向前搜索

    </p>
    

    ,现在您可以删除所有 patindex 之间的内容。 (不要忘记在这里计算搜索字符串)

    另一种方法可能是: 如果你必须处理这样的任务,你可以在前端使用 JavaScript/jQuery 轻松完成。

    $('p').each(function( index ) {if($(this).html() === '') { this.remove()}})
    

    这将删除所有空的段落。

    编辑:

    循环遍历 html 并将每个段落选择到临时表中。 那么你就可以轻松搞定了。

    declare 
        @tbl TABLE(
            [ID] [int] IDENTITY(1,1) NOT NULL,
            [paragraph] nvarchar(max) NOT NULL
        );
    
    declare 
        @HtmlLen INT = LEN(@Html)
        ,@StartIndex INT = 0
        ,@EndIndex INT
    
    
    WHILE CHARINDEX('<p', @Html)>0
    BEGIN
        SELECT 
            @StartIndex = CHARINDEX('<p', @Html),
            @EndIndex = CHARINDEX('</p>', @Html)
    
        INSERT INTO @tbl([paragraph])
        select SUBSTRING ( @Html ,@StartIndex, @EndIndex -4 )
        select @Html = SUBSTRING(@Html, @EndIndex+3, @HtmlLen)
    
    END
    
    select * from @tbl
    

    结果

    您可以使用 (for SQL2017+) 重建您的 html 字符串

     -- at least build up your string again
     SELECT STRING_AGG([paragraph], '')  FROM @tbl
    

    【讨论】:

    • 这是个好建议!我已经考虑过了,但这超出了我的工作范围,我已经设法从模板中删除了段落元素,我想知道是否有更好的方法通过 SQL 删除元素,而不是我完成了。原始帖子已被编辑以反映这一点。
    • 其实我不是。在办公桌上,但我想可能还有第二种解决方案。我稍后再看看
    猜你喜欢
    • 1970-01-01
    • 2020-10-08
    • 1970-01-01
    • 2015-12-16
    • 2010-11-05
    • 1970-01-01
    • 1970-01-01
    • 2022-01-14
    • 1970-01-01
    相关资源
    最近更新 更多