【问题标题】:Remove HTML tags from string in Excel VBA从 Excel VBA 中的字符串中删除 HTML 标记
【发布时间】:2024-01-25 14:11:01
【问题描述】:

我想从 Excel VBA 中的字符串中删除所有 HTML 标记。

例如:

before_text = "text1 <br> text2 <a href = 'www.data.com' id = 'data'>text3</a> text4"

after_text = RemoveTags(before_text)

结果:

after_text = "text1  text2 text3 text4"

【问题讨论】:

标签: string vba excel tags


【解决方案1】:
vbscript.regexp

代码:

Function RemoveHTML(text As String) As String
    Dim regexObject As Object
    Set regexObject = CreateObject("vbscript.regexp")

    With regexObject
        .Pattern = "<!*[^<>]*>"    'html tags and comments
        .Global = True
        .IgnoreCase = True
        .MultiLine = True
    End With

    RemoveHTML = regexObject.Replace(text, "")
End Function

【讨论】:

    【解决方案2】:

    以@zhihar 的回复为基础,要从选定的单元格中删除所有 HTML,您可以遍历选择。

    Function RemoveHTML(text As String) As String
        Dim regexObject As Object
        Set regexObject = CreateObject("vbscript.regexp")
    
        With regexObject
            .Pattern = "<!*[^<>]*>"    'html tags and comments
            .Global = True
            .IgnoreCase = True
            .MultiLine = True
        End With
    
        RemoveHTML = regexObject.Replace(text, "")
    End Function
    
    
    Sub StripHtmlSelected()
        For Each Cell In Selection
            If Not Cell.HasFormula Then
                Cell.Value = RemoveHTML(Cell.Value)
            End If
        Next Cell
    End Sub
    

    【讨论】: