【问题标题】:How to replace the html tagged text in a word Document in VB.NET如何在VB.NET中替换word文档中的html标记文本
【发布时间】:2015-02-27 17:47:05
【问题描述】:

我有一个 VB.NET 代码,它总是在 Word 文档文件 (.docx) 中查找和替换文本。我在这个过程中使用 OpenXml。 但我只想替换 H​​TML 标记的文本,并在替换文档中的新文本后始终删除标记。

我的代码是:

Public Sub SearchAndReplace(ByVal document As String)

    Dim wordDoc As WordprocessingDocument = WordprocessingDocument.Open(document, True)
    Using (wordDoc)
        Dim docText As String = Nothing
        Dim sr As StreamReader = New StreamReader(wordDoc.MainDocumentPart.GetStream)

        Using (sr)
            docText = sr.ReadToEnd
        End Using

        Dim regexText As Regex = New Regex("<ReplaceText>")
        docText = regexText.Replace(docText, "Hi Everyone!")
        Dim sw As StreamWriter = New StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create))

        Using (sw)
            sw.Write(docText)
        End Using
    End Using

【问题讨论】:

  • 你需要使用捕获组。

标签: vb.net


【解决方案1】:

这里可以帮助您解决问题。

Imports System.Text.RegularExpressions
Module Module1
    Sub Main()
        Dim Text As String = "Blah<foo>Blah"
        'Prints Text
        Console.WriteLine(Text)
        Dim regex As New Regex("(<)[]\w\/]+(>)")
        'Prints Text after replace the in-between the capturing group 1 and 2. 
        'Capturing group are marked between parenthesis in the regex pattern 
        Console.WriteLine(regex.Replace(Text, "$1foo has been replaced.$2"))
        'Update Text
        Text = regex.Replace(Text, "$1foo has been replaced.$2")
        'Remove starting tag
        Dim p As Integer = InStr(Text, "<")
        Text = Text.Remove(p - 1, 1)
        'Remove trailing tag
        Dim pp As Integer = InStr(Text, ">")
        Text = Text.Remove(pp - 1, 1)
        'Print Text
        Console.WriteLine(Text)
        Console.ReadLine()
    End Sub

End Module

输出:

如果每行有多个标签,上述代码将无法运行。

我建议不要使用正则表达式来解析 HTML。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-12-03
    • 2019-12-12
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多