【问题标题】:Remove linebreak node in htmlagilitypack?删除 htmlagilitypack 中的换行节点?
【发布时间】:2010-09-10 22:39:23
【问题描述】:

我试图在没有换行符的网页上检索此文本:

<span class="listingTitle">888-I-AM-JUNK. Canada's most trusted BIG LOAD junk removal<br />specialist!</span></a>

我该怎么做?

这是我目前的代码,我使用的是 vb。

Dim content As String = ""
        Dim doc As New HtmlAgilityPack.HtmlDocument()
        doc.Load(WebBrowser1.DocumentStream)
        Dim hnc As HtmlAgilityPack.HtmlNodeCollection = doc.DocumentNode.SelectNodes("//span[@class='listingTitle']")
        For Each link As HtmlAgilityPack.HtmlNode In hnc
            Dim replaceUnwanted As String = ""
            replaceUnwanted = link.InnerText.Replace("&amp;", "&") '
            replaceUnwanted = replaceUnwanted.Replace("&#39;", "'")
            replaceUnwanted = replaceUnwanted.Replace("See full business details", "")

            content &= replaceUnwanted & vbNewLine
        Next
        RichTextBox1.Text = content
        Me.RichTextBox1.Lines = Me.RichTextBox1.Text.Split(New Char() {ControlChars.Lf}, _
                                                   StringSplitOptions.RemoveEmptyEntries)

我需要删除&lt;br /&gt;

【问题讨论】:

    标签: vb.net html-agility-pack


    【解决方案1】:

    如何进行相同的常规字符串操作?

    replaceUnwanted = replaceUnwanted.Replace(vbCrLf, "")
    

    如果你处理的是&lt;span&gt;...&lt;span&gt;

    replaceUnwanted = replaceUnwanted.ToLower().Replace("<br>", "")
    replaceUnwanted = replaceUnwanted.ToLower().Replace("<br />", "")
    

    【讨论】:

    • 非常感谢 p.cambell,“replaceUnwanted = replaceUnwanted.ToLower().Replace(vbCrLf, "")" 成功了。我不知道我是怎么想到的。
    • @Datadayne:你敢打赌,我很高兴。显然 toLower() 并没有真正为您购买 vbCrLf 案例,但我只是从 BR 示例中复制/粘贴。我进行编辑只是为了好玩。这是您的问题的赞成票!
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2023-03-11
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多