【发布时间】:2012-01-07 09:48:55
【问题描述】:
大家好,我正在尝试在网页源中查找冲击波视频的参数。源代码如下所示:
<object align="middle" width="480" height="320" viewastext="" id="player" codebase="http://fpdownload.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=7,0,0,0" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000">
<param value="sameDomain" name="allowScriptAccess">
<param value="http://mediawebsite.com/lcmplayer.swf?autoStart=1&hidecontrols=1&&noresize=1&file=http%3A%2F%2Ftx02.us.mediawebsite.com%2Fedge2%2F31dfty452611%26sec%3D1090" name="movie">
<param value="best" name="quality">
<param value="#000000" name="bgcolor">
<param value="true" name="allowFullScreen">
<param value="" name="FlashVars">
<embed align="middle" width="480" height="320" pluginspage="http://www.macromedia.com/go/getflashplayer" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="sameDomain" name="player" bgcolor="#000000" flashvars="" quality="best" src="http://mediawebsite.com/lcmplayer.swf?autoStart=1&hidecontrols=1&&noresize=1&file=http%3A%2F%2Ftx02.us.mediawebsite.com%2Fedge2%2F31dfty452611%26sec%3D1090">
</object>
我只需要从上面得到这个:
http://mediawebsite.com/lcmplayer.swf?autoStart=1&hidecontrols=1&&noresize=1&file=http%3A%2F%2Ftx02.us.mediawebsite.com%2Fedge2%2F31dfty452611%26sec%3D1090
或 HTML 代码中的任何内容。哦,当然,HTML 代码链接可能会更改每次刷新,所以这就是为什么我只需要获取里面的内容参数。
我正在使用标题和 VB.net 2008 中所述的 HtmlAgilityPack。
这是我当前用来加载 HTML 并解析它的代码:
Imports HtmlAgilityPack
Imports System.Text.RegularExpressions
Private Sub getVidLink()
Dim doc As New HtmlDocument()
'doc.LoadHtml("<html><body><p><table id=""foo""><tr><th>hello</th></tr><tr><td>world</td></tr></table></body></html>")
doc.Load("C:\kathryn\fpHTML.html")
For Each table As HtmlNode In doc.DocumentNode.SelectNodes("//object")
Debug.Print("Found: " + table.Id)
For Each row As HtmlNode In table.SelectNodes("param")
Debug.Print(row.Id)
Next
Next
End Sub
但它没有找到参数的任何值。都是空白?...
任何帮助都会很棒!
大卫
【问题讨论】:
标签: html vb.net html-parsing html-agility-pack