【发布时间】:2014-04-23 19:22:09
【问题描述】:
我要解析这个文件:(仅重要部分)
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
...
</head>
<body onload="Xaprb.InputMask.setupElementMasks()">
<div align="center">
<table> ... </table>
<table width="900" height="500" border="0" cellpadding="0"
cellspacing="0" class="content">
<tr>
<td width="45"> </td>
<td width="210" valign="top">
<div class="np_table">
<div class="np_bl">
<div class="np_br">
<div class="np_tl">
<div class="np_tr">
<span class="name_heading">Hello</span><br />
<span class="name_content">**NAME I NEED**</span><br />
<br /> <span class="name_heading">Number:</span><br />
<span class="name_content">**NUMBER I NEED**</span>
</div>
</div>
</div>
</div>
</div> <br>
<div class="menu"> ... </div>
<p> </p>
</td>
<td width="600" valign="top">
<div class="content_table">
<div class="ct_bl">
<div class="ct_br">
<div class="ct_tl">
<div class="ct_tr">
<span class="heading">...</span>
<p><b>**I need this number too: 250**</b> <br />
<br />
Here is the datum I want: **17-04-2014**. <br />
Please do not...</p>
<p><b>...</b></p>
<br /><br>
</div>
</div>
</div>
</div>
</div>
</td>
</body>
</html>
现在我想要四个字符串,两个数字,日期和名字。我有这个代码:
HttpClient client = new HttpClient();
var doc = new HtmlAgilityPack.HtmlDocument();
var html = await client.GetStringAsync("http://example.com");
doc.LoadHtml(html);
var name = ???
var numberone = ???
var numbertwo = ???
var date = ???
但我不知道我是如何通过 HTML Agility Pack 获得这些信息的。有人可以帮助我吗?或者给我提示?
【问题讨论】:
-
您可能会发现这很有用。 [stackoverflow.com/questions/846994/…
标签: c# html parsing windows-phone-8