【发布时间】:2015-06-09 06:46:13
【问题描述】:
只是想从 AEC 网站上提取一些信息(例如 http://apps.aec.gov.au/eSearch/LocalitySearchResults.aspx?filter=3977&filterby=Postcode)。我正在运行的 XPath 查询是“//x:tbody/x:tr/x:td[4]/x:a”,我已经在 XPath Checker(Firefox 扩展)中对其进行了测试,它会提取相关的位置数据。
然后我使用 PHP 加载页面,执行查询,然后遍历结果。
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$html = curl_exec($ch);
curl_close($ch);
# Create a DOM parser object
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($html);
$xpath = new DOMXpath($dom);
$elements = $xpath->query( '//tbody/tr/td[4]/a');
foreach ($elements as $element) {
echo $element;
}
然后我得到:
Warning: Invalid argument supplied for foreach() in /home/givesh5/public_html/dig/electoratesearch.php on line 41
查询似乎返回某种布尔值而不是查询的匹配列表?
相关标记如下:
<table cellspacing="0" rules="all" border="1" id="ContentPlaceHolderBody_gridViewLocalities" style="border-collapse:collapse;">
<tr class="headingLink">
<th scope="col"><a href="javascript:__doPostBack('ctl00$ContentPlaceHolderBody$gridViewLocalities','Sort$StateAb')">State</a></th><th scope="col"><a href="javascript:__doPostBack('ctl00$ContentPlaceHolderBody$gridViewLocalities','Sort$LocalityNm')">Locality/Suburb</a></th><th scope="col"><a href="javascript:__doPostBack('ctl00$ContentPlaceHolderBody$gridViewLocalities','Sort$Postcode')">Postcode</a></th><th scope="col"><a href="javascript:__doPostBack('ctl00$ContentPlaceHolderBody$gridViewLocalities','Sort$DivisionNm')">Electorate</a></th><th scope="col"><a href="javascript:__doPostBack('ctl00$ContentPlaceHolderBody$gridViewLocalities','Sort$DivisionNmRedistributed')">Redistributed Electorate</a></th><th scope="col">Other Locality(s)</th>
</tr><tr>
<td>VIC</td><td>BOTANIC RIDGE</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Flinders&filterby=Electorate&divid=211">Flinders</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>CANNONS CREEK</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Flinders&filterby=Electorate&divid=211">Flinders</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>CRANBOURNE</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Holt&filterby=Electorate&divid=216">Holt</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>CRANBOURNE EAST</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Flinders&filterby=Electorate&divid=211">Flinders</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>CRANBOURNE EAST</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Holt&filterby=Electorate&divid=216">Holt</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>CRANBOURNE NORTH</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Holt&filterby=Electorate&divid=216">Holt</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>CRANBOURNE SOUTH</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Flinders&filterby=Electorate&divid=211">Flinders</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>CRANBOURNE WEST</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Holt&filterby=Electorate&divid=216">Holt</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>DEVON MEADOWS</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Flinders&filterby=Electorate&divid=211">Flinders</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>FIVEWAYS</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Flinders&filterby=Electorate&divid=211">Flinders</a></td><td></td><td><a href="LocalitySearchResults.aspx?filter=DEVON+MEADOWS&filterby=LocalityorSuburb&state=VIC">DEVON MEADOWS</a></td>
</tr><tr>
<td>VIC</td><td>JUNCTION VILLAGE</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Flinders&filterby=Electorate&divid=211">Flinders</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>SANDHURST</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Isaacs&filterby=Electorate&divid=219">Isaacs</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>SKYE</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Dunkley&filterby=Electorate&divid=210">Dunkley</a></td><td></td><td> </td>
</tr><tr>
<td>VIC</td><td>SKYE</td><td><a href="LocalitySearchResults.aspx?filter=3977&filterby=Postcode">3977</a></td><td><a href="LocalitySearchResults.aspx?filter=Isaacs&filterby=Electorate&divid=219">Isaacs</a></td><td></td><td> </td>
</tr>
</table>
【问题讨论】:
-
DOMXpath如果表达式格式错误或上下文节点无效,则返回 false -
您能否提供您正在解析的标记的相关部分。从 Firefox 派生的 XPath 来自可以包含隐含标记的实时 DOM。因此,以这种方式获得它们是不可靠的。另外,您到底想获取什么?
-
已经用标记更新了 OP,谢谢。在这种情况下,尝试获取本地的链接文本(例如
Text)。例如,在前两个单元格中,这将是“Flinders”。