XPath 检索节点内容答案

【问题标题】：XPath retrieve nodeContentXPath 检索节点内容
【发布时间】：2011-11-01 08:30:01
【问题描述】：

我想从这个网页http://www.westminster.ac.uk/schools/computing/undergraduate/computer-networks/bsc-honours-computer-network-security 中检索“课程负责人”的名字。如何才能做到这一点？我试过了

//div[starts-with(@id,'content_div')]/*[self::h3 or self::h4 and .='Course Leader' or 'Course Leaders']/following-sibling::p[1]

但它返回错误的数据.. 我需要在课程负责人之后选择“nodeContent”。

【问题讨论】：

您是如何检索数据的？ W3C Validator 说这不是有效的 XML。
对不起，我粘贴错误的剪辑，编辑操作。我也在使用 hpple 来解析 html 数据。

标签： html objective-c xml xpath html-parsing

【解决方案1】：

试试这个 XPath：

//div[starts-with(@id, 'content_div')]
    /p[
        (preceding-sibling::*[1][self::h3] or preceding-sibling::*[1][self::h4]) 
            and (preceding-sibling::*[1] = 'Course Leader' 
                or preceding-sibling::*[1] = 'Course Leaders')
     ]

【讨论】：