在每个 div 中仅查找第一次出现的类标记的 XPath 是什么？答案

【问题标题】：What is the XPath to find only the first occurence of a class tag in each div?在每个 div 中仅查找第一次出现的类标记的 XPath 是什么？
【发布时间】：2014-08-04 10:50:07
【问题描述】：

我正在尝试抓取具有产品列表的网站的一些文本。获取每个 div 中仅第一次出现的类标记的文本的 XPath 是什么？在下面的代码中，我需要每个 div "foo" 的 span "bar" 文本第一次出现。

所以我需要只给我“A 年”、“C 年”等的 XPath。

我是新手，不知道怎么做。非常感谢您提供的任何帮助！

<div class="foo">                       
    <span class="bar">year A</span>
    <span class="qux">some text</span>
    <span class="bar">year B</span>
</div>

<div class="foo">                       
    <span class="bar">year C</span>
    <span class="qux">some text</span>
    <span class="bar">year D</span>
</div>

Etc.

使用像 //span[@class='bar'][1]/text() 这样的东西只会得到“A 年”。

使用类似 //*[contains(@class, 'bar')]/text() 的内容，会得到“A 年”、“B 年”、“C 年”和“D 年”。

我正在抓取多个页面，每个页面上的项目数不同。类名“bar”只用于我需要的元素，所以这里描述的问题：What is the XPath expression to find only the first occurrence?不适用。

【问题讨论】：

标签： html xpath

【解决方案1】：

使用//div[@class = 'foo']/span[@class = 'bar'][1]，您将选择每个第一个子span，属性class 为bar。如果父级的类或名称无关紧要，则使用//*/span[@class = 'bar'][1]。

【讨论】：

【解决方案2】：

这个在 XPath 测试器中运行良好：

//div[@class='foo']/span[@class='bar'][1]/text()

如果你真的不需要它，也可以不使用text()：

//div[@class='foo']/span[@class='bar'][1]

【讨论】：

太好了，这对于提供的示例似乎很有效。当代码变得更复杂时，接受的答案会提供更高的精度。