JSoup 获取 div 的第一个孩子答案

【问题标题】：JSoup get first child of divJSoup 获取 div 的第一个孩子
【发布时间】：2015-12-27 22:13:13
【问题描述】：

我正在尝试使用 JSoup 解析如下所示的结构。

<div class="bigClass">
    <a href="foo.com"> Field 1</a>
    <a href="bar.com"> Field 2</a>
    <a href="baz.com"> Field 3</a>
</div>

现在，我正在使用以下代码获取 div 类“bigClass”的整个文本内容

doc = Jsoup.connect("http://foobar.com").userAgent(userAgent).timeout(1000).get();
price = doc.getElementsByClass("bigClass");
System.out.println(price.text());

无论<a> 类和 URL 如何，我怎样才能只获取第一个孩子（“字段 1”）？

BeautifulSoup python 的类似问题：Beautiful soup getting the first child

【问题讨论】：

标签： java parsing jsoup

【解决方案1】：

你可能在找我

doc.getElementsByClass("bigClass").first().child(0)

getElementsByClass("bigClass") 返回所有带有bigClass 的元素
但我们想要获得特定的（可能是第一个）
然后在第一个元素上选择它的第一个子节点（子节点的索引从 0 开始）。

【讨论】：

更简单/更短：doc.select("div.bigClass > a:first-child").

【解决方案2】：

您也可以使用以下两个选项之一：

选项 1

doc.select("div.bigClass > a:first-of-type");

演示：http://try.jsoup.org/~btbp8Fb1xrPf38dTYbplLz5lA3Y

选项 2

doc.select("div.bigClass > a:first-child");

演示：http://try.jsoup.org/~mj8CAaWTtQEicyd75bSHDV3_KeA

【讨论】：