【发布时间】:2019-02-14 22:15:33
【问题描述】:
我有这个相当复杂的 html,我想用 JSoup 解析。我已经尝试了几件事,但没有一个有效。基本上,我想获取第二个表,并读取所有行并将其附加到字符串。
我的尝试
val document = Jsoup.parse(it.data)
val tableElements = document.select("table:eq(2) > tbody")
for (element in tableElements) {
val data = element.select("td")
try {
Timber.i("${data[0].select("small").text()} : ${data[1].select("small").text()}")
} catch (e: Exception) {
}
}
我要提取什么部分
<table>
<tbody>
<tr class="">
<td class="odsazena" align="left"><small>User's identification number: </small></td>
<td class="odsazena" align="left"><small>34565</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Study programme: </small></td>
<td class="odsazena" align="left"><small>Informatics</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Type of study: </small></td>
<td class="odsazena" align="left"><small>Bachelor</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Form of study: </small></td>
<td class="odsazena" align="left"><small>full-time, attendance method</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Standard length of study: </small></td>
<td class="odsazena" align="left"><small>3</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Number of credits required to complete your study: </small></td>
<td class="odsazena" align="left"><small>180</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Number of credits enrolled for the whole study: </small></td>
<td class="odsazena" align="left"><small>120</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Number of credits obtained during your whole course of study: </small></td>
<td class="odsazena" align="left"><small>90</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Your prospective academic degree: </small></td>
<td class="odsazena" align="left"><small>Bc.</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Beginning of study: </small></td>
<td class="odsazena" align="left"><small>09/01/2017</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Resolution of admission: </small></td>
<td class="odsazena" align="left"><small>Admitted without the entrance exam</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Progress of study: </small></td>
<td class="odsazena" align="left"><small>enrolled</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Mode of completion: </small></td>
<td class="odsazena" align="left"><small><i>not stated</i></small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Current financing: </small></td>
<td class="odsazena" align="left"><small>study fully financed from ME SK</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Final thesis topic: </small></td>
<td class="odsazena" align="left"><small><i>not stated</i></small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Final thesis supervisor: </small></td>
<td class="odsazena" align="left"><small><i>not stated</i></small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Language of study: </small></td>
<td class="odsazena" align="left"><small>Slovak</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Card number:</small></td>
<td class="odsazena" align="left"><small>123456</small></td>
</tr>
</tbody>
</table>
现在,问题到底出在哪里?好吧,从我的尝试来看,代码甚至不允许我打印我想要的东西,并且在当前状态下它只会跳过 for 循环。我想要实现的是我想进入第二个表“table:eq(2)”并获取“tbody”中的元素
【问题讨论】: