【问题标题】:CSS Selector only selects the first rowCSS Selector 只选择第一行
【发布时间】:2016-02-05 15:36:23
【问题描述】:

我正在解析一个 html 页面并且有一个很长的 CSS 选择器(我想不出一个更短的选择器,因为该页面很愚蠢)。它应该选择表中的所有 tr,但只选择第二行......我错过了什么?

选择器:

body > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(2) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(3) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(8) > td:nth-child(1) > table:nth-child(4) > tbody:nth-child(1) > tr:nth-child(2) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) tr:not(:first-child)

该页面内部有多个表格,但前 90% 甚至无关紧要,选择我要使用的表格后,我跟进一个“[space]tr:not(...)”,所以它应该选择所有降序行,不是吗?

示例html页面(不能链接,需要登录才能访问):http://pastebin.com/gprXTvzz

选择器成功选中我想要的表后(在选择器...> tbody:nth-child(1) tr:not(:first-child)),年龄是这样的:

<tbody>
   <tr valign="bottom">
      <td class="blackmedium" width="80"><b>Part Number</b></td>
      <td class="blackmedium" width="100"><b>Manufacturer</b></td>
      <td class="blackmedium" width="40"><b>Abbr.</b></td>
      <td class="blackmedium" width="50"><b>WIX Part Number</b></td>
      <td class="blackmedium" width="50"><b>Lead Time</b></td>
   </tr>
   <tr>
      <td class="blackmedium" width="80">A0002701098</td>
      <td class="blackmedium" width="100">MERCEDES-BENZ</td>
      <td class="blackmedium" width="40">MBZ</td>
      <td class="blackmedium" width="50"> <a href="http://www.wixindustrialfilters.com/cross.aspx?Part=W03AT780" target="_blank">W03AT780</a>
      </td>
      <td class="blackmedium" width="50">
         STOCK
      </td>
   </tr>
   <tr bgcolor="#e0e0e0">
      <td class="blackmedium" width="80">A0002701598 Discontinued</td>
      <td class="blackmedium" width="100">MERCEDES-BENZ</td>
      <td class="blackmedium" width="40">MBZ</td>
      <td class="blackmedium" width="50"> <a href="javascript:var w=window.open('PartDetail.asp?Part=58892','PartDetail','left=200,top=200,width=530,height=500,toolbar=no,location=no,directories=no,status=no,menubar=no,resizable=yes,scrollbars=yes');w.focus();">58892</a>
      </td>
      <td class="blackmedium" width="50">
      </td>
   </tr>
   <tr>
      <td class="blackmedium" width="80">A0002772395</td>
      <td class="blackmedium" width="100">MERCEDES-BENZ</td>
      <td class="blackmedium" width="40">MBZ</td>
      <td class="blackmedium" width="50"> <a href="javascript:var w=window.open('PartDetail.asp?Part=51249','PartDetail','left=200,top=200,width=530,height=500,toolbar=no,location=no,directories=no,status=no,menubar=no,resizable=yes,scrollbars=yes');w.focus();">51249</a>
      </td>
      <td class="blackmedium" width="50">
      </td>
   </tr>
   <tr bgcolor="#e0e0e0">
      <td class="blackmedium" width="80">A0002772895</td>
      <td class="blackmedium" width="100">MERCEDES-BENZ</td>
      <td class="blackmedium" width="40">MBZ</td>
      <td class="blackmedium" width="50"> <a href="javascript:var w=window.open('PartDetail.asp?Part=57701','PartDetail','left=200,top=200,width=530,height=500,toolbar=no,location=no,directories=no,status=no,menubar=no,resizable=yes,scrollbars=yes');w.focus();">57701</a>
      </td>
      <td class="blackmedium" width="50">
      </td>
   </tr>
</tbody>

【问题讨论】:

    标签: html css-selectors html-parsing


    【解决方案1】:

    body &gt; table:nth-child(1) &gt; tbody:nth-child(1) &gt; tr:nth-child(2) &gt; td:nth-child(1) &gt; table:nth-child(1) &gt; tbody:nth-child(1) &gt; tr:nth-child(3) &gt; td:nth-child(1) &gt; table:nth-child(1) &gt; tbody:nth-child(1) &gt; tr:nth-child(8) &gt; td:nth-child(1) &gt; table:nth-child(4) &gt; tbody:nth-child(1) &gt; tr:nth-child(2) &gt; td:nth-child(1) &gt; table:nth-child(1) &gt; tbody:nth-child(1) tr:not(:first-child)

    不完全回答您的问题,但如果标记不适合解析并且我需要在可怕的标记table 元素中找到一个深深嵌套的元素,我更喜欢通过 特定的存在来找到它标题在其中。在这种情况下,您可以找到具有Part Number 标头的表。示例 XPath:

    //table[tr[1]/td/b = "Part Number"]
    

    然后,您可以在此表上使用"not first child" CSS 选择器:

    tr:not(:first-child)
    

    或者,您也可以使用adjacent selector(在tr 元素之后查找tr 元素,这在逻辑上会排除第一行):

    tr + tr
    

    希望这会简化事情。

    【讨论】:

    • 我不能使用 xpath,但我通过先获取所有表来解决它,然后知道我需要哪个索引,在下一个语句中选择所有 tr 元素。你的也应该工作。 (使用 jSoup)
    猜你喜欢
    • 1970-01-01
    • 2020-04-09
    • 1970-01-01
    • 2020-09-22
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-05-16
    • 1970-01-01
    相关资源
    最近更新 更多