无法使用 BeautifulSoup CSS-selector 选择 HTML 元素，但能够使用 CSS-selector 在 JS 中获取元素答案

【问题标题】：Unable to select HTML element using BeautifulSoup CSS-selector but was able to get the element in JS using CSS-selectors无法使用 BeautifulSoup CSS-selector 选择 HTML 元素，但能够使用 CSS-selector 在 JS 中获取元素
【发布时间】：2019-09-09 21:15:35
【问题描述】：

我正在使用 Python 和 BeautfulSoup HTML 解析器来选择 HTML 元素。但是，我无法让这个工作。

response = requests_session.post(login_url, headers=headers, data=data_credentials) # log in to the requests Session so that you can reuse it

search_url= 'https://www.website.com/search.php'
p_id='342953'

response = requests_session.get(search_url,headers=headers, params={'query':p_id,'type':'p'})
redirected_urls=response.url
th_soup = BeautifulSoup(response.content, 'html.parser')
trx_ht =th_soup.select("body > table > tbody > tr > td > table > tbody > tr:nth-child(2) > td:nth-child(2) > div:nth-child(3) > table > tbody > tr:nth-child(11) > td > table > tbody > tr:nth-child(4) > td:nth-child(5) > input[type='hidden']:nth-child(1)")

【问题讨论】：

website.com/search.php 是什么？我得到一个 404 并假设它是一个虚拟 URL。除了明显的遗漏之外，我怀疑这可以在不查看您尝试抓取的 DOM 的情况下进行调试。
@ggorlen 那不是真的。我故意使用虚拟网址。我在实际代码中的实际代码。
@ggorlen 但是选择器是实际的，它在 Java Script 代码中工作。
好的，但是我如何在没有 DOM 的情况下调试您的代码？此外，最好不要回滚明显改善帖子的格式编辑。
@ggorlen 是对的。如果您不提供导致问题的实际代码，任何人都无能为力。

标签： python-3.x beautifulsoup css-selectors

【解决方案1】：

根据您在 pastebin 中提供的 HTML，可以使用具有特定属性的 .find_all() 调用来定位隐藏的输入。如果你想要的字段总是以qtyb-开头，你可以使用正则表达式和BeautifulSoup来查找所有匹配的元素，如下所示：

from bs4 import BeautifulSoup
import re

# Read the HTML in from a file (normally requests is used)

with open('sm7iXcUq.html', encoding='utf-8') as f_html:
    html = f_html.read()

soup = BeautifulSoup(html, 'html.parser')

for i in soup.find_all('input', attrs={'type' : 'hidden', 'name' : re.compile('qtyb-.*')}):
    print(i)

对于您提供的 HTML，这将返回一个元素，如下所示：

<input name="qtyb-52843099" type="hidden" value="1"/>

name 的值可以通过以下方式获得：

i['name']

这种方法会为您提供匹配 name 的所有元素。

【讨论】：

【解决方案2】：

您也可以使用以下内容吗？这假设 input[value=1][name] 是跨源的常量

soup.select_one('input[value=1][name]')['name']

【讨论】：