【问题标题】:Cannot access Beautifulsoup second div无法访问 Beautifulsoup 第二个 div
【发布时间】:2023-09-26 20:16:01
【问题描述】:

我下面有一个html代码,我想访问“sisa 1”..但总是失败,谁能帮忙?

<dd>
<div class="product-list__stock--branch">
 <div data-id="2" data-stock="1" class="product-list__stock product- 
       list__stock--ready">
    <b>Online / COD</b>
    <span>Stok tersedia</span>
 </div>
 <div data-id="3" data-stock="1" class="product-list__stock product- 
     list__stock--ready">
    <b>Toko Semarang</b>
    <span>Stok tersedia</span>
    <span class="tag2 tag--warning" style="color:white;">Sisa 1</span>
  </div>
  <div class="product-list__stock--available-branch-trigger product- 
     list__stock--available-branch-trigger--sold-out">Tidak tersedia di 
     toko lain.
  </div>
</div>
</dd>

【问题讨论】:

  • Div 标签不匹配,您确定 HTML 正确吗?
  • 是的,我从印度尼西亚的在线网店中删除了它
  • let x = document.querySelector("span.tag2.tag--warning").innerText;
  • 结果还是无

标签: python web-scraping beautifulsoup


【解决方案1】:

这是一个使用 BeautifulSoup 的解决方案:

div = soup.find("div", {"data-id": "3"})

这将返回包含 Sisa 1 的 div。要获取实际的“Sisa 1”文本:

text = div.contents[2].text

【讨论】:

  • 我试过了,结果是 None 这是我的脚本:link = requests.get('jakartanotebook.com/…soup2 = BeautifulSoup(link,'lxml') div = soup2.find("div", {"data-id": "3"}) try : text = div.contents[2].text except Exception as e: text = None text="Not found" print(text) ...对不起我'不知道怎么写在这个评论框里...
【解决方案2】:

你可以像这样使用类来访问元素:

from bs4 import BeautifulSoup

html = """
<dd>
<div class="product-list__stock--branch">
 <div data-id="2" data-stock="1" class="product-list__stock product- 
       list__stock--ready">
    <b>Online / COD</b>
    <span>Stok tersedia</span>
 </div>
 <div data-id="3" data-stock="1" class="product-list__stock product- 
     list__stock--ready">
    <b>Toko Semarang</b>
    <span>Stok tersedia</span>
    <span class="tag2 tag--warning" style="color:white;">Sisa 1</span>
  </div>
  <div class="product-list__stock--available-branch-trigger product- 
     list__stock--available-branch-trigger--sold-out">Tidak tersedia di 
     toko lain.
  </div>
</div>
</dd>
"""
soup = BeautifulSoup(html, 'html.parser')
all_rows = soup.find(class_="tag2")

print(all_rows.text)

【讨论】:

  • @NugrohoMoristianto 你复制了完全相同的代码吗?
  • 这里是我的代码:link = requests.get('jakartanotebook.com/… soup2 = BeautifulSoup(link,'lxml') all_rows = soup2.find(class_="tag2") try : text1 = all_rows.除 Exception as e: text1 = None text1="Not found" print(text1) 的文本
  • @NugrohoMoristianto 能否发个代码链接(不能复制)。
  • @NugrohoMoristianto 每个网站都有不同的标签