【问题标题】:Unable to extract div html conent in scrapy python无法在scrapy python中提取div html内容
【发布时间】:2022-07-15 23:16:59
【问题描述】:

我正在从这个URL 中删除一些数据

我要提取描述 html div 内容

这是我的代码

response.xpath("//*[@id='tab-description']/p").extract()

但它返回额外的普通数据。

我希望输出应该是这样的

<p>    <strong>Brand Name: </strong>NoEnName_Null  <br>  <strong>Material: </strong>Cloth  <br>  <strong>Warning: </strong>2+  <br>  <strong>Function: </strong>Cooperation/Interpersonal Relations Developing  <br>  <strong>Dimensions: </strong>2/3/4/5/6 M  <br>  <strong>Design: </strong>Other  <br>  <strong>Age Range: </strong>&gt; 3 years old  <br>  <strong>Sports: </strong>Gymnastics  <br>  <strong>Type: </strong>Other  <br>  <strong>Gender: </strong>Unisex  <br>  <strong>Diameter: </strong>2/3/4/5/6 M  <br>  <strong>Material: </strong>210T Polyester fabric, 450mm water proof  <br>  <strong>Color: </strong>as the photo  <br>  <strong>handle number: </strong>8-28   </p>

【问题讨论】:

    标签: python html web-scraping scrapy scrapy-shell


    【解决方案1】:
    from bs4 import BeautifulSoup
    import requests
    
    r = requests.get('https://bbdealz.com/product/funny-sports-game-2m-3m-4m-5m-6m-diameter-outdoor-rainbow-umbrella-parachute-toy-jump-sack-ballute-play-game-mat-toy-kids-gift/')
    soup = BeautifulSoup(r.text, 'html.parser')
    info = soup.select_one('#tab-description')
    print(info.text)
    

    【讨论】:

      猜你喜欢
      • 2016-04-28
      • 2017-09-30
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2020-10-28
      • 1970-01-01
      • 2014-03-01
      • 1970-01-01
      相关资源
      最近更新 更多