【问题标题】:how to fetch an image url from image tag that is present within a div tag from flipkart site?如何从来自 Flipkart 网站的 div 标签中的图片标签中获取图片 url?
【发布时间】:2019-09-20 15:34:04
【问题描述】:

我正在尝试使用漂亮的汤从 Flipkart 网站获取图像 url,但出现键值错误。我尝试从 alt src 中存在的图像类标记中获取图像 url。

import requests

from bs4 import BeautifulSoup

r = requests.get("https://www.flipkart.com/men/shirts/casual-party-wear-shirts/prsid=2oq,s9b,mg4,vg6&p[]=facets.price_range.from%3DMin&p[]=facets.price_range.to%3D799&otracker=sp_browse_announcement_search.flipkart.com")

html = BeautifulSoup(r.text, 'lxml')

for img in html('img','_3togXc'):

print(img['alt src'])

预期的结果是得到图片的url

:src="https://rukminim1.flixcart.com/image/309/371/jtsz3bk0/shirt/p/n/r/3xl-twtblshirtful-sh4-tripr-original-imaffycxgppmkknv.jpeg?q=50" 

...但我收到键值错误。

【问题讨论】:

  • 您能否更新您的答案以显示整个错误回溯?这将使发现问题变得容易得多。
  • 我明白了很遗憾,您要查找的页面已被移动或删除

标签: python web-scraping


【解决方案1】:

以下代码将帮助您进入,

import requests
from bs4 import BeautifulSoup
soup = BeautifulSoup(requests.get('https://matplotlib.org/tutorials/introductory/sample_plots.html').content)
# Using find gives first occurrence / use select
image_div = soup.find('div',{'class':'figure align-center'}) # Getting complete div element
image_tag = image_div.select('img ') # Getting image element
imageLink = image_tag[0]['src']
imageAlt = image_tag[0]['alt']
#Some Manipulations if required
imageLink = imageLink.replace("../../",'https://matplotlib.org/')
print(imageLink)
print(imageAlt)

也请参考一些有用的选择器https://sites.google.com/view/way2learnings/programming-languages/python/python-libraries/beautifulsoup

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2014-05-07
    • 2017-06-14
    • 1970-01-01
    • 2013-05-05
    • 2014-09-17
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多