对于每个页面上最顶部的新闻,您可以从'src'属性本身获取图像源。
您可以先使用find() 方法导航到包含图像的div。接下来在该 div 中,您可以找到 img 标记并从其 attributes 获取其来源。
import requests
from bs4 import BeautifulSoup
url='https://www.engadget.com/reviews/latest/page/10/'
res=requests.get(url)
soup=BeautifulSoup(res.text,'html.parser')
div=soup.find('div',{"class":"o-rating_thumb@m-"})
print(div.find('img').attrs['src'])
输出:
https://o.aolcdn.com/images/dims?resize=810%2C455&crop=810%2C455%2C0%2C0&quality=80&image_uri=https%3A%2F%2Fo.aolcdn.com%2Fimages%2Fdims%3Fcrop%3D1400%252C933%252C0%252C0%26quality%3D85%26format%3Djpg%26resize%3D1600%252C1066%26image_uri%3Dhttp%253A%252F%252Fo.aolcdn.com%252Fhss%252Fstorage%252Fmidas%252F85a4e2b124ba329ab520e80e306f07eb%252F206517051%252FIMG_5243e.jpg%26client%3Da1acac3e1b3290917d92%26signature%3Dcea6158d0bf02768d31ee67f2694be6cafaf200c&client=amp-blogside-v2&signature=08a97a1109f1c3287c6766fa284104c6f78770fe
编辑以抓取页面的所有新闻来源:
虽然第一张图片有src属性,但要抓取后续图片,我们必须使用data-originals属性(您可以查看页面来源并找出这一点)。我认为这就是您收到 AttributeError 的原因
我能够像这样抓取所有新闻条目
import requests
from bs4 import BeautifulSoup
url='https://www.engadget.com/reviews/latest/page/10/'
res=requests.get(url)
soup=BeautifulSoup(res.text,'html.parser')
articles=soup.find_all('article',{"class":"o-hit"})
for article in articles:
print("Heading: ", article.find('h2').text.strip())#heading
print("Summary: ", article.find('p').text.strip())#summary
print("Image Source:", article.find('img').attrs['data-original'])#image src
print()
输出:
Heading: Netflix will remove user reviews from its website next month
Summary: Last year five-star ratings got the ax, and now written reviews will fade away too.
Image Source: https://o.aolcdn.com/images/dims?thumbnail=300%2C200&quality=80&image_uri=https%3A%2F%2Fs.aolcdn.com%2Fhss%2Fstorage%2Fmidas%2F884e68f9a829f3a26db5b729f00ccd03%2F206508290%2FEnglish.jpg&client=amp-blogside-v2&signature=b37eb21e95cef8cebe1f3c741b8bb29eb3471dcc
Heading: Smart ForTwo Electric Drive quick spin review
Summary: The saddest way to spend $25,000.
Image Source: https://o.aolcdn.com/images/dims?thumbnail=300%2C200&quality=80&image_uri=https%3A%2F%2Fs.aolcdn.com%2Fhss%2Fstorage%2Fmidas%2Fedbdfdfeff2e77567cd0c4a73484d108%2F206502307%2Fsmartfortwo.jpg&client=amp-blogside-v2&signature=a9fc05d80d4b4d8ba6ef33453510c138632bab81
Heading: Vivo's all-screen NEX S is a frustrating glimpse of the future
Summary: Spoiler alert: It's really cool, but don't bother importing one.
Image Source: https://o.aolcdn.com/images/dims?thumbnail=386%2C217&quality=80&image_uri=https%3A%2F%2Fimg.vidible.tv%2Fprod%2F2018-06%2F29%2F5b36ac0e523dc352bd46785a%2F5b36aedc884c2354eb33d663_1920x1080_U_v1.jpg&client=amp-blogside-v2&signature=725c8033196a2ae3500e2144830d14b03e7abc0e
Heading: Sonos Beam review: Smart features trump minor audio compromises
Summary: Bringing the soundbar into the smart home era.
Image Source: https://o.aolcdn.com/images/dims?thumbnail=386%2C217&quality=80&image_uri=https%3A%2F%2Fimg.vidible.tv%2Fprod%2F2018-06%2F27%2F5b32f579523dc352bd3f66f3%2F5b32fbf2884c2354eb33d62f_1920x1080_U_v1.jpg&client=amp-blogside-v2&signature=4ad311aeb5cb23907fd99ec12d962b148646163d
Heading: BlackBerry KEY2 review: The undisputed keyboard king
Summary: This is the best Android-powered BlackBerry, if that means anything to you.
Image Source: https://o.aolcdn.com/images/dims?thumbnail=386%2C217&quality=80&image_uri=https%3A%2F%2Fimg.vidible.tv%2Fprod%2F2018-06%2F26%2F5b3188ee523dc36212a7ff02%2F5b318be5802b94347b7e586b_1920x1080_U_v1.jpg&client=amp-blogside-v2&signature=5438cdf814480be5856d38db73695f86ade186ea
Heading: Amazon Echo Look review: Good selfie taker, so-so stylist
Summary: An AI is no match for my style instincts.
Image Source: https://o.aolcdn.com/images/dims?thumbnail=386%2C217&quality=80&image_uri=https%3A%2F%2Fimg.vidible.tv%2Fprod%2F2018-06%2F25%2F5b30cbfce880db6107cb7ad0%2F5b30cde61aa5fc22c7bbf187_1920x1080_U_v1.jpg&client=amp-blogside-v2&signature=308e9f00afcb968da05823ce0d0718ccc6e43cb4
Heading: Mitsubishi’s Outlander Plug-In Hybrid is an understated surprise
Summary: Mitsubishi is back, even though it actually never left.
Image Source: https://o.aolcdn.com/images/dims?thumbnail=386%2C217&quality=80&image_uri=https%3A%2F%2Fimg.vidible.tv%2Fprod%2F2018-06%2F21%2F5b2bc80f523dc36212a2be79%2F5b2bc8a6884c2319c410c008_1920x1080_U_v1.jpg&client=amp-blogside-v2&signature=a00b8466fa281051de4d64b1223fe99f97315985
Heading: Amazon Fire TV Cube review: Alexa still needs work as a TV guide
Summary: This device was bound to be made at some point, but is it worth it?
Image Source: https://o.aolcdn.com/images/dims?thumbnail=386%2C217&quality=80&image_uri=https%3A%2F%2Fimg.vidible.tv%2Fprod%2F2018-06%2F21%2F5b2bb81edbaab36faf00ed0e%2F5b2bddfb884c2319c410c00c_1920x1080_U_v1.jpg&client=amp-blogside-v2&signature=baa2db64e12d013ab712d823238fc3efeee693f8
Heading: HTC U12+ review: Fundamentally flawed
Summary: The phone's pressure-sensitive power and volume keys are kinda the worst.
Image Source: https://o.aolcdn.com/images/dims?thumbnail=386%2C217&quality=80&image_uri=https%3A%2F%2Fimg.vidible.tv%2Fprod%2F2018-06%2F21%2F5b28cd94f50775726418990a%2F5b2bd7d4b46ab33c496c1607_1920x1080_U_v1.jpg&client=amp-blogside-v2&signature=8518ce5c141fb85b935794fbd3bd283d32508484