【发布时间】:2021-02-24 06:45:35
【问题描述】:
我正在抓取的链接:https://www.indusind.com/in/en/personal/cards/credit-card.html
from urllib.request import urlopen
from bs4 import BeautifulSoup
import json, requests, re, sys
from selenium import webdriver
import re
IndusInd_url = "https://www.indusind.com/in/en/personal/cards/credit-card.html"
html = requests.get(IndusInd_url)
soup = BeautifulSoup(html.content, 'lxml')
print(soup)
for x in soup.select("#display-product-cards .text-primary"):
print(x.get_text())
使用上面的代码我试图抓取卡片的标题,但不幸的是我得到了这个输出
<html><body><p>This website is secured against online attacks. Your request was blocked due to suspicious behavior<br/>
<br/>
Client IP : 124.123.170.109<br/>
<br/>
Incident Time : 2021-02-24 06:28:10 UTC <br/>
<br/>
Incident ID : YDXx@m6g3nSFLvi5lGg4wgAAAf8<br/>
<br/>
If you feel it was a legitimate request, please contact the website owner for further investigation and remediation with a screenshot of this page.</p></body></html>
是否有任何其他替代方法可以用来抓取详细信息。
非常感谢任何帮助! ! !
【问题讨论】:
标签: python selenium web-scraping beautifulsoup python-requests