只获取 href beautifulsoup答案

【问题标题】：Get only href beautifulsoup只获取 href beautifulsoup
【发布时间】：2019-10-11 05:11:04
【问题描述】：

我有以下汤

<a href="https://www.abc1.com">
    <h3>ABC1</h3>
</a>
<a href="https://www.abc2.com">
    <h3>ABC2</h3>
</a>
<a href="https://www.abc3.com">
   <h3>ABC3</h3>
</a>

从这里，我想得到所有的href 目前，我正在做

links = soup.find_all('a')

但这显示的是空数组，像这样，

[][][]

有人知道更好的方法吗？

【问题讨论】：

什么是 linkWithTitles ？添加完整代码
这是一种汤，我实际上是根据自己的喜好命名的。
@UditHariVashisht 请再看问题

标签： python python-3.x beautifulsoup python-requests

【解决方案1】：

我可以使用以下代码获取 href：-

for link in links:
    print(link['href'])

【讨论】：

【解决方案2】：

cont =  soup.find_all('a')

link = []
for href in cont:
    print(link.append(href.get('href')))

#o/p
link
['https://www.abc1.com', 'https://www.abc2.com', 'https://www.abc3.com']

【讨论】：

【解决方案3】：

确保你在之前下载了这个库

from bs4 import BeautifulSoup
import urllib2
import re

html_page = urllib2.urlopen("https://yourwebsite")
soup = BeautifulSoup(html_page)
links = []

for link in soup.findAll('a', attrs={'href': re.compile("^http://")}):
    links.append(link.get('href'))

print(links)

【讨论】：