【问题标题】:Extracting title from link in Python (Beautiful soup)从 Python 中的链接中提取标题(美丽的汤)
【发布时间】:2025-11-27 21:20:03
【问题描述】:

我是 Python 新手,我希望从链接中提取标题。到目前为止,我有以下但遇到了死胡同:

import requests
from bs4 import BeautifulSoup
page = requests.get("http://books.toscrape.com/")
soup = BeautifulSoup(page.content, 'html.parser')
books = soup.find("section")
book_list = books.find_all(class_="product_pod")
tonight = book_list[0]

for book in book_list:
    price = book.find(class_="price_color").get_text()
    title = book.find('a')
    print (price)
    print (title.contents[0])

【问题讨论】:

  • 你想获取 内容吗?

标签: python beautifulsoup


【解决方案1】:

要从链接中提取标题,可以使用title 属性。

举例:

import requests
from bs4 import BeautifulSoup
page = requests.get("http://books.toscrape.com/")
soup = BeautifulSoup(page.content, 'html.parser')

for a in soup.select('h3 > a'):
    print(a['title'])

打印:

A Light in the Attic
Tipping the Velvet
Soumission
Sharp Objects
Sapiens: A Brief History of Humankind
The Requiem Red
The Dirty Little Secrets of Getting Your Dream Job
The Coming Woman: A Novel Based on the Life of the Infamous Feminist, Victoria Woodhull
The Boys in the Boat: Nine Americans and Their Epic Quest for Gold at the 1936 Berlin Olympics
The Black Maria
Starving Hearts (Triangular Trade Trilogy, #1)
Shakespeare's Sonnets
Set Me Free
Scott Pilgrim's Precious Little Life (Scott Pilgrim #1)
Rip it Up and Start Again
Our Band Could Be Your Life: Scenes from the American Indie Underground, 1981-1991
Olio
Mesaerion: The Best Science Fiction Stories 1800-1849
Libertarianism for Beginners
It's Only the Himalayas

【讨论】:

    【解决方案2】:

    你可以使用它:

    import requests
    from bs4 import BeautifulSoup
    page = requests.get("http://books.toscrape.com/")
    soup = BeautifulSoup(page.content, 'html.parser')
    books = soup.find("section")
    book_list = books.find_all(class_="product_pod")
    tonight = book_list[0]
    
    for book in book_list:
        price = book.find(class_="price_color").get_text()
        title = book.select_one('a img')['alt']
        print (title)
    

    输出:

    A Light in the Attic
    Tipping the Velvet
    Soumission
    Sharp Objects
    Sapiens: A Brief History of Humankind
    The Requiem Red...
    

    【讨论】:

    • 这不是锚的标题。
    • 已修复,请尝试
    • 完美运行
    【解决方案3】:

    只需修改现有代码,您就可以在示例中使用包含书名的替代文本。

    print (title.contents[0].attrs["alt"])
    

    【讨论】:

      最近更新 更多