Beautifulsoup 不打印链接答案

【问题标题】：beautifulsoup not printing linksBeautifulsoup 不打印链接
【发布时间】：2015-03-07 05:43:12
【问题描述】：

我正在废弃 rss

from bs4 import BeautifulSoup
import urllib2
import requests


url = raw_input("");
re=requests.get(url);

def rss_get_items(url):    
    request = urllib2.Request(url)
    response = urllib2.urlopen(request)
    soup = BeautifulSoup(response)

    for item_node in soup.find_all('item'):
        item = {}
        for subitem_node in item_node.findChildren():
            key = subitem_node.name
            value = subitem_node.text
            item[key] = value
        yield item

if __name__ == '__main__':
    for item in rss_get_items(url):
        print item['title']
        print item['pubdate']
        print item['link']
        print item['guid']
        print item['description']

我从这个网站上发布的答案中得到了这个脚本的一部分，我只是给这个家伙学分。我忘记了原始帖子和发布它的用户的姓名。无论如何，我无法打印链接，它只是无法正常工作，我想知道为什么。

我可以按照文档去做

for link in soup.find_all('a'):
    print(link.get('href'))
# http://example.com/elsie
# http://example.com/lacie
# http://example.com/tillie

这会起作用，但出于好奇，我只想知道第一种方法适用于打印链接，只是出于好奇。

我正在使用aljazeera.com rss

【问题讨论】：

您使用的是哪个网址？
我尝试使用 feeds.bbci.co.uk/news/rss.xml 并且效果很好，添加感兴趣的 URL 会有所帮助。
@alecxe 请阅读我的编辑
@Stedy 请阅读我的编辑
试过没有问题，你使用的是完整的URL“aljazeera.com/Services/Rss/?PostingId=2007731105943979989”吗？

标签： python xml python-2.7 beautifulsoup

【解决方案1】：

当您抓取 xml 内容时，请使用 xml 解析器来创建您的汤。

soup = BeautifulSoup(response, 'xml')

【讨论】：

似乎没有修复它，而是似乎我得到了这个错误而不是pastebin.com/B2QSzdqw
@Fischer 这是一个keyerror，因为井项目没有那个键。为了避免它像这样打印它：item.get('pubdate') -这将在 keyerror 情况下返回 None 。并考虑接受答案。
我总是接受答案，你可以查看我的个人资料 :) 但在我这样做之前，我想确保解决方案有效，因为我不喜欢在接受答案后提问，接受答案 =问题永远关闭:)