Python 3 web scraper 非常简单不工作

【问题标题】：Python 3 web scraper extremely simple not workingPython 3 web scraper 非常简单不工作
【发布时间】：2018-09-09 22:34:30
【问题描述】：

我正在阅读一本“自学成才的程序员”一书，但遇到了一些 Python 代码问题。我让程序运行没有任何错误。问题是没有任何输出。

import urllib.request
from bs4 import BeautifulSoup


class Scraper:
    def __init__(self, site):
        self.site = site

    def scrape(self):
        r = urllib.request\
            .urlopen(self.site)
        html = r.read()
        parser = "html.parser"
        sp = BeautifulSoup(html, parser)
        for tag in sp.find_all("a"):
            url = tag.get("href")
            if url is None:
                continue
            if "html" in url:
                print("\n" + url)

news = "https://news.google.com/"
Scraper(news).scrape()

【问题讨论】：

标签： python beautifulsoup urllib

【解决方案1】：

查看最后一个“if”语句。如果 url 中没有文本“html”，则不会打印任何内容。尝试删除它并取消缩进：

class Scraper:
    def __init__(self, site):
        self.site = site

    def scrape(self):
        r = urllib.request\
            .urlopen(self.site)
        html = r.read()
        parser = "html.parser"
        sp = BeautifulSoup(html, parser)
        for tag in sp.find_all("a"):
            url = tag.get("href")
            if url is None:
                continue
            print("\n" + url)

【讨论】：

现在我没有模块名称 urlib??
urllib 是相当新的 python 中的本机库。您可以尝试安装它pip install urllib，但我认为这甚至是不可能的，除非您的 python 很旧。你能发布整个错误/回溯吗？也许在 pastebin 上所以它是可读的？
如果您还没有升级到更高版本的 Python，请运行 python3 myfile.py（python -v 会显示版本）。