雅虎金融股票的python 3网络抓取代码答案

【问题标题】：python 3 web scraping code for yahoo finance stock雅虎金融股票的python 3网络抓取代码
【发布时间】：2014-01-02 14:59:00
【问题描述】：

这里是 Python 3 的代码，用于网络抓取雅虎财经的 AAPL 股票价格。

import urllib.request
from bs4 import BeautifulSoup as bs4

htmlfile = urllib.request.urlopen("http://finance.yahoo.com/q?s=AAPL")

htmltext = htmlfile.read()

for price in htmltext.find(attrs={'id':"yfs_184_aapl"}):
    print (price)

显然，代码在 Python 2.7 中只需稍加修改即可正常工作。但是，它在 Python 3.3.3 Shell 中不起作用。这是它显示的错误：

Traceback (most recent call last):
  File "C:/Python33/python codes/webstock2.py", line 8, in <module>
    for price in htmltext.find(attrs={'id':"yfs_184_aapl"}):
TypeError: find() takes no keyword arguments

我已经学会了通过 str.encode 将字符串模式更正为二进制。我不确定我是否可以使用此代码。

Edit1：@Martijn 之后的最终工作代码更改

    import urllib.request
    from bs4 import BeautifulSoup as bs4

    htmlfile = urllib.request.urlopen("http://finance.yahoo.com/q?s=AAPL")

    htmltext = htmlfile.read()

    soup = bs4(htmltext)

    for price in soup.find_all(id="yfs_l84_aapl"):
        print (price)

它打印出空白。你能想出这个吗。再次感谢。

【问题讨论】：

更好的是，以 CSV 格式获取报价并完全跳过屏幕抓取：download.finance.yahoo.com/d/quotes.csv?s=AAPL&f=sl1。更多详情：gummy-stuff.org/Yahoo-data.htm

标签： python python-3.x web-scraping stockquotes

【解决方案1】：

你打电话给str.find()，不是BeautifulSoup.find()。你忘记了什么：

soup = bs4(htmltext)

for price in soup.find(attrs={'id':"yfs_184_aapl"}):

但是如果你要循环，你需要调用find_all()，真的：

for price in soup.find_all(id="yfs_l84_aapl"):

您实际上不必使用attrs 关键字参数；直接将属性指定为关键字参数也可以正常工作。

你做必须使用正确的id属性；它是yfs_l84_aapl（字母l，后跟数字8和4），而不是数字1。

【讨论】：

我已更新（检查问题中编辑中插入的代码）。不幸的是，它既不提供错误也不提供任何输出值。只是空白。
糟糕。愚蠢的错误。非常感谢您指出。我一直在挠头。 :)