【问题标题】:No Output When Running BeautifulSoup Python Code运行 BeautifulSoup Python 代码时没有输出
【发布时间】:2014-09-23 15:32:46
【问题描述】:

我最近使用来自this question 的 BeautifulSoup 尝试了以下 Python 代码,这似乎对提问者有用。

import urllib2
import bs4
import string
from bs4 import BeautifulSoup

badwords = set([
    'cup','cups',
    'clove','cloves',
    'tsp','teaspoon','teaspoons',
    'tbsp','tablespoon','tablespoons',
    'minced'
])

def cleanIngred(s):

    s=s.strip()

    s=s.strip(string.digits + string.punctuation)

    return ' '.join(word for word in s.split() if not word in badwords)

def cleanIngred(s):
    # remove leading and trailing whitespace
    s = s.strip()
    # remove numbers and punctuation in the string
    s = s.strip(string.digits + string.punctuation)
    # remove unwanted words
    return ' '.join(word for word in s.split() if not word in badwords)

def main():
    url = "http://allrecipes.com/Recipe/Slow-Cooker-Pork-Chops-II/Detail.aspx"
    data = urllib2.urlopen(url).read()
    bs = BeautifulSoup.BeautifulSoup(data)

    ingreds = bs.find('div', {'class': 'ingredients'})
    ingreds = [cleanIngred(s.getText()) for s in ingreds.findAll('li')]

    fname = 'PorkRecipe.txt'
    with open(fname, 'w') as outf:
        outf.write('\n'.join(ingreds))

if __name__=="__main__":
    main()

但由于某种原因,我无法让它在我的情况下工作。我收到错误:

AttributeError                            Traceback (most recent call last)
<ipython-input-4-55411b0c5016> in <module>()
     41 
     42 if __name__=="__main__":
---> 43     main()

<ipython-input-4-55411b0c5016> in main()
     31     url = "http://allrecipes.com/Recipe/Slow-Cooker-Pork-Chops-II/Detail.aspx"
     32     data = urllib2.urlopen(url).read()
---> 33     bs = BeautifulSoup.BeautifulSoup(data)
     34 
     35     ingreds = bs.find('div', {'class': 'ingredients'})

AttributeError: type object 'BeautifulSoup' has no attribute 'BeautifulSoup'

我怀疑这是因为我使用的是 bs4 而不是 BeautifulSoup。我尝试用bs = bs4.BeautifulSoup(data) 替换行bs = BeautifulSoup.BeautifulSoup(data) 并且不再收到错误,但没有输出。是否有太多可能的原因无法猜测?

【问题讨论】:

  • 他们import BeautifulSoup,你from bs4 import BeautifulSoup。你应该使用bs = BeautifulSoup(data),或者import bs4然后bs = bs4.BeautifulSoup(data)

标签: python beautifulsoup


【解决方案1】:

原代码使用 BeautifulSoup 版本 3:

import BeautifulSoup

你切换到 BeautifulSoup 版本 4,还切换了导入的样式:

from bs4 import BeautifulSoup

要么删除该行;您的文件前面已经有正确的导入:

import bs4

然后使用:

bs = bs4.BeautifulSoup(data)

或将后一行更改为:

bs = BeautifulSoup(data)

(并删除import bs4 行)。

您可能还想查看 BeautifulSoup 文档的 Porting code to BS4 section,以便您可以进行任何其他必要的更改,升级您找到的代码,以充分利用 BeautifulSoup 版本 4。

该脚本在其他方面工作正常并生成一个新文件PorkRecipe.txt,它不会在标准输出上生成输出。

修复bs4.BeautifulSoup引用后的文件内容:

READY IN 4+ hrs

Slow Cooker Pork Chops II

Amazing Pork Tenderloin in the Slow Cooker

Jerre's Black Bean and Pork Slow Cooker Chili

Slow Cooker Pulled Pork

Slow Cooker Sauerkraut Pork Loin

Slow Cooker Texas Pulled Pork

Oven-Fried Pork Chops

Pork Chops for the Slow Cooker

Tangy Slow Cooker Pork Roast

Types of Cooking Oil

Garlic: Fresh Vs. Powdered

All about Paprika

Types of Salt
olive oil
chicken broth
garlic,
paprika
garlic powder
poultry seasoning
dried oregano
dried basil
thick cut boneless pork chops
salt and pepper to taste
PREP 10 mins
COOK 4 hrs
READY IN 4 hrs 10 mins
In a large bowl, whisk together the olive oil, chicken broth, garlic, paprika, garlic powder, poultry seasoning, oregano, and basil. Pour into the slow cooker. Cut small slits in each pork chop with the tip of a knife, and season lightly with salt and pepper. Place pork chops into the slow cooker, cover, and cook on High for 4 hours. Baste periodically with the sauce

【讨论】:

  • @MaxPower:否则脚本工作;您需要检查文件是否已生成,而不是控制台上是否有输出。
  • @Martjin 非常感谢,我在使用不同版本时应该更加小心!
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2021-02-28
  • 1970-01-01
  • 2022-12-25
  • 1970-01-01
  • 2018-12-01
相关资源
最近更新 更多