【问题标题】:Python extract empty tag with beautifulsoupPython用beautifulsoup提取空标签
【发布时间】:2018-06-14 18:45:43
【问题描述】:

我有以下循环,用于提取特定标签并将它们输入到目录中所有文件的 .csv 文件中。但是,有些文件有空标签,我收到以下错误

Traceback (most recent call last):  
  File "newsbank2.py", line 27, in <module>  
    author = fauthor.text  
AttributeError: 'NoneType' object has no attribute 'text'

对于这些情况,我怎样才能在 csv 文件中输入一个空白。我的代码如下。

path = "my directory"

for filename in os.listdir(path):

    if filename.endswith('.htm'):
        fname = os.path.join(path,filename)
        with open(fname, 'r') as f:
            soup = BeautifulSoup(f.read(),'html.parser')
            ftitle = soup.find("div", class_="title")
            title = ftitle.text
            fsource = soup.find("div", class_="source")
            source = fsource.text
            source = source.replace("Browse Issues", " ")
            publication = source.split("-")[0].strip()
            fauthor = soup.find("li", class_="author first")
            author = fauthor.text
            fbody = soup.find("div", class_="body")
            body = fbody.text
            f = csv.writer(open("testcsv","a"))
            f.writerow([title, source, author, body])

【问题讨论】:

    标签: python beautifulsoup nonetype


    【解决方案1】:

    使用这样的东西:

    title = ftitle.text if hasattr(ftitle, 'text') else ''
    

    或以下也应该有效:

    title = ftitle.text if ftitle else ''
    

    【讨论】:

      猜你喜欢
      • 2012-05-18
      • 1970-01-01
      • 2020-10-21
      • 1970-01-01
      • 2021-12-05
      • 2018-12-12
      • 2016-04-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多