【发布时间】:2015-08-18 05:08:51
【问题描述】:
我正在尝试使用以下函数从 Wikipedia 中获取信息,但我遇到了 属性错误,因为函数调用返回 None。有人可以尝试解释为什么这会返回 None 吗?
import wikipedia as wp
import string
def add_section_info(search):
HTML = wp.page(search).html().encode("UTF-8") #gets HTML source from Wikipedia
with open("temp.xml",'w') as t: #write HTML to xml format
t.write(HTML)
table_of_contents = []
dict_of_section_info = {}
#This extracts the info in the table of contents
with open("temp.xml",'r') as r:
for line in r:
if "toclevel" in line:
new_string = line.partition("#")[2]
content_title = new_string.partition("\"")[0]
tbl = string.maketrans("_"," ")
content_title = content_title.translate(tbl)
table_of_contents.append(content_title)
print wp.page(search).section("Aortic rupture") #this is None, but shouldn't be
for item in table_of_contents:
section = wp.page(search).section(item).encode("UTF-8")
print section
if section == "":
continue
else:
dict_of_section_info[item] = section
with open("Section_Info.txt",'a') as sect:
sect.write(search)
sect.write("------------------------------------------\n")
for item in dict_of_section_info:
sect.write(item)
sect.write("\n\n")
sect.write(dict_of_section_info[item])
sect.write("####################################\n\n")
add_section_info("Abdominal aortic aneurysm")
我不明白的是,例如,如果我运行add_section_info("HIV"),它会完美运行。
导入的维基百科源代码为here
上面代码的输出是这样的:
Abdominal aortic aneurysm
Signs and symptoms
Traceback (most recent call last):
File "/home/pharoslabsllc/Documents/wikitest.py", line 79, in <module>
add_section_info(line)
File "/home/pharoslabsllc/Documents/wikitest.py", line 30, in add_section_info
section = wp.page(search).section(item).encode("UTF-8")
AttributeError: 'NoneType' object has no attribute 'encode'
【问题讨论】:
-
你能告诉我们这个错误发生在哪里吗?只需将回溯添加到问题。
-
在失败的循环内尝试
print(repr(item))。 -
你有一个硬编码的值。如果您使用
print wp.page(search).section(item)而不是print wp.page(search).section("Aortic rupture"),会发生什么? -
如果我打印,在 for 循环中,
wp.page(search).section(item),我得到None。那是我不明白的部分-应该是文本。
标签: python wikipedia attributeerror nonetype