【发布时间】:2018-06-08 23:41:21
【问题描述】:
我正在尝试从文本文件中读取一个单词并检查该单词是否存在于 xml 标记中,或者该文件包含特殊字符。 这是代码:
import lxml.objectify
from lxml import etree
import codecs
import xml.etree.cElementTree as ET
file_path = "C:\Users\HP\Downloads\Morphalou-2.0.xml"
for event, elem in ET.iterparse(file_path, events=("start", "end")):
if elem.tag == 'orthography' and event =='start':
data = elem.text
f = codecs.open ('test.txt', encoding="ISO-8859-1")
for line in f:
check = line
if check in data:
print (check,":", "true")
break
else:
print (check,":", "false")
break
elem.clear()
当我提示 print (check) 时,这个词看起来就像我想要的“garçon”,但是当我添加测试时
if check in data:
print (check,":", "true")
break
else:
print (check,":", "false")
break
这就是我得到的:
(u'gar\xe7on', ':', 'false')
认为结果一定是真的!!遗漏了什么,有大神知道是什么吗,求大神帮忙!提前谢谢。
【问题讨论】: