BeautifulSoup 不解析从本地文件加载的 XML答案

【问题标题】：BeautifulSoup doesn't parse XML loaded from local fileBeautifulSoup 不解析从本地文件加载的 XML
【发布时间】：2017-03-31 01:40:50
【问题描述】：

我的Python 脚本使用BeautifulSoup 在尝试从本地加载的文件中解析（从XML 中查找元素）时得到None：

xmlData = None

with open('conf//test2.xml', 'r') as xmlFile:
    xmlData = xmlFile.read()

# this creates a soup object out of xmlData,
# which is properly loaded from file above
xmlSoup = BeautifulSoup(xmlData, "html.parser")

# this resolves to None
subElemX = xmlSoup.root.singleelement.find('subElementX', recursive=False)

文件：

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<root>
    <singleElement>
        <subElementX>XYZ</subElementX>
    </singleElement>
    <repeatingElement id="1"/>
    <repeatingElement id="2"/>
</root>

我还有一个 REST GET 服务，它返回相同的 XML，但是当我使用 requests.get 读取它时，它被解析得很好：

resp = requests.get(serviceURL, headers=headers)

respXML = resp.content.decode("utf-8")

restSoup = BeautifulSoup(respXML, "html.parser")

为什么它适用于 REST 响应，而不适用于从本地文件中读取的数据？

更新：虽然我知道 python 区分大小写且单个element !=singleElement，但在解析时会忽略大小写网络服务。

【问题讨论】：

打印 xmlData 和 respXML 并比较你得到的结果。
singleelement != singleElement
有趣，它与 REST 服务一起工作

标签： python xml beautifulsoup

【解决方案1】：

要让它发挥作用的两件事：

将功能从 html.parser 更改为 xml（您正在解析 XML 数据，XML != HTML）
将singleelement 更改为singleElement

已应用更改（对我有用）：

xmlSoup = BeautifulSoup(xmlData, "xml")

subElemX = xmlSoup.root.singleElement.find('subElementX', recursive=False)
print(subElemX)  # prints <subElementX>XYZ</subElementX>

【讨论】：

我收到了bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: xml. Do you need to install a parser library?
@amphibient 啊，是的，你需要安装lxml 才能工作。
我尝试了python -m pip install lxml，但得到了ERROR: b"'xslt-config' is not recognized as an internal or external command,\r\noperable program
stackoverflow.com/questions/40640026/…

【解决方案2】：

显然，HTML 是一种不区分大小写的语言，因此html.parser 在内部将所有标签名称转换为小写。鉴于此，以下行应该有效：

subElemX = xmlSoup.root.singleelement.find('subelementx', recursive=False)

但一般来说，您不应该使用 HTML 解析器来解析 XML 文档。 XML 对其语法非常严格，这是有充分理由的。

【讨论】：