使用 lxml Python 3.5 从 xml 字符串中删除特定元素答案

【问题标题】：To remove a particular element from the xml-string using lxml Python 3.5使用 lxml Python 3.5 从 xml 字符串中删除特定元素
【发布时间】：2016-09-26 22:06:21
【问题描述】：

我将以下 xml 作为 python 函数的输入。我想找到一个具有 Null 值（（firstChild.nodeValue））的特定元素，并从 xml 中完全删除它并返回字符串。我有只使用 lxml 模块的偶然性。我能得到这方面的帮助吗？

<country name="Liechtenstein">
    <rank></rank>
    <a></a>
    <b></b>
    <year>2008</year>
    <gdppc>141100</gdppc>
    <neighbor name="Austria" direction="E">345</neighbor>
</country>

我希望输出是：-

<country name="Liechtenstein">
    <year>2008</year>
    <gdppc>141100</gdppc>
    <neighbor name="Austria" direction="E">345</neighbor>
</country>

我基本上可以灵活地使用包含标签名称的常量列表，我可以在其中迭代并找到文本。下面是列表。 a= ('rank','year','a','b','gdppc','neighbor')

请帮忙！

【问题讨论】：

标签： python xml lxml python-3.5

【解决方案1】：

您可以使用联合来查找单个 xpath 中的所有节点，然后假设您要删除没有文本的节点，您可以调用 tree.remove(node)：

x = """<country name="Liechtenstein">
    <rank></rank>
    <a></a>
    <b></b>
    <year>2008</year>
    <gdppc>141100</gdppc>
    <neighbor name="Austria" direction="E">345</neighbor>
</country>"""

from lxml import etree


tree = etree.fromstring(x)

a = ('rank','year','a','b','gdppc','neighbor')

for node in tree.xpath("|".join(map("//{}".format, a))):
    if not node.text:
        tree.remove(node)
print(etree.tostring(tree).decode("utf-8"))

这会给你：

<country name="Liechtenstein">
<year>2008</year>
<gdppc>141100</gdppc>
<neighbor name="Austria" direction="E">345</neighbor>
</country>

【讨论】：

您好 Padriac，我收到 lxml.etree.XPathEvalError: Undefined namespace prefix 错误。抱歉，我用具有命名空间的实际 xml 更新了我的问题。您的答案适用于我之前没有命名空间的示例 xml。基本上使用标签中的命名空间，它无法评估我猜的 xpath。
'for node in tree.xpath("|".join(map("//{}".format, a)))：文件“lxml.etree.pyx”，第1587行，在 lxml.etree._Element.xpath (src\lxml\lxml.etree.c:57803) 文件“xpath.pxi”，第 307 行，在 lxml.etree.XPathElementEvaluator.__call__ (src\lxml\lxml.etree.c: 166824）文件“xpath.pxi”，第 227 行，在 lxml.etree._XPathEvaluatorBase._handle_result (src\lxml\lxml.etree.c:165783) lxml.etree.XPathEvalError: Undefined namespace prefix'
@sreenimmala，是的，因为您的编辑完全改变了您的问题，您提供的答案与您最初提出的问题相差甚远

【解决方案2】：

下面的代码有效:)

def remove_empty_elements(self,xml_input):
    tree = etree.fromstring(xml_input)
    for found in tree.xpath("//*[text()=' ']"):
        print("deleted " + str(found))
        found.getparent().remove(found)
    print(etree.tostring(tree).decode("utf-8"))

【讨论】：