根据文本删除 SVG 元素答案

【问题标题】：Delete SVG element based on text根据文本删除 SVG 元素
【发布时间】：2020-05-19 08:21:47
【问题描述】：

我有一个 SVG 文件。我试图摆脱一些包含特定文本的元素：

<g style="font-family:'ARIAL'; stroke:none; fill:rgb(127,0,0);" >
<g font-size="53.4132" >
<text id="cv_126" x="168" y="474.78" transform="rotate(330 168 474.78) translate(168 -474.78) scale(1 1) translate(-168 474.78) ">SomeSpecificText</text>
<text id="cv_127" x="336" y="474.78" transform="rotate(330 336 474.78) translate(336 -474.78) scale(1 1) translate(-336 474.78) ">SomeSpecificTextBis</text>
</g>
</g>

上面的例子说明了我需要做什么：我需要删除整个块（<g><g> ... </g></g>），因为它包含SomeSpecificText 和SomeSpecificTextBis。我必须对包含一个或另一个文本的任何“块”或“元素”执行此操作。

我想使用 Python 和 lxml 来实现这一点，因为显然，这提供了必要的工具，但我不知道如何使用它。我现在有这个代码：

tree = etree.parse(open("myFile.svg"))

但是我不知道我应该使用哪种方法？我看过一些关于 xpath 的演讲并尝试过，例如tree.xpath('.//g[contains(text(), "SomeSpecific")]) 但它返回一个空列表。

编辑

我尝试了以下方法，试图捕捉包含“someSpecificText”的结构（需要部分匹配），但它仍然为parents返回一个空列表

tree = etree.parse(open("svg/myFile_ezdxf.svg"))
targets = tree.xpath('//g[./g[contains(text(),"SomeText")]]', namespaces = {"svg" : "http://www.w3.org/2000/svg"})
for target in targets:
    target.getparent().remove(target)

这也是我的 svg 文件的标题：

<?xml version="1.0" encoding="utf-8" ?>
<!-- Generated by SomeCompanySoftware -->
<!-- www.somecompany.com -->
<!DOCTYPE svg PUBLIC '-//W3C//DTD SVG 1.0//EN' 
'http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd'>
<svg contentScriptType="text/ecmascript" xmlns:xlink="http://www.w3.org/1999/xlink" zoomAndPan="magnify" 
contentStyleType="text/css" preserveAspectRatio="xMidYMid meet" 
width="840" height="593.48" viewBox="0 0 840 593.48" 
version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:cvjs="http://www.somecompany.com/" stroke-linecap="round" stroke-linejoin="round" fill-rule="evenodd" >

【问题讨论】：

很有可能您的 xpath 尝试不起作用，因为 svg 通常位于默认命名空间中。尝试Jack's answer below，如果这不起作用，请将完整的 svg 开始标签（或者理想情况下，一个最小但完整的 svg 以便我们可以复制）添加到您的问题中。
不幸的是，我无法放置整个 svg，因为它包含许多敏感数据，而且我无法花时间使其匿名。我尝试了解决方案，但它不起作用。我将更新我的帖子以说明我尝试过的内容
在您的编辑中，您正确绑定了“svg”前缀，但您没有在 xpath 中使用它。此外，text 是一个元素，因此在您的 contains() 中使用 text() 将不起作用。以下是我的做法：targets = tree.xpath('//svg:g[./svg:g[.//svg:text[contains(.,"SomeSpecificText")]]][.//svg:text[contains(.,"SomeSpecificTextBis")]]', namespaces={"svg": "http://www.w3.org/2000/svg"})（.// 通常不是必需的，但没有它们我不会选择任何东西）

标签： python svg lxml

【解决方案1】：

你绝对可以用 lxml 做到这一点：

targets = tree.xpath('//g[./g[text="SomeSpecificTextBis" or text="SomeSpecificText"]]')
for target in targets:
    target.getparent().remove(target)
print(etree.tostring(tree, pretty_print=True).decode())

【讨论】：

【解决方案2】：

我找到了执行任务的方法：

tree = etree.parse(open("myFile.svg"))
root = tree.getroot()
targets = ["SomeText", "SomeText2"]
for element in root.iter("*"):
   if (element.text is not None) and any([item in element.text for item in targets]):
      element.getparent().remove(element)
with open('myModifiedFile.svg', 'wb') as f:
    f.write(etree.tostring(tree))

【讨论】：

【解决方案3】：

您可以使用 Beutiful Soup 4 和 Python 3 来完成此操作。在您的示例中，此代码将执行以下操作：

#!/usr/local/bin/python3
from bs4 import BeautifulSoup

tree = BeautifulSoup(open('svg.svg').read(),features="lxml")

for item in tree.find_all(): 
    if item.getText().strip() == "SomeSpecificText" or item.getText().strip() == "SomeSpecificText" :
        item.findParent().findParent().decompose()

print(tree)

它有点脆弱，因为我不知道你的确切逻辑，但你可以改进它。

【讨论】：