【问题标题】:Search and replace elements in XMLfile using python使用python搜索和替换XMLfile中的元素
【发布时间】:2019-07-13 20:01:45
【问题描述】:

我需要在 XML 文件中搜索元素并替换为另一个值。替换应该只发生在条件匹配的行。

我有以下 xml 文件。

<?xml vn="1.0" encoding="UTF-8"?>
<proj>
    <mV>4.0.0</mV>

    <gId>com.test</gId>
    <aId>console</aId>
    <vn>1.0</vn>

    <bld>
        <plugins>
            <plugin>
                <gId>org.apache.maven.plugins</gId>
                <aId>maven-compiler-plugin</aId>
                <vn>1.1</vn>
                <configuration>
                    <source>1.0</source>
                    <target>1.0</target>
                    <showWarnings>true</showWarnings>
                </configuration>
            </plugin>
        </plugins>
    </bld>
    <dps>
        <dp>
            <gId>org.sk</gId>
            <aId>sk-api</aId>
            <vn>1.7.20</vn>
        </dp>
        <dp>
            <gId>org.sk</gId>
            <aId>sk-log</aId>
            <vn>1.7.25</vn>
        </dp>
    </dps>
</proj>

下面是替换代码。

aIdValue = "sk-log"
tree = ET.parse('test.xml')
al_rt = tree.getal_rt()
dp = al_rt.findall(".//xmlns:dp")
for d in dp:
    aId = d.find("xmlns:aId")
    vn    = d.find("xmlns:vn")
    if aIdValue == aId.text:
       print aId.text
        print vn.text
        vn.text = vn.text
        tree.write('test.xml')

所以在这里我从打印语句中得到的值是 aId.textsk-logvn.text1.7.25。我只需要在该特定行中将1.7.25 替换为somevalue。上面的代码对我不起作用。我该怎么做?

预期的输出将是

<?xml vn="1.0" encoding="UTF-8"?>
<proj>
    <mV>4.0.0</mV>

    <gId>com.test</gId>
    <aId>console</aId>
    <vn>1.0</vn>

    <bld>
        <plugins>
            <plugin>
                <gId>org.apache.maven.plugins</gId>
                <aId>maven-compiler-plugin</aId>
                <vn>1.1</vn>
                <configuration>
                    <source>1.0</source>
                    <target>1.0</target>
                    <showWarnings>true</showWarnings>
                </configuration>
            </plugin>
        </plugins>
    </bld>
    <dps>
        <dp>
            <gId>org.sk</gId>
            <aId>sk-api</aId>
            <vn>1.7.20</vn>
        </dp>
        <dp>
            <gId>org.sk</gId>
            <aId>sk-log</aId>
            <vn>somevalue</vn>
        </dp>
    </dps>
</proj>

【问题讨论】:

  • 这一行看起来很可疑:“vn.text = vn.text”。显然,这没有任何作用。你的意思是其他东西会真正改变文本吗?也许“vn.text = 'somevalue'”?
  • vn.text = vn.text- 你可以忽略这个。我刚刚在调试时放了这个。 @joe 管理员
  • 是的.. 我会用预期的输出更新问题
  • @moong mu 回答有帮助吗?如果是,您可以将其标记为接受。欢呼
  • 我将其标记为已接受

标签: python xml search replace


【解决方案1】:

这是您需要的: 将 xml.etree.ElementTree 导入为 ET

tree = ET.parse('test.xml')  
root = tree.getroot()
aIdValue = "sk-log"

for elt in root.iter("dp"):
  print("%s - %s" % (elt.tag, elt.text))
  aId = elt.find("aId")
  vn = elt.find("vn")
  print(aId.text)
  print(vn.text)
  if (aId.text == aIdValue):
    print("vn will be changed.")
    elt.find("vn").text='1.8.0'

tree.write('test.xml', 'unicode')   

【讨论】:

  • ns0: 前缀被添加到每一行,我怎样才能避免这种情况
  • 感谢您接受我的正确回答。 “前缀”是什么意思?我在我的电脑(Windows 10)上测试了代码,效果很好。我必须提到,我已经从 test.xml 文件中删除了第一行 ""。
  • 感谢您的快速回复。ns0: 被添加为每个标签的前缀。例如&lt;ns0:aId&gt;sk-api&lt;/aId&gt;我怎样才能避免这种情况
  • ns0: 前缀是对 XML 文件中命名空间的引用:您的 xml 文件中有这样的命名空间吗?正如我所说,我从 test.xml 中删除了第一行 "" 您能否提供您正在使用的“真实” test.xml?
  • 尝试在 tree = ET.parse('test.xml') 之后添加 ET.register_namespace('', "your NameSpace")
【解决方案2】:

使用BeautifulSoupfind_next()

list_text.xml:

<?xml vn="1.0" encoding="UTF-8"?>
<proj>
    <mV>4.0.0</mV>

    <gId>com.test</gId>
    <aId>console</aId>
    <vn>1.0</vn>

    <bld>
        <plugins>
            <plugin>
                <gId>org.apache.maven.plugins</gId>
                <aId>maven-compiler-plugin</aId>
                <vn>1.1</vn>
                <configuration>
                    <source>1.0</source>
                    <target>1.0</target>
                    <showWarnings>true</showWarnings>
                </configuration>
            </plugin>
        </plugins>
    </bld>
    <dps>
        <dp>
            <gId>org.sk</gId>
            <aId>sk-api</aId>
            <vn>1.7.20</vn>
        </dp>
        <dp>
            <gId>org.sk</gId>
            <aId>sk-log</aId>
            <vn>1.7.25</vn>
        </dp>
    </dps>
</proj>

然后:

from bs4 import BeautifulSoup
with open('list_test.xml','r') as f:
    soup = BeautifulSoup(f.read(), "html.parser")
    aid = soup.find_all('aid')
    for s in aid:
        if s.text == 'sk-log':
            vn = s.find_next('vn')
            print("Original Value: {}".format(vn.text))
            vn.string = 'SomeValue'
            print("Replaced value: {}".format(vn.text))

输出:

Original Value: 1.7.25
Replaced value: SomeValue

编辑:

要将其写入同一个 xml 文件,我们将使用 soup.prettify():

from bs4 import BeautifulSoup
with open('list_test.xml','r') as f:
    soup = BeautifulSoup(f.read(), features="lxml")
    aid = soup.find_all('aid')
    for s in aid:
        if s.text == 'sk-log':
            vn = s.find_next('vn')
            print("Original Value: {}".format(vn.text))
            vn.string = 'SomeValue'
            print("Replaced value: {}".format(vn.text))

with open("list_test.xml", "w") as f_write:
    f_write.write(soup.prettify())

输出:

<?xml vn="1.0" encoding="UTF-8"?>
<html>
 <body>
  <proj>
   <mv>
    4.0.0
   </mv>
   <gid>
    com.test
   </gid>
   <aid>
    console
   </aid>
   <vn>
    1.0
   </vn>
   <bld>
    <plugins>
     <plugin>
      <gid>
       org.apache.maven.plugins
      </gid>
      <aid>
       maven-compiler-plugin
      </aid>
      <vn>
       1.1
      </vn>
      <configuration>
       <source>
        1.0
       </source>
       <target>
        1.0
       </target>
       <showwarnings>
        true
       </showwarnings>
      </configuration>
     </plugin>
    </plugins>
   </bld>
   <dps>
    <dp>
     <gid>
      org.sk
     </gid>
     <aid>
      sk-api
     </aid>
     <vn>
      1.7.20
     </vn>
    </dp>
    <dp>
     <gid>
      org.sk
     </gid>
     <aid>
      sk-log
     </aid>
     <vn>
      SomeValue
     </vn>
    </dp>
   </dps>
  </proj>
 </body>
</html>

【讨论】:

  • 如何将其转储到同一个文件中
  • @moong 为它编辑。
【解决方案3】:

您是否尝试使用模块 xmltodict 将该 xml 更改为 dict,然后再次将其更改为 xml?

Here is a little guide.

And here the repository.

这是一个替换dict上元素的小功能,当两个键相等时会出现问题,但是用不重复的键替换元素没关系,至少对我有用:

def changes_dict(self, tree, change):
    """Function that changes the values of a json with the keys given
    :param tree: Json to be changed
    :param change: Dictionary with the keys to be changed and the new values: {field1: value1, field2: value2,..., fieldN: valueN}"""
    if isinstance(tree,(list,tuple)):
        res = []
        for subItem in tree:
            result = self.changes_dict(subItem, change)
            res.append(result)
        return res
    elif isinstance(tree,dict):
        for nodeName in tree.keys():
            subTree = tree[nodeName]
            if nodeName in list(change.keys()):
                tree[nodeName] = {'value': str(change[nodeName])}
                change.pop(nodeName)
                if not change:
                    break
            else:
                tree[nodeName] = self.changes_dict(subTree, change)
        return tree
    elif isinstance(tree, str):
        return tree

我制作了这个程序并且效果很好:

# -*- coding: utf-8 -*-

import xmltodict, json

def changes_dict(tree, change, wordHelp):
    """Function that changes the values of a json with the keys given
    :param tree: Json to be changed
    :param change: Dictionary with the keys to be changed and the new values: {field1: value1, field2: value2,..., fieldN: valueN}
    :param wordHelp: Word that must be in the values of the dict that contains the change"""
    if isinstance(tree,(list,tuple)):
        res = []
        for subItem in tree:
            result = changes_dict(subItem, change, wordHelp)
            res.append(result)
        return res
    elif isinstance(tree,dict):
        for nodeName in tree.keys():
            subTree = tree[nodeName]
            if nodeName in list(change.keys()) and wordHelp in list(tree.values()):
                tree[nodeName] = {'value': str(change[nodeName])}
                change.pop(nodeName)
                if not change:
                    break
            else:
                tree[nodeName] = changes_dict(subTree, change, wordHelp)
        return tree
    elif isinstance(tree, str):
        return tree

 x = """
 <proj>
    <mV>4.0.0</mV>

        <gId>com.test</gId>
        <aId>console</aId>
        <vn>1.0</vn>

        <bld>
            <plugins>
                <plugin>
                    <gId>org.apache.maven.plugins</gId>
                    <aId>maven-compiler-plugin</aId>
                    <vn>1.1</vn>
                    <configuration>
                        <source>1.0</source>
                        <target>1.0</target>
                        <showWarnings>true</showWarnings>
                    </configuration>
                 </plugin>
             </plugins>
         </bld>
         <dp>
            <gId>org.sk</gId>
            <aId>sk-api</aId>
            <vn>1.7.20</vn>
        </dp>
        <dp>
            <gId>org.sk</gId>
            <aId>sk-log</aId>
            <vn>1.7.25</vn>
        </dp>
    </dps>
</proj> """

dicti = eval(json.dumps(xmltodict.parse(x)))
dicti_changed = changes_dict(dicti, {'vn': 'somevalue'}, 'sk-log')
print(xmltodict.unparse(dicti_changed))

问候

【讨论】:

    猜你喜欢
    • 2018-02-15
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2010-11-07
    • 2014-06-16
    • 2011-10-24
    相关资源
    最近更新 更多