比较两个 xml 文件而不考虑它们的顺序答案

【问题标题】：comparing two xml files irrespective of their order比较两个 xml 文件而不考虑它们的顺序
【发布时间】：2015-06-29 21:36:43
【问题描述】：

我目前正在处理一个 python 项目，但遇到了一个与使用 python 比较两个 XML 文件相关的小问题。现在假设我们有两个 xml 文件：

一个文件：

<m1:time timeinterval="5">
   <m1:vehicle distance="40" speed="5"\>

   <m1:location hours = "1" path = '1'\>
      <m1:feature color="2" type="a">564</m1:feature>
      <m1:feature color="3" type="b">570</m1:feature>
      <m1:feature color="4" type="c">570</m1:feature>
   <\m1:location>

   <m1:location hours = "5" path = '1'\>
      <m1:feature color="6" type="a">560</m1:feature>
      <m1:feature color="7" type="b">570</m1:feature>
      <m1:feature color="8" type="c">580</m1:feature>   
   <\m1:location>

   <m1:location hours = "9" path = '1'\>
      <m1:feature color="10" type="a">560</m1:feature>
      <m1:feature color="11" type="b">570</m1:feature>
      <m1:feature color="12" type="c">580</m1:feature>   
   <\m1:location>
</m1:time>

B文件：

<m1:time timeinterval="6">
   <m1:vehicle distance="40" speed="5"\>

   <m1:location hours = "5" path = '1'\>
      <m1:feature color="6" type="a">560</m1:feature>
      <m1:feature color="7" type="b">570</m1:feature>
      <m1:feature color="8" type="c">580</m1:feature>   
   <\m1:location>

   <m1:location hours = "1" path = '1'\>
      <m1:feature color="2" type="a">564</m1:feature>
      <m1:feature color="3" type="b">570</m1:feature>
      <m1:feature color="4" type="c">570</m1:feature>
   <\m1:location>

   <m1:location hours = "9" path = '1'\>
      <m1:feature color="10" type="a">560</m1:feature>
      <m1:feature color="11" type="b">570</m1:feature>
      <m1:feature color="12" type="c">580</m1:feature>   
   <\m1:location>

</m1:time>

我想问的是如何比较A文件和B文件确保虽然“位置”元素的顺序不同在这两个文件中，它们仍然使用 python 显示相同吗？
我尝试过各种方法，也尝试过参考 this 问题，但在这个项目中，我想开发一个我自己的方法，我不能使用任何已经可用的工具。

到目前为止我尝试过的方法是：

我正在使用 LXML，我从 A 文件中获取子项的各个属性并将它们存储在列表中。然后我将 B 文件的元素和子属性与存储在该列表中的值进行比较。

首先，这种方法不起作用，我也想不出任何有效的程序来完成这项任务。你们能对此有所了解吗？

谢谢。

【问题讨论】：

标签： python xml comparison lxml

【解决方案1】：

听起来您需要一些 XML 解析器。我的第一个建议是使用 DOM 解析器（或自己创建一个非常基本的解析器）。通过读取两个 XML 文件，然后比较这些树，您可以轻松验证它们是否相同。

虽然这不是很有效。可以在读取第二个 XML 文件时进行验证。但是，您必须删除匹配的元素。（确保不留下不匹配的元素）

但我很好奇为什么您的列表方法不起作用。你能提供更多关于这方面的信息吗？

【讨论】：

基本上问题是我有两个巨大的 xml 文件进行比较，可能是 5 到 6 MB。现在，由于在我的情况下数据并不总是固定的，我想动态生成列表。例如在上面例如。我想生成动态列表，例如 location_hours_1=[]、location_hours_1=[]、location_hours_9=[]。创建这些列表后，我可以单独将其他文件内容与此列表进行比较。我按照here 的方法进行动态生成，但在我的情况下它显示了一些错误。
我认为这种方法对于深层元素结构变得非常难以理解，因此非常容易出错。我真的建议做一棵可以比较的小树。如果您需要帮助，可以给我发电子邮件。
您好，感谢您的反馈。是的，目前我正在努力。我知道我的动态列表方法很容易出错并且根本没有效率。因此，其他一些方法值得尝试。顺便说一句，我怎样才能找到你的电子邮件地址？您的页面上没有提到电子邮件 ID。