【问题标题】:How to read some contents of xml files and write them into a text file?如何读取xml文件的一些内容并将它们写入文本文件?
【发布时间】:2016-05-29 16:44:18
【问题描述】:

我有一个以下 xml 文件,我想读取 <seg> 中的内容并使用 Python 将它们保存到纯文本文件中。我使用了 DOM 模块。

<?xml version="1.0"?>
<mteval>
  <tstset setid="default" srclang="any" trglang="TRGLANG" sysid="SYSID">
    <doc docid="ntpmt-dev-2000/even1k.cn.seg.txt">
      <seg id="1">therefore , can be obtained having excellent properties ( good stability and solubility of the balance of the crystal as a pharmaceutical compound is not possible to predict .</seg>
      <seg id="3">compound ( I ) are preferably crystalline , in particular , has good stability and solubility equilibrium and suitable for industrial prepared type A crystal is preferred .</seg>
      <seg id="4">method B included in the catalyst such as DMF , and the like in the presence of a compound of formula ( II ) with thionyl chloride or oxalyl chloride to give an acyl chloride , in the presence of a base of the acid chloride with alcohol ( IV ) ( O ) by reaction of esterification .</seg>
    </doc>
  </tstset>
</mteval>
from xml.dom.minidom import parse
import xml.dom.minidom

dom = xml.dom.minidom.parse(r"path_to_xml file")
file = dom.documentElement
seg = dom.getElementsByTagName("seg")
for item in seg:
    sent = item.firstChild.data
    print(sent,sep='')

file = open(r'file.txt','w')
file.write(sent)
file.close()

上面的代码运行时,成功打印了屏幕上的所有行,但是file.txt只有最后一行&lt;seg&gt; (seg id=4),其实我想把所有的句子都保存到文件中.我的代码有问题吗?

【问题讨论】:

  • 那是因为您只是将最后找到的项目写入文件。对文件的写入也需要在循环内。
  • 如你所说,我将文件写入命令放入循环中,试了几次,还是一样,总是最后一句。
  • 您需要打开文件以附加 'a' 否则它会用 'w' 覆盖文件。

标签: python xml text


【解决方案1】:

您只调用了一次file.write(sent),在循环之前打开文件,然后将以下行添加到此代码中:

file = open(r'file.txt','w')

for item in seg:
    sent = item.firstChild.data
    print(sent,sep='')
    file.write(sent) // <---- this line

file.close()

【讨论】:

  • 按照您的指示,问题已经解决,谢谢!
猜你喜欢
  • 1970-01-01
  • 2020-06-30
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2015-06-26
  • 1970-01-01
  • 2016-12-25
相关资源
最近更新 更多