【问题标题】:XML: remove child node of a nodeXML:删除节点的子节点
【发布时间】:2011-02-09 14:53:51
【问题描述】:

我想在 xml 文件中找到所有具有特定标记名的节点,比如说“foo”。 如果这些 foo 标记具有节点名称为“bar”的子节点,那么我想删除这些节点。结果应写入文件。

<myDoc>
  <foo>
    <bar/> // remove this one
  </foo>
  <foo>
    <anyThing>
      <bar/> // don't remove this one
    </anyThing>
  </foo>
</myDoc> 

感谢任何提示。正如标签所示,我想用 python 来做这个。

【问题讨论】:

    标签: python


    【解决方案1】:

    你可以使用ElementTree:

    from xml.etree.ElementTree import ElementTree
    tree = ElementTree()
    tree.parse('in.xml')
    
    foos = tree.findall('foo')
    for foo in foos:
      bars = foo.findall('bar')
      for bar in bars:
        foo.remove(bar)
    
    tree.write('out.xml')
    

    【讨论】:

      【解决方案2】:

      非常感谢miles82 为我提供解决问题的线索。 这是我的 xml 文件中的多节点和/或多元素删除。我巨大的原始文件中的示例数据如下所示:

      <Data>
         <horsedata>
                  <horse_name>DO NOT DELETE</horse_name>
                  <stats_data>
                          <stat type="ALL_WEATHR">
                                      GOT TO GO
                  </stats_data>
      
                  <sire><sirename>GOT TO GO</sirename><tmmark>M</tmmark><stud_fee>5000</stud_fee>
      
                  </sire>
                  <dam><damname>GOT TO GO</damname><damsire>WISED UP</damsire>
      
                  </dam>
      
                  <jockey><stat_breed>GOT TO GO</stat_breed><jock_disp>Lopez Charles C</jock_disp>
                          <stats_data>
                                  <stat type="LAST30">
                                          GOT TO GO
                                  </stat>                                                                                                             
                          </stats_data>
                  </jockey>
      
                  <workoutdata>Some More Text ... </workoutdata>
                  <workoutdata>Yet More Text</workoutdata>
      
                  <ppdata><racedate>20150801</racedate><trackcode>DO NOT DELETE</trackcode><trackname>Monmouth Park</trackname>
                  <ppdata><racedate>20150715</racedate><trackcode>DO NOT DELET</trackcode><trackname>Belmont Park</trackname>er>2</racenumber><racebreed>TB</racebreed><country>USA</country><racetype>MCL</racetype><raceclass>MC</raceclass><claimprice>20000</claimprice><purse>29000</purse><classratin>76</classratin><trackcondi>FT</trackcondi><distance>600</distance><disttype>F</disttype><aboutdist/><courseid>D</courseid><surface>D</surface><pulledofft>0</pulledofft><winddirect/><windspeed>0</windspeed><trackvaria>12</trackvaria><sealedtrac/><racegrade>0</racegrade><agerestric>3U</agerestric><sexrestric/><statebredr/><abbrevcond/><postpositi>4</postpositi><favorite>0</favorite><weightcarr>119</weightcarr><jockfirst>Eduardo</jockfirst><jockmiddle/><jocklast>Ulloa</jocklast><jocksuffix/><jockdisp>Ulloa Eduardo</jockdisp><equipment>BF</equipment><medication/><fieldsize>6</fieldsize><posttimeod>54.25</posttimeod><shortcomme>3w upper, gave way</shortcomme><longcommen>slight bobble st, vied 4p early, cleared 2p turn, 3w 1/4,gave way</longcommen><gatebreak>3</gatebreak><position1>1</position1><lenback1>-50.00</lenback1><horsetime1>22.49</horsetime1><leadertime>22.49</leadertime><pacefigure>108</pacefigure><position2>1</position2><lenback2>-150.00</lenback2><horsetime2>46.56</horsetime2><leadertim2>46.56</leadertim2><pacefigur2>77</pacefigur2><positionst>5</positionst><lenbackstr>810.00</lenbackstr><horsetimes>60.75</horsetimes><leadertim3>59.40</leadertim3><dqindicato/><positionfi>6</positionfi><lenbackfin>1700.00</lenbackfin><horsetimef>75.50</horsetimef><leadertim4>72.67</leadertim4><speedfigur>33</speedfigur><turffigure>0.0</turffigure><winnersspe>71</winnersspe><foreignspe>-97</foreignspe><horseclaim>0</horseclaim><biasstyle>F</biasstyle><biaspath>N</biaspath><complineho>Lightning Ron</complineho><complinele>275.00</complinele><complinewe>124</complinewe><complinedq/><complineh2>Thomas Knight</complineh2><complinel2>25.00</complinel2><complinew2>119</complinew2><complined2/><complineh3>Heavy Hitter</complineh3><complinel3>450.00</complinel3><complinew3>124</complinew3><complined3/><linebefore/><lineafter/><domesticpp>1</domesticpp><oflfinish>6</oflfinish><runup_dist>64</runup_dist><rail_dist>-1</rail_dist><apprweight>0</apprweight><vd_claim/><vd_reason/></ppdata>
          </horsedata>
      </Data>
      

      我感兴趣的是只保留两个 3 元素,即 horsename、workoutdata 和 ppdata ...

      <?xml version="1.0"?>
      tree=ET.parse('bel.xml')
      root=tree.getroot()
      
      horses=tree.findall('.//horsedata')
      
      for horse in horses:
          stats = horse.findall('stats_data')
          sires = horse.findall('sire')
          dams = horse.findall('dam')
          jockeys = horse.findall('jockey')
          trainers = horse.findall('trainer')
      
          for stat in stats:
              horse.remove(stat)
          for sire in sires:
              horse.remove(sire)
          for dam in dams:
              horse.remove(dam)
          for jockey in jockeys:
              horse.remove(jockey)
          for trainer in trainers:
              horse.remove(trainer)  
      
      tree.write('junk.xml')``
      

      这是最终的输出:

      <Data>
          <horsedata>
                  <horse_name>DO NOT DELETE</horse_name>
                  <workoutdata>Some More Text ... </workoutdata>
                  <workoutdata>Yet More Text</workoutdata>
      
                  <ppdata><racedate>20150801</racedate><trackcode>DO NOT DELETE</trackcode><trackname>Monmouth Park</trackname></ppdata>
                  <ppdata><racedate>20150715</racedate><trackcode>DO NOT DELET</trackcode><trackname>Belmont Park</trackname></ppdata>
          </horsedata>
      </Data>
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2013-11-09
        相关资源
        最近更新 更多