如何从 XML 文件中的节点返回特定文本？答案

【问题标题】：How to return specific text from node in XML file?如何从 XML 文件中的节点返回特定文本？
【发布时间】：2019-06-13 11:23:45
【问题描述】：

我正在尝试通过解析文本从 XML 返回值。我有一个工作可以在其中查找特定值，然后在下面的特定元素中返回文本。

但是，当我想从 attritube 而不是元素返回文本时，我无法让它工作。

您可以在此处查看 XML 文档的示例：

<?xml version = '1.0' encoding = 'UTF-8'?>
<ADI3 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
   <Asset xsi:type="offer:OfferType" uriId="url.com/assetID">
      <offer:BillingId>DUMMY</offer:BillingId>
   </Asset>
   <Asset xsi:type="title:TitleType">
      <core:Description deprecated="true" xmlns:core="urn:cablelabs:md:xsd:core:3.0">Title Package</core:Description>
      <core:Ext xsi:type="ExtType" xmlns:core="urn:cablelabs:md:xsd:core:3.0">
         <TestipediaInfo>
            <test:SeriesInfo xml:lang="en" seasonNumber="2" episodeNumber="9">
               <test:SeriesBrief>A very nice title</test:SeriesBrief>
               <test:EpisodeInfo>
                  <test:SummaryShort>Peter and the crew travel to Greenland.</test:SummaryShort>
               </test:EpisodeInfo>
            </test:SeriesInfo>
         </TestipediaInfo>
      </core:Ext>
   </Asset>
   <Asset xsi:type="offer:OfferType" uriId="url.com/assetID">
      <core:Description deprecated="true" xmlns:core="urn:cablelabs:md:xsd:core:3.0">Series Poster</core:Description>
      <content:SourceUrl>A-typical-file-name_1000x1500.jpg</content:SourceUrl>
   </Asset>
</ADI3>

我使用了这个 groovy 代码：

File file = new File("stackoverflowtest.xml")

def str = file.text

def xmlSlurper = new XmlSlurper(false,false)
def root = xmlSlurper.parseText(str)
def path = 'Asset."core:Ext".TestipediaInfo."test:SeriesInfo".find{it.@"xml:lang" == "en"}."test:EpisodeInfo"."test:SummaryShort"'
def xpathRes = Eval.x(root, "x.$path")

print(xpathRes)

打印来自 test:SummaryShort 的值

但是，我希望能够执行类似的 xPath（例如 'Asset."core:Ext".TestipediaInfo."test:SeriesInfo".find{it.@"xml:lang" == "en"}."test:EpisodeInfo"."test:SummaryShort"' 在 <core:Description deprecated="true" xmlns:core="urn:cablelabs:md:xsd:core:3.0">Series Poster</core:Description> 包含“系列海报”时从 <content:SourceUrl> 返回文本。

【问题讨论】：

你试过'Asset."core:Description".find{it.@"." == "Series Poster"}."content:SourceUrl"

标签： xml xpath groovy xml-parsing

【解决方案1】：

不知道你为什么使用eval...

你应该能够运行：

root.Asset.findAll { it.'core:Description'.@deprecated == 'true' }.'content:SourceUrl'*.text()

返回一个列表，其中包含已弃用 true 作为 Description 属性的任何节点的所有 SourceUrl 文本

【讨论】：