【问题标题】:Filtering using xslt for specific node values使用 xslt 过滤特定节点值
【发布时间】:2014-01-30 23:34:03
【问题描述】:

我需要过滤庞大而冗余的 xml 文件。 简单的事情是消除所有没有属性和没有内容的节点:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="@*|node()">
    <xsl:if test=". != '' or ./@* != ''">
      <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
      </xsl:copy>
    </xsl:if>
  </xsl:template>
</xsl:stylesheet>

但我还需要过滤掉包含的节点

<type>0</type>

仅包含的节点

<whatever id="-1 />

以及仅包含空属性的节点,例如:

  <dateacquired year="" month="" day="" long="" unformatted=""/>

我的(机器生成的)输入文件的摘录是:

<record table="book" id="1">
<bookdata>
  <bookid unformatted="1">1</bookid>
  <marked bool="False">No</marked>
  <lastmodified year="2013" month="09" day="25" long="Wednesday, September 25, 2013" unformatted="20130925">09/25/2013</lastmodified>
  <title>Intervista Col Vampiro</title>
  <fulltitle>Ciclo Dei Vampiri: Intervista Col Vampiro</fulltitle>
  <fulltitle2>Intervista Col Vampiro (Ciclo Dei Vampiri)</fulltitle2>
  <referenceno>BB00001</referenceno>
  <publishdate year="1993" month="" day="" long="1993" unformatted="1993">1993</publish date>
  <copyrightdate year="" month="" day="" long="" unformatted=""/>
  <type id="-1"/>
  <authors sort="Rice, Anne">
    <author id="1">
      <name>Anne Rice</name>
      <sortby>Rice, Anne</sortby>
      <roles/>
    </author>
  </authors>
  <credits/>
  <image1>
    <filename>Book_1_3.jpg</filename>
    <type>2</type>
    <notes/>
  </image1>
  <image2>
    <filename/>
    <type>0</type>
    <notes/>
  </image2>
  <image3>
    <filename/>
    <type>0</type>
    <notes/>
  </image3>
  <image4>
    <filename/>
    <type>0</type>
    <notes/>
  </image4>
  <image5>
    <filename/>
    <type>0</type>
    <notes/>
  </image5>
  <image6>
    <filename/>
    <type>0</type>
    <notes/>
  </image6>
  <image7>
    <filename/>
    <type>0</type>
    <notes/>
  </image7>
  <image8>
    <filename/>
    <type>0</type>
    <notes/>
  </image8>
  <image9>
    <filename/>
    <type>0</type>
    <notes/>
  </image9>
  <subtitle/>
  <titlesort>Intervista Col Vampiro</titlesort>
  <publisher id="1">Salani</publisher>
  <publicationplace id="-1"/>
  <isbn/>
  <lccn/>
  <lccallnum/>
  <dewey>823.9</dewey>
  <country id="-1"/>
  <pages unformatted="283">283</pages>
  <numberofsections unformatted="0">0</numberofsections>
  <printedby id="-1"/>
  <binding id="-1"/>
  <edition id="1">Ebook</edition>
  <printing id="-1"/>
  <language id="-1"/>
  <series id="1">Ciclo Dei Vampiri</series>
  <releaseno unformatted="0">0</releaseno>
  <originaltitle>Interview With The Vampire</originaltitle>
  <originalsubtitle/>
  <originalpublisher id="-1"/>
  <originalcountry id="-1"/>
  <originallanguage id="-1"/>
  <originalcopyright year="1976" month="" day="" long="1976" unformatted="1976">1976</originalcopyright>
  <price integer="8" fraction="0" unformatted="8.0">8.00</price>
  <value integer="0" fraction="0" unformatted="0.0">0.00</value>
  <sellingprice integer="0" fraction="0" unformatted="0.0">0.00</sellingprice>
  <changeinvalue>0.00</changeinvalue>
  <changeinvaluepr>0.00</changeinvaluepr>
  <condition id="-1"/>
  <appraiser id="-1"/>
  <insurance id="-1"/>
  <registered year="2005" month="09" day="10" long="Saturday, September 10, 2005" unformatted="20050910">09/10/2005</registered>
  <status id="-1"/>
  <dateacquired year="" month="" day="" long="" unformatted=""/>
  <acquiredfrom id="-1"/>
  <personalrating id="-1"/>
  <category id="1">Horror-Gotico</category>
  <subcategory id="-1"/>
  <owner id="-1"/>
  <location id="-1"/>
  <keywords>
    <keyword id="1">Vampiro</keyword>
    <keyword id="2">Vampiri</keyword>
  </keywords>
  <newbook bool="False">No</newbook>
  <onloan bool="False">No</onloan>
  <overdue bool="False">No</overdue>
  <borrower id="-1"/>
  <borrowercategory id="-1"/>
  <dateborrowed year="" month="" day="" long="" unformatted=""/>
  <datedue year="" month="" day="" long="" unformatted=""/>
  <reserved bool="False">No</reserved>
  <reservedto id="-1"/>
  <reserveddate year="" month="" day="" long="" unformatted=""/>
  <awards/>
  <awardyear/>
  <awarddetails/>
  <nominations/>
  <nominationyear/>
  <nominationdetails/>
  <custom01/>
  <custom02/>
  <custom03>http://www.ddunlimited.net/viewtopic.php?f=1079&amp;t=3749847</custom03>
  <custom04/>
  <custom05 id="-1"/>
  <custom06 id="-1"/>
  <custom07 id="-1"/>
  <custom08 id="-1"/>
  <custom09 year="" month="" day="" long="" unformatted=""/>
  <custom10 integer="0" fraction="0" unformatted="0.0">0.00</custom10>
  <custom11 bool="True">Yes</custom11>
  <custom12 bool="False">No</custom12>
  <custom13 bool="False">No</custom13>
  <custom14 bool="True">Yes</custom14>
  <custom15 bool="False">No</custom15>
  <custom16 bool="False">No</custom16>
  <custom17 bool="False">No</custom17>
  <custom18 bool="False">No</custom18>
  <notes>ed2k://|file|eBook.ITA.001.Anne.Rice.Intervista.Col.Vampiro.(doc.lit.pdf.rtf).[Hyps].rar|1998285|81D4C283C03E5787170A33C335577533|/</notes>
  <synopsis>A San Francisco alle soglie del 2000 il giornalista Mallory viene avvicinato da Louis De Point Du Lac, vampiro dal 1791, quando era un proprietario terriero presso New Orleans. Ridotto alla disperazione per la perdita della moglie e della figlioletta vieneiniziato alla sua tenebrosa e ferina esistenza da Lestat, collega di origini parigine, che cerca invano di far superare al discepolo l&apos;innata repulsione per l&apos;omicidio. Invano Louis si ciba di sangue di ratti e galline, e fà fuggire i servi incendiando la casa. Ormai Lestat lo domina e lo coinvolge in efferate uccisioni di innocenti. Una bimba orfana, Claudia, viene &quot;adottata&quot; dai due e si rivela feroce quant&apos;altri mai.</synopsis>
  <reviews/>
  <weblinks/>
  <weblinktype id="1"/>
  <filelinks/>
  <filelinktype id="1"/>
  <barcode/>
  <originalseries id="-1"/>
  <originalreleaseno unformatted="0">0</originalreleaseno>
  <readhistory/>
  <lastread year="" month="" day="" long="" unformatted=""/>
  <readcount unformatted="0">0</readcount>
  <dustjacketcondition id="-1"/>
  <dimensions_width integer="0" fraction="0" unformatted="0.0">0.00</dimensions_width>
  <dimensions_height integer="0" fraction="0" unformatted="0.0">0.00</dimensions_height>
  <dimensions_depth integer="0" fraction="0" unformatted="0.0">0.00</dimensions_depth>
  <coverprice integer="0" fraction="0" unformatted="0.0">0.00</coverprice>
  <coverprice_currency id="-1"/>
  <booklinks/>
</bookdata>
<contentsdata items="0"/>
</record>

期望的输出是:

<record table="book" id="1">
<bookdata>
  <bookid unformatted="1">1</bookid>
  <marked bool="False">No</marked>
  <lastmodified year="2013" month="09" day="25" long="Wednesday, September 25, 2013" unformatted="20130925">09/25/2013</lastmodified>
  <title>Intervista Col Vampiro</title>
  <fulltitle>Ciclo Dei Vampiri: Intervista Col Vampiro</fulltitle>
  <fulltitle2>Intervista Col Vampiro (Ciclo Dei Vampiri)</fulltitle2>
  <referenceno>BB00001</referenceno>
  <publishdate year="1993" month="" day="" long="1993" unformatted="1993">1993</publish date>
  <authors sort="Rice, Anne">
    <author id="1">
      <name>Anne Rice</name>
      <sortby>Rice, Anne</sortby>
    </author>
  </authors>
  <image1>
    <filename>Book_1_3.jpg</filename>
    <type>2</type>
  </image1>
  <titlesort>Intervista Col Vampiro</titlesort>
  <publisher id="1">Salani</publisher>
  <dewey>823.9</dewey>
  <pages unformatted="283">283</pages>
  <numberofsections unformatted="0">0</numberofsections>
  <edition id="1">Ebook</edition>
  <series id="1">Ciclo Dei Vampiri</series>
  <releaseno unformatted="0">0</releaseno>
  <originaltitle>Interview With The Vampire</originaltitle>
  <originalcopyright year="1976" month="" day="" long="1976" unformatted="1976">1976</originalcopyright>
  <price integer="8" fraction="0" unformatted="8.0">8.00</price>
  <value integer="0" fraction="0" unformatted="0.0">0.00</value>
  <sellingprice integer="0" fraction="0" unformatted="0.0">0.00</sellingprice>
  <changeinvalue>0.00</changeinvalue>
  <changeinvaluepr>0.00</changeinvaluepr>
  <registered year="2005" month="09" day="10" long="Saturday, September 10, 2005" unformatted="20050910">09/10/2005</registered>
  <category id="1">Horror-Gotico</category>
  <keywords>
    <keyword id="1">Vampiro</keyword>
    <keyword id="2">Vampiri</keyword>
  </keywords>
  <newbook bool="False">No</newbook>
  <onloan bool="False">No</onloan>
  <overdue bool="False">No</overdue>
  <reserved bool="False">No</reserved>
  <custom03>http://www.ddunlimited.net/viewtopic.php?f=1079&amp;t=3749847</custom03>
  <custom10 integer="0" fraction="0" unformatted="0.0">0.00</custom10>
  <custom11 bool="True">Yes</custom11>
  <custom12 bool="False">No</custom12>
  <custom13 bool="False">No</custom13>
  <custom14 bool="True">Yes</custom14>
  <custom15 bool="False">No</custom15>
  <custom16 bool="False">No</custom16>
  <custom17 bool="False">No</custom17>
  <custom18 bool="False">No</custom18>
  <notes>ed2k://|file|eBook.ITA.001.Anne.Rice.Intervista.Col.Vampiro.(doc.lit.pdf.rtf).[Hyps].rar|1998285|81D4C283C03E5787170A33C335577533|/</notes>
  <synopsis>A San Francisco alle soglie del 2000 il giornalista Mallory viene avvicinato da Louis De Point Du Lac, vampiro dal 1791, quando era un proprietario terriero presso New Orleans. Ridotto alla disperazione per la perdita della moglie e della figlioletta vieneiniziato alla sua tenebrosa e ferina esistenza da Lestat, collega di origini parigine, che cerca invano di far superare al discepolo l&apos;innata repulsione per l&apos;omicidio. Invano Louis si ciba di sangue di ratti e galline, e fà fuggire i servi incendiando la casa. Ormai Lestat lo domina e lo coinvolge in efferate uccisioni di innocenti. Una bimba orfana, Claudia, viene &quot;adottata&quot; dai due e si rivela feroce quant&apos;altri mai.</synopsis>
  <weblinktype id="1"/>
  <filelinktype id="1"/>
  <originalreleaseno unformatted="0">0</originalreleaseno>
  <readcount unformatted="0">0</readcount>
  <dimensions_width integer="0" fraction="0" unformatted="0.0">0.00</dimensions_width>
  <dimensions_height integer="0" fraction="0" unformatted="0.0">0.00</dimensions_height>
  <dimensions_depth integer="0" fraction="0" unformatted="0.0">0.00</dimensions_depth>
  <coverprice integer="0" fraction="0" unformatted="0.0">0.00</coverprice>
</bookdata>
<contentsdata items="0"/>
</record>

问题是我并没有真正了解转换,当我尝试阅读它们时,我没有找到一个易于理解的教程。欢迎任何指针!

作为额外的奖励,我还想过滤掉特定的“空”项目,如上述尺寸_*。

钛酸

【问题讨论】:

  • 为什么输出中保留了一些空节点(例如&lt;credits/&gt;),而没有保留其他节点?
  • @LegoStormtroopr 因为我手动编辑了条目,忘记删除它;)这应该与其他人一起使用。我编辑了上面的内容,希望我没有忘记任何东西。我应该删除:空标签(没有文本和属性);空标签(无文本,一些属性为空);默认标签(带有id="-1"属性的标签);特定标签(&lt;type&gt;0&lt;/type&gt;);可能还有一些其他特定标签(例如:&lt;dimensions_width integer="0" fraction="0" unformatted="0.0"&gt;0.00&lt;/dimensions_width&gt;

标签: xslt filter transform


【解决方案1】:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="*[normalize-space(.) = 0]" />
  <xsl:template match="*[normalize-space(.) = '' and count(@*[. = '']) = count(@*)]" />
  <!-- write more empty templates for nodes that should be removed -->

</xsl:stylesheet>

请注意,如果您愿意,count(@*[. = '']) = count(@*) 可以写成 not(@*[. != ''])

【讨论】:

  • 好的,谢谢。有用!这会在标签被删除的地方留下一个空白行;我怎样才能摆脱空行?
  • @Zio 是的,你可以。首先,尝试&lt;xsl:output indent="yes" /&gt;&lt;xsl:template match="text()[normalize-space() = '']" /&gt;。前者指示 XSL 处理器缩进输出(doh!),后者删除空文本节点(即元素之间的空白)。输出格式取决于您的处理器以及它的缩进效果。
  • 但是真的......不要花太多时间让它变得漂亮。无意义的空格有这个名字是有原因的。
  • 事实证明我不能使用&lt;xsl:template match="*[normalize-space(.) = 0]" /&gt;,因为它会切断“有趣”的节点。我真正需要做的是杀死所有没有具有“文件名”子节点的“图像*”节点。我尝试了几件事,包括:*[starts-with(name(), 'image') and not(filename)]*[starts-with(name(), 'image') and not(count(filename))]*[starts-with(name(), 'image') and not(descendant::*[name() = 'filename'])]//image2[text() = 0] 以及更多变体均无济于事。请帮忙。
  • 原则上你的第一次尝试,match="*[starts-with(name(), 'image') and not (filename)]" 应该可以解决问题。其他人也没有错,只是没有那么简短。独立尝试条件,看看哪一个失败。
猜你喜欢
  • 1970-01-01
  • 2021-08-24
  • 1970-01-01
  • 1970-01-01
  • 2021-10-25
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多