【问题标题】:Using XSLT to split a string使用 XSLT 拆分字符串
【发布时间】:2013-05-20 14:41:59
【问题描述】:

我正在使用 XSLT 将 XML 文件转换为 Excel 可以分隔的格式(示例代码稍后显示)。例如,在 Excel 中打开时,分隔版本可能类似于:

+---------------+---------------+----------+
|URL            |Title          | Version  |
+---------------+---------------+----------+
|dogs_are_cool  |Dogs are cool  | May 2013 |
+---------------+---------------+----------+

问题与每个 URL 末尾都附加了版本有关。使用前面的例子,dogs_are_cool 实际上是dogs_are_cool_may2013.html

我想对附加版本做两件事:

  • 打印 URL 时删除版本。
  • 重新格式化并打印版本。

我猜最好的方法是通过某种方式将 URL 拆分为下划线。然后将最后一个元素拆分到一个变量中并按顺序打印其他元素 - 将下划线重新插入。

我不知道该怎么做。

示例 XML:

<contents Url="toc_animals_may2013.html" Title="Animals">
    <contents Url="toc_apes_may2013.html" Title="Apes">
        <contents Url="chimps_may2013.html" Title="Some Stuff About Chimps" />
    </contents>
    <contents Url="toc_cats" Title="Cats">
        <contents Url="hairless_cats_may2013.html" Title="OMG Where Did the Hair Go?"/>
        <contents Url="wild_cats_may2013.html" Title="These Things Frighten Me"/>
    </contents>
    <contents Url="toc_dogs_may2013.html" Title="Dogs">
        <contents Url="toc_snorty_dogs_may2013.html" Title="Snorty Dogs">
            <contents Url="boston_terriers_may2013.html" Title="Boston Terriers" />
            <contents Url="french_bull_dogs_may2013.html" Title="Frenchies" />
        </contents>
    </contents>
</contents>

XSLT 示例:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text" indent="no"/>

    <!-- This variable sets the delimiter symbol that Excel will use to seperate the cells -->
    <xsl:variable name="delimiter">@</xsl:variable>

    <xsl:template match="contents">

        <!-- Prints the URL -->
        <xsl:value-of select="@Url"/>
        <xsl:copy-of select="$delimiter" />

        <!-- Prints the title -->
        <xsl:apply-templates select="@Title"/>
        <xsl:copy-of select="$delimiter" />

        <!-- I'd like to print the version here -->
        <xsl:copy-of select="$delimiter" />

    <xsl:template match="/">
        <xsl:apply-templates select="//contents"/>
    </xsl:template>

</xsl:stylesheet>

【问题讨论】:

    标签: xml xslt


    【解决方案1】:

    添加更多模板来帮助我们,我们创建了一个 XSLT 野兽,但它似乎可以解决问题......

    <?xml version="1.0" encoding="utf-8"?>
    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
      <xsl:output method="text" indent="no"/>
      <!-- This variable sets the delimiter symbol that Excel will use to seperate the cells -->
      <xsl:variable name="delimiter">@</xsl:variable>
    
      <xsl:template match="contents">
        <!-- Prints the URL -->
        <xsl:choose>
          <xsl:when test="contains(@Url, '.')">
            <xsl:call-template name="substring-before-last">
              <xsl:with-param name="list" select="@Url"/>
              <xsl:with-param name="delimiter" select="'_'"/>
            </xsl:call-template>            
          </xsl:when>
          <xsl:otherwise><xsl:value-of select="@Url"/></xsl:otherwise>
        </xsl:choose>
        <xsl:copy-of select="$delimiter"/>
    
        <!-- Prints the title -->
        <xsl:apply-templates select="@Title"/>
        <xsl:copy-of select="$delimiter"/>
    
        <!-- Now do all the tricks to format the version -->
        <xsl:variable name="withExtension">
          <xsl:call-template name="substring-after-last">
            <xsl:with-param name="string" select="@Url"/>
            <xsl:with-param name="delimiter" select="'_'"/>
          </xsl:call-template>
        </xsl:variable>
    
        <xsl:variable name="withoutExtension">
          <xsl:call-template name="substring-before-last">
            <xsl:with-param name="list" select="$withExtension"/>
            <xsl:with-param name="delimiter" select="'.'"/>
          </xsl:call-template>
        </xsl:variable>
    
        <xsl:variable name="withoutSpace">
          <xsl:value-of select="concat(translate(substring($withoutExtension, 1, 1), 'abcdefghijklmnopqrstuvwxyz', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'), substring($withoutExtension, 2))"/>
        </xsl:variable>
    
        <xsl:variable name="year">
          <xsl:value-of select="translate($withoutSpace,translate($withoutSpace, '0123456789', ''), '')"/>
        </xsl:variable>
    
        <xsl:value-of select="concat(substring-before($withoutSpace, $year), ' ', $year)"/>
        <xsl:copy-of select="$delimiter"/>
      </xsl:template>
    
      <xsl:template match="/">
        <xsl:apply-templates select="//contents"/>
      </xsl:template>
    
      <xsl:template name="substring-before-last">
        <xsl:param name="list"/>
        <xsl:param name="delimiter"/>
        <xsl:choose>
          <xsl:when test="contains($list, $delimiter)">
            <xsl:value-of select="substring-before($list,$delimiter)"/>
            <xsl:choose>
              <xsl:when test="contains(substring-after($list,$delimiter),$delimiter)">
                <xsl:value-of select="$delimiter"/>
              </xsl:when>
            </xsl:choose>
            <xsl:call-template name="substring-before-last">
              <xsl:with-param name="list" select="substring-after($list,$delimiter)"/>
              <xsl:with-param name="delimiter" select="$delimiter"/>
            </xsl:call-template>
          </xsl:when>
        </xsl:choose>
      </xsl:template>
    
      <xsl:template name="substring-after-last">
        <xsl:param name="string"/>
        <xsl:param name="delimiter"/>
        <xsl:choose>
          <xsl:when test="contains($string, $delimiter)">
            <xsl:call-template name="substring-after-last">
              <xsl:with-param name="string" select="substring-after($string, $delimiter)"/>
              <xsl:with-param name="delimiter" select="$delimiter"/>
            </xsl:call-template>
          </xsl:when>
          <xsl:otherwise>
            <xsl:value-of select="$string"/>
          </xsl:otherwise>
        </xsl:choose>
      </xsl:template>
    
    </xsl:stylesheet>
    

    输出:

    toc_animals@Animals@May 2013@toc_apes@Apes@May 2013@chimps@Some Stuff About Chimps@May 2013@toc_cats@Cats@ @hairless_cats@OMG Where Did the Hair Go?@May 2013@wild_cats@These Things Frighten Me@May 2013@toc_dogs@Dogs@May 2013@toc_snorty_dogs@Snorty Dogs@May 2013@boston_terriers@Boston Terriers@May 2013@french_bull_dogs@Frenchies@May 2013@
    

    【讨论】:

      【解决方案2】:

      如果您可以使用 XSLT 2.0,它就会变得简单得多。

      XML 输入

      <contents Url="toc_animals_may2013.html" Title="Animals">
          <contents Url="toc_apes_may2013.html" Title="Apes">
              <contents Url="chimps_may2013.html" Title="Some Stuff About Chimps" />
          </contents>
          <contents Url="toc_cats" Title="Cats">
              <contents Url="hairless_cats_may2013.html" Title="OMG Where Did the Hair Go?"/>
              <contents Url="wild_cats_may2013.html" Title="These Things Frighten Me"/>
          </contents>
          <contents Url="toc_dogs_may2013.html" Title="Dogs">
              <contents Url="toc_snorty_dogs_may2013.html" Title="Snorty Dogs">
                  <contents Url="boston_terriers_may2013.html" Title="Boston Terriers" />
                  <contents Url="french_bull_dogs_may2013.html" Title="Frenchies" />
              </contents>
          </contents>
      </contents>
      

      XSLT 2.0

      <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
          <xsl:output method="text"/>
          <xsl:strip-space elements="*"/>
      
          <xsl:param name="delim" select="'@'"/>
      
          <xsl:template match="contents">
              <xsl:variable name="urlTokens" select="tokenize(@Url,'_')"/>
              <xsl:value-of select="$urlTokens[not(position() = last())]" separator="_"/>
              <xsl:value-of select="$delim"/>
              <xsl:value-of select="concat(@Title,$delim)"/>
              <xsl:analyze-string select="$urlTokens[last()]" regex="([a-z])([a-z]+)([0-9]+)">
                  <xsl:matching-substring>
                      <xsl:value-of select="concat(upper-case(regex-group(1)),regex-group(2),' ',regex-group(3))"/>               
                  </xsl:matching-substring>
              </xsl:analyze-string>
              <xsl:text>&#xA;</xsl:text>
              <xsl:apply-templates/>
          </xsl:template>
      
      </xsl:stylesheet>
      

      输出

      toc_animals@Animals@May 2013
      toc_apes@Apes@May 2013
      chimps@Some Stuff About Chimps@May 2013
      toc@Cats@
      hairless_cats@OMG Where Did the Hair Go?@May 2013
      wild_cats@These Things Frighten Me@May 2013
      toc_dogs@Dogs@May 2013
      toc_snorty_dogs@Snorty Dogs@May 2013
      boston_terriers@Boston Terriers@May 2013
      french_bull_dogs@Frenchies@May 2013
      

      【讨论】:

      • 谢谢丹尼尔。 2.0 的解决方案当然要简单得多,但 Excel 似乎并不喜欢 2.0。