【问题标题】:How to read comment text with SAX Java parser如何使用 SAX Java 解析器读取注释文本
【发布时间】:2017-01-08 17:44:15
【问题描述】:

我只想在我的 XML 文件中使用 Java 中的 SAX 解析器读取对象标记的注释。

这是我文件的摘要:

<!-- Object Seed term: day, WikiTitle: day-->
<object id="15155220" name="solar day, twenty-four hour period, 24-hour interval, mean solar day, twenty-four hours, si day, día, days, si days, day duration, day, civil day">
    <!-- class: "calendar day" -->
    <class id="15157041" name="calendar day, civil day"></class>
    <!-- class: "unit of time" -->
    <class id="15154774" name="time units, unit of time, time unit, units of time"></class>
    <!-- class: "" -->
    <class id="15113229" name="period of time, time period, period"></class>
    <!-- class: "" -->
    <class id="00000000" name="time"></class>
    <genericPhysicalDescription>
        <!-- hasPart: "" -->
        <hasPart id="15228378" name="hour, time of day"></hasPart>
        <!-- hasPart: "" -->
        <hasPart id="15157225" name="day"></hasPart>
        <!-- partOf: "calendar" -->
        <partOf id="15173479" name="calendrics, calendar, dating style, calendarist, calendars, birthday calendar, calendar strip, secular calendar, calandar, agriculture calendar, calendar system, criminal calendar"></partOf>
        <!-- partOf: "" -->
        <partOf id="15206296" name="month"></partOf>
        <!-- partOf: "" -->
        <partOf id="15157225" name="day"></partOf>
    </genericPhysicalDescription>
</object>

【问题讨论】:

    标签: java xml parsing sax


    【解决方案1】:

    javax.xml.parsers.SAXParser 不支持读取 cmets。它会忽略它们。

    org.xml.sax.ext.LexicalHandler 允许您在使用 org.xml.sax.XMLReader 解析时捕获 cmets。请参阅 another stackoverflow posttutorial at Oracle 的示例。

    如果您想将评论连接到紧随其后的元素,您可以另外将org.xml.sax.ContentHandler 传递给解析器并通过它跟踪其他 XML 内容。我修改了上面提到的代码以仅打印 object 元素,该元素前面紧跟一条评论:

    import org.xml.sax.*;
    import org.xml.sax.ext.*;
    import org.xml.sax.helpers.*;
    
    import java.io.IOException;
    
    public class Test implements LexicalHandler, ContentHandler {
    
      private String  lastComment;
    
      public void startDTD(String name, String publicId, String systemId) throws SAXException {
      }
      public void endDTD() throws SAXException {
      }
      public void startEntity(String name) throws SAXException {
      }
      public void endEntity(String name) throws SAXException {
      }
      public void startCDATA() throws SAXException {
      }
      public void endCDATA() throws SAXException {
      }
      public void comment(char[] text, int start, int length) throws SAXException {
        this.lastComment = new String(text, start, length).trim();
      }
    
      public void characters(char[] ch, int start, int length) {
      }
      public void endDocument() {
      }
      public void endElement(String uri, String localName, String qName) {
      }
      public void endPrefixMapping(String prefix) {
      }
      public void ignorableWhitespace(char[] ch, int start, int length) {
      }
      public void processingInstruction(String target, String data) {
      }
      public void setDocumentLocator(Locator locator) {
      }
      public void skippedEntity(String name) {
      }
      public void startDocument() {
      }
      public void startElement(String uri, String localName, String qName, Attributes atts) {
        if (localName == "object") {
          if (this.lastComment != null) {
            System.out.println("Element object with comment found: \"" + this.lastComment + "\"");
            this.lastComment = null;
          }
        } else {
          this.lastComment = null;
        }
      }
      public void startPrefixMapping(String prefix, String uri) {
      }
    
      public static void main(String[] args) {
        Test test = new Test();
        XMLReader parser;
    
        try {
          parser = XMLReaderFactory.createXMLReader();
        } catch (SAXException ex1) {
          try {
            parser = XMLReaderFactory.createXMLReader("org.apache.xerces.parsers.SAXParser");
          } catch (SAXException ex2) {
            return;
          }
        }
    
        try {
          parser.setProperty("http://xml.org/sax/properties/lexical-handler", test);
        } catch (SAXNotRecognizedException e) {
          System.out.println(e.getMessage());
          return;
        } catch (SAXNotSupportedException e) {
          System.out.println(e.getMessage());
          return;
        }
    
        parser.setContentHandler(test);
    
        try {
          parser.parse("test.xml");
        } catch (SAXParseException e) {
          System.out.println(e.getMessage());
        } catch (SAXException e) { 
          System.out.println(e.getMessage());
        } catch (IOException e) {
          System.out.println(e.getMessage());
        }
      }
    }
    

    将此代码保存到“Test.java”并将您的 XML 内容保存到“test.xml”。编译并执行后,它应该会为您提供以下输出:

    $ javac Test.java 
    $ java Test 
    Element object with comment found: "Object Seed term: day, WikiTitle: day"
    

    【讨论】:

    • 这段代码在解析后开始阅读所有注释e
    • 是的,LexicalHandler 不跟踪元素;您还需要将 ContentHandler 设置为解析器,以跟踪其他 XML 内容并能够将 cmets 和元素关联在一起。我更新了我的答案,只打印object 元素的评论。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2017-07-27
    • 2012-08-17
    • 1970-01-01
    • 2020-08-09
    • 1970-01-01
    • 2011-06-17
    • 2018-10-15
    相关资源
    最近更新 更多