【问题标题】:non-validating DocumentBuilder trying to read DTD file非验证 DocumentBuilder 试图读取 DTD 文件
【发布时间】:2014-09-04 19:41:40
【问题描述】:

为什么下面SSCCE 中的非验证DocumentBuilder 试图读取DTD 文件?

public class FooMain  {

    private static String XML_INSTANCE = "<?xml version=\"1.0\"?>                        "+
                                         "<!DOCTYPE note SYSTEM \"does-not-exist.dtd\">  "+
                                         "<a/>                                           ";


    public static void main(String args[]) throws Exception {
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(false);
        factory.setValidating(false); 
        DocumentBuilder builder = factory.newDocumentBuilder();

        InputStream is = new ByteArrayInputStream(XML_INSTANCE.getBytes("UTF-8"));
        Document doc = builder.parse(is);
    }
}

代码爆炸:

[java] Exception in thread "main" java.io.FileNotFoundException: /lhome/minimal-for-SO/does-not-exist.dtd (No such file or directory)
 [java]     at java.io.FileInputStream.open(Native Method)
 [java]     at java.io.FileInputStream.<init>(FileInputStream.java:146)
 [java]     at java.io.FileInputStream.<init>(FileInputStream.java:101)
 [java]     at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)
 [java]     at sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188)
 [java]     at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
 [java]     at org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source)
 [java]     at org.apache.xerces.impl.XMLEntityManager.startDTDEntity(Unknown Source)
 [java]     at org.apache.xerces.impl.XMLDTDScannerImpl.setInputSource(Unknown Source)
 [java]     at org.apache.xerces.impl.XMLDocumentScannerImpl$DTDDispatcher.dispatch(Unknown Source)
 [java]     at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
 [java]     at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
 [java]     at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
 [java]     at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
 [java]     at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
 [java]     at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
 [java]     at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)
 [java]     at FooMain.main(FooMain.java:35)

鉴于构建器未进行验证,如果找不到文件(如果不完全跳过对 DTD 文件的搜索),我希望至少不会崩溃。那么是什么阻止了文档被解析,因为构建器是非验证的,因此不需要访问 DTD?

【问题讨论】:

    标签: java xml java-7 dtd


    【解决方案1】:

    为了忽略 DTD 指令和引用,您必须设置更多标志:

    factory.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
    factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
    

    如果您正在构建 Web 应用程序,我建议您全局禁用解析 DTD 实体,因为它可能存在安全漏洞。

    例如:

    <?xml version="1.0" encoding="ISO-8859-1"?>
     <!DOCTYPE foo [  
      <!ELEMENT foo ANY >
       <!ENTITY xxe SYSTEM "file:///dev/random" >]><foo>&xxe;</foo>
    

    在尝试将 /dev/random 中的内容插入 &xxe 时,会导致服务器崩溃。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2019-12-02
      • 2015-12-17
      • 2023-03-07
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2012-11-28
      相关资源
      最近更新 更多