【发布时间】:2015-11-16 00:22:53
【问题描述】:
我有一个大型 XML 文档 (100 Go),并且想要解析它以提取信息并将它们存储到 RDF 三元存储中。
我发现了如何使用 Java 解析大型 XML 文件,并且知道如何使用 Jena RDF API 读取/写入 RDF 文件。
- 如何根据我在
OWL本体,使用Protege创建? - 是否可以读取/加载此
OWL本体并创建实例 将类作为三元组并使用Jena将它们存储到RDF File中?
主要问题是创建了大量实例(三元组)。
XML 文件样本:
<?xml version="1.0"?>
<catalog>
<book id="bk101">
<author>
<name>Gaella, Matt</name>
<initial>MG</initial>
</author>
<title>User Guide</title>
<price>45.95</price>
<publish_date>2010-10-01</publish_date>
</book>
<book id="bk102">
<author>
<name>Rall, Kimiou</name>
<initial>KR</initial>
</author>
<title>Midnight Scene</title>
<price>5.75</price>
<publish_date>2011-12-02</publish_date>
</book>
<book id="bk103">
<author>
<name>Colin, Evian</name>
<initial>EC</initial>
</author>
<title>Cool Ascendant</title>
<price>5.50</price>
<publish_date>2012-11-03</publish_date>
</book>
<book id="bk104">
<author>
<name>Cortes, Smith</name>
<initial>SC</initial>
</author>
<title>Farmer Legacy</title>
<price>10.50</price>
<publish_date>2013-03-04</publish_date>
</book>
. . .
</catalog>
OWL-DL 本体:
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:swrlb="http://www.w3.org/2003/11/swrlb#"
xmlns="http://www.owl-ontologies.com/OntologyBooks.owl#"
xmlns:xsp="http://www.owl-ontologies.com/2005/08/07/xsp.owl#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:protege="http://protege.stanford.edu/plugins/owl/protege#"
xmlns:swrl="http://www.w3.org/2003/11/swrl#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xml:base="http://www.owl-ontologies.com/OntologyBooks.owl">
<owl:Ontology rdf:about=""/>
<owl:Class rdf:ID="Book">
<owl:disjointWith>
<owl:Class rdf:ID="Author"/>
</owl:disjointWith>
<rdfs:subClassOf>
<owl:Restriction>
<owl:allValuesFrom>
<owl:Class rdf:about="#Author"/>
</owl:allValuesFrom>
<owl:onProperty>
<owl:ObjectProperty rdf:ID="hasAuthor"/>
</owl:onProperty>
</owl:Restriction>
</rdfs:subClassOf>
<rdfs:subClassOf>
<owl:Restriction>
<owl:onProperty>
<owl:ObjectProperty rdf:about="#hasAuthor"/>
</owl:onProperty>
<owl:someValuesFrom>
<owl:Class rdf:about="#Author"/>
</owl:someValuesFrom>
</owl:Restriction>
</rdfs:subClassOf>
<rdfs:subClassOf>
<owl:Restriction>
<owl:cardinality rdf:datatype="http://www.w3.org/2001/XMLSchema#int"
>1</owl:cardinality>
<owl:onProperty>
<owl:DatatypeProperty rdf:ID="price"/>
</owl:onProperty>
</owl:Restriction>
</rdfs:subClassOf>
<rdfs:subClassOf>
<owl:Restriction>
<owl:cardinality rdf:datatype="http://www.w3.org/2001/XMLSchema#int"
>1</owl:cardinality>
<owl:onProperty>
<owl:DatatypeProperty rdf:ID="publishDate"/>
</owl:onProperty>
</owl:Restriction>
</rdfs:subClassOf>
<rdfs:subClassOf>
<owl:Restriction>
<owl:onProperty>
<owl:DatatypeProperty rdf:ID="title"/>
</owl:onProperty>
<owl:cardinality rdf:datatype="http://www.w3.org/2001/XMLSchema#int"
>1</owl:cardinality>
</owl:Restriction>
</rdfs:subClassOf>
<rdfs:subClassOf rdf:resource="http://www.w3.org/2002/07/owl#Thing"/>
</owl:Class>
<owl:Class rdf:about="#Author">
<rdfs:subClassOf>
<owl:Restriction>
<owl:cardinality rdf:datatype="http://www.w3.org/2001/XMLSchema#int"
>1</owl:cardinality>
<owl:onProperty>
<owl:DatatypeProperty rdf:ID="initial"/>
</owl:onProperty>
</owl:Restriction>
</rdfs:subClassOf>
<rdfs:subClassOf>
<owl:Restriction>
<owl:cardinality rdf:datatype="http://www.w3.org/2001/XMLSchema#int"
>1</owl:cardinality>
<owl:onProperty>
<owl:DatatypeProperty rdf:ID="name"/>
</owl:onProperty>
</owl:Restriction>
</rdfs:subClassOf>
<rdfs:subClassOf rdf:resource="http://www.w3.org/2002/07/owl#Thing"/>
<owl:disjointWith rdf:resource="#Book"/>
</owl:Class>
<owl:ObjectProperty rdf:ID="isAuthorOf">
<rdfs:domain rdf:resource="#Author"/>
<rdfs:range rdf:resource="#Book"/>
<owl:inverseOf>
<owl:ObjectProperty rdf:about="#hasAuthor"/>
</owl:inverseOf>
</owl:ObjectProperty>
<owl:ObjectProperty rdf:about="#hasAuthor">
<owl:inverseOf rdf:resource="#isAuthorOf"/>
<rdfs:domain rdf:resource="#Book"/>
<rdfs:range rdf:resource="#Author"/>
</owl:ObjectProperty>
<owl:DatatypeProperty rdf:about="#publishDate">
<rdfs:domain rdf:resource="#Book"/>
<rdfs:range rdf:resource="http://www.w3.org/2001/XMLSchema#date"/>
</owl:DatatypeProperty>
<owl:DatatypeProperty rdf:about="#price">
<rdfs:domain rdf:resource="#Book"/>
<rdfs:range rdf:resource="http://www.w3.org/2001/XMLSchema#float"/>
</owl:DatatypeProperty>
<owl:DatatypeProperty rdf:about="#initial">
<rdfs:domain rdf:resource="#Author"/>
<rdfs:range rdf:resource="http://www.w3.org/2001/XMLSchema#string"/>
</owl:DatatypeProperty>
<owl:DatatypeProperty rdf:about="#name">
<rdfs:domain rdf:resource="#Author"/>
<rdfs:range rdf:resource="http://www.w3.org/2001/XMLSchema#string"/>
</owl:DatatypeProperty>
<owl:DatatypeProperty rdf:about="#title">
<rdfs:range rdf:resource="http://www.w3.org/2001/XMLSchema#string"/>
<rdfs:domain rdf:resource="#Book"/>
</owl:DatatypeProperty>
</rdf:RDF>
【问题讨论】:
-
首先向我们展示您已经尝试过的内容。