【发布时间】:2013-07-16 20:59:23
【问题描述】:
我有一个想要解析和访问节点的 RDF/XML 数据。 它看起来像这样:
<!-- http://purl.obolibrary.org/obo/VO_0000185 -->
<owl:Class rdf:about="&obo;VO_0000185">
<rdfs:label>Influenza virus gene</rdfs:label>
<rdfs:subClassOf rdf:resource="&obo;VO_0000156"/>
<obo:IAO_0000117>YH</obo:IAO_0000117>
</owl:Class>
<!-- http://purl.obolibrary.org/obo/VO_0000186 -->
<owl:Class rdf:about="&obo;VO_0000186">
<rdfs:label>RNA vaccine</rdfs:label>
<owl:equivalentClass>
<owl:Class>
<owl:intersectionOf rdf:parseType="Collection">
<rdf:Description rdf:about="&obo;VO_0000001"/>
<owl:Restriction>
<owl:onProperty rdf:resource="&obo;BFO_0000161"/>
<owl:someValuesFrom rdf:resource="&obo;VO_0000728"/>
</owl:Restriction>
</owl:intersectionOf>
</owl:Class>
</owl:equivalentClass>
<rdfs:subClassOf rdf:resource="&obo;VO_0000001"/>
<obo:IAO_0000116>Using RNA may eliminate the problem of having to tailor a vaccine for each individual patient with their specific immunity. The advantage of RNA is that it can be used for all immunity types and can be taken from a single cell. DNA vaccines need to produce RNA which then prompts the manufacture of proteins. However, RNA vaccine eliminates the step from DNA to RNA.</obo:IAO_0000116>
<obo:IAO_0000115>A vaccine that uses RNA(s) derived from a pathogen organism.</obo:IAO_0000115>
<obo:IAO_0000117>YH</obo:IAO_0000117>
</owl:Class>
完整的RDF/XML文件可以在here找到。
我想做的是做以下事情:
- 找到包含条目
<rdfs:subClassOf rdf:resource="&obo;VO_0000001"/>的块 - 访问由
<rdfs:label>...</rdfs:label>定义的字面术语
所以在上面的例子中,代码将通过第二个块并输出: “RNA疫苗”。
我目前坚持使用以下代码。我做不到的地方 访问节点。正确的方法是什么?使用 XML::LibXML 以外的解决方案 受到欢迎。
#!/usr/bin/perl -w
use strict;
use Data::Dumper;
use Carp;
use File::Basename;
use XML::LibXML 1.70;
my $filename = "VO.owl";
# Obtained from http://svn.code.sf.net/p/vaccineontology/code/trunk/src/ontology/VO.owl
my $parser = XML::LibXML->new();
my $doc = $parser->parse_file( $filename );
foreach my $chunk ($doc->findnodes('/owl:Class')) {
my ($label) = $chunk->findnodes('./rdfs:label');
my ($subclass) = $chunk->findnodes('./rdfs:subClassOf');
print $label->to_literal;
print $subclass->to_literal;
}
【问题讨论】:
-
我要提一下,不使用 XML 库的解决方案不仅应该受到欢迎,而且首选; don't try to parse RDF as XML。诚然,RDF 可以在 XML 中序列化,但是同样的 RDF 图可以在 XML 中以许多不同的方式进行序列化,并且适用于其中一种的 XML 解决方案不太可能适用于另一种。 RDF 是基于图的表示,应该这样对待。