【发布时间】:2025-12-21 02:25:12
【问题描述】:
我有一个这样的 xml 文件:
<comment type="PTM">
<text evidence="19">Sumoylated following its interaction with PIAS1 and UBE2I.</text>
</comment>
<comment type="PTM">
<text evidence="17">Ubiquitinated, leading to proteasomal degradation.</text>
</comment>
<comment type="disease">
<text>A chromosomal aberration involving ZMYND11 is a cause of acute poorly differentiated myeloid leukemia. Translocation (10;17)(p15;q21) with MBTD1.</text>
</comment>
<comment type="disease" evidence="23">
<disease id="DI-04257">
<name>Mental retardation, autosomal dominant 30</name>
<acronym>MRD30</acronym>
<description>A disorder characterized by significantly below average general intellectual functioning associated with impairments in adaptive behavior and manifested during the developmental period. MRD30 patients manifest mild intellectual disability and subtle facial dysmorphisms, including hypertelorism, ptosis, and a wide mouth.</description>
<dbReference type="MIM" id="616083"/>
</disease>
<text>The disease is caused by mutations affecting the gene represented in this entry.</text>
</comment>
<comment type="similarity">
<text evidence="8">Contains 1 bromo domain.</text>
</comment>
<comment type="similarity">
<text evidence="9">Contains 1 MYND-type zinc finger.</text>
</comment>
我使用 stax 来提取疾病信息。这是我的代码的一部分:
XMLInputFactory factory = XMLInputFactory.newInstance();
XMLEventReader eventReader = factory.createXMLEventReader( new FileReader(p));
while(eventReader.hasNext()){
XMLEvent event = eventReader.nextEvent();
switch(event.getEventType()){
case XMLStreamConstants.START_ELEMENT:
StartElement startElement = event.asStartElement();
String qName = startElement.getName().getLocalPart();
if (qName.equalsIgnoreCase("comment")) {
System.out.println("Start Element : comment");
Iterator<Attribute> attributes = startElement.getAttributes();
Attribute a = attributes.next();
System.out.println("ATRIBUTES " + a.getName());
type = a.getValue();
System.out.println("Roll No : " + type);
} else if(qName.equalsIgnoreCase("text") && type.equals("disease")){ text = true; }
break;
case XMLStreamConstants.CHARACTERS:
Characters characters = event.asCharacters();
if(text){ res = res + " " + characters.getData();
//System.out.println("TEXT: " + res);
text = false;
}
break;
case XMLStreamConstants.END_ELEMENT:
EndElement endElement = event.asEndElement();
if(endElement.getName().getLocalPart().equalsIgnoreCase("comment")){
//System.out.println("End Element : comment");
//System.out.println();
}
break;
对于这种类型的线:
<comment type="disease">
我可以正确提取信息,但是当我尝试在此行中查找注释类型“疾病”时:
<comment type="disease" evidence="23">
它给了我 type=evidence 而不是 type=disease 应该的。因此,它不会从这种行中保存任何内容。
【问题讨论】:
-
您会考虑使用非税基的答案吗?
-
是的,我愿意。如果我想使用 STAX,我做了很多代码。