【发布时间】:2022-01-28 01:31:38
【问题描述】:
我意识到有很多类似的问题,但我仍然无法找到我正在寻找的具体答案。
我使用 Perl 和 XML::LibXML 库从 XML 文件中读取信息。 XML 文件有许多节点和许多子节点(以及子子节点等)。我试图从“每个节点”的 XML 文件中提取信息,但我真的陷入了困境,试图弄清楚如何做到这一点。
这只是我想要做的一个例子:
#!/usr/bin/perl -w
use XML::LibXML
open ($xml_fh, "<test.xml");
my $dom = XML::LibXML->load_xml(IO => $xml_fh);;
close($xml_fh);
foreach $chapter ($dom->findnodes('/file/chapter')) {
my $chapterNumber = $chapter->findvalue('@number');
print "Chapter #$chapterNumber\n";
#I tried $dom->findnodes('/file/chapter/section') <-- spelling out the xPath with same results..
foreach $section ($dom->findnodes('//section')) {
my $sectionNumber = $section->findvalue('@number');
print " Section #$sectionNumber\n";
foreach $subsection ($dom->findnodes('//subsection')) {
my $subsectionNumber = $subsection->findvalue('@number');
print " SubSection $subsectionNumber\n";
}
}
}
这个特定的 XML 文件是这样设置的:
<file>
<chapter number="1">
<section number="abc123">
There is some data here I'd like to get to
<subsection number="abc123.(s)(4)">
Some additional data here
<subsection number="deeperSubSec">
There might even be deeper subsections
</subsection>
</subsection>
</section>
</chapter>
<chapter number="208">
<section number="dgfj23">
There is some data here I'd like to get to also
<subsection number="dgfj23.(s)(4)">
Some additional data here also
<subsection number="deeperSubSec44">
There might even be deeper subsections also
</subsection>
</subsection>
</section>
</chapter>
<chapter number="998">
<section number="xxxid">
There is even more data here I'd like to get to also
<subsection number="xxxid.(s)(4)">
Some additional data also here too
<subsection number="deeperSubSec999">
There might even be deeper subsections also again
</subsection>
</subsection>
</section>
</chapter>
</file>
不幸的是,我最终得到的只是重复数据的列表。我确信这是因为我嵌套了 for 循环,但我真的没有掌握关于如何操作这种数据类型的基本理解。希望有人能提供一些资源或见解。
这是我当前的输出:
Chapter #1
Section #abc123
SubSection abc123.(s)(4)
SubSection deeperSubSec
SubSection dgfj23.(s)(4)
SubSection deeperSubSec44
SubSection xxxid.(s)(4)
SubSection deeperSubSec999
Section #dgfj23
SubSection abc123.(s)(4)
SubSection deeperSubSec
SubSection dgfj23.(s)(4)
SubSection deeperSubSec44
SubSection xxxid.(s)(4)
SubSection deeperSubSec999
Section #xxxid
SubSection abc123.(s)(4)
SubSection deeperSubSec
SubSection dgfj23.(s)(4)
SubSection deeperSubSec44
SubSection xxxid.(s)(4)
SubSection deeperSubSec999
Chapter #208
Section #abc123
SubSection abc123.(s)(4)
SubSection deeperSubSec
SubSection dgfj23.(s)(4)
SubSection deeperSubSec44
SubSection xxxid.(s)(4)
SubSection deeperSubSec999
Section #dgfj23
SubSection abc123.(s)(4)
SubSection deeperSubSec
SubSection dgfj23.(s)(4)
SubSection deeperSubSec44
SubSection xxxid.(s)(4)
SubSection deeperSubSec999
Section #xxxid
SubSection abc123.(s)(4)
SubSection deeperSubSec
SubSection dgfj23.(s)(4)
SubSection deeperSubSec44
SubSection xxxid.(s)(4)
SubSection deeperSubSec999
Chapter #998
Section #abc123
SubSection abc123.(s)(4)
SubSection deeperSubSec
SubSection dgfj23.(s)(4)
SubSection deeperSubSec44
SubSection xxxid.(s)(4)
SubSection deeperSubSec999
Section #dgfj23
SubSection abc123.(s)(4)
SubSection deeperSubSec
SubSection dgfj23.(s)(4)
SubSection deeperSubSec44
SubSection xxxid.(s)(4)
SubSection deeperSubSec999
Section #xxxid
SubSection abc123.(s)(4)
SubSection deeperSubSec
SubSection dgfj23.(s)(4)
SubSection deeperSubSec44
SubSection xxxid.(s)(4)
SubSection deeperSubSec999
所以对于每一章,我正在阅读所有部分,然后我正在阅读所有小节等。一遍又一遍..
我想要做的是阅读每一章的相关部分,然后阅读每个部分、相关的小节以及其中的任何适用的小节。
像这样:
Chapter #1
Section #abc123
Subsection #abc123.(s)(4
Sub-Subsection #deeperSubSec
Chapter #208
Section #dgfj23
Subsection #dgfj23.(s)(4)
Sub-Subsection #deeperSubSec44
etc...
此外,最终,在我弄清楚基本操作是如何工作的之后,我需要访问每个章节、部分、小节等中包含的数据。但我认为我需要在跑步之前步行,所以我将首先尝试获取属性的简单值..
感谢您的帮助。
【问题讨论】:
标签: xml perl xml-libxml