【问题标题】:write for loop xml parsing java编写for循环xml解析java
【发布时间】:2016-12-12 16:37:51
【问题描述】:

有人可以帮我写一个 for 循环来遍历所有这些区域节点并获取唯一的类领导的文本吗?

<zones count="13">
                    <zone type="RECT" flags="4099" class="Headline" num="1">
                        <zrect unit="pix">0,1097,2173,1303</zrect>
                        <ztext type="XML" textformat="XML">
                            <REGION>
                                <PARAGRAPH>
                                    <LINE>
                                        <WORD Rect="27,933,272,1067">ma</WORD>
                                        <BLANK/>
                                        <WORD Rect="325,933,820,1096">ekdum</WORD>
                                        <BLANK/>
                                        <WORD Rect="877,933,982,1065">gyani</WORD>
                                        <BLANK/>
                                        <WORD Rect="1040,933,1829,1096">chu</WORD>
                                        <BLANK/>
                                    </LINE>
                                </PARAGRAPH>
                            </REGION>
                        </ztext>
                        <source/>
                    </zone>
                    <zone type="RECT" flags="4099" class="Author" num="2">
                        <zrect unit="pix">0,1326,324,1372</zrect>
                        <ztext type="XML" textformat="XML">
                            <REGION>
                                <PARAGRAPH>
                                    <LINE>
                                        <WORD Rect="4,1126,44,1158">By</WORD>
                                        <BLANK/>
                                        <WORD Rect="54,1126,131,1151">Sano</WORD>
                                        <BLANK/>
                                        <WORD Rect="145,1126,272,1151">shrest</WORD>
                                        <BLANK/>
                                    </LINE>
                                </PARAGRAPH>
                            </REGION>
                        </ztext>
                        <source/>
                    </zone>
                    <zone type="RECT" flags="4099" class="Lead" num="3">
                        <zrect unit="pix">0,1384,475,1584</zrect>
                        <ztext type="XML" textformat="XML">
                            <REGION>
                                <PARAGRAPH>
                                    <LINE>
                                        <WORD Rect="5,1174,42,1192">Dherai</WORD>
                                        <BLANK/>
                                        <WORD Rect="55,1178,118,1198">years</WORD>
                                        <BLANK/>
                                        <WORD Rect="130,1178,166,1192">dekhin</WORD>
                                        <BLANK/>
                                        <WORD Rect="179,1174,263,1192">gadi</WORD>
                                        <BLANK/>
                                        <WORD Rect="277,1174,331,1192">banaune</WORD>
                                        <BLANK/>
                                        <WORD Rect="344,1174,399,1192">manche</WORD>
                                        <BLANK/>
                                    </LINE>
                                    <LINE>
                                        <WORD Rect="4,1203,91,1226">haru</WORD>
                                        <BLANK/>
                                        <WORD Rect="115,1203,147,1221">mehanat</WORD>
                                        <BLANK/>
                                        <WORD Rect="172,1207,218,1221">gardai</WORD>
                                        <BLANK/>
                                        <WORD Rect="241,1203,399,1226">chan</WORD>
                                        <BLANK/>
                                    </LINE>
                                    <LINE>
                                        <WORD Rect="3,1236,63,1255">ramro</WORD>
                                        <BLANK/>
                                        <WORD Rect="80,1233,102,1250">gadi</WORD>
                                        <BLANK/>
                                        <WORD Rect="119,1231,214,1255">nirman</WORD>
                                        <BLANK/>
                                        <WORD Rect="232,1231,323,1254">garna</WORD>
                                        <BLANK/>
                                        <WORD Rect="341,1236,400,1250">lai</WORD>
                                        <BLANK/>
                                    </LINE>
                                </PARAGRAPH>
                            </REGION>
                        </ztext>
                        <source/>
                    </zone>
                    <zone type="RECT" flags="4099" class="Paragraph" num="4">
                        <zrect unit="pix">0,1596,478,2249</zrect>
                        <ztext type="XML" textformat="XML">
                            <REGION>
                                <PARAGRAPH>
                                    <LINE>
                                        <WORD Rect="28,1352,74,1366">Ramro</WORD>
                                        <BLANK/>
                                        <WORD Rect="82,1356,114,1366">gadi</WORD>
                                        <BLANK/>
                                        <WORD Rect="122,1356,151,1369">are,</WORD>
                                        <BLANK/>
                                        <WORD Rect="158,1352,179,1366">for</WORD>
                                        <BLANK/>
                                        <WORD Rect="186,1356,196,1366">a</WORD>
                                        <BLANK/>
                                        <WORD Rect="202,1352,254,1369">variety</WORD>
                                        <BLANK/>
                                        <WORD Rect="262,1352,274,1366">of</WORD>
                                        <BLANK/>
                                        <WORD Rect="283,1356,348,1368">reasons,</WORD>
                                        <BLANK/>
                                        <WORD Rect="356,1352,400,1369">ramro</WORD>
                                        <BLANK/>
                                    </LINE>
                                </PARAGRAPH>
                            </REGION>
                        </ztext>
                        <source/>
                    </zone>

我能够获取所有区域的文本,但不是特别使用属性 class= "Lead"

【问题讨论】:

  • 你用什么解析xml?
  • 我正在使用 xpath。但是我不能给出节点的位置,而结构会因不同的 xml 文件而改变。即,Lead 类在 num=3 中,但在其他 xml 中,Lead 类可以在 num=1 中。
  • 根据 w3schools 的说法,如果您使用//zone[@class='Lead'],您将获得所有具有类 Lead 的区域。然后你可以循环它们以获得你需要的文本。

标签: xml for-loop xml-parsing


【解决方案1】:

来自 W3 学校的示例。 来源:http://www.w3schools.com/xml/xpath_syntax.asp

这是xml:

<bookstore>
    <book category="cooking">
        <title lang="en">Everyday Italian</title>
        <author>Giada De Laurentiis</author>
        <year>2005</year>
        <price>30.00</price>
    </book>
    <book category="children">
        <title lang="en">Harry Potter</title>
        <author>J K. Rowling</author>
        <year>2005</year>
        <price>29.99</price>
    </book>
    <book category="web">
        <title lang="en">XQuery Kick Start</title>
        <author>James McGovern</author>
        <author>Per Bothner</author>
        <author>Kurt Cagle</author>
        <author>James Linn</author>
        <author>Vaidyanathan Nagarajan</author>
        <year>2003</year>
        <price>49.99</price>
    </book>
    <book category="web" cover="paperback">
        <title lang="en">Learning XML</title>
        <author>Erik T. Ray</author>
        <year>2003</year>
        <price>39.95</price>
    </book>
</bookstore>

还有 Javascript 函数,它将提取您想要的节点。请注意,我们使用 //book[@category='web'] 来获取具有该 attribute=value 对的所有节点。同样,你可以//zone[@class='Lead']

<html>
<body>

<p id="demo"></p>

<script>
function showResult(xml) {
    var txt = "";
    path = "//book[@category='web']/title";
    if (xml.evaluate) {
        var nodes = xml.evaluate(path, xml, null, XPathResult.ANY_TYPE, null);
        var result = nodes.iterateNext();
        while (result) {
            txt += result.childNodes[0].nodeValue + "<br>";
            result = nodes.iterateNext();
        } 
    document.getElementById("demo").innerHTML = txt;
}
</script>

</body>
</html>

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2013-11-22
    • 2012-02-21
    • 2012-09-30
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2015-05-05
    相关资源
    最近更新 更多