【问题标题】:Xpath always return first child node of XML in JAVAXpath 总是返回 JAVA 中 XML 的第一个子节点
【发布时间】:2018-10-23 10:09:19
【问题描述】:

我有一个这样的 xml:

<div class="row mt-5">
<div class="col-lg-cus col-6">
    <div class="product-box lazyload-wrap">
        <div class="remove-wishlist" data-id="7080">
            <i class="fa fa-times" aria-hidden="true"></i>
        </div>
        <div class="productAvatar">
            <a href="/vi/classic-fullface-royal-m18" title="Classic FULLFACE ROYAL M18">
                <div class="img img-background text-left origin product-img lazy-bg-img lazyload-item" style="background-image:url('/Uploads/default-image.jpg')" alt="Classic FULLFACE ROYAL M18" data-pil-src="https://fanfan.vn/Uploads/t/fa/fanfan0non-bao-hiem-cafe-racer-classic-fullface-royal-m18-5_0061169_235.jpg">
                </div>
            </a>
            <a class="btn btnQuickView "
               onclick="OpenCustomBootstrapModal('/vi/_Details?productId=7080', null, 1000, 'productPopup')">
                <span class="txtOutOfStock">Hết h&#224;ng</span>
                <span class="txtQuickView">Xem nhanh</span>
            </a>
        </div>
        <p class="mb-1 brand-name">
            <a href="/vi/royal-helmet" class="a-black" title="ROYAL HELMET">ROYAL HELMET</a>
        </p>
        <p class="name text-uppercase mb-0">
            <a href="/vi/classic-fullface-royal-m18" class="a-black">Classic FULLFACE ROYAL M18</a>
        </p>
        <div class="rating my-2">
            <span class="star-raty" data-score="0" data-readOnly="true"></span>
        </div>
        <p>
            <span>1.100.000 ₫</span>
        </p>
    </div>
</div>
<div class="col-lg-cus col-6">
    <div class="product-box lazyload-wrap">
        <div class="remove-wishlist" data-id="6855">
            <i class="fa fa-times" aria-hidden="true"></i>
        </div>
        <div class="productAvatar">
            <a href="/vi/non-bao-hiem-34-royal-m01-tem" title="N&#243;n bảo hiểm 3/4 Royal M01 Tem">
                <div class="img img-background text-left origin product-img lazy-bg-img lazyload-item" style="background-image:url('/Uploads/default-image.jpg')" alt="N&#243;n bảo hiểm 3/4 Royal M01 Tem" data-pil-src="https://fanfan.vn/Uploads/t/fa/fanfan0mu-non-bao-hiem-3-4-di-xe-may-royal-m01-tem-helmet-with-texture-4-do-xam-red-si_0060108_235.jpg">
                </div>
            </a>
            <a class="btn btnQuickView "
               onclick="OpenCustomBootstrapModal('/vi/_Details?productId=6855', null, 1000, 'productPopup')">
                <span class="txtOutOfStock">Hết h&#224;ng</span>
                <span class="txtQuickView">Xem nhanh</span>
            </a>
        </div>
        <p class="mb-1 brand-name">
            <a href="/vi/royal-helmet" class="a-black" title="ROYAL HELMET">ROYAL HELMET</a>
        </p>
        <p class="name text-uppercase mb-0">
            <a href="/vi/non-bao-hiem-34-royal-m01-tem" class="a-black">N&#243;n bảo hiểm 3/4 Royal M01 Tem</a>
        </p>
        <div class="rating my-2">
            <span class="star-raty" data-score="0" data-readOnly="true"></span>
        </div>
        <p>
            <span>400.000 ₫</span>
        </p>
    </div>
</div>

我试图解析它们,从中获取一些数据并将它们解析到 jaxb,我是这样做的:

 public static void fetchFanFanData(String dataFilePath, String type) {
    try {
        Document doc = DocParser(dataFilePath);
        XPath xpath = getXPath();

        String query = "//div[@class=\"col-lg-cus col-6\"]";
        NodeList list = (NodeList) xpath.evaluate(query, doc, XPathConstants.NODESET);
        NodeList list = doc.getDocumentElement().getChildNodes();
        System.out.println(list.getLength());

        Products products = new Products();
        for (int i = 0; i < list.getLength(); i++) {
            Node node = list.item(i);
            String url = xpath.evaluate("//p[@class=\"name text-uppercase mb-0\"]/a/@href", node, XPathConstants.STRING).toString();
            String name = xpath.evaluate("//p[@class=\"name text-uppercase mb-0\"]/a", node, XPathConstants.STRING).toString();
            String producer = xpath.evaluate("//p[@class=\"mb-1 brand-name\"]", node, XPathConstants.STRING).toString();
            String image_url = xpath.evaluate("//div[@class=\"productAvatar\"]/a/div/@data-pil-src", node, XPathConstants.STRING).toString();
            String price = xpath.evaluate("//p/span", node, XPathConstants.STRING).toString();
            Product product = new Product();
            product.setName(name);
            product.setImage(image_url);
            product.setUrl(url);
            product.setPrice(price);
            product.setProducer(producer);
            product.setStore("FanFan");
            product.setType(type);

            products.getProduct().add(product);
        }
        marshallJAXB(products, dataFilePath);
    } catch (ParserConfigurationException | SAXException | IOException | XPathExpressionException | JAXBException ex) {
        Logger.getLogger(XMLUtilities.class.getName()).log(Level.SEVERE, null, ex);
    }
}

private static void marshallJAXB(Products products, String path) throws JAXBException, FileNotFoundException {
    JAXBContext context = JAXBContext.newInstance(Products.class);
    Marshaller m = context.createMarshaller();
    m.setProperty(Marshaller.JAXB_ENCODING, "UTF-8");
    m.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE);
    m.marshal(products, new File(ServletActionContext.getServletContext().getRealPath("/" + "WEB-INF\\result.xml")));
}

public static XPath getXPath() {
    XPathFactory factory = XPathFactory.newInstance();
    XPath xPath = factory.newXPath();
    return xPath;
}

public static Document DocParser(String filePath)
        throws ParserConfigurationException, SAXException, IOException {
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    DocumentBuilder builder = factory.newDocumentBuilder();
    Document doc = builder.parse(filePath);
    return doc;
}

marshaller 只是验证 jaxb 是否正确,但我得到的始终是这样的第一个节点:

<products xmlns="http://www.example.org/product">
<product type="helmet">
    <name>Classic FULLFACE ROYAL M18</name>
    <url>/vi/classic-fullface-royal-m18</url>
    <image>https://fanfan.vn/Uploads/t/fa/fanfan0non-bao-hiem-cafe-racer-classic-fullface-royal-m18-5_0061169_235.jpg</image>
    <price>1.100.000 ₫</price>
    <producer>ROYAL HELMET</producer>
    <store>FanFan</store>
</product>
<product type="helmet">
    <name>Classic FULLFACE ROYAL M18</name>
    <url>/vi/classic-fullface-royal-m18</url>
    <image>https://fanfan.vn/Uploads/t/fa/fanfan0non-bao-hiem-cafe-racer-classic-fullface-royal-m18-5_0061169_235.jpg</image>
    <price>1.100.000 ₫</price>
    <producer>ROYAL HELMET</producer>
    <store>FanFan</store>
</product>
<product type="helmet">
    <name>Classic FULLFACE ROYAL M18</name>
    <url>/vi/classic-fullface-royal-m18</url>
    <image>https://fanfan.vn/Uploads/t/fa/fanfan0non-bao-hiem-cafe-racer-classic-fullface-royal-m18-5_0061169_235.jpg</image>
    <price>1.100.000 ₫</price>
    <producer>ROYAL HELMET</producer>
    <store>FanFan</store>
</product>

我尝试了很多方法,但现在没有希望了。有谁知道为什么?请帮忙。 尽管我在特定上下文中使用它们,但我无法找出 xpath 在这种情况下的工作原理?

【问题讨论】:

    标签: java xml xpath


    【解决方案1】:

    我想问题出现在 for 循环内的 fetchFanFanData() 方法中 通过访问 url、name 等的值。在这里,您必须将“//”替换为“.//” 对于所有访问,例如替换

     String url = xpath.evaluate("//p[@class=\"name text-uppercase mb-0\"]/a/@href", node, XPathConstants.STRING).toString();
    

     String url = xpath.evaluate(".//p[@class=\"name text-uppercase mb-0\"]/a/@href", node, XPathConstants.STRING).toString();
    

    //”和“.//”的区别是:

    "//para" 选择 [...] 与上下文节点在同一文档中的所有 para 元素

    ".//para" 选择上下文节点的para元素后代

    来自https://www.w3.org/TR/2017/REC-xpath-31-20170321/ 一般,特别是第 3.3.5 章的“示例”部分。并且 https://docs.oracle.com/javase/10/docs/api/javax/xml/xpath/package-summary.html.

    总是得到相同值的原因是:表达式

     "//p[@class=\"name text-uppercase mb-0\"]/a/@href" 
    

    应用于列表的节点还会返回一个列表,其中包含整个文档中的所有匹配项(而不是单个匹配项)。此外,该列表对于每个节点都是相同的。结合返回类型 XPathConstants.STRING 总是选择第一个(相同的)命中。因此,每个节点都返回相同的结果。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2015-05-21
      • 1970-01-01
      • 1970-01-01
      • 2021-05-24
      • 2015-06-29
      • 2011-03-03
      • 1970-01-01
      相关资源
      最近更新 更多