【问题标题】:PHP - parsing xml which has namespace elementsPHP - 解析具有命名空间元素的 xml
【发布时间】:2018-01-25 00:58:27
【问题描述】:

我阅读了其他帖子和解决方案,但它们对我不起作用 - 或者我可能对它们的理解不够充分。

我有一个 hp 网络扫描仪,并且有一个 perl 脚本,它通过一系列事务进行交互,以便我可以启动扫描。我正在努力将其直接移植到 php;更适合我要运行它的服务器。有些交易有效,有些则无效。这是关于一个没有的。

我从其中一个查询中获取了 XML,但它无法成功解析(或者这就是我不太了解它的地方)。我正在运行 php 版本 7.1.12,以防有与此相关的内容。

我的测试输出如下:

> php xmltest.php
SimpleXMLElement Object
(
)
object(SimpleXMLElement)#1 (0) {
}
>

如果 xml 更简单(我认为没有命名空间信息),那么 print_r() 会非常冗长。

这是完整的测试脚本,其中包含一些要处理的实际数据

error_reporting( E_ALL );
ini_set('display_errors', 1);

$test_1 = <<<EOM
<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope 
    xmlns:SOAP-ENV="http://www.w3.org/2003/05/soap-envelope"
    xmlns:SOAP-ENC="http://www.w3.org/2003/05/soap-encoding"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema"
    xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/08/addressing"
    xmlns:wst="http://schemas.xmlsoap.org/ws/2004/09/transfer"
    xmlns:mex="http://schemas.xmlsoap.org/ws/2004/09/mex"
    xmlns:wsdp="http://schemas.xmlsoap.org/ws/2006/02/devprof"
    xmlns:PNPX="http://schemas.microsoft.com/windows/pnpx/2005/10"
    xmlns:UNS1="http://www.microsoft.com/windows/test/testdevice/11/2005"
    xmlns:dd="http://www.hp.com/schemas/imaging/con/dictionaries/1.0"
    xmlns:wprt="http://schemas.microsoft.com/windows/2006/08/wdp/print"
    xmlns:wscn="http://schemas.microsoft.com/windows/2006/08/wdp/scan">
    <SOAP-ENV:Header>
        <wsa:To>http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous</wsa:To>
        <wsa:Action>http://schemas.xmlsoap.org/ws/2004/09/transfer/GetResponse</wsa:Action>
        <wsa:MessageID>urn:uuid:fec6e42d-5356-1f69-9c3a-001f2927cf33</wsa:MessageID>
        <wsa:RelatesTo>urn:uuid:704ccde5-6861-415d-bd65-31dd9d7a8b98</wsa:RelatesTo>
    </SOAP-ENV:Header>
    <SOAP-ENV:Body>
        <mex:Metadata>
            <mex:MetadataSection Dialect="http://schemas.xmlsoap.org/ws/2006/02/devprof/ThisDevice">
                <wsdp:ThisDevice>
                    <wsdp:FriendlyName xml:lang="en">Printer (HP Color LaserJet CM1312nfi MFP)</wsdp:FriendlyName>
                    <wsdp:FirmwareVersion>20140625</wsdp:FirmwareVersion>
                    <wsdp:SerialNumber>CNB885H665</wsdp:SerialNumber>
                </wsdp:ThisDevice>
            </mex:MetadataSection>
            <mex:MetadataSection Dialect="http://schemas.xmlsoap.org/ws/2006/02/devprof/ThisModel">
                <wsdp:ThisModel>
                    <wsdp:Manufacturer xml:lang="en">HP</wsdp:Manufacturer>
                    <wsdp:ManufacturerUrl>http://www.hp.com/</wsdp:ManufacturerUrl>
                    <wsdp:ModelName xml:lang="en">HP Color LaserJet CM1312nfi MFP</wsdp:ModelName>
                    <wsdp:ModelNumber>CM1312nfi MFP</wsdp:ModelNumber>
                    <wsdp:PresentationUrl>http://192.168.1.20:80/</wsdp:PresentationUrl>
                    <PNPX:DeviceCategory>Printers</PNPX:DeviceCategory>
                </wsdp:ThisModel>
            </mex:MetadataSection>
            <mex:MetadataSection Dialect="http://schemas.xmlsoap.org/ws/2006/02/devprof/Relationship">
                <wsdp:Relationship Type="http://schemas.xmlsoap.org/ws/2006/02/devprof/host">
                    <wsdp:Hosted>
                        <wsa:EndpointReference>
                            <wsa:Address>http://192.168.1.20:3910/</wsa:Address>
                            <wsa:ReferenceProperties>
                                <UNS1:ServiceIdentifier>uri:prn</UNS1:ServiceIdentifier>
                            </wsa:ReferenceProperties>
                        </wsa:EndpointReference>
                        <wsdp:Types>wprt:PrinterServiceType</wsdp:Types>
                        <wsdp:ServiceId>uri:1cd4F16e-7c8a-a7a0-3797-00145a8827ce</wsdp:ServiceId>
                        <PNPX:CompatibleId>http://schemas.microsoft.com/windows/2006/08/wdp/print/PrinterServiceType</PNPX:CompatibleId>
                    </wsdp:Hosted>
                </wsdp:Relationship>
            </mex:MetadataSection>
        </mex:Metadata>
    </SOAP-ENV:Body>
</SOAP-ENV:Envelope>
EOM;

$myxml1 = simplexml_load_string($test_1);
print_r($myxml1);
var_dump($myxml1);
exit;
?>

里面有几个我想提取的参数。一个,例如是:

<wsa:Address>http://192.168.1.20:3910/</wsa:Address>

您能帮我缩小关于如何访问此参数的知识差距吗?

谢谢!

【问题讨论】:

  • 正如您所说,关于使用 PHP 从包含命名空间的 XML 中获取元素还有许多其他问题。如果您可以链接到您尝试关注的链接,请展示您尝试调整该链接中的解决方案以获取wsa:Address,这将为我们提供您所知道的内容以及实际完成任务的差距是什么,并且因此我们会更容易提供帮助

标签: php xml parsing xml-namespaces


【解决方案1】:

首先,soap 和命名空间只会使解析 XML 变得比原来更难。我从未解析过具有实际上使 XML 更易于理解的名称空间的 XML,或者根本没有任何好处。我完全明白为什么存在命名空间,但这只是意味着跳过一些额外的障碍来获取数据。命名空间的诀窍在于,您必须通过将命名空间作为子项询问来“进入”命名空间分支。

<?php

error_reporting( E_ALL );
ini_set('display_errors', 1);

$str = <<<EOM
<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope 
    xmlns:SOAP-ENV="http://www.w3.org/2003/05/soap-envelope"
    xmlns:SOAP-ENC="http://www.w3.org/2003/05/soap-encoding"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema"
    xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/08/addressing"
    xmlns:wst="http://schemas.xmlsoap.org/ws/2004/09/transfer"
    xmlns:mex="http://schemas.xmlsoap.org/ws/2004/09/mex"
    xmlns:wsdp="http://schemas.xmlsoap.org/ws/2006/02/devprof"
    xmlns:PNPX="http://schemas.microsoft.com/windows/pnpx/2005/10"
    xmlns:UNS1="http://www.microsoft.com/windows/test/testdevice/11/2005"
    xmlns:dd="http://www.hp.com/schemas/imaging/con/dictionaries/1.0"
    xmlns:wprt="http://schemas.microsoft.com/windows/2006/08/wdp/print"
    xmlns:wscn="http://schemas.microsoft.com/windows/2006/08/wdp/scan">
    <SOAP-ENV:Header>
        <wsa:To>http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous</wsa:To>
        <wsa:Action>http://schemas.xmlsoap.org/ws/2004/09/transfer/GetResponse</wsa:Action>
        <wsa:MessageID>urn:uuid:fec6e42d-5356-1f69-9c3a-001f2927cf33</wsa:MessageID>
        <wsa:RelatesTo>urn:uuid:704ccde5-6861-415d-bd65-31dd9d7a8b98</wsa:RelatesTo>
    </SOAP-ENV:Header>
    <SOAP-ENV:Body>
        <mex:Metadata>
            <mex:MetadataSection Dialect="http://schemas.xmlsoap.org/ws/2006/02/devprof/ThisDevice">
                <wsdp:ThisDevice>
                    <wsdp:FriendlyName xml:lang="en">Printer (HP Color LaserJet CM1312nfi MFP)</wsdp:FriendlyName>
                    <wsdp:FirmwareVersion>20140625</wsdp:FirmwareVersion>
                    <wsdp:SerialNumber>CNB885H665</wsdp:SerialNumber>
                </wsdp:ThisDevice>
            </mex:MetadataSection>
            <mex:MetadataSection Dialect="http://schemas.xmlsoap.org/ws/2006/02/devprof/ThisModel">
                <wsdp:ThisModel>
                    <wsdp:Manufacturer xml:lang="en">HP</wsdp:Manufacturer>
                    <wsdp:ManufacturerUrl>http://www.hp.com/</wsdp:ManufacturerUrl>
                    <wsdp:ModelName xml:lang="en">HP Color LaserJet CM1312nfi MFP</wsdp:ModelName>
                    <wsdp:ModelNumber>CM1312nfi MFP</wsdp:ModelNumber>
                    <wsdp:PresentationUrl>http://192.168.1.20:80/</wsdp:PresentationUrl>
                    <PNPX:DeviceCategory>Printers</PNPX:DeviceCategory>
                </wsdp:ThisModel>
            </mex:MetadataSection>
            <mex:MetadataSection Dialect="http://schemas.xmlsoap.org/ws/2006/02/devprof/Relationship">
                <wsdp:Relationship Type="http://schemas.xmlsoap.org/ws/2006/02/devprof/host">
                    <wsdp:Hosted>
                        <wsa:EndpointReference>
                            <wsa:Address>http://192.168.1.20:3910/</wsa:Address>
                            <wsa:ReferenceProperties>
                                <UNS1:ServiceIdentifier>uri:prn</UNS1:ServiceIdentifier>
                            </wsa:ReferenceProperties>
                        </wsa:EndpointReference>
                        <wsdp:Types>wprt:PrinterServiceType</wsdp:Types>
                        <wsdp:ServiceId>uri:1cd4F16e-7c8a-a7a0-3797-00145a8827ce</wsdp:ServiceId>
                        <PNPX:CompatibleId>http://schemas.microsoft.com/windows/2006/08/wdp/print/PrinterServiceType</PNPX:CompatibleId>
                    </wsdp:Hosted>
                </wsdp:Relationship>
            </mex:MetadataSection>
        </mex:Metadata>
    </SOAP-ENV:Body>
</SOAP-ENV:Envelope>
EOM;

$xml = simplexml_load_string($str);

$namespaces = $xml->getNamespaces(true);

// Here we are saying that we want the Body node in the SOAP-ENV namespace
$body = $xml->children( $namespaces['SOAP-ENV'] )->Body;

// Inside that Body node, we want to get into the mex namespace
$mex = $body->children( $namespaces['mex'] );

// We want the MetadataSections that are in each of the mex namespaces
$metadataSections = $mex->Metadata->MetadataSection;

// Loop through each of the MetadataSections
foreach( $metadataSections as $meta )
{
    // Get inside the wsdp namespace
    $wsdp = $meta->children( $namespaces['wsdp'] );

    // Check if there is a Hosted node inside a Relationship node
    if( isset( $wsdp->Relationship->Hosted ) )
    {
        // Get the wsa namespace inside the Hosted node
        $wsa = $wsdp->Relationship->Hosted->children( $namespaces['wsa'] );

        // If there is an Address inside the EndpointReference node
        if( isset( $wsa->EndpointReference->Address ) )
        {
            // Then output it
            echo $wsa->EndpointReference->Address;
        }
    }
}

【讨论】:

  • 这让我得到了我所需要的,并且很容易推断出未来一些更复杂的场景。
【解决方案2】:

作为一个非常简单的示例 - 如果您只想要 wsa:Address 元素...

$myxml1 = simplexml_load_string($test_1);
$myxml1->registerXPathNamespace("wsa", "http://schemas.xmlsoap.org/ws/2004/08/addressing");
echo "wsa:Address=".(string)$myxml1->xpath("//wsa:Address")[0];

这只是确保wsa 命名空间已在文档中注册并且可用于XPath 表达式。然后 XPath 表达式只是说 - 从文档中的任何位置获取元素 wsa:Address。但由于xpath 返回所有匹配项的列表(即使只有1 个),所以使用[0] 获取第一项。这输出...

wsa:Address=http://192.168.1.20:3910/

如果您需要更多关于(例如)&lt;wsdp:Hosted&gt; 元素的数据,您可以执行类似...

$myxml1 = simplexml_load_string($test_1);
$myxml1->registerXPathNamespace("wsdp", "http://schemas.xmlsoap.org/ws/2006/02/devprof");
$hosted = $myxml1->xpath("//wsdp:Hosted")[0];
$hostedWSA = $hosted->children("wsa", true);
echo "wsa:Address=".(string)$hostedWSA->EndpointReference->Address.PHP_EOL;
$hostedWSPD = $hosted->children("wsdp", true);
echo "wsdp:Types=".(string)$hostedWSPD->Types.PHP_EOL;

因此,首先要获取正确的元素,然后使用该节点内不同命名空间中的各种子节点。

【讨论】:

  • 这个例子看起来对“外科”需求很有用,其中需要单个元素 - 正如您所指出的那样。
  • 第二个示例显示了如何获取一组数据的起点 - 'wsdp:Hosted'。最重要的是它独立于封闭数据的结构。即使他们更改了随附的文档,它仍然可以工作。当您尝试逐级遍历复杂结构时,它会随着时间的推移而发生变化,突然代码无法找到数据。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2018-06-26
  • 1970-01-01
  • 2012-06-11
  • 1970-01-01
  • 2014-06-14
  • 1970-01-01
  • 2013-02-18
相关资源
最近更新 更多