【问题标题】:Flatten XML structure in PHP and add all the values into an array在 PHP 中展平 XML 结构并将所有值添加到数组中
【发布时间】:2023-11-12 04:19:01
【问题描述】:

我有一些浏览节点从 Amazon API 返回为 XML,如下所示。我怎样才能穿过这个烂摊子/弄平它并提取出我需要的数据。这是输入:

object(SimpleXMLElement)#72 (1) {
  ["BrowseNode"]=>
  array(2) {
    [0]=>
    object(SimpleXMLElement)#73 (3) {
      ["BrowseNodeId"]=>
      string(10) "1342630031"
      ["Name"]=>
      string(8) "Chargers"
      ["Ancestors"]=>
      object(SimpleXMLElement)#75 (1) {
        ["BrowseNode"]=>
        object(SimpleXMLElement)#76 (3) {
          ["BrowseNodeId"]=>
          string(9) "389516011"
          ["Name"]=>
          string(11) "Accessories"
          ["Ancestors"]=>
          object(SimpleXMLElement)#77 (1) {
            ["BrowseNode"]=>
            object(SimpleXMLElement)#78 (3) {
              ["BrowseNodeId"]=>
              string(9) "389514011"
              ["Name"]=>
              string(38) "Sat Nav, GPS, Navigation & Accessories"
              ["Ancestors"]=>
              object(SimpleXMLElement)#79 (1) {
                ["BrowseNode"]=>
                object(SimpleXMLElement)#80 (4) {
                  ["BrowseNodeId"]=>
                  string(6) "560800"
                  ["Name"]=>
                  string(10) "Categories"
                  ["IsCategoryRoot"]=>
                  string(1) "1"
                  ["Ancestors"]=>
                  object(SimpleXMLElement)#81 (1) {
                    ["BrowseNode"]=>
                    object(SimpleXMLElement)#82 (2) {
                      ["BrowseNodeId"]=>
                      string(6) "560798"
                      ["Name"]=>
                      string(19) "Electronics & Photo"
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
    [1]=>
    object(SimpleXMLElement)#74 (3) {
      ["BrowseNodeId"]=>
      string(9) "340328031"
      ["Name"]=>
      string(12) "Car Chargers"
      ["Ancestors"]=>
      object(SimpleXMLElement)#75 (1) {
        ["BrowseNode"]=>
        object(SimpleXMLElement)#76 (3) {
          ["BrowseNodeId"]=>
          string(9) "340327031"
          ["Name"]=>
          string(8) "Chargers"
          ["Ancestors"]=>
          object(SimpleXMLElement)#77 (1) {
            ["BrowseNode"]=>
            object(SimpleXMLElement)#78 (3) {
              ["BrowseNodeId"]=>
              string(6) "560826"
              ["Name"]=>
              string(11) "Accessories"
              ["Ancestors"]=>
              object(SimpleXMLElement)#79 (1) {
                ["BrowseNode"]=>
                object(SimpleXMLElement)#80 (3) {
                  ["BrowseNodeId"]=>
                  string(10) "1340509031"
                  ["Name"]=>
                  string(29) "Mobile Phones & Communication"
                  ["Ancestors"]=>
                  object(SimpleXMLElement)#81 (1) {
                    ["BrowseNode"]=>
                    object(SimpleXMLElement)#82 (4) {
                      ["BrowseNodeId"]=>
                      string(6) "560800"
                      ["Name"]=>
                      string(10) "Categories"
                      ["IsCategoryRoot"]=>
                      string(1) "1"
                      ["Ancestors"]=>
                      object(SimpleXMLElement)#83 (1) {
                        ["BrowseNode"]=>
                        object(SimpleXMLElement)#84 (2) {
                          ["BrowseNodeId"]=>
                          string(6) "560798"
                          ["Name"]=>
                          string(19) "Electronics & Photo"
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

我想遍历它并将其展平成一个我可以使用的结构,如下所示:

array(

    (1342630031,'Chargers'),

    (389516011,'Accessories'),

    (389514011,'Sat Nav, GPS, Navigation & Accessories'),

    (560800,'Categories'),

    (560798,'Electronics & Photo'),

    (340328031,'Car Chargers'),

    (340327031,'Chargers'),

    (560826,'Accessories'),

    (1340509031,'Mobile Phones & Communication'),

    (560800,'Categories'),

    (560798,'Electronics & Photo')

)

这样我就可以:

回声 $array[0][0];

回声 $array[0][1];

回声 $array[5][1];

这会给:

1342630031

充电器

电子与照片

等等……

如果有帮助,这里是原始 XML

    <?xml version="1.0" encoding="UTF-8"?>
<BrowseNodes>
   <BrowseNode>
      <BrowseNodeId>1342630031</BrowseNodeId>
      <Name>Chargers</Name>
      <Ancestors>
         <BrowseNode>
            <BrowseNodeId>389516011</BrowseNodeId>
            <Name>Accessories</Name>
            <Ancestors>
               <BrowseNode>
                  <BrowseNodeId>389514011</BrowseNodeId>
                  <Name>Sat Nav, GPS, Navigation &amp; Accessories</Name>
                  <Ancestors>
                     <BrowseNode>
                        <BrowseNodeId>560800</BrowseNodeId>
                        <Name>Categories</Name>
                        <IsCategoryRoot>1</IsCategoryRoot>
                        <Ancestors>
                           <BrowseNode>
                              <BrowseNodeId>560798</BrowseNodeId>
                              <Name>Electronics &amp; Photo</Name>
                           </BrowseNode>
                        </Ancestors>
                     </BrowseNode>
                  </Ancestors>
               </BrowseNode>
            </Ancestors>
         </BrowseNode>
      </Ancestors>
   </BrowseNode>
   <BrowseNode>
      <BrowseNodeId>340328031</BrowseNodeId>
      <Name>Car Chargers</Name>
      <Ancestors>
         <BrowseNode>
            <BrowseNodeId>340327031</BrowseNodeId>
            <Name>Chargers</Name>
            <Ancestors>
               <BrowseNode>
                  <BrowseNodeId>560826</BrowseNodeId>
                  <Name>Accessories</Name>
                  <Ancestors>
                     <BrowseNode>
                        <BrowseNodeId>1340509031</BrowseNodeId>
                        <Name>Mobile Phones &amp; Communication</Name>
                        <Ancestors>
                           <BrowseNode>
                              <BrowseNodeId>560800</BrowseNodeId>
                              <Name>Categories</Name>
                              <IsCategoryRoot>1</IsCategoryRoot>
                              <Ancestors>
                                 <BrowseNode>
                                    <BrowseNodeId>560798</BrowseNodeId>
                                    <Name>Electronics &amp; Photo</Name>
                                 </BrowseNode>
                              </Ancestors>
                           </BrowseNode>
                        </Ancestors>
                     </BrowseNode>
                  </Ancestors>
               </BrowseNode>
            </Ancestors>
         </BrowseNode>
      </Ancestors>
   </BrowseNode>
</BrowseNodes>

【问题讨论】:

标签: php xml multidimensional-array flatten


【解决方案1】:

使用 Xpath 是从 XML 文档中读取数据的最简单方法。您使用一个表达式来迭代项目,并使用几个来提取每个项目的数据。

$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);

$result = [];
foreach($xpath->evaluate('//BrowseNode[BrowseNodeId]') as $browseNode) {
  $id = $xpath->evaluate('string(BrowseNodeId)', $browseNode);
  if (array_key_exists($id, $result)) {
    continue;
  }
  $result[$id] = [
    'id' => $id,
    'name' => $xpath->evaluate('string(Name)', $browseNode)
  ];
}

var_dump($result);

输出:

array(9) {
  [1342630031]=>
  array(2) {
    ["id"]=>
    string(10) "1342630031"
    ["name"]=>
    string(8) "Chargers"
  }
  [389516011]=>
  array(2) {
    ["id"]=>
    string(9) "389516011"
    ["name"]=>
    string(11) "Accessories"
  }
  ...
}

//BrowseNode[BrowseNodeId] 获取文档中具有子节点BrowseNodeId 的任何BrowseNode 元素。 string(BrowseNodeId) 在节点上下文中执行,它返回所有 BrowseNodeId 子节点并将第一个子节点转换为字符串(如果未找到节点,则为空字符串)。

通过使用 id 作为数组的键,将消除重复。

【讨论】:

    【解决方案2】:

    这有点难看,但将其扁平化为我可以使用的结构,不是我想要的输出,但可能足够接近使用。

    $json = json_encode($xml);
    
    $array = json_decode($json,TRUE);
    
    $it = new RecursiveIteratorIterator(new RecursiveArrayIterator($array));
    
    foreach($it as $v) {
    
        $values[] = $v;
    
    }
    

    【讨论】:

      【解决方案3】:
      $DOM = new DOMDocument();
      $DOM->loadHTML($xml);
      
      $XPATH = new DOMXpath($DOM);
      
      // Gets all BrowseNodeId anywhere within the document
      $r = $XPATH->query("//BrowseNodeId");
      
      // Gets only BrowseNodeIds that re directly below a BrowseNodes and then a BrowseNodes
      $r = $XPATH->query("/BrowseNodes/BrowseNode/BrowseNodeId");
      

      您可能希望使用第一个 Xpath 查询来获取所有 Ids 元素。

      $r = $XPATH->query("//BrowseNodeId");
      
      foreach ($r as $element) { // $element will be a DOMElement object
           $original = $element;
           while($element->nextSibling != null) { 
                if("Name" == $element->tagName) {
                      echo "The ID for " . $element->nodeValue . " is " . $original->nodeValue;
                }
                $element = $element->nextSibling;
           }
      }
      

      这至少给了你一个开始/想法。

      它未经测试。

      【讨论】:

        【解决方案4】:

        考虑XSLT 将源 XML 展平,然后循环遍历结果以填充您的数组:

        // Load the XML source and XSLT string
        $doc = simplexml_load_file('Input.xml');
        
        $xslstr = '<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
                     <xsl:output version="1.0" encoding="UTF-8" indent="yes" />
                     <xsl:strip-space elements="*"/>      
                     <xsl:template match="/BrowseNodes">
                        <xsl:copy>            
                           <xsl:apply-templates select="descendant::BrowseNodeId"/>
                        </xsl:copy>
                     </xsl:template>      
                     <xsl:template match="BrowseNodeId">
                        <data>            
                            <xsl:copy-of select="."/>
                            <xsl:copy-of select="following-sibling::Name"/>
                        </data>
                    </xsl:template>  
                  </xsl:transform>';
        $xsl = new SimpleXMLElement($xslstr);
        
        // Configure and run the transformer
        $proc = new XSLTProcessor;
        $proc->importStyleSheet($xsl); 
        $newXML = $proc->transformToXML($doc);
        
        // Populate flattened array
        $output = new SimpleXMLElement($newXML);
        
        values = [];
        foreach ($output->data as $line){
            $inner = [];
            $inner[] = (string)$line->BrowseNodeId;
            $inner[] = (string)$line->Name;
            $values[] = $inner;
        }
        

        新 XML

        <?xml version="1.0" encoding="UTF-8"?>
        <BrowseNodes>
          <data>
            <BrowseNodeId>1342630031</BrowseNodeId>
            <Name>Chargers</Name>
          </data>
          <data>
            <BrowseNodeId>389516011</BrowseNodeId>
            <Name>Accessories</Name>
          </data>
          <data>
            <BrowseNodeId>389514011</BrowseNodeId>
            <Name>Sat Nav, GPS, Navigation &amp; Accessories</Name>
          </data>
          <data>
            <BrowseNodeId>560800</BrowseNodeId>
            <Name>Categories</Name>
          </data>
          <data>
            <BrowseNodeId>560798</BrowseNodeId>
            <Name>Electronics &amp; Photo</Name>
          </data>
          <data>
            <BrowseNodeId>340328031</BrowseNodeId>
            <Name>Car Chargers</Name>
          </data>
          <data>
            <BrowseNodeId>340327031</BrowseNodeId>
            <Name>Chargers</Name>
          </data>
          <data>
            <BrowseNodeId>560826</BrowseNodeId>
            <Name>Accessories</Name>
          </data>
          <data>
            <BrowseNodeId>1340509031</BrowseNodeId>
            <Name>Mobile Phones &amp; Communication</Name>
          </data>
          <data>
            <BrowseNodeId>560800</BrowseNodeId>
            <Name>Categories</Name>
          </data>
          <data>
            <BrowseNodeId>560798</BrowseNodeId>
            <Name>Electronics &amp; Photo</Name>
          </data>
        </BrowseNodes>
        

        值数组

        array(11) {
          [0]=>
          array(2) {
            [0]=>
            string(10) "1342630031"
            [1]=>
            string(8) "Chargers"
          }
          [1]=>
          array(2) {
            [0]=>
            string(9) "389516011"
            [1]=>
            string(11) "Accessories"
          }
          [2]=>
          array(2) {
            [0]=>
            string(9) "389514011"
            [1]=>
            string(38) "Sat Nav, GPS, Navigation & Accessories"
          }
          [3]=>
          array(2) {
            [0]=>
            string(6) "560800"
            [1]=>
            string(10) "Categories"
          }
          [4]=>
          array(2) {
            [0]=>
            string(6) "560798"
            [1]=>
            string(19) "Electronics & Photo"
          }
          [5]=>
          array(2) {
            [0]=>
            string(9) "340328031"
            [1]=>
            string(12) "Car Chargers"
          }
          [6]=>
          array(2) {
            [0]=>
            string(9) "340327031"
            [1]=>
            string(8) "Chargers"
          }
          [7]=>
          array(2) {
            [0]=>
            string(6) "560826"
            [1]=>
            string(11) "Accessories"
          }
          [8]=>
          array(2) {
            [0]=>
            string(10) "1340509031"
            [1]=>
            string(29) "Mobile Phones & Communication"
          }
          [9]=>
          array(2) {
            [0]=>
            string(6) "560800"
            [1]=>
            string(10) "Categories"
          }
          [10]=>
          array(2) {
            [0]=>
            string(6) "560798"
            [1]=>
            string(19) "Electronics & Photo"
          }
        }
        

        【讨论】:

          最近更新 更多