【问题标题】:php- XML import with simplexmlphp- 使用 simplexml 导入 XML
【发布时间】:2015-09-23 18:38:42
【问题描述】:

我想设置一个脚本来在我的数据库中导入一个 XML 文件。 我的问题是,我不知道应该如何以一种聪明的方式编写导入,以便 PHP 脚本识别每个子信息。 谁能帮帮我?

<books>
   <book attribute="123" attribute2="12345">
    <basic_information>
        <name addition="fooobar">fooobar</name>
        <book_genre>
            <genre>Action</genre>
            <genre>Thriller</genre>
        </book_genre>
        <languages>
            <language>Deutsch</language>
            <language>Englisch</language>
            <language>Polnisch</language>
            <language>Russisch</language>
        </languages>
    </basic_information>
    <author_information>
        <name addition="fooabr">Mr_Ed</name>
    </author_information>
</book>
<book attribute="123" attribute2="12345">
    <basic_information>
        <name addition="fooobar">fooobar</name>

        <genres>
            <genre>Action</genre>
            <genre>Thriller</genre>
        </genres>
        <languages>
            <language>Deutsch</language>
            <language>Englisch</language>
            <language>Polnisch</language>
            <language>Russisch</language>
        </languages>
    </basic_information>
    <author_information>
        <name addition="fooabr">Mr_Ed</name>
    </author_information>
</book>

【问题讨论】:

  • &lt;genre1&gt;Action&lt;/genre1&gt;&lt;genre2&gt;Thriller&lt;/genre2&gt; 严重吗?!?谁在 XML 中以这种方式对元素进行编号?
  • 它只是一个示例,对此感到抱歉。我改了
  • 我不知道如何打开每个元素并导入或显示它
  • 在下面查看我的答案。
  • 你的问题在这里有两个折叠。您还应该在此处提供数据库的结构。否则,我强烈建议您使用基于 XML 的数据库。这样的数据库确实存在。这里还有一个类似的问题,可能会给你一些想法:Get XML tags from asXML()

标签: php xml database import simplexml


【解决方案1】:

尽管每个 XML 文件本身都可以代表一个数据库,但 XML 和关系 SQL 数据库之间通常存在两个根本区别。

最明显的一个是模式。您在问题中提供的 XML 根本没有架构。根据定义,SQL 数据库具有架构。

您的 XML 不仅没有架构,您甚至没有分享任何关于它的含义的信息。所以最聪明的做法是完全忽略这里的任何模式。

所以给你一个一个例子,然后你的问题中的 XML 可以如何转换为数据库表。您可以创建一个包含两列的数据库表:PathValue。然后你可以决定把所有的属性和叶子文本节点放在那里:

+-------------------------------------------------------------+--------+
|path                                                         |value   |
+-------------------------------------------------------------+--------+
|/books/book[1]/@attribute                                    |123     |
+-------------------------------------------------------------+--------+
|/books/book[1]/@attribute2                                   |12345   |
+-------------------------------------------------------------+--------+
|/books/book[1]/basic_information/name/@addition              |fooobar |
+-------------------------------------------------------------+--------+
|/books/book[1]/basic_information/name/text()                 |fooobar |
+-------------------------------------------------------------+--------+
|/books/book[1]/basic_information/book_genre/genre[1]/text()  |Action  |
+-------------------------------------------------------------+--------+
|/books/book[1]/basic_information/book_genre/genre[2]/text()  |Thriller|
+-------------------------------------------------------------+--------+
|/books/book[1]/basic_information/languages/language[1]/text()|Deutsch |
+-------------------------------------------------------------+--------+
|/books/book[1]/basic_information/languages/language[2]/text()|Englisch|
+-------------------------------------------------------------+--------+
|/books/book[1]/basic_information/languages/language[3]/text()|Polnisch|
+-------------------------------------------------------------+--------+
|/books/book[1]/basic_information/languages/language[4]/text()|Russisch|
+-------------------------------------------------------------+--------+
|/books/book[1]/author_information/name/@addition             |fooabr  |
+-------------------------------------------------------------+--------+
|/books/book[1]/author_information/name/text()                |Mr_Ed   |
+-------------------------------------------------------------+--------+
|/books/book[2]/@attribute                                    |123     |
+-------------------------------------------------------------+--------+
|/books/book[2]/@attribute2                                   |12345   |
+-------------------------------------------------------------+--------+
|/books/book[2]/basic_information/name/@addition              |fooobar |
+-------------------------------------------------------------+--------+
|/books/book[2]/basic_information/name/text()                 |fooobar |
+-------------------------------------------------------------+--------+
|/books/book[2]/basic_information/genres/genre[1]/text()      |Action  |
+-------------------------------------------------------------+--------+
|/books/book[2]/basic_information/genres/genre[2]/text()      |Thriller|
+-------------------------------------------------------------+--------+
|/books/book[2]/basic_information/languages/language[1]/text()|Deutsch |
+-------------------------------------------------------------+--------+
|/books/book[2]/basic_information/languages/language[2]/text()|Englisch|
+-------------------------------------------------------------+--------+
|/books/book[2]/basic_information/languages/language[3]/text()|Polnisch|
+-------------------------------------------------------------+--------+
|/books/book[2]/basic_information/languages/language[4]/text()|Russisch|
+-------------------------------------------------------------+--------+
|/books/book[2]/author_information/name/@addition             |fooabr  |
+-------------------------------------------------------------+--------+
|/books/book[2]/author_information/name/text()                |Mr_Ed   |
+-------------------------------------------------------------+--------+ 

使用支持 Xpath 查询(如 the dom extension in PHP)的 XML 解析器创建此类转换非常简单:

$doc    = new DOMDocument();
$result = $doc->loadXML($buffer);
if (!$result) {
    throw new UnexpectedValueException('Could not load XML');
}
$xpath = new DOMXPath($doc);

$nodes = $xpath->query('(//@*|(.|.//*)[not(*)]/text())');

$table = [['path', 'value']];

foreach ($nodes as $node) {
    /** @var DOMNode $node */
    $path    = $node->getNodePath();
    $value   = $node->nodeValue;
    $table[] = [$path, $value];
}

echo new TextTable($table);

但此类数据尚未标准化。显然有重复的值。它们似乎很容易成为获得更多规范化的第一个目标。例如,对于跟踪价值身份的商店:

$values = new IdentityStore('value');
$table = [['path', $values->getKey()]];

foreach ($nodes as $node) {
    /** @var DOMNode $node */
    $path  = $node->getNodePath();
    $value = $values->add($node->nodeValue);

    $table[] = [$path, $value];
}

echo new TextTable($table);
echo new TextTable($values);

然后将值更改为它们的 ID:

+-------------------------------------------------------------+--------+
|path                                                         |value_id|
+-------------------------------------------------------------+--------+
|/books/book[1]/@attribute                                    |1       |
+-------------------------------------------------------------+--------+
|/books/book[1]/@attribute2                                   |2       |
+-------------------------------------------------------------+--------+
|/books/book[1]/basic_information/name/@addition              |3       |
+-------------------------------------------------------------+--------+
|/books/book[1]/basic_information/name/text()                 |3       |
+-------------------------------------------------------------+--------+
|/books/book[1]/basic_information/book_genre/genre[1]/text()  |4       |
+-------------------------------------------------------------+--------+
...

并给它们自己的值表:

+--------+--------+
|value_id|value   |
+--------+--------+
|1       |123     |
+--------+--------+
|2       |12345   |
+--------+--------+
|3       |fooobar |
+--------+--------+
|4       |Action  |
+--------+--------+
|5       |Thriller|
+--------+--------+
|6       |Deutsch |
+--------+--------+
|7       |Englisch|
+--------+--------+
|8       |Polnisch|
+--------+--------+
|9       |Russisch|
+--------+--------+
|10      |fooabr  |
+--------+--------+
|11      |Mr_Ed   |
+--------+--------+

这本身看起来并没有多大帮助。即使现在值已经标准化,如何映射路径而不是值可能更有趣。

路径对表名进行了编码。每个方括号表示表中的一个记录集,由它之前的路径表示。如果该表在前缀表的另一个记录集中,则这将构成一个关系。

所以这也可能是一个有趣的方法:

$tables = new PathTables();
foreach ($nodes as $node) {
    /** @var DOMNode $node */
    $path = $node->getNodePath();
    $tables->add($path, $node->nodeValue);
}
echo $tables;

但是,这些值并没有被反规范化,并且模式知道是否对值进行分组。记下逗号分隔值的值以注意缺点:

===  books_book  ===

+-------+----------+-----------+--------------------------------+-----------------------------+-------------------------------------------+------------------------------------------------+---------------------------------+------------------------------+---------------------------------------+
|book_id|@attribute|@attribute2|basic_information/name/@addition|basic_information/name/text()|basic_information_book_genre_genre.genre_id|basic_information_languages_language.language_id|author_information/name/@addition|author_information/name/text()|basic_information_genres_genre.genre_id|
+-------+----------+-----------+--------------------------------+-----------------------------+-------------------------------------------+------------------------------------------------+---------------------------------+------------------------------+---------------------------------------+
|1      |123       |12345      |fooobar                         |fooobar                      |1,2                                        |1,2,3,4                                         |fooabr                           |Mr_Ed                         |                                       |
+-------+----------+-----------+--------------------------------+-----------------------------+-------------------------------------------+------------------------------------------------+---------------------------------+------------------------------+---------------------------------------+
|2      |123       |12345      |fooobar                         |fooobar                      |                                           |1,2,3,4                                         |fooabr                           |Mr_Ed                         |1,2                                    |
+-------+----------+-----------+--------------------------------+-----------------------------+-------------------------------------------+------------------------------------------------+---------------------------------+------------------------------+---------------------------------------+

===  basic_information_book_genre_genre  ===

+--------+--------+
|genre_id|text()  |
+--------+--------+
|1       |Action  |
+--------+--------+
|2       |Thriller|
+--------+--------+

===  basic_information_languages_language  ===

+-----------+-----------------+
|language_id|text()           |
+-----------+-----------------+
|1          |Deutsch,Deutsch  |
+-----------+-----------------+
|2          |Englisch,Englisch|
+-----------+-----------------+
|3          |Polnisch,Polnisch|
+-----------+-----------------+
|4          |Russisch,Russisch|
+-----------+-----------------+

===  basic_information_genres_genre  ===

+--------+--------+
|genre_id|text()  |
+--------+--------+
|1       |Action  |
+--------+--------+
|2       |Thriller|
+--------+--------+

因此,无论如何您都会遇到缺少架构的问题。使用 XML 文档和 SQL 数据库的模式,您可以使用定义映射的 xpath 表达式轻松地在两者之间进行映射。

但是没有,它过于复杂。 XML 中的更改将更改您的 SQL 架构。转换错误可能会被忽视,因此唯一直接的方法是将 xpath 路径映射到值。

当然,如何以有用的方式进一步规范化会很有趣,但我想说这对于计算机课程来说比问答网站更重要。进一步查找两个资源,一个专注于数据库技术,另一个是关于在流式传输时将 XML 映射到 SQL 结构:

【讨论】:

    【解决方案2】:

    打开 XML 数据后,每个元素都会以 string-&gt;fieldname 格式加载。

    试试这个:

    $books = simplexml_load_file("xmlfile.xml");
    
    foreach($books->books->book as $book){
    
        $attribute = $book["@attributes"]["attribute"]; //123
        $attribute2 = $book["@attributes"]["attribute2"]; //12345
        $name = $book->basic_information->name; //fooobar
        $name_addition = $book->basic_information->name["@attributes"]["addition"]; //fooobar
        $genres = $book->basic_information->book_genre; //array; $genres[0] = "Action" etc
        $languages = $book->basic_information->languages; //array; $languages[0] = "Deutsch" etc
        $author = $book->author_information->name; //"Mr_Ed"
        $author_addition = $book->author_information->name["@attributes"]["addition"]; //fooabr
    
        //...
    }
    

    【讨论】:

      猜你喜欢
      • 2017-01-11
      • 1970-01-01
      • 1970-01-01
      • 2017-03-03
      • 1970-01-01
      • 2014-04-26
      • 2013-12-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多