好的,我已经编写了扩展 Zend Framework DB 表、行和行集类的 PHP 类。无论如何,我一直在开发这个,因为几周后我将在PHP Tek-X 发表关于分层数据模型的演讲。
我不想将我的所有代码发布到 Stack Overflow,因为如果我这样做,它们会隐含地获得知识共享许可。 更新:我将我的代码提交给Zend Framework extras incubator,我的演示文稿是Models for Hierarchical Data with SQL and PHP at slideshare。
我将用伪代码描述解决方案。我使用动物分类学作为测试数据,从ITIS.gov 下载。表是longnames:
CREATE TABLE `longnames` (
`tsn` int(11) NOT NULL,
`completename` varchar(164) NOT NULL,
PRIMARY KEY (`tsn`),
KEY `tsn` (`tsn`,`completename`)
)
我为分类层次结构中的路径创建了一个闭包表:
CREATE TABLE `closure` (
`a` int(11) NOT NULL DEFAULT '0', -- ancestor
`d` int(11) NOT NULL DEFAULT '0', -- descendant
`l` tinyint(3) unsigned NOT NULL, -- levels between a and d
PRIMARY KEY (`a`,`d`),
CONSTRAINT `closure_ibfk_1` FOREIGN KEY (`a`) REFERENCES `longnames` (`tsn`),
CONSTRAINT `closure_ibfk_2` FOREIGN KEY (`d`) REFERENCES `longnames` (`tsn`)
)
给定一个节点的主键,你可以这样得到它的所有后代:
SELECT d.*, p.a AS `_parent`
FROM longnames AS a
JOIN closure AS c ON (c.a = a.tsn)
JOIN longnames AS d ON (c.d = d.tsn)
LEFT OUTER JOIN closure AS p ON (p.d = d.tsn AND p.l = 1)
WHERE a.tsn = ? AND c.l <= ?
ORDER BY c.l;
closure AS p 的连接是包含每个节点的父 ID。
查询很好地利用了索引:
+----+-------------+-------+--------+---------------+---------+---------+----------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+----------+------+-----------------------------+
| 1 | SIMPLE | a | const | PRIMARY,tsn | PRIMARY | 4 | const | 1 | Using index; Using filesort |
| 1 | SIMPLE | c | ref | PRIMARY,d | PRIMARY | 4 | const | 5346 | Using where |
| 1 | SIMPLE | d | eq_ref | PRIMARY,tsn | PRIMARY | 4 | itis.c.d | 1 | |
| 1 | SIMPLE | p | ref | d | d | 4 | itis.c.d | 3 | |
+----+-------------+-------+--------+---------------+---------+---------+----------+------+-----------------------------+
鉴于我在 longnames 中有 490,032 行,在 closure 中有 4,299,883 行,它运行得非常好:
+--------------------+----------+
| Status | Duration |
+--------------------+----------+
| starting | 0.000257 |
| Opening tables | 0.000028 |
| System lock | 0.000009 |
| Table lock | 0.000013 |
| init | 0.000048 |
| optimizing | 0.000032 |
| statistics | 0.000142 |
| preparing | 0.000048 |
| executing | 0.000008 |
| Sorting result | 0.034102 |
| Sending data | 0.001300 |
| end | 0.000018 |
| query end | 0.000005 |
| freeing items | 0.012191 |
| logging slow query | 0.000008 |
| cleaning up | 0.000007 |
+--------------------+----------+
现在我对上面的 SQL 查询结果进行后处理,根据层次结构将行排序为子集(伪代码):
while ($rowData = fetch()) {
$row = new RowObject($rowData);
$nodes[$row["tsn"]] = $row;
if (array_key_exists($row["_parent"], $nodes)) {
$nodes[$row["_parent"]]->addChildRow($row);
} else {
$top = $row;
}
}
return $top;
我还为行和行集定义了类。 Rowset 基本上是一个行数组。 Row 包含行数据的关联数组,还包含其子项的 Rowset。叶节点的子行集为空。
Rows 和 Rowsets 还定义了称为 toArrayDeep() 的方法,它们将其数据内容递归地转储为普通数组。
然后我可以像这样一起使用整个系统:
// Get an instance of the taxonomy table data gateway
$tax = new Taxonomy();
// query tree starting at Rodentia (id 180130), to a depth of 2
$tree = $tax->fetchTree(180130, 2);
// dump out the array
var_export($tree->toArrayDeep());
输出如下:
array (
'tsn' => '180130',
'completename' => 'Rodentia',
'_parent' => '179925',
'_children' =>
array (
0 =>
array (
'tsn' => '584569',
'completename' => 'Hystricognatha',
'_parent' => '180130',
'_children' =>
array (
0 =>
array (
'tsn' => '552299',
'completename' => 'Hystricognathi',
'_parent' => '584569',
),
),
),
1 =>
array (
'tsn' => '180134',
'completename' => 'Sciuromorpha',
'_parent' => '180130',
'_children' =>
array (
0 =>
array (
'tsn' => '180210',
'completename' => 'Castoridae',
'_parent' => '180134',
),
1 =>
array (
'tsn' => '180135',
'completename' => 'Sciuridae',
'_parent' => '180134',
),
2 =>
array (
'tsn' => '180131',
'completename' => 'Aplodontiidae',
'_parent' => '180134',
),
),
),
2 =>
array (
'tsn' => '573166',
'completename' => 'Anomaluromorpha',
'_parent' => '180130',
'_children' =>
array (
0 =>
array (
'tsn' => '573168',
'completename' => 'Anomaluridae',
'_parent' => '573166',
),
1 =>
array (
'tsn' => '573169',
'completename' => 'Pedetidae',
'_parent' => '573166',
),
),
),
3 =>
array (
'tsn' => '180273',
'completename' => 'Myomorpha',
'_parent' => '180130',
'_children' =>
array (
0 =>
array (
'tsn' => '180399',
'completename' => 'Dipodidae',
'_parent' => '180273',
),
1 =>
array (
'tsn' => '180360',
'completename' => 'Muridae',
'_parent' => '180273',
),
2 =>
array (
'tsn' => '180231',
'completename' => 'Heteromyidae',
'_parent' => '180273',
),
3 =>
array (
'tsn' => '180213',
'completename' => 'Geomyidae',
'_parent' => '180273',
),
4 =>
array (
'tsn' => '584940',
'completename' => 'Myoxidae',
'_parent' => '180273',
),
),
),
4 =>
array (
'tsn' => '573167',
'completename' => 'Sciuravida',
'_parent' => '180130',
'_children' =>
array (
0 =>
array (
'tsn' => '573170',
'completename' => 'Ctenodactylidae',
'_parent' => '573167',
),
),
),
),
)
关于计算深度的评论 - 或每条路径的实际长度。
假设您刚刚在包含实际节点的表中插入了一个新节点(上例中的longnames),新节点的 id 由 MySQL 中的LAST_INSERT_ID() 返回,否则您可以获得它不知何故。
INSERT INTO Closure (a, d, l)
SELECT a, LAST_INSERT_ID(), l+1 FROM Closure
WHERE d = 5 -- the intended parent of your new node
UNION ALL SELECT LAST_INSERT_ID(), LAST_INSERT_ID(), 0;