【发布时间】:2018-04-02 07:56:30
【问题描述】:
我正在研究家谱:
我根据 sqldf https://www.r-bloggers.com/exploring-recursive-ctes-with-sqldf/ 改编了 Bob Horton 的例子
我的数据:
person father
Guillou Arthur NA
Cleach Marc NA
Guillou Eric Guillou Arthur
Guillou Jacques Guillou Arthur
Cleach Franck Cleach Marc
Cleach Leo Cleach Marc
Cleach Herbet Cleach Leo
Cleach Adele Cleach Herbet
Guillou Jean Guillou Eric
Guillou Alan Guillou Eric
我的结果,按“Guillou Arthur”(没有父亲的顶级人物)等级排序的后代:
name parent_name level
Guillou Arthur NA 1
Guillou Eric Guillou Arthur 2
Guillou Jacques Guillou Arthur 2
Guillou Alan Guillou Eric 3
Guillou Jean Guillou Eric 3
您可以使用 sqldf 递归查询构建此表:
数据:
person <- c("Guillou Arthur",
"Cleach Marc",
"Guillou Eric",
"Guillou Jacques",
"Cleach Franck",
"Cleach Leo",
"Cleach Herbet",
"Cleach Adele",
"Guillou Jean",
"Guillou Alan" )
father <- c(NA, NA, "Guillou Arthur" , "Guillou Arthur", "Cleach Marc", "Cleach Marc", "Cleach Leo", "Cleach Herbet", "Guillou Eric", "Guillou Eric")
family <- data.frame(person, father)
大到长格式转换:
library(tidyr)
long_family <- gather(family, parent, parent_name, -person)
long_family
递归查询寻找“Guillou Arthur”(没有父亲的顶级人物)的后代:
library(sqldf)
descendants_sql <- "
WITH RECURSIVE descendants (name, parent_name, level) AS (
SELECT person, parent_name, 1 FROM long_family
WHERE person = '%s'
AND parent = '%s'
UNION ALL
SELECT F.person, F.parent_name, D.level + 1
FROM descendants D
JOIN long_family F
ON F.parent_name = D.name)
SELECT * FROM descendants ORDER BY level, name
"
fam <- sqldf(sprintf(descendants_sql, 'Guillou Arthur', 'father'))
fam
我的问题:
如何直接使用 R(而不是 sql)创建包含所有家谱的 data.frame 对象。
每棵树都以像“Cleach Marc”这样的族长(没有父亲)开头。 (用R方法或sqldf方法)
【问题讨论】:
标签: r recursion tree igraph sqldf