【发布时间】:2017-06-24 07:26:42
【问题描述】:
我有以下组织数据:
EmployeeID <- c(10:15)
Job.Title <- c("Program Manager", "Development Manager", "Developer" , "Developer", "Developer", "Summer Intern")
Level.1 <- c(1,1,1,1,1,1)
Level.2 <- c(2,2,2,2,2,2)
Level.3 <- c("",10,10,10,10,10)
Level.4 <- c("","",11,11,11,11)
Level.5 <- c("","","","","",12)
Level.6 <- c("","","","","","")
Pay.Type <- c("Salary", "Salary", "Salary", "Salary", "Salary", "Hourly")
acme = data.frame(EmployeeID, Job.Title, Level.1, Level.2, Level.3, Level.4, Level.5, Level.6, Pay.Type)
acme
EmployeeID Job.Title Level.1 Level.2 Level.3 Level.4 Level.5 Level.6 Pay.Type
1 10 Program Manager 1 2 Salary
2 11 Development Manager 1 2 10 Salary
3 12 Developer 1 2 10 11 Salary
4 13 Developer 1 2 10 11 Salary
5 14 Developer 1 2 10 11 Salary
6 15 Summer Intern 1 2 10 11 12 Hourly
对于每一行,我需要确定 Level.1 到 Level.6 的第一个非 NULL 值,从右侧开始是 Level.6,然后是 Level.5,然后是 Level.4,依此类推。我还需要以相同的模式识别第二个非 Null 值。每行的标识值需要放入新列中,因此最终表格如下所示:
EmployeeID Job.Title Level.1 Level.2 Level.3 Level.4 Level.5 Level.6 Pay.Type Supervisor Manager
1 10 Program Manager 1 2 Salary 2 1
2 11 Development Manager 1 2 10 Salary 10 2
3 12 Developer 1 2 10 11 Salary 11 10
4 13 Developer 1 2 10 11 Salary 11 10
5 14 Developer 1 2 10 11 Salary 11 10
6 15 Summer Intern 1 2 10 11 12 Hourly 12 11
【问题讨论】:
-
R 有 NA 值。使用它们比使用空字符串要好得多。
标签: r hierarchy hierarchical-data