【发布时间】:2014-01-09 14:40:43
【问题描述】:
我有一个数据框:
dput(Data1)
structure(list(Emp.ID = c(182038L, 191854L), Project.Acquired.Skill = structure(c(2L,
1L), .Label = c("Architecting (10),Cognos TM1 (4),Support Function (3)",
"SAS (76),SAS Analytics (76),SAS BI (76),SAS data modeling tool (63),ClearCase (18),SQL (18),SQL Server (18),SQL SERVER 2000 (18),SQL SERVER 2005 (18),Excel (16),Oracle (16),AS400 (10)"
), class = "factor")), .Names = c("Emp.ID", "Project.Acquired.Skill"
), class = "data.frame", row.names = c(NA, -2L))
str(Data1)
'data.frame': 2 obs. of 2 variables:
$ Emp.ID : int 182038 191854
$ Project.Acquired.Skill: Factor w/ 2 levels "Architecting (10),Cognos TM1 (4),Support Function (3)",..: 2 1
我有一列是一个像 Architecting (10),Cognos TM1 (4),Support Function (3) 这样的因子,我需要去掉数字 (0-9)、WhiteSpace 和括号 () 以获得 Architecting,Cognos TM1,Support Function。我正面临问题,因为这被编码为因素。
我的输出应该是这样的
Emp ID Project Acquired Skill
182038 SAS,SAS Analytics,SAS BI,SAS data modeling tool,ClearCase,SQL,SQL Server,SQL SERVER 2000,SQL SERVER 2005,Excel,Oracle,AS400
191854 Architecting,Cognos TM1,Support Function
【问题讨论】: