【发布时间】:2021-06-21 11:48:44
【问题描述】:
我有一个包含 4 列的数据框,其中包含职位名称。对于每一列,我想创建一个新列(category1、category2、category3、category4),根据职称包含的单词为每个工作分配一个类别 1-10(例如,如果职称包含单词“frontend”, "ui", "ux" 那么列 category1 应该是 1)。我设法使用以下代码对每一列手动进行分类,但希望同时对所有 4 列进行分类。任何帮助表示赞赏!
data_rel$category1 <-
ifelse(grepl("frontend|ui|ux", data$job4_clean),1, ifelse(grepl("backend", data$job4_clean),2, ifelse(grepl("fullstack", data$job4_clean),3, ifelse(grepl("entwickler|development|application|developer|software",data$job4_clean),4, ifelse(grepl("data|analytics|machine|programmer|ml|engineer|engineering|programmer|learning",data$job4_clean),5, ifelse(grepl("research|teaching|akademischer|researcher",data$job4_clean),6, ifelse(grepl("project|manager|product|consultant|consulting",data$job4_clean),7, ifelse(grepl("it|security|technical|tech", data$job4_clean),8, ifelse(grepl("margketing|sales|media|saas|business|commerce|support|development|digital|markeing|graphic|designer|graphics|design",data$job4_clean),9, ifelse(grepl("founder|ceo|partner|chief|executive|cto",data$job4_clean),10,NA))))))))))
data_rel <- structure(list(job1 = c("phd fellow", "java developer intern",
"optical engineer", " dwh bi engineer", " software engineer",
"software developer", "data engineer", "application software engineer",
"software developer", " web developer", "web developer", "web developer",
"software engineer", "software engineer", " es computer", "associate software engineer",
"fullstack ios developer", "technical delivery manager project manager",
"software architect", "software developer"), job2 = c("research scientist",
"analytics analyst", " developer", " data ml engineer", "graduate teaching assistant",
"software developer", "machine learning engineer", "akademischer mitarbeiter machine learning and analytics",
"backend develope", "lead php developer", "php system analytic software specialist",
"webcreater", "data engineer", "software engineer", "assistant network administrator",
"frontend engineer", "application infrastructor lead", "software engineer",
"application developer", "software developer"), job3 = c("data scientist",
"machine learning engineer", "application developer associate manager",
NA, "co founder cto", NA, NA, NA, NA, NA, "lead php sugarcrm developer",
" php developer", "data analysing researcher ", NA, "application developer consultance",
"manager l1 ui frontend ", " software architect", "software engineering manager solution architect",
"software developer consultance", "ai developer"), job4 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, "software architect development lead",
"team leader", NA, NA, " application development specialist",
" associate experience technology", NA, " software developer",
"fullstack developer productowner", NA)), row.names = c(NA, -20L
), class = c("tbl_df", "tbl", "data.frame"))
【问题讨论】:
-
将你的 ifelse 转换成一个函数...
foo <- function(x){ ifelse(x....)}然后遍历列lapply(myData[, c("col1", "col2", etc)], foo) -
阅读
switch函数,以避免嵌套 ifelse。
标签: r apply categories