r 字符串解析挑战答案

【问题标题】：r string parsing challenger 字符串解析挑战
【发布时间】：2016-04-27 04:38:32
【问题描述】：

我正在处理一个包含如下字符串的列

       Col1
       ------------------------------------------------------------------
       Department of Mechanical Engineering, Department of Computer Science
       Division of Advanced Machining, Center for Mining and Metallurgy
       Department of Aerospace, Center for Science and Delivery

我正在尝试做的是包含以 Department 或 Divison 或 Center 开头的单词的单独字符串，直到 comma(,) 最终输出应如下所示

       Dept_Mechanical_Eng   Dept_Computer_Science   Div_Adv_Machining   Cntr_Mining_Metallurgy   Dept_Aerospace  Cntr_Science_Delivery
       1                     1                       0                    0                        0              0
       0                     0                       1                    1                        0              0
       0                     0                       1                    1                        1              1

为了美观，我在预期输出中删除了实际名称。非常感谢解析此字符串的任何帮助。

【问题讨论】：

library(splitstackshape); cSplit_e(mydf, "Col1", ",", type = "character", drop = TRUE, fill = 0)。另请查看“qdapTools”中的strsplit + mtabulate。

标签： r string transform

【解决方案1】：

这与我刚刚列出的另一个文本示例的问题非常相似。你和这里的提问者同班吗？ Count the number of times (frequency) a string occurs

 inp <- "Department of Mechanical Engineering, Department of Computer Science
        Division of Advanced Machining, Center for Mining and Metallurgy
        Department of Aerospace, Center for Science and Delivery"
 inp2 <- factor(scan(text=inp,what="",sep=","))
#Read 6 items
 inp3 <- readLines(textConnection(inp))

as.data.frame( setNames( lapply(levels(inp2), function(ll) as.numeric(grepl(ll, inp3) ) ), trimws(levels(inp2) )) )
  Department.of.Aerospace Division.of.Advanced.Machining
1                       0                              0
2                       0                              1
3                       1                              0
  Center.for.Mining.and.Metallurgy Center.for.Science.and.Delivery
1                                0                               0
2                                1                               0
3                                0                               1
  Department.of.Computer.Science Department.of.Mechanical.Engineering
1                              1                                    1
2                              0                                    0
3                              0                                    0

【讨论】：