【问题标题】:r string parsing challenger 字符串解析挑战
【发布时间】:2016-04-27 04:38:32
【问题描述】:

我正在处理一个包含如下字符串的列

       Col1
       ------------------------------------------------------------------
       Department of Mechanical Engineering, Department of Computer Science
       Division of Advanced Machining, Center for Mining and Metallurgy
       Department of Aerospace, Center for Science and Delivery

我正在尝试做的是包含以 Department 或 Divison 或 Center 开头的单词的单独字符串,直到 comma(,) 最终输出应如下所示

       Dept_Mechanical_Eng   Dept_Computer_Science   Div_Adv_Machining   Cntr_Mining_Metallurgy   Dept_Aerospace  Cntr_Science_Delivery
       1                     1                       0                    0                        0              0
       0                     0                       1                    1                        0              0
       0                     0                       1                    1                        1              1

为了美观,我在预期输出中删除了实际名称。非常感谢解析此字符串的任何帮助。

【问题讨论】:

  • library(splitstackshape); cSplit_e(mydf, "Col1", ",", type = "character", drop = TRUE, fill = 0)。另请查看“qdapTools”中的strsplit + mtabulate

标签: r string transform


【解决方案1】:

这与我刚刚列出的另一个文本示例的问题非常相似。你和这里的提问者同班吗? Count the number of times (frequency) a string occurs

 inp <- "Department of Mechanical Engineering, Department of Computer Science
        Division of Advanced Machining, Center for Mining and Metallurgy
        Department of Aerospace, Center for Science and Delivery"
 inp2 <- factor(scan(text=inp,what="",sep=","))
#Read 6 items
 inp3 <- readLines(textConnection(inp))

as.data.frame( setNames( lapply(levels(inp2), function(ll) as.numeric(grepl(ll, inp3) ) ), trimws(levels(inp2) )) )
  Department.of.Aerospace Division.of.Advanced.Machining
1                       0                              0
2                       0                              1
3                       1                              0
  Center.for.Mining.and.Metallurgy Center.for.Science.and.Delivery
1                                0                               0
2                                1                               0
3                                0                               1
  Department.of.Computer.Science Department.of.Mechanical.Engineering
1                              1                                    1
2                              0                                    0
3                              0                                    0

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2012-06-29
    • 2014-09-21
    • 1970-01-01
    • 2018-01-03
    • 1970-01-01
    • 2022-07-24
    • 1970-01-01
    • 2011-09-25
    相关资源
    最近更新 更多