【问题标题】:Comparing the position of 1's is matched in the strings in r比较 1 的位置是否匹配 r 中的字符串
【发布时间】:2015-02-24 05:04:17
【问题描述】:

假设我正在从 R 读取一个 .csv 文件,其列包含 0 和 1 的字符串。假设我需要比较 1 的位置,如果匹配,则每次匹配计为 1,并将该计数放在第三列中。

插图

dput(head(string_data))
structure(list(v_1 = structure(c(1L, 1L, 1L, 1L, 3L, 1L), .Label = c("", 
"0,0,0,1", "0,0,1,0", "0,1,0,0", "1,1,0,0"), class = "factor"), 
    v_2 = structure(c(1L, 1L, 1L, 1L, 2L, 1L), .Label = c("", 
    "1,0,1,0"), class = "factor"), v_3 = structure(c(1L, 1L, 
    1L, 1L, 4L, 1L), .Label = c("", "0,0,0,1", "0,0,1,0", "1,0,0,0"
    ), class = "factor"), v_4 = structure(c(1L, 1L, 1L, 1L, 2L, 
    1L), .Label = c("", "0,0,0,1"), class = "factor"), v_5 = structure(c(1L, 
    5L, 1L, 1L, 1L, 2L), .Label = c("", "0,0,0,0,0", "0,0,0,1,0", 
    "0,0,1,0,0", "1,0,1,1,0"), class = "factor"), v_6 = structure(c(1L, 
    2L, 1L, 1L, 1L, 2L), .Label = c("", "1,0,1,1,0"), class = "factor"), 
    v_7 = structure(c(1L, 1L, 1L, 1L, 1L, 2L), .Label = c("", 
    "0,0,0,0", "0,0,0,1", "0,1,0,0", "1,0,0,0"), class = "factor"), 
    v_8 = structure(c(1L, 1L, 1L, 1L, 1L, 2L), .Label = c("", 
    "1,0,0,0"), class = "factor")), .Names = c("v_1", "v_2", 
"v_3", "v_4", "v_5", "v_6", "v_7", "v_8"), row.names = c(NA, 
6L), class = "data.frame")

上面我已经粘贴了dput的头部数据。

我需要将 (2*i-1) 列中 1 的位置与第 (2*i) 列 (i =1,2,...,8) 中的位置进行比较,并将其放在第三列中。作为匹配数。

例如

假设我在第一列中有一个字符串 0,0,1,1,在第二列中有一个 0,1,1,1,那么在第三列中它应该返回 2。

谁能帮我解决这个问题。

编辑

第三列的计数应该基于第二列字符串中 1 的个数。在上面例如第二列字符串是 0,1,1,1,这意味着它的计数可以从 0 到 3。

【问题讨论】:

  • 这个问题是不是太含糊了?还是困难的?
  • 请为您的示例提供预期的输出。

标签: r string


【解决方案1】:

这两个函数可能对初学者有所帮助:

# Compares two strings and computes number of '1's at matching positions
f <- function(s1, s2) {
    if (s1=='' || s2=='') return(0)
    m <- do.call(cbind,strsplit(c(s1,s2),','))
    m2 <- rowMeans(m=="1")
    sum(m2==1.0)
}

# Calls `f()` for every row of two columns i and j from a data set d and returns a vector 
# that could be used as a new column
f.cols <- function(d,i,j) {
    c1 <- as.character(d[,i])
    c2 <- as.character(d[,j])
    unname(mapply(f,c1,c2))
}

使用示例:

d$out <- f.cols(d,1,2)

【讨论】:

    猜你喜欢
    • 2022-03-30
    • 1970-01-01
    • 1970-01-01
    • 2012-10-12
    • 2019-02-21
    • 1970-01-01
    • 2017-05-18
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多