apply() 函数仅适用于某些列答案

【问题标题】：apply() function to only certain columnsapply() 函数仅适用于某些列
【发布时间】：2021-01-23 01:55:25
【问题描述】：

我有一个如下所示的数据框（带有可重现的代码）：

# create the table
name <- c("Mary", "John", "Peter")
id1 <- c(50, 30, 25)
id2 <- c(8, 12, 90)
id3 <- c(14, 17, 34)
id4 <- c(9, 67, 89)
id5 <- c(20, 21, 22)
beep <- c(15, 20, 23)

# combine the df
df <- data.frame(name, id1, id2, id3, id4, id5, beep)

# show df
df
   name id1 id2 id3 id4 id5 beep
1  Mary  50   8  14   9  20   15
2  John  30  12  17  67  21   20
3 Peter  25  90  34  89  22   23

我想用小于“beep”变量的“id#”将每个单元格重新编码为 1，否则为 0。我尝试了以下方法：

apply(df, 2, function(x) {
 ifelse(x < df$beep, 1, 0)})

这会产生以下向量：

     name id1 id2 id3 id4 id5 beep
[1,]    0   0   1   1   1   0    0
[2,]    0   0   1   1   0   0    0
[3,]    0   0   0   0   0   1    0

上述向量的问题是我不希望“name”或“beep”变量发生变化。有什么建议吗？

【问题讨论】：

您可以apply 到数据框的列子集：apply(df[ , 2:6], 2,...)

标签： r apply

【解决方案1】：

1) mutate/across 使用 dplyr 可以使用 mutate/across。 cross 的第一个参数定义要使用的列，第二个参数是应用于每个此类列的函数。公式的右边是函数的主体，点是函数的参数。我们使用 + 将逻辑结果转换为数字。

library(dplyr)

df %>% mutate(across(starts_with("id"), ~ +(. < beep)))
##    name id1 id2 id3 id4 id5 beep
## 1  Mary   0   1   1   1   0   15
## 2  John   0   1   1   0   0   20
## 3 Peter   0   0   0   0   1   23

2) modify_if purrr 包有一个函数，它只修改满足第二个参数定义的条件的列。它支持与 (1) 中相同的函数简写。

library(purrr)

modify_if(df, startsWith(names(df), "id"), ~ +(. < df$beep))

##    name id1 id2 id3 id4 id5 beep
## 1  Mary   0   1   1   1   0   15
## 2  John   0   1   1   0   0   20
## 3 Peter   0   0   0   0   1   23

3) 替换 这与另一个答案基本相同，但使用grep 和replace 代替。没有使用任何包。

ix <- grep("^id", names(df))
replace(df, ix, +(df[ix] < df$beep))
##    name id1 id2 id3 id4 id5 beep
## 1  Mary   0   1   1   1   0   15
## 2  John   0   1   1   0   0   20
## 3 Peter   0   0   0   0   1   23

4) modifyList 它的modifyList 使用名称匹配将第一个参数中的列替换为第二个参数中的列。两个参数都必须是列表或数据框（不是矩阵）。

ix <- grep("^id", names(df))
modifyList(df, +as.data.frame(df[ix] < df$beep))
##    name id1 id2 id3 id4 id5 beep
## 1  Mary   0   1   1   1   0   15
## 2  John   0   1   1   0   0   20
## 3 Peter   0   0   0   0   1   23

（这曾经在 lattice 包中，但现在在 utils 中，它是 base R 的一部分。）

【讨论】：

【解决方案2】：

你不需要apply，你可以试试下面的代码

df[startsWith(names(df), "id")] <- +(df[startsWith(names(df), "id")] < df$beep)

给了

> df
   name id1 id2 id3 id4 id5 beep
1  Mary   0   1   1   1   0   15
2  John   0   1   1   0   0   20
3 Peter   0   0   0   0   1   23

如果你真的想使用apply，下面是一种选择

idx <- grep("^id", names(df))
df[idx] <- apply(df[idx], 2, function(x) ifelse(x < df$beep, 1, 0))

【讨论】：

【解决方案3】：

您的数据中可能包含NA，如果您与< 进行比较，则会返回NA。您可以使用 is.na 进行额外检查以处理 NA 值。

cols <- grep('id', names(df))
df[cols] <- +(df[cols] < df$beep & !is.na(df[cols]))

【讨论】：