如何用另一个值替换列表中所有数据框的第 n 列中的 0？答案

【问题标题】：How to replace 0 in the nth column of all dataframe in a list with another value?如何用另一个值替换列表中所有数据框的第 n 列中的 0？
【发布时间】：2021-09-09 12:14:13
【问题描述】：

我有一个数据框列表，我想替换该列表中所有数据框第二列中的 0。

这是数据框列表的最小工作示例：

> named <- c(1, 2, 3, 4, 5, 6)
> one <- c(0, 2, 0, 4, 5, 0)
> two <- c(1, 0, 3, 0, 0, 6)
> df <- data.frame(named, one, two)
> df1 <- data.frame(named, two, one)
> listed <- list(df, df1)
>
> listed
[[1]]
  named one two
1     1   0   1
2     2   2   0
3     3   0   3
4     4   4   0
5     5   5   0
6     6   0   6

[[2]]
  named two one
1     1   1   0
2     2   0   2
3     3   3   0
4     4   0   4
5     5   0   5
6     6   6   0

我可以用replace(listed[[2]][2], listed[[2]][2] == 0, 1) 替换特定数据帧的一列（以下代码中第二个数据帧的第二列）。

但是如何对列表中的所有数据框执行此操作？我试过了：

for (i in 1:2) {
  replace(listed[[i]][2], listed[[i]][2] == 0, -1)
}

但这显然是一次糟糕的尝试。

【问题讨论】：

标签： r list dataframe subset

【解决方案1】：

你可以使用lapply -

listed <- lapply(listed, function(x) {x[2][x[2] == 0] <- -1;x})

或使用replace 和for 循环将更改的数据分配回列表。

for (i in seq_along(listed)) {
  listed[[i]][2] <- replace(listed[[i]][2], listed[[i]][2] == 0, -1)
}

listed
#[[1]]
#  named one two
#1     1  -1   1
#2     2   2   0
#3     3  -1   3
#4     4   4   0
#5     5   5   0
#6     6  -1   6

#[[2]]
#  named two one
#1     1   1   0
#2     2  -1   2
#3     3   3   0
#4     4  -1   4
#5     5  -1   5
#6     6   6   0

【讨论】：

您介意为我解释一下{x[2][x[2] == 0] <- -1;x} 吗？为什么最后有;x？它是干什么用的？抱歉，我的基础还是很差。
x[2][x[2] == 0] <- -1 将 0 替换为 -1。最后的x 是从lapply 输出返回更改后的数据帧（包含所有列）。尝试从代码中删除它，看看你会得到什么。
知道了。我没有在您的答案中尝试 for 循环，但 lapply 有效。它看起来也更简单。

【解决方案2】：

你也可以在cross中使用列索引

library(tidyverse)

map(listed, ~.x %>% mutate(across(2, ~replace(., .== 0, -1))))

#> [[1]]
#>   named one two
#> 1     1  -1   1
#> 2     2   2   0
#> 3     3  -1   3
#> 4     4   4   0
#> 5     5   5   0
#> 6     6  -1   6
#> 
#> [[2]]
#>   named two one
#> 1     1   1   0
#> 2     2  -1   2
#> 3     3   3   0
#> 4     4  -1   4
#> 5     5  -1   5
#> 6     6   6   0

^{由reprex package (v2.0.0) 于 2021 年 6 月 26 日创建}

【讨论】：

【解决方案3】：

你可以使用 tidyverse 中的map：

library(tidyverse)

map(listed, ~ mutate_at(.x, .vars = colnames(.x)[length(colnames(.x))],
                        ~ case_when(. == 0 ~ -1, T ~ as.numeric(.))))

此代码遍历 listed 中的每个数据帧，仅标识最后一列，并将 0 值更改为 -1，如您的 for 循环示例中所示。

输出：

[[1]]
  named one two
1     1   0   1
2     2   2  -1
3     3   0   3
4     4   4  -1
5     5   5  -1
6     6   0   6

[[2]]
  named two one
1     1   1  -1
2     2   0   2
3     3   3  -1
4     4   0   4
5     5   0   5
6     6   6  -1

【讨论】：

【解决方案4】：

na_if/replace_na 的选项

library(dplyr)
library(tidyr)
library(purrr)
 map(listed, ~ .x %>% 
         mutate(across(2,  ~ replace_na(na_if(., 0), -1))))
[[1]]
  named one two
1     1  -1   1
2     2   2   0
3     3  -1   3
4     4   4   0
5     5   5   0
6     6  -1   6

[[2]]
  named two one
1     1   1   0
2     2  -1   2
3     3   3   0
4     4  -1   4
5     5  -1   5
6     6   6   0

【讨论】：