【问题标题】:Adding columns to data frames in double for loop在双 for 循环中向数据框添加列
【发布时间】:2022-01-14 04:31:22
【问题描述】:

我有以下设置

df_names <- c("df1", "df2", "df3")
df1 <- tibble("1" = "hallo")
df2 <- tibble("1" = "hallo")
df3 <- tibble("1" = "hallo")
missing_columns <- c("2", "3")

我的目标是向每个数据框添加在 missing_columns 中看到的列。

我试过了

for(i in df_names){
  
  for(j in missing_columns){
    
    get(i)[, j] <- ""
    
  }
  
}

Error in get(i) <- `*vtmp*` : could not find function "get<-"

for(i in df_names){
  
  for(j in missing_columns){
    
    assign(get(i)[, j], "")
    
  }
  
}

Error: Can't subset columns that don't exist.
x Column `2` doesn't exist.

当然第 2 列不存在,这就是我要添加它的原因。

【问题讨论】:

  • 随便df1[["2"]] &lt;- "foo"
  • 这对我不起作用,因为我正在写要添加的列和数据帧数量不同的东西。这就是为什么我有一个数据框名称的向量和一个缺失列的向量。这就是为什么我认为使用 for 循环可能是正确的选择。

标签: r for-loop


【解决方案1】:

您需要将对象分配给全局环境才能在运行代码后访问它们:

library(tidyverse)

df_names <- c("df1", "df2", "df3")
df1 <- tibble("1" = "hallo")
df2 <- tibble("1" = "hallo")
df3 <- tibble("1" = "hallo")
missing_columns <- c("2", "3")

df1
#> # A tibble: 1 x 1
#>   `1`  
#>   <chr>
#> 1 hallo
df2
#> # A tibble: 1 x 1
#>   `1`  
#>   <chr>
#> 1 hallo

expand_grid(
  col = missing_columns,
  df = df_names
) %>%
  mutate(
    new_df = map2(col, df, ~ {
      res <- get(.y)
      res[[.x]] <- "foo"
      assign(.y, res, envir = globalenv())
    })
  )
#> # A tibble: 6 x 3
#>   col   df    new_df          
#>   <chr> <chr> <list>          
#> 1 2     df1   <tibble [1 × 2]>
#> 2 2     df2   <tibble [1 × 2]>
#> 3 2     df3   <tibble [1 × 2]>
#> 4 3     df1   <tibble [1 × 3]>
#> 5 3     df2   <tibble [1 × 3]>
#> 6 3     df3   <tibble [1 × 3]>

df1
#> # A tibble: 1 x 3
#>   `1`   `2`   `3`  
#>   <chr> <chr> <chr>
#> 1 hallo foo   foo
df2
#> # A tibble: 1 x 3
#>   `1`   `2`   `3`  
#>   <chr> <chr> <chr>
#> 1 hallo foo   foo

reprex package 创建于 2021-12-09 (v2.0.1)

【讨论】:

    【解决方案2】:

    还取决于您的最终目标是什么,也许这种方法可能对您有用。

    df_names <- c("df1", "df2", "df3")
    # note the small change in sample data
    df1 <- tibble("1" = "hallo")
    df2 <- tibble("2" = "hallo")
    df3 <- tibble("3" = "hallo")
    
    # I suggest to work with required columns, what is not there becomes missing
    required <- c("1", "2", "3")
    
    dfs <- lapply(df_names, function(df) {
      t <- get(df)
      t[setdiff(required, names(t))] <- NA
      t
    })
    
    dfs
    
    [[1]]
    # A tibble: 1 x 3
      `1`   `2`   `3`  
      <chr> <lgl> <lgl>
    1 hallo NA    NA   
    
    [[2]]
    # A tibble: 1 x 3
      `2`   `1`   `3`  
      <chr> <lgl> <lgl>
    1 hallo NA    NA   
    
    [[3]]
    # A tibble: 1 x 3
      `3`   `1`   `2`  
      <chr> <lgl> <lgl>
    1 hallo NA    NA   
    
    # if you want to combine the data anyhow
    do.call("rbind", dfs)
    
    # A tibble: 3 x 3
      `1`   `2`   `3`  
      <chr> <chr> <chr>
    1 hallo NA    NA   
    2 NA    hallo NA   
    3 NA    NA    hallo
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2022-01-20
      • 1970-01-01
      • 1970-01-01
      • 2019-03-02
      • 1970-01-01
      • 1970-01-01
      • 2019-11-09
      相关资源
      最近更新 更多