【问题标题】:tidyr unnest with purrr::map2 shows error Error: object 'model' not found带有 purrr::map2 的 tidyr unnest 显示错误错误:找不到对象“模型”
【发布时间】:2020-06-16 09:13:08
【问题描述】:

运行以下代码时出现错误Error: object 'model' not found。有没有办法使用另一个库或以不同的方式使用代码来解决问题?

在显示错误之前,下面的代码是可运行的。它旨在运行 k 折验证。

K_folds_regression_function <- function(inputData, col_pos_Y, col_pos_Xn){
  # ------------------------------- User Inputs -------------------------------------------------------------------------------------------------------
  # inputFile <- 'Input_Data-MLR.csv'   #   <-- provide file name .csv
  # col_pos_Y <-  11  # position of the Y variable
  # col_pos_Xn <- 14  # position of LAST Xs

  # ------------------------------- libraries --------------------------------------------------------------------------------------------- 
  library(modelr) ; library(dplyr) ; library(purrr) ; library(ggplot2)
  library(tidyr) ; library(broom) ; library(tibble)
  library(modelr)
  library(broom)

  my_y_x_data <- inputData[   , c(1,col_pos_Y:col_pos_Xn)]; head(my_y_x_data)
  no_of_vars <- dim (my_y_x_data)[2] - 1  # including x and y and excluding the 1st ID column
  no_of_xs <- no_of_vars - 1  # only Xs

  set.seed(6)

  colnames(my_y_x_data)[2] <- 'yy'   # making sure that 2nd column which is a depedent variable is named 'yy'
  k_folds_regr <- crossv_kfold(data = my_y_x_data, k = 10)    # k is the no of folds, in this case 10 folds. around 7-11 folds will have minimum error, uually 10 is used

  colnames(my_y_x_data)[2] <- 'yy'   # making sure that 2nd column which is a dependent variable is named 'yy'
  head(my_y_x_data)
  library(tidyr)
  # fitting models
  k_folds_regr <- k_folds_regr %>% mutate(                                                          
    model = map(     train, ~ lm( yy ~ . , data = my_y_x_data[, -1])     ))
  # k_fold_regr %>% mutate --> adds a new column       # map (train, ) -> applys a function to each rows of train data

  interim_regression_result_table <- k_folds_regr  %>% unnest(
    map2(model, test, ~augment(.x, newdata = .y) )   # Error happens here 
  )
}

示例数据:

Sites,SPM,Total_Rrs
WMS04,36,1.428096339
WMS05,36,1.468391179
WMS06,35,1.480664998
WMS07,30,1.563063147
WMS09,28.25,1.575149555
WMS13,29.5,1.527970637
WMS14,36.75,1.697142346
WMS15,54.5,1.701675753
WMS16,61.75,1.647421816
WMS17,46.5,1.734592117
WMS18,43.5,1.658748318
WMS19,44,1.56844909
WMS20,46.25,1.523165904
WMS22,46.5,1.727354533
WMS23,45.75,1.583000425
WMS24,43.5,1.679314139
WMS25,40.5,1.583000425
WMS26,16,1.722078046
WMS27,68.5,1.356190243
WMS29,44.25,1.65871572
WMS30,21.5,1.55141119
WMS32,49,1.549374723
WMS33,49.75,1.623284831
WMS34,47.5,1.716614557
WMS35,55.25,1.693683697

用法:

inputFile = 'path/to/csv.csv'
inputData <- read.csv(inputFile, header=TRUE)
K_folds_regression_function(inputData, 2, 3)

【问题讨论】:

  • 你能提供一个reprex吗?
  • 对不起,我是 R 的新手。

标签: r tidyr purrr


【解决方案1】:

似乎map2 不喜欢它被传递一个列表的事实。 Base R 的mapply 不那么挑剔。

请注意,在函数中使用library 并不是一个好习惯——它可能会给函数用户带来意想不到的副作用,因为加载了用户可能不想要的包,因此某些函数可能已被覆盖.最好使用命名空间限定符package::function

对于同一个包,您还多次调用 library,所以我整理了函数以使用命名空间限定符而不是 library 调用:

K_folds_regression_function <- function(inputData, col_pos_Y, col_pos_Xn) 
{
  `%>%`       <- dplyr::`%>%`
  my_y_x_data <- inputData[ , c(1,col_pos_Y:col_pos_Xn)]
  no_of_vars  <- dim(my_y_x_data)[2] - 1  
  no_of_xs    <- no_of_vars - 1

  set.seed(6)

  k_folds_regr <- modelr::crossv_kfold(data = my_y_x_data, k = 10)  
  colnames(my_y_x_data)[2] <- 'yy'  

  k_folds_regr <- k_folds_regr %>% 
    dplyr::mutate(model = purrr::map(train, ~ lm(yy~. ,data = my_y_x_data[, -1])))

  mapply(function(.x, .y) broom::augment(.x, newdata = .y),
         k_folds_regr$model, k_folds_regr$test, SIMPLIFY = FALSE)
}

现在当我们运行我们得到的函数时:

K_folds_regression_function(inputData, 2, 3)
#> $`1`
#> # A tibble: 3 x 5
#>   Sites    yy Total_Rrs .fitted .se.fit
#>   <fct> <dbl>     <dbl>   <dbl>   <dbl>
#> 1 WMS04  36        1.43    42.1    4.86
#> 2 WMS18  43.5      1.66    42.3    2.81
#> 3 WMS23  45.8      1.58    42.2    2.46
#> 
#> $`2`
#> # A tibble: 3 x 5
#>   Sites    yy Total_Rrs .fitted .se.fit
#>   <fct> <dbl>     <dbl>   <dbl>   <dbl>
#> 1 WMS14  36.8      1.70    42.3    3.38
#> 2 WMS17  46.5      1.73    42.3    4.07
#> 3 WMS29  44.2      1.66    42.3    2.81
#> 
#> $`3`
#> # A tibble: 3 x 5
#>   Sites    yy Total_Rrs .fitted .se.fit
#>   <fct> <dbl>     <dbl>   <dbl>   <dbl>
#> 1 WMS06  35        1.48    42.1    3.80
#> 2 WMS13  29.5      1.53    42.2    3.01
#> 3 WMS34  47.5      1.72    42.3    3.72
#> 
#> $`4`
#> # A tibble: 3 x 5
#>   Sites    yy Total_Rrs .fitted .se.fit
#>   <fct> <dbl>     <dbl>   <dbl>   <dbl>
#> 1 WMS07  30        1.56    42.2    2.59
#> 2 WMS09  28.2      1.58    42.2    2.51
#> 3 WMS32  49        1.55    42.2    2.73
#> 
#> $`5`
#> # A tibble: 3 x 5
#>   Sites    yy Total_Rrs .fitted .se.fit
#>   <fct> <dbl>     <dbl>   <dbl>   <dbl>
#> 1 WMS20  46.2      1.52    42.2    3.08
#> 2 WMS25  40.5      1.58    42.2    2.46
#> 3 WMS35  55.2      1.69    42.3    3.32
#> 
#> $`6`
#> # A tibble: 2 x 5
#>   Sites    yy Total_Rrs .fitted .se.fit
#>   <fct> <dbl>     <dbl>   <dbl>   <dbl>
#> 1 WMS24  43.5      1.68    42.3    3.09
#> 2 WMS33  49.8      1.62    42.3    2.49
#> 
#> $`7`
#> # A tibble: 2 x 5
#>   Sites    yy Total_Rrs .fitted .se.fit
#>   <fct> <dbl>     <dbl>   <dbl>   <dbl>
#> 1 WMS26  16        1.72    42.3    3.83
#> 2 WMS27  68.5      1.36    42.0    6.44
#> 
#> $`8`
#> # A tibble: 2 x 5
#>   Sites    yy Total_Rrs .fitted .se.fit
#>   <fct> <dbl>     <dbl>   <dbl>   <dbl>
#> 1 WMS16  61.8      1.65    42.3    2.68
#> 2 WMS22  46.5      1.73    42.3    3.93
#> 
#> $`9`
#> # A tibble: 2 x 5
#>   Sites    yy Total_Rrs .fitted .se.fit
#>   <fct> <dbl>     <dbl>   <dbl>   <dbl>
#> 1 WMS15  54.5      1.70    42.3    3.46
#> 2 WMS30  21.5      1.55    42.2    2.71
#> 
#> $`10`
#> # A tibble: 2 x 5
#>   Sites    yy Total_Rrs .fitted .se.fit
#>   <fct> <dbl>     <dbl>   <dbl>   <dbl>
#> 1 WMS05    36      1.47    42.1    4.04
#> 2 WMS19    44      1.57    42.2    2.55

【讨论】:

  • 感谢您的回答。是否可以使输出像k_folds_regr 这样的数据帧
  • 是的,@wondim 您可以将mapply 结果存储在my_list 中,然后do.call(rbind, my_list) 将返回单个数据框。
猜你喜欢
  • 1970-01-01
  • 2020-10-06
  • 2018-12-09
  • 2015-06-02
  • 1970-01-01
  • 2020-02-08
  • 2017-09-17
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多