【问题标题】:How can I correct the mutate and filter errors in my R function如何纠正 R 函数中的变异和过滤错误
【发布时间】:2021-06-26 01:20:10
【问题描述】:

我有一个函数,它接受一个数据框和两个其他变量(horse 和 race_date)作为输入。 horse 和 race_date 用于过滤传递给函数的数据帧,然后应用汇总函数来计算所需的输出。当我在管道之外单独测试函数时,一切正常,但是当我尝试从 mutate 函数和管道中运行函数时,我收到以下错误消息:

Error: Problem with `mutate()` input `split_Lt`.
x Problem with `filter()` input `..1`.
x Input `..1` must be of size 1, not size 18.
i Input `..1` is `Horse == horse & NewSplit == "LT Races" & race_date < date`.
i The error occurred in group 2: split = "A BIT OF BOTH_var106_Track: CD".
i Input `split_Lt` is `getsplit_LT(splits, horse, race_date)`.
i The error occurred in group 2: split = "A BIT OF BOTH_var106_Track: CD".

函数如下:

getsplit_LT <- function(df, horse, date){

  kpi <- df %>% 
    filter(Horse == horse & NewSplit == "LT Races" & race_date < date) %>% 
    group_by(split) %>% 
    summarise_if(is.numeric, sum) %>% 
    mutate(TopAvgB = ((E + 3.439) /(R+3.439 + 25.69))) %>% 
    select(TopAvgB) 
    
  x = if(is.data.frame(kpi) && nrow(kpi)==0){0}else{kpi[[1]]}
   
  return(x)
 
}

这是我尝试运行的代码:

df <- df %>%  
  mutate(split_Lt = getsplit_LT(splits, horse, race_date))

这是输入数据:

structure(list(horse = c("A BIT OF BOTH", "A BIT OF BOTH", "A BIT OF BOTH", 
"A BIT OF BOTH", "A BIT OF BOTH", "A BIT OF BOTH", "A BIT OF BOTH", 
"A BIT OF BOTH", "A BIT OF BOTH", "A BIT OF BOTH", "A BIT OF BOTH", 
"A BIT OF BOTH", "A BIT OF BOTH", "A BIT OF BOTH", "A BIT OF BOTH", 
"A BIT OF BOTH", "A BIT OF BOTH", "A BIT OF BOTH"), race_date = structure(c(17802, 
17906, 17941, 17969, 18006, 18062, 18091, 18183, 18183, 18226, 
18244, 18286, 18454, 18502, 18546, 18581, 18601, 18664), class = "Date")), row.names = c(NA, 
-18L), groups = structure(list(horse = "A BIT OF BOTH", .rows = structure(list(
    1:18), ptype = integer(0), class = c("vctrs_list_of", "vctrs_vctr", 
"list"))), row.names = 1L, class = c("tbl_df", "tbl", "data.frame"
), .drop = TRUE), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
))
structure(list(split = c("A BIT OF BOTH_var102B_LifeTime: Life", 
"A BIT OF BOTH_var102B_LifeTime: Life", "A BIT OF BOTH_var102B_LifeTime: Life", 
"A BIT OF BOTH_var102B_LifeTime: Life", "A BIT OF BOTH_var102B_LifeTime: Life", 
"A BIT OF BOTH_var102B_LifeTime: Life", "A BIT OF BOTH_var102B_LifeTime: Life", 
"A BIT OF BOTH_var102B_LifeTime: Life", "A BIT OF BOTH_var102B_LifeTime: Life", 
"A BIT OF BOTH_var102B_LifeTime: Life", "A BIT OF BOTH_var102B_LifeTime: Life", 
"A BIT OF BOTH_var102B_LifeTime: Life", "A BIT OF BOTH_var102B_LifeTime: Life", 
"A BIT OF BOTH_var102B_LifeTime: Life", "A BIT OF BOTH_var102B_LifeTime: Life", 
"A BIT OF BOTH_var102B_LifeTime: Life", "A BIT OF BOTH_var102B_LifeTime: Life", 
"A BIT OF BOTH_var102B_LifeTime: Life", "A BIT OF BOTH_var106_Track: CD", 
"A BIT OF BOTH_var106_Track: CT", "A BIT OF BOTH_var106_Track: DE", 
"A BIT OF BOTH_var106_Track: FG", "A BIT OF BOTH_var106_Track: GP", 
"A BIT OF BOTH_var106_Track: GP", "A BIT OF BOTH_var106_Track: GP", 
"A BIT OF BOTH_var106_Track: GP", "A BIT OF BOTH_var106_Track: GP", 
"A BIT OF BOTH_var106_Track: GP", "A BIT OF BOTH_var106_Track: GP", 
"A BIT OF BOTH_var106_Track: GP", "A BIT OF BOTH_var106_Track: KE", 
"A BIT OF BOTH_var106_Track: MT", "A BIT OF BOTH_var106_Track: MT", 
"A BIT OF BOTH_var106_Track: OT", "A BIT OF BOTH_var106_Track: PX", 
"A BIT OF BOTH_var106_Track: PX", "A BIT OF BOTH_var107_Surface: Dirt", 
"A BIT OF BOTH_var107_Surface: Dirt", "A BIT OF BOTH_var107_Surface: Dirt", 
"A BIT OF BOTH_var107_Surface: Dirt", "A BIT OF BOTH_var107_Surface: Dirt", 
"A BIT OF BOTH_var107_Surface: Dirt", "A BIT OF BOTH_var107_Surface: Dirt", 
"A BIT OF BOTH_var107_Surface: Dirt", "A BIT OF BOTH_var107_Surface: Dirt", 
"A BIT OF BOTH_var107_Surface: Dirt", "A BIT OF BOTH_var107_Surface: Dirt", 
"A BIT OF BOTH_var107_Surface: Dirt", "A BIT OF BOTH_var107_Surface: Dirt", 
"A BIT OF BOTH_var107_Surface: Synth", "A BIT OF BOTH_var107_Surface: Turf", 
"A BIT OF BOTH_var107_Surface: Turf", "A BIT OF BOTH_var107_Surface: Turf"

【问题讨论】:

  • 您的意思是将df 的小写horse 列与splits 数据框一起传递到您的函数中吗?这似乎很奇怪。您能否使用dput() 在您的问题中仅发布几行示例输入,以便复制/粘贴,并显示这些行的所需结果?举个小例子会更清楚,并且可以很好地保持独立。
  • @GregorThomas 是的,我的意思是通过小写马。我将尝试 dput() - 第一次听说它。谢谢。
  • 例如,dput(df[1:10, ]) 给出了 df 前 10 行的复制/粘贴版本,包括所有结构和类信息。这是发布 R 示例数据的首选方式。
  • 顺便说一句,您的 github 存储库中的 data.frame 具有非标准的 unicode 空白。 A BIT OF BOTH\u00a0 而不是 ` `。
  • @IanCampbell 谢谢,我已经提供了 dput() 数据。

标签: r dplyr tidyverse magrittr


【解决方案1】:

一种方法是使用 purrr::pmap 函数,该函数将函数应用于 data.frame 行。

library(tidyverse)
pmap(df, ~ getsplit_LT(splits, horse = .x, date = .y))
[[1]]
[1] 0.2156712

[[2]]
[1] 0

[[3]]
[1] 0.1070373

[[4]]
[1] 0.1339914

[[5]]
[1] 0.1593659
...

或者返回原来的data.frame:

bind_cols(df,kpi = pmap_dbl(df, ~ getsplit_LT(splits, horse = .x, date = .y)))
# A tibble: 18 x 3
   horse         race_date    kpi
   <chr>         <date>     <dbl>
 1 A BIT OF BOTH 2020-09-28 0.216
 2 A BIT OF BOTH 2020-01-10 0    
 3 A BIT OF BOTH 2020-02-14 0.107
 4 A BIT OF BOTH 2020-03-14 0.134
 5 A BIT OF BOTH 2020-04-20 0.159
 6 A BIT OF BOTH 2020-06-15 0.183
 7 A BIT OF BOTH 2020-07-14 0.227
...

数据:

splits <- read_csv("https://raw.githubusercontent.com/Handicappr/Rstudio_test_project/main/splits.csv")
df <- read_csv("https://raw.githubusercontent.com/Handicappr/Rstudio_test_project/main/df.csv")
splits %>% mutate(race_date = as.Date(race_date,"%m/%d/%y")) -> splits
df %>% mutate(race_date = as.Date(race_date,"%m/%d/%y")) -> df

【讨论】:

    猜你喜欢
    • 2020-12-08
    • 2020-08-21
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2022-01-09
    • 2022-10-07
    • 2013-08-19
    相关资源
    最近更新 更多