【问题标题】:How do you use approx() inside of mutate_at()?如何在 mutate_at() 中使用 approx()?
【发布时间】:2017-02-02 22:57:06
【问题描述】:

我在让 approx() 在 mutate_at() 中工作时遇到问题。我确实设法使用一个很长的 mutate() 函数来获得我想要的东西,但为了将来的参考,我想知道是否有更优雅和更少复制粘贴的 mutate_at() 方法来做到这一点。

首要问题是将具有 1 年间隔数据的数据集合并为具有 3 年间隔的数据集,并以 3 年间隔对数据集中没有数据的年份进行插值。年份和年份之间存在缺失值,需要某种形式的外推。

library("tidyverse")

demodf <- data.frame(groupvar = letters[rep(1:15, each = 6)],
                     timevar = c(2000, 2003, 2006, 2009, 2012, 2015),
                     x1 = runif(n = 90, min = 0, max = 3),
                     x2 = runif(n = 90, min = -1, max = 4),
                     x3 = runif(n = 90, min = 1, max = 12),
                     x4 = runif(n = 90, min = 0, max = 30),
                     x5 = runif(n = 90, min = -2, max = 5),
                     x6 = runif(n = 90, min = 20, max = 50),
                     x7 = runif(n = 90, min = 1, max = 37),
                     x8 = runif(n = 90, min = 0.3, max = 0.5))

demotbl <- tbl_df(demodf)

masterdf <- data.frame(groupvar = letters[rep(1:15, each = 17)],
                      timevar = 2000:2016,
                      z1 = runif(n = 255, min = 0, max = 1E6))

mastertbl <- tbl_df(masterdf)

joineddemotbls <- mastertbl %>% left_join(demotbl, by = c("groupvar", "timevar"))

View(joineddemotbls)

joineddemotblswithinterpolation <- joineddemotbls %>% group_by(groupvar) %>%
  mutate(x1i = approx(timevar, x1, timevar, rule = 2, f = 0, ties = mean, method = "linear")[["y"]],
         x2i = approx(timevar, x2, timevar, rule = 2, f = 0, ties = mean, method = "linear")[["y"]],
         x3i = approx(timevar, x3, timevar, rule = 2, f = 0, ties = mean, method = "linear")[["y"]],
         x4i = approx(timevar, x4, timevar, rule = 2, f = 0, ties = mean, method = "linear")[["y"]],
         x5i = approx(timevar, x5, timevar, rule = 2, f = 0, ties = mean, method = "linear")[["y"]],
         x6i = approx(timevar, x6, timevar, rule = 2, f = 0, ties = mean, method = "linear")[["y"]],
         x7i = approx(timevar, x7, timevar, rule = 2, f = 0, ties = mean, method = "linear")[["y"]],
         x8i = approx(timevar, x8, timevar, rule = 2, f = 0, ties = mean, method = "linear")[["y"]])

View(joineddemotblswithinterpolation)

# this is what I want

效果很好。但是我已经尝试了所有这些 mutate_at() 变体并且没有让它们工作。我确定某处的语法有错误...

joineddemotblswithinterpolation2 <- joineddemotblswithinterpolation %>% group_by(groupvar) %>%
  mutate_at(vars(x1, x2, x3, x4, x5, x6, x7, x8), approx(timevar, ., timevar, rule = 2, f = 0, ties = mean, method = "linear")[["y"]])

# error

joineddemotblswithinterpolation2 <- joineddemotblswithinterpolation %>% group_by(groupvar) %>%
  mutate_at(vars(x1, x2, x3, x4, x5, x6, x7, x8), approxfun(timevar, ., timevar, rule = 2, f = 0, ties = mean, method = "linear")[["y"]])

# error

joineddemotblswithinterpolation2 <- joineddemotblswithinterpolation %>% group_by(groupvar) %>%
  mutate_at(vars(x1, x2, x3, x4, x5, x6, x7, x8), funs(approxfun(timevar, ., timevar, rule = 2, f = 0, ties = mean, method = "linear")[["y"]]))

# error

joineddemotblswithinterpolation2 <- joineddemotblswithinterpolation %>% group_by(groupvar) %>%
  mutate_at(vars(x1, x2, x3, x4, x5, x6, x7, x8), funs(approxfun(timevar, ., rule = 2, f = 0, ties = mean, method = "linear")[["y"]]))

我什至尝试过 na.approx(),但也无济于事......

library("zoo")
joineddemotblswithinterpolation2 <- joineddemotblswithinterpolation %>% group_by(groupvar) %>%
  mutate_at(vars(x1, x2, x3, x4, x5, x6, x7, x8), na.approx(., timevar, na.rm = FALSE))

我从以下相关问题构建了这些不同的试验:

Using approx in dplyr

Linear Interpolation using dplyr

Using approx() with groups in dplyr

linear interpolation with dplyr but skipping groups with all missing values

R: Interpolation of NAs by group

感谢您的帮助!

【问题讨论】:

    标签: r dplyr tidyverse


    【解决方案1】:

    你很亲密。这对我有用:

    joineddemotblswithinterpolation <- joineddemotbls %>%
      group_by(groupvar) %>%
      mutate_at(vars(starts_with("x")), # easier than listing each column separately
                funs("i" = approx(timevar, ., timevar, rule = 2, f = 0, ties = mean,
                                  method = "linear")[["y"]]))
    

    这将使用插值创建列 x1_ix2_i 等。

    【讨论】:

    • 漂亮!谢谢!
    猜你喜欢
    • 2018-03-22
    • 2016-08-23
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-01-19
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多