基于值的 IF ELSE 生成器答案

【问题标题】：IF ELSE Generator based on values基于值的 IF ELSE 生成器
【发布时间】：2019-11-20 07:10:35
【问题描述】：

我有一个数据框，其中包含 Region、minage 和 maxage 列中的 if else 条件。请参阅下面的 df。

   Seasoning Region minage maxage
1 mths:36-47      A     36     47
2 mths:24-35      A     24     35
3 mths:12-23      A     12     23
4 mths:36-47      B     36     47
5 mths:24-35      B     24     35
6 mths:12-23      B     12     23

我想在函数内以自动方式生成以下 IF ELSE 条件，因为我的实际数据集中有 40-50 多个条件。简而言之，我不想手动输入 if else 条件。 if 条件的格式 -

(seasoning >= minage) & (seasoning <= maxage) & (Region == value_Region_column)

功能

bx_is <- function(seasoning = NULL, Region = NULL) {
  Bx = if ((seasoning >= 36) & (seasoning <= 47) & (Region == 'A')) {trans1} 
  else if ((seasoning >= 24) & (seasoning <= 35) & (Region == 'A')) {trans2}
  else if ((seasoning >= 12) & (seasoning <= 23) & (Region == 'A')) {trans3}
  else if ((seasoning >= 36) & (seasoning <= 47) & (Region == 'B')) {trans4}
  else if ((seasoning >= 24) & (seasoning <= 35) & (Region == 'B')) {trans5}
  else if ((seasoning >= 12) & (seasoning <= 23) & (Region == 'B')) {trans6}
  return(data.matrix(Bx))
}

bx_is(seasoning=28, Region = 'B')

输入 Df

mx = structure(list(Seasoning = structure(c(3L, 2L, 1L, 3L, 2L, 1L
), .Label = c("mths:12-23", "mths:24-35", "mths:36-47"), class = "factor"), 
Region = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"), 
minage = c(36L, 24L, 12L, 36L, 24L, 12L), maxage = c(47L, 35L, 23L, 47L, 35L, 23L)), 
class = "data.frame", row.names = c(NA, -6L))

注意在上面显示的 bx_is( ) 函数中，trans1...trans6 是 6 个不同的矩阵。我想在 22M 迭代的循环中使用基于此函数条件的矩阵。 我无法应用过滤器，因为它需要一些处理并且在 22M 迭代中变慢。

【问题讨论】：

标签： r dplyr

【解决方案1】：

这是使用dplyr 包的解决方案：

library(dplyr)

df = mx %>% mutate(res=paste0("trans", row_number()))

bx_is <- function(seasoning, region) {
  r = df %>% filter(Region == region, minage <= seasoning, maxage >= seasoning)
  get(r$res)
}

它创建一个数据框df，它附加一个res 列，对应于每个条件（行）的输出结果。

函数bx_is 根据条件过滤数据帧，并为符合条件的行输出res 值。

编辑由于它比使用if else 语句慢（即使使用data.table），我们可以生成相同的代码并使用：

f <- function() {
  s = "bx_is <- function(seasoning = NULL, Region = NULL) {"

  for (i in 1:dim(df)[1]) {
    r = paste0("((seasoning >= ",
               df[i, ]$minage,
               ") & (seasoning <= ",
               df[i, ]$maxage,
               ') & (Region == "',
               df[i, ]$Region,
               '")) {',
               df[i, ]$res,
               "}")
    if (i == 1) {
      s = paste0(s, " Bx = if ", r)
    } else {
      s = paste0(s, "\n else if ", r)
    }
  }

  s = paste0(s, "\n return(data.matrix(Bx))}")
  s
}
eval(parse(text=f()))

【讨论】：

trans1...trans6 是 6 个不同的矩阵。我想在 22M 迭代的循环中使用基于此函数的矩阵。
我刚刚编辑过：您可以通过get(r$res) 的名称获取矩阵。例如get("trans1") 是trans1。
我无法应用过滤器，因为它需要一些处理并且在 22M 迭代中变慢
我不认为过滤速度很慢，如果所有条件都是手动编写的，它不应该真的更快（你只有50行，使用dplyr时这个处理是在C++中完成的）。你在 R 中对这么多元素的 for 循环使它变慢了。
谢谢。我测试过过滤和不过滤。过滤速度至少慢 10 倍。 IF 条件不需要处理任何东西。