使用 purrr::map 循环列表产生错误答案

【问题标题】：Looping over lists using purrr::map producing error使用 purrr::map 循环列表产生错误
【发布时间】：2019-06-11 01:40:37
【问题描述】：

问题

我有一个历史税率列表和一个应税收入向量，我需要将它们组合起来，以便计算每年每个收入水平的纳税义务。当我去迭代历史税率和收入时，我收到一条错误消息：

Error: Argument 2 can't be a list containing data frames

对有关如何修改数据或函数调用（如下）以完成迭代的任何建议感兴趣。

数据

pit_sch <- list(`2016` = structure(list(id = c("2016", "2016", "2016", "2016"
), hh_exp_def = c(0.989, 0.989, 0.989, 0.989), `Taxable income` = c("$18,201 – $37,000", 
"$37,001 – $80,000", "$80,001 – $180,000", "$180,001 and over"
), `Tax on this income` = c("19c for each $1 over $18200", "$3572 plus 32.5c for each $1 over $37000", 
"$17547 plus 37c for each $1 over $80000", "$54547 plus 45c for each $1 over $180000"
), cumm_tax_amt = c(0, 3572, 17547, 54547), tax_rate = c(19, 
32.5, 37, 45), threshold = c(18200, 37000, 80000, 180000), real_threshold = c(18402.4266936299, 
37411.5267947422, 80889.7876643074, 182002.022244692), real_cumm_tax_amt = c(0, 
3611.72901921132, 17742.16380182, 55153.6905965622)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -4L)), `2017` = structure(list(
    id = c("2017", "2017", "2017", "2017"), hh_exp_def = c(1, 
    1, 1, 1), `Taxable income` = c("$18,201 – $37,000", "$37,001 – $87,000", 
    "$87,001 – $180,000", "$180,001 and over"), `Tax on this income` = c("19c for each $1 over $18200", 
    "$3572 plus 32.5c for each $1 over $37000", "$19822 plus 37c for each $1 over $87000", 
    "$54232 plus 45c for each $1 over $180000"), cumm_tax_amt = c(0, 
    3572, 19822, 54232), tax_rate = c(19, 32.5, 37, 45), threshold = c(18200, 
    37000, 87000, 180000), real_threshold = c(18200, 37000, 87000, 
    180000), real_cumm_tax_amt = c(0, 3572, 19822, 54232)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -4L)))

income <- seq(from = 1, to = 100000, by = 100)

尝试

# Defining the function which will calculate tax liability for a given set of tax rates (in pit_sch) and income
nominial_tax_calc <- function(data, income) {
  i <-pmax(which(income >= data[, 7]))
  if (length(i) > 0) 
    return(tibble(income = income, 
                  tax = (income - data[i, 7]) * (data[i, 6] / 100) + data[i, 5]))
  else
    return(tibble(income = income, tax = 0))
}

# Function that results in the error
map(pit_sch,~map_df(income, nominial_tax_calc, data = .))

【问题讨论】：

标签： r purrr

【解决方案1】：

我认为您需要对函数进行两次更改，

1) 代替pmax 使用max

2) 将as.numeric 包裹在tax 计算中

nominial_tax_calc <- function(data, income) {
   i <- max(which(income >= data[, 7]))
   if (length(i) > 0) 
     return(tibble(income = income, 
        tax = as.numeric((income - data[i, 7]) * (data[i, 6] / 100) + data[i, 5])))
    else
      return(tibble(income = income, tax = 0))
}

然后调用

library(purrr)
map(pit_sch,~map_df(income, nominial_tax_calc, data = .))

【讨论】：

【解决方案2】：

问题在于 data 参数是一个小标题，但您使用括号索引，就好像它是一个基本的 R 数据帧。这具有留下列名的效果，这会导致您的麻烦：

pit_sch[["2016"]][2, 7]

# A tibble: 1 x 1
  threshold
      <dbl>
1     37000

将data转换为nominial_tax_calc()第一行的数据框，
使用data <- as.data.frame(data)，然后您可以使用您选择的索引语法，您的函数将正常运行。

【讨论】：