【发布时间】:2020-05-03 07:18:11
【问题描述】:
我想让 R 为一定数量的Income 计算netincome:
panelID = c(1:50)
year= c(2001:2010)
country = "NLD"
n <- 2
library(data.table)
set.seed(123)
DT <- data.table(panelID = rep(sample(panelID), each = n),
country = rep(sample(country, length(panelID), replace = T), each = n),
year = c(replicate(length(panelID), sample(year, n))),
some_NA = sample(0:5, 6),
some_NA_factor = sample(0:5, 6),
norm = round(runif(100)/10,2),
Income = round(rnorm(10,10,10),2),
Happiness = sample(10,10),
Sex = round(rnorm(10,0.75,0.3),2),
Age = sample(100,100),
Educ = round(rnorm(10,0.75,0.3),2))
DT [, uniqueID := .I] # Creates a unique ID
DT[DT == 0] <- NA
DT$Income[DT$Income < 0] <- NA
DT <- as.data.frame(DT)
现在,需要按如下方式计算税款:
前五年(2001-2005),收入 20 == 50%
第二个五年(2006-2010),收入20 == 45%
我试着写成这样:
for (i in DT$Income) {
if (DT$Income[i] < 20 & DT$year[i] < 2006) {
DT$netincome[i] <- DT$Income[i] - (DT$Income[i]*0.25)
} else if (DT$Income[i] > 20 & DT$year[i] < 2006) {
DT$netincome[i] <- DT$Income[i] - (20*0.25) - ((DT$Income[i]-20)*0.5)
} else if (DT$Income[i] < 15 & DT$year[i] > 2005) {
DT$netincome[i] <- DT$Income[i] - (DT$Income[i]*0.20)
} else if (DT$Income[i] > 15 & DT$year[i] > 2005) {
DT$netincome[i] <- DT$Income[i] - (15*0.20) - ((DT$Income[i]-15)*0.45)
}
}
但我得到了错误:
Error in `$<-.data.frame`(`*tmp*`, "netincome", value = c(NA, NA, NA, :
replacement has 15 rows, data has 100
此外,我真的很想用sapply 以更简洁的方式重写它,但我正在为如何做而苦苦挣扎。
【问题讨论】:
-
sapply失去了矢量化的好处。你所有的计算都在向量上。
标签: r for-loop if-statement sapply