【发布时间】:2020-11-26 14:14:39
【问题描述】:
我已经对数据和一个变量进行了分组,我希望对每组进行平滑处理。如果绝对变化很小(例如小于 5),我认为它们是测量误差,因此想要复制(前滚)旧值。在每个组中,我将第一个测量值初始化为默认值。因此,我假设每组的第一个观察结果总是正确的(有待商榷)。
set.seed(5)
mydata = data.frame(group=c(1,1,1,1,1,1,1,2,2,2,2,2,2,2),
year=seq(from=2003, to=2009, by=1),
variable = round(runif(14, min = -5, max = 15),0))
mydata %>%
filter(variable > 0) %>%
group_by(group) %>%
mutate(smooth5 = ifelse( abs( lag(variable, n = 1, default = first(variable)) - variable ) <= 5 , variable, 5)) %>%
select(group, year, variable, smooth5) %>%
arrange(group)
# A tibble: 10 x 4
# Groups: group [2]
group year variable smooth5
<dbl> <dbl> <dbl> <dbl>
1 1 2004 9 9
2 1 2005 13 13 # <- this change is |4|, thus it should use the old value 9
3 1 2006 1 5 # <- here 13 changes to 1 is a reasonable change, should keep 1
4 1 2008 9 5
5 1 2009 6 6
6 2 2003 11 11
7 2 2004 14 14
8 2 2007 5 5
9 2 2008 1 1
10 2 2009 6 6
【问题讨论】:
-
也许你可以试试
ifelse(abs(diff(c(0, variable))) >= 5, variable, lag(variable))。