【问题标题】:How to use dplyr lag() to smooth minor changes in a variable如何使用 dplyr lag() 平滑变量中的微小变化
【发布时间】:2020-11-26 14:14:39
【问题描述】:

我已经对数据和一个变量进行了分组,我希望对每组进行平滑处理。如果绝对变化很小(例如小于 5),我认为它们是测量误差,因此想要复制(前滚)旧值。在每个组中,我将第一个测量值初始化为默认值。因此,我假设每组的第一个观察结果总是正确的(有待商榷)。

set.seed(5)
mydata = data.frame(group=c(1,1,1,1,1,1,1,2,2,2,2,2,2,2), 
                       year=seq(from=2003, to=2009, by=1), 
                       variable = round(runif(14, min = -5, max = 15),0))
mydata %>%
  filter(variable > 0) %>%
  group_by(group) %>%
  mutate(smooth5 = ifelse( abs( lag(variable, n = 1, default = first(variable)) - variable ) <= 5 , variable, 5)) %>%       
  select(group, year, variable, smooth5) %>%
  arrange(group)

# A tibble: 10 x 4
# Groups:   group [2]
   group  year variable smooth5
   <dbl> <dbl>    <dbl>   <dbl>
 1     1  2004        9       9
 2     1  2005       13      13  # <- this change is |4|, thus it should use the old value 9
 3     1  2006        1       5  # <- here 13 changes to 1 is a reasonable change, should keep 1
 4     1  2008        9       5
 5     1  2009        6       6
 6     2  2003       11      11
 7     2  2004       14      14
 8     2  2007        5       5
 9     2  2008        1       1
10     2  2009        6       6

【问题讨论】:

  • 也许你可以试试ifelse(abs(diff(c(0, variable))) &gt;= 5, variable, lag(variable))

标签: r dplyr smoothing


【解决方案1】:

您很接近,但您的ifelse() 呼叫中有一些错误。下面,为了清楚起见,我添加了一个新变量previous。如果abs(previous - variable) &lt;= 5,你想要previous,否则你想要variable

mydata %>%
  filter(variable > 0) %>%
  group_by(group) %>%
  mutate(previous = lag(variable, n = 1, default = first(variable)),
         smooth5 = ifelse(abs(previous - variable) <= 5, previous, variable)) %>%       
  select(group, year, variable, smooth5) %>%
  arrange(group)

给了

# A tibble: 10 x 4
# Groups:   group [2]
   group  year variable smooth5
   <dbl> <dbl>    <dbl>   <dbl>
 1     1  2004        9       9
 2     1  2005       13       9
 3     1  2006        1       1
 4     1  2008        9       9
 5     1  2009        6       9
 6     2  2003       11      11
 7     2  2004       14      11
 8     2  2007        5       5
 9     2  2008        1       5
10     2  2009        6       1

【讨论】:

    猜你喜欢
    • 2021-11-11
    • 2019-05-27
    • 1970-01-01
    • 1970-01-01
    • 2020-06-11
    • 1970-01-01
    • 1970-01-01
    • 2020-10-24
    • 1970-01-01
    相关资源
    最近更新 更多