【问题标题】:Iteration for filtering maximum data values on a weekly basis for every year ; R [duplicate]每年每周过滤最大数据值的迭代; R [重复]
【发布时间】:2020-06-17 12:25:49
【问题描述】:

我有一个看起来像这样的数据框

dt           GNDVI  YEAR   week
   <date>     <dbl> <chr> <dbl>
 1 2002-07-04 0.646 2002     27
 2 2002-07-07 0.627 2002     27
 3 2002-07-08 0.514 2002     27
 4 2002-07-09 0.614 2002     28
 5 2002-07-11 0.654 2002     28
 6 2002-07-14 0.64  2002     28
 7 2002-07-18 0.673 2002     29
 8 2002-07-20 0.653 2002     29

我已经按周对数据进行了分组。现在我想过滤 2002-2019 年每周的变量 GNDVI 的最大值。 我当前的代码返回 2002-2019 年 GNDVI 最高的总周数,而不是分别返回每一年。

library(dplyr)
library(lubridate)
library(tidyverse)
options(stringsAsFactors = FALSE)
library(data.table)

#setting dt as dateclass column
gndvi_daily$dt<-as.Date(gndvi_daily$dt)

#selecting months of choice
GS=gndvi_daily[month(gndvi_daily$dt) >= 6 & month(gndvi_daily$dt) <= 
9, ]

#extract year from dateclass column
GS$YEAR <- substr(GS$dt, 1,4)


#group GNDVI by week 
GSWEEK = GS %>% group_by(week = week(dt))

#iterating to filter maximum GNDVI per week of all years 2002-2019
output <- vector ("double", 0)
for(i in seq_along(GSWEEK$YEAR)) {output <- tapply(GSWEEK$GNDVI, 
GSWEEK$week, max)}
 output

当前输出:

22    0.651
23    0.711
24    0.699
....
40    0.648

需要的输出:

week   year     Max GNDVI
22     2002     0.651
23     2002     0.711
...
39     2019     0.88
40     2019     0.67

我对 R 中的编码有点陌生,我非常感谢任何帮助。

【问题讨论】:

    标签: r iteration


    【解决方案1】:
    df <- tribble(~dt, ~GNDVI,  ~YEAR,   ~week,
    "2002-07-04", 0.646, 2002,    27,
    "2002-07-07", 0.627, 2002,     27,
    "2002-07-08", 0.514, 2002,    27,
    "2002-07-09", 0.614, 2002,     28,
    "2002-07-11", 0.654, 2002 ,   28,
    "2002-07-14", 0.64,  2002,     28,
    "2002-07-18", 0.673, 2002,     29,
    "2002-07-20", 0.653, 2002 ,    29)
    
    
    
    
    
    df %>% 
    group_by(YEAR, week) %>% 
    summarise(Max_GNDVI = max(GNDVI))
    
    
    # A tibble: 3 x 3
    # Groups:   YEAR [1]
       YEAR  week Max_GNDVI
      <dbl> <dbl>     <dbl>
    1  2002    27     0.646
    2  2002    28     0.654
    3  2002    29     0.673
    

    【讨论】:

      【解决方案2】:

      您要查找的函数名为summarise。它带有tidyverse package。此外,如果您想区分周数和年数,则必须按两者进行分组。

      library(tidyverse)
      library(magrittre)
      # First i read in your data and format it the same way
      dat <- read_table(" 1 2002-07-04 0.646 2002     27
       2 2002-07-07 0.627 2002     27
       3 2002-07-08 0.514 2002     27
       4 2002-07-09 0.614 2002     28
       5 2002-07-11 0.654 2002     28
       6 2002-07-14 0.64  2002     28
       7 2002-07-18 0.673 2002     29
       8 2002-07-20 0.653 2002     29", col_names=F) %>% 
        mutate(date = X2, GNDVI = X3 , year = X4, week = X5) %>% 
        select(date,GNDVI,year,week)
      
      
      
      dat %>% 
        group_by(week, year) %>% 
        summarise(Max_Gndvi = max(GNDVI))  
      

      结果是

      # A tibble: 3 x 3
         week Max_Gndvi  year
        <dbl>     <dbl> <dbl>
      1    27     0.646  2002
      2    28     0.654  2002
      3    29     0.673  2002
      

      另外,您加载了很多有用的库并且不使用它们。可以和管道运算符%&gt;%组合很多功能,可以读作“然后”:

      GSWEEK <- gndvi_daily %>% 
        mutate(dt = as.Date(dt)) %>% 
        filter(month(dt) >= 6 & month(dt) <=9) %>% 
        mutate(YEAR = year(dt)) 
      

      此代码将执行以下操作:获取 gndvi_daily AND THEN mutate dt to datformat AND THEN filter months between the Sixth and 9th AND THEN mutate the year column。

      【讨论】:

        猜你喜欢
        • 2015-02-16
        • 1970-01-01
        • 1970-01-01
        • 2021-01-06
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2019-12-14
        • 2020-10-09
        相关资源
        最近更新 更多