【问题标题】:R group values in column based on intervals and average the results for each intervalR根据间隔对列中的值进行分组,并对每个间隔的结果进行平均
【发布时间】:2017-10-11 08:59:59
【问题描述】:

我有两张桌子

表1:

Dates_only <- data.frame(ID=c('1118','1118','1118','1118','1118',
                                 '1118','1118','1118','1119','1119',
                                 '1119','1119','1119','1119','1119',
                                 '1119','13PP','13PP','13PP','13PP',
                                 '13PP','13PP','13PP','13PP'),
                            Quart_y=c('2017Q3','2017Q4','2018Q1','2018Q2',
                                      '2018Q3','2018Q4','2019Q1','2019Q2',
                                      '2017Q3','2017Q4','2018Q1','2018Q2',
                                      '2018Q3','2018Q4','2019Q1','2019Q2',
                                      '2017Q3','2017Q4','2018Q1','2018Q2',
                                      '2018Q3','2018Q4','2019Q1','2019Q2'),
                            Quart=c(0.25,0.50,0.75,1.00,1.25,1.50,1.75,2.00,
                                    0.25,0.50,0.75,1.00,1.25,1.50,1.75,2.00,
                                    0.25,0.50,0.75,1.00,1.25,1.50,1.75,2.00))

和表2:

Values <- data.frame(ID=c('1118','1119','13PP','1118','1119','13PP',
                          '1118','1119','13PP','1118','1119','13PP',
                          '1118','1119','13PP','1118','1119','13PP',
                          '1118','1119','13PP','1118','1119','13PP',
                          '1118','1119','13PP','1118','1119','13PP'),
                     Day=c(0,0,0,0.14,0.13,0.13,0.2,0.23,0.24,0.27,0.28,
                           0.32,0.32,0.32,0.44,0.47,0.49,0.49,0.59,0.64,
                           0.61,0.72,0.71,0.73,0.95,0.86,0.78,1.1,0.93,1.15),
                     Value=c(7.6,6.2,6.8,7.1,6.2,5.9,6.8,5.8,4.6,6.5,5.4,
                             4.2,6.3,4.8,4,6,4.3,3.8,5.9,4,3.6,5.6,3.8,
                             3.4,5.4,3.2,3,5,2.9,2.9))

我要做的是找到一种方法来根据Dates_only$Quart 更改Values$Day 中的值。 具体来说,Dates_only$Quart 代表量化的季度,(2017Q3 - 0.25, 2017Q4-0.50,...,2018Q4-1.50) 等。而Values$Day 代表量化的天数。 我想更改按季度分类的Values$Day,例如: 对于0&lt;=Values$Day&lt;=0.25Values$Day==0.25,对于0.25&lt;Values$Day&lt;=0.50Values$Day==0.50等等。

我试图做的是使用下面的这种方法,但它会出现一条错误消息:

unique_quarters <- unique(Dates_only$Quart)
unique_quarters <- append(unique_quarters, 0, after=0)
df3 <- transform(Dates_only, 
                 Transf_Day=Values$Quart[findInterval(Values$Day, unique_quarters)])

我猜是findInterval(Values$Day, unique_quarters)返回的问题

1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 5 4 5

虽然Values$Quart 有值

0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00

【问题讨论】:

  • 试试cut(Values$Day, seq(0,3,0.25), include.lowest = T)
  • 谢谢,但这并没有真正的帮助。因为我想提取数字而不是间隔。不过感谢您的努力!

标签: r dataframe grouping


【解决方案1】:

试试这个:

library(tidyverse)
as.tbl(Values) %>% 
  mutate(Int=cut(Day, seq(0,3,0.25), include.lowest = T)) %>% 
  mutate(Int2=factor(Int, labels =  seq(0.25,1.25,0.25)))
# A tibble: 30 x 5
      ID   Day Value        Int   Int2
<fctr> <dbl> <dbl>     <fctr> <fctr>
1   1118  0.00   7.6   [0,0.25]   0.25
2   1119  0.00   6.2   [0,0.25]   0.25
3   13PP  0.00   6.8   [0,0.25]   0.25
4   1118  0.14   7.1   [0,0.25]   0.25
5   1119  0.13   6.2   [0,0.25]   0.25
6   13PP  0.13   5.9   [0,0.25]   0.25
7   1118  0.20   6.8   [0,0.25]   0.25
8   1119  0.23   5.8   [0,0.25]   0.25
9   13PP  0.24   4.6   [0,0.25]   0.25
10  1118  0.27   6.5 (0.25,0.5]    0.5
# ... with 20 more rows

【讨论】:

    猜你喜欢
    • 2022-01-17
    • 2017-11-24
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2015-04-13
    • 1970-01-01
    相关资源
    最近更新 更多