【发布时间】:2020-03-06 16:57:57
【问题描述】:
我在下面有一个大的 csv 文件示例,
> data <- fread('data.csv', sep = ",")
> data
name year value
1: Afghanistan 1800 11
2: Albania 1800 22
3: Algeria 1800 6
4: Afghanistan 1801 48
5: Albania 1801 60
6: Algeria 1801 120
---
46509: Afghanistan 2040 108
46510: Albania 2040 72
46511: Algeria 2040 36
我的目标是将这些数据重新采样为每月并插入值列,如下所示,(阿富汗 1800)
name year value
1: Afghanistan Jan 1800 1
1: Afghanistan Feb 1800 2
1: Afghanistan Mar 1800 3
1: Afghanistan May 1800 4
1: Afghanistan Jun 1800 5
1: Afghanistan Jul 1800 6
1: Afghanistan Aug 1800 7
1: Afghanistan Sep 1800 8
1: Afghanistan Oct 1800 9
1: Afghanistan Nov 1800 10
1: Afghanistan Dec 1800 11
2: Albania Jan 1800 2
---
46509: Afghanistan 2040 108
46510: Albania 2040 72
46511: Algeria 2040 36
我尝试了几个选项都没有成功,最接近的如下所示,
> data <- as.zoo(data)
> m <- na.approx(data(time(data), 0:11/12, "+"))
Error in approx(x[!na], y[!na], xout, ...) :
need at least two non-NA values to interpolate
In addition: Warning messages:
1: In data(time(data), 0:11/12, "+") : data set ‘time(data)’ not found
2: In data(time(data), 0:11/12, "+") : data set ‘0:11/12’ not found
3: In data(time(data), 0:11/12, "+") : data set ‘+’ not found
4: In xy.coords(x, y, setLab = FALSE) : NAs introduced by coercion
> head(m)
Afghanistan Albania Algeria
1800-01-31 11 24 6
1800-02-28 11 24 6
1800-03-31 11 24 6
1800-04-30 11 24 6
1800-05-31 11 24 6
1800-06-30 11 24 6
关于如何达到我想要的结果的想法?
【问题讨论】:
-
1.重新采样是什么意思?您的值还不是数据的一部分,因此您可以重新采样它们。看起来您正在向已有的行添加更多行。 2. 每月流程如何运作?为什么四月从阿富汗失踪? 3、value列是如何产生的?
标签: r dplyr data.table tidyverse zoo