【发布时间】:2020-04-13 23:17:17
【问题描述】:
我有三个缺失值的数据集,每个数据集由一个时间列和一个数据列组成。两行之间的最小时间差为 1 秒(00:00:01):
Dataset 1: Dataset 2: Dataset 3:
00:00:00 81 00:00:00 70
00:00:01 81
00:00:02 81
00:00:03 81 00:00:03 99
00:00:04 81 00:00:04 100
00:00:05 80 00:00:05 80 00:00:05 101
00:00:06 80 00:00:06 100
00:00:07 92 00:00:07 88
00:00:08 83 00:00:08 80 00:00:08 88
00:00:09 84 00:00:09 83 00:00:09 87
00:00:10 86
00:00:11 89
00:00:12 90
00:00:13 92 00:00:13 92
00:00:14 94 00:00:14 94
00:00:15 94 00:00:15 96 00:00:15 93
00:00:16 96 00:00:16 97
00:00:17 98 00:00:17 100 00:00:17 99
00:00:18 100 00:00:18 99
00:00:19 101 00:00:19 101
00:00:20 103
为了直观起见,上表显示了缺失值的空白字段。真实数据是密集的,例如看起来像这样:
Dataset 1: Dataset 2: Dataset 3:
00:00:00 81 00:00:05 80 00:00:00 70
00:00:01 81 00:00:06 100 00:00:03 99
00:00:02 81 00:00:07 92 00:00:04 100
00:00:03 81 00:00:08 80 00:00:05 101
00:00:04 81 00:00:09 83 00:00:07 88
00:00:05 80 00:00:15 96 00:00:08 88
00:00:06 80 00:00:16 97 00:00:09 87
00:00:08 83 00:00:17 100 00:00:13 92
00:00:09 84 00:00:14 94
00:00:10 86 00:00:15 93
00:00:11 89 00:00:17 99
00:00:12 90 00:00:18 99
00:00:13 92 00:00:19 101
00:00:14 94
00:00:15 94
00:00:16 96
00:00:17 98
00:00:18 100
00:00:19 101
00:00:20 103
现在我想对齐数据,以便可以这样绘制:
这样:
我的幼稚做法是这样的:
- 找出每个数据集中的最小/最大时间。
- 创建一个表,其中每次一行,三列,每列都以
n/a为值。 - 循环遍历每个数据集并将值分配给表。
是否有一些 Python 函数/库可以有效地执行这些步骤?或者有更好的方法吗?
问候,
【问题讨论】:
标签: python pandas numpy matplotlib