【发布时间】:2015-12-02 01:41:44
【问题描述】:
我有一个使用 np.genfromtxt() 函数从 csv 文件加载的元组数组。
import numpy as np
import re
from matplotlib.dates import strpdate2num
def convert_string_to_bigint(x):
p = re.compile(r'(\d{4})/(\d{1,2})/(\d{1,2}) (\d{1,2}):(\d{2}):\d{2}')
m = p.findall(x)
l = list(m[0])
l[1] = ('0' + l[1])[-2:]
l[2] = ('0' + l[2])[-2:]
return long("".join(l))
#print convert_string_to_bigint("2012/7/2 14:07:00")
csv = np.genfromtxt ('sr00-1min.txt', delimiter=',', converters={0:convert_string_to_bigint})
csv文件中的数据样本:
2015/9/2 14:54:00,5169,5170,5167,5168
2015/9/2 14:55:00,5168,5169,5166,5166
2015/9/2 14:56:00,5167,5170,5165,5169
2015/9/2 14:57:00,5168,5173,5167,5172
2015/9/2 14:58:00,5172,5187,5171,5182
2015/9/2 14:59:00,5182,5183,5171,5176
2015/9/2 15:00:00,5176,5183,5174,5182
加载后是这样的:
[(201509021455L, 5168.0, 5169.0, 5166.0, 5166.0)
(201509021456L, 5167.0, 5170.0, 5165.0, 5169.0)
(201509021457L, 5168.0, 5173.0, 5167.0, 5172.0)
(201509021458L, 5172.0, 5187.0, 5171.0, 5182.0)
(201509021459L, 5182.0, 5183.0, 5171.0, 5176.0)
(201509021500L, 5176.0, 5183.0, 5174.0, 5182.0)]
我想将它转换为一个 numpy 二维数组。它应该是这样的:
[[201509021455L, 5168.0, 5169.0, 5166.0, 5166.0]
[201509021456L, 5167.0, 5170.0, 5165.0, 5169.0]
[201509021457L, 5168.0, 5173.0, 5167.0, 5172.0]
[201509021458L, 5172.0, 5187.0, 5171.0, 5182.0]
[201509021459L, 5182.0, 5183.0, 5171.0, 5176.0]
[201509021500L, 5176.0, 5183.0, 5174.0, 5182.0]]
我使用下面的代码解决了这个问题,但它看起来非常难看。谁能告诉我如何以优雅的方式转换它?
pool = np.asarray([x for x in csv if x[0] > 201508010000])
sj = np.asarray([x[0] for x in pool])
kpj = np.asarray([x[1] for x in pool])
zgj = np.asarray([x[2] for x in pool])
zdj = np.asarray([x[3] for x in pool])
spj = np.asarray([x[4] for x in pool])
output = np.column_stack((sj,kpj,zgj,zdj,spj))
print output.shape
【问题讨论】:
-
csv 是什么样的?
-
二维数组是什么意思?对于相同的输入,您希望您的输出如何?
-
预期的输出是什么?
-
你不能得到一个二维数组,其中一列是
L,而其他列是浮动的。相反,genfromtxt给了你一个一维结构化数组。您可以获得所有浮点数的二维数组。
标签: python arrays numpy tuples