【发布时间】:2014-09-11 15:36:18
【问题描述】:
我正在尝试为多列数据文件中的一列绘制 CDF。当数据文件中只有一列时,它绘制得很好。当我尝试从数据中获取特定列时,它给了我错误。我还尝试使用 for 循环来读取它读得很好的特定列。如果我在 for 循环之外给出绘图语句,则仅显示该列的最后一个值,并且如果我将绘图语句保留在循环内,则会给出错误。这不是读取文件或特定列的问题,甚至不是缩进问题。我该如何解决?
带有for循环的代码
import numpy as np
import matplotlib.pyplot as plt
from pylab import*
import math
from matplotlib.ticker import LogLocator
with open('input.txt', 'r') as f:
for rows in f:
cols = rows.split()
data = cols[2]
sorted_data = np.sort(data)
cdf = np.arange(len(data))/float(len(data))
plt.plot(sorted_data, cdf, '-bs')
plt.show()
#print data
错误
Traceback (most recent call last):
File "cdf_plot.py", line 13, in <module>
plt.plot(sorted_data, cdf, '-bs')
File "/usr/lib/pymodules/python2.7/matplotlib/pyplot.py", line 2467, in plot
ret = ax.plot(*args, **kwargs)
File "/usr/lib/pymodules/python2.7/matplotlib/axes.py", line 3893, in plot
for line in self._get_lines(*args, **kwargs):
File "/usr/lib/pymodules/python2.7/matplotlib/axes.py", line 322, in _grab_next_args
for seg in self._plot_args(remaining, kwargs):
File "/usr/lib/pymodules/python2.7/matplotlib/axes.py", line 300, in _plot_args
x, y = self._xy_from_xy(x, y)
File "/usr/lib/pymodules/python2.7/matplotlib/axes.py", line 240, in _xy_from_xy
raise ValueError("x and y must have same first dimension")
ValueError: x and y must have same first dimension
没有for循环的代码:
import numpy as np
import matplotlib.pyplot as plt
from pylab import*
import math
from matplotlib.ticker import LogLocator
data = np.loadtxt('input.txt')
data_one = [row[2] for row in data]
sorted_data = np.sort(data)
cdf = np.arange(len(data_one))/float(len(data_one))
#cumulative = np.cumsum(data)
#ccdf = 1 - cdf
#plt.plot(data, sorted_data, 'r-*')
plt.plot(sorted_data, cdf, '-bs')
#plt.xlim([0,0.5])
plt.gca().set_xscale("log")
plt.gca().set_yscale("log")
plt.show()
错误:
Traceback (most recent call last):
File "cum_graph.py", line 7, in <module>
data = np.loadtxt('e_p_USC_30_days.txt')
File "/usr/lib/python2.7/dist-packages/numpy/lib/npyio.py", line 804, in loadtxt
X = np.array(X, dtype)
ValueError: setting an array element with a sequence.
输入文件:我有兴趣计算 col[2] 的 CDF,即仅第 3 列
4814 2464 27 0.000627707861971 117923.0
4211 736 2 4.64968786645 05 2576.0
2075 1339 30 0.000697453179968 499822.0
2441 2381 3 6.97453179968 05 1968.0
4694 1738 1 2.32484393323 05 5702.0
4406 3008 12 0.000278981271987 8483.0
3622 1396 3 6.97453179968 05 2564.0
5425 478 1 2.32484393323 05 428.0
4489 1715 6 0.000139490635994 19045.0
3695 3387 2 4.64968786645 05 16195.0
【问题讨论】:
-
您在每个循环中都覆盖了
data和cdf。考虑使用numpy.loadtxt -
@darthbith 第二个代码是
numpy.loadtxt也有错误。 -
那为什么不使用切片语法呢?
data_one = data[:,2]PS:问得好,+1 -
哦,现在我看到错误出现在
loadtxt... 您是否尝试在谷歌上搜索您收到的错误消息?原因是您在某些行中有额外的数据(或者在其他行中没有足够的数据,这取决于您如何看待它:-))。见:numpy-discussion.10968.n7.nabble.com/…
标签: python file numpy matplotlib plot