【发布时间】:2020-05-03 08:04:15
【问题描述】:
我想使用 MatPlotLib 绘制时间序列数据。数据以 csv 格式存储,我使用 pd.read_csv() 处理到 Pandas DataFrame,效果很好。一个数据集包括一个时间戳列和大约 10 个值列。我通过pd.to_datetime(dataFrame['TIMESTAMP'], format='%Y-%m-%d %H:%M:%S') 将时间戳(最初是格式为 yyyy-MM-dd hh:mm:ss 的字符串)转换为 datetime。
为了绘制数据,我使用以下代码(示例数据的生成不是我的代码的一部分):
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns
N = 30
timestamps = pd.date_range('2020-01-16 8:00', periods=N, freq='72s')
# note: the original timestamps aren't evenly spaced, this is just data to test
dataFrame = pd.DataFrame({'TIMESTAMP': timestamps, 'Y1': np.random.normal(100, 30, N), 'Y2': np.random.normal(100, 30, N)})
acqFieldName = 'Y1'
fig = sns.pointplot(x='TIMESTAMP', y=acqFieldName, data=dataFrame, scale=0.75)
timestamps = dataFrame['TIMESTAMP'].dt.time
fig.axes.set_xticklabels(labels=timestamps, rotation=45)
plt.show()
结果如下:
不过,我还是想更改 x 轴:刻度太密集,所以我想要 - 比如说 - 10 个刻度,我希望以分钟为单位查看花费的时间,格式为 'mm :ss'。
我尝试了以下方法:
fig = sns.pointplot(x='TIMESTAMP', y=acqFieldName, data=dataFrame, scale=0.75)
timestamps = dataFrame['TIMESTAMP'].dt.time
xmin = dataFrame['TIMESTAMP'][0]
xmax = dataFrame['TIMESTAMP'][len(dataFrame['TIMESTAMP']) - 1]
timeDiff: timedelta = xmax - xmin
customTicks = np.linspace(0., timeDiff.seconds, 10)
fig.axes.set_xticklabels(labels=customTicks, rotation=45)
fig.axes.set_xticks(customTicks)
plt.show()
结果如下:
显然不是我想要的。
如果我可以减少格式化为时间的刻度数,或者 - 更好 - 如果这些点与所用时间给出的刻度对齐,我的问题就会得到解决。
更新:更木剑八产量建议
fig, ax = plt.subplots()
ax.plot(dataFrame.set_index('TIMESTAMP'), dataFrame[acqFieldName])
plt.show()
基于 JohanC 答案的工作解决方案:
for fileName in glob.glob('*.csv'):
plt.close()
# NOTE: CsvFileProcessor is a custom class doing the readout of CSV and conversion to pandas.DataFrame
dataFrame, acqFieldName, settingParameterCount = CsvFileProcessor.processFile(fileName)
fig, ax = plt.subplots()
ax: plt.Subplot = sns.pointplot(x='TIMESTAMP', y=acqFieldName, data=dataFrame, scale=0.75, ax=ax)
startTime = dataFrame['TIMESTAMP'][0]
timeProgress = []
for timeStamp in dataFrame['TIMESTAMP']:
timePassed = timeStamp - startTime
timeProgress.append(timePassed)
custom_ticks = range(0, len(timeProgress), 5)
timestamps = [f"{datetime.timedelta(seconds=timeProgress[t].seconds)}" for t in custom_ticks]
# for manipulating the x-axis tick labels:
# https://stackoverflow.com/questions/51105648/ordering-and-formatting-dates-on-x-axis-in-seaborn-bar-plot
ax.axes.set_xticklabels(labels=timestamps, rotation=45)
ax.axes.set_xlabel(xlabel="Processing Time")
plt.title('Setting Parameters: ' + str(settingParameterCount))
ax.axes.set_xticks(custom_ticks)
outFileName = fileName.upper()
outFileName = outFileName.replace('_DATA.CSV', '')
outFileName = outFileName + '_READOUT.PNG'
fig.tight_layout()
#plt.savefig(outFileName)
plt.show()
结果:
【问题讨论】:
-
发布或附加 csv 数据样本
-
@WolfiG 我在您的帖子中添加了一些测试数据。随时改进。
标签: python pandas matplotlib seaborn