【问题标题】:How to plot data chronologically如何按时间顺序绘制数据
【发布时间】:2021-10-06 06:14:21
【问题描述】:

我正在使用matplotlib 绘制来自.dat 文件的结果。

数据如下

1145, 2021-07-17 00:00:00, bob, rome, 12.75, 65.0, 162.75
1146, 2021-07-12 00:00:00, billy larkin, italy, 93.75, 325.0, 1043.75
114, 2021-07-28 00:00:00, beatrice, rome, 1, 10, 100
29, 2021-07-25 00:00:00, Colin, italy the third, 10, 10, 50
5, 2021-07-22 00:00:00, Veronica, canada, 10, 100, 1000
1149, 1234-12-13 00:00:00, Billy Larkin, 1123, 12.75, 65.0, 162.75

我想以正确的顺序打印一年的数据(1 月到 12 月),并让我的标签显示为月份,而不是长日期。

这是我的代码:

import matplotlib.pyplot as plt
import csv

x = []
y = []

with open('Claims.dat','r') as csvfile:
    #bar = csv.reader(csvfile, delimiter=',')
    plot = csv.reader(csvfile, delimiter=',')

    for row in plot:
        x.append(str(row[1]))
        y.append(str(row[6]))

plt.plot(x,y, label='Travel Claim Totals!', color='red', marker="o")
plt.xlabel('Months', color="red", size='large')

plt.ylabel('Totals', color="red", size='large')
plt.title('Claims Data:   Team Bobby\n Second Place is the First Looser', color='Blue', weight='bold', size='large')

plt.xticks(rotation=45, horizontalalignment='right', size='small')
plt.yticks(weight='bold', size='small', rotation=45)

plt.legend()
plt.subplots_adjust(left=0.2, bottom=0.40, right=0.94, top=0.90, wspace=0.2, hspace=0)
plt.show()

【问题讨论】:

  • 仅供参考:彻底回答问题非常耗时。如果您的问题已解决,请通过接受最适合您的需求的解决方案表示感谢。 ✔ 位于答案左上角的 ▲/▼ 箭头下方。如果出现更好的解决方案,则可以接受新的解决方案。如果您的声望超过 15,您还可以使用 ▲/▼ 箭头对答案的有用性进行投票。 如果解决方案无法回答问题,请发表评论。 What should I do when someone answers my question?。谢谢。

标签: python pandas matplotlib


【解决方案1】:

我认为最简单的方法是根据日期来处理数据,这可以使用datetime 包构建。这是一个基于您的数据的最小工作示例

import datetime

def isfloat(value: str):
  try:
    float(value)
    return True
  except ValueError:
    return False

def isdatetime(value: str):
  try:
    datetime.datetime.fromisoformat(value)
    return True
  except ValueError:
    return False

data = r"""1145, 2021-07-17 00:00:00, bob, rome, 12.75, 65.0, 162.75
1146, 2021-07-12 00:00:00, billy larkin, italy, 93.75, 325.0, 1043.75
114, 2021-07-28 00:00:00, beatrice, rome, 1, 10, 100
29, 2021-07-25 00:00:00, Colin, italy the third, 10, 10, 50
5, 2021-07-22 00:00:00, Veronica, canada, 10, 100, 1000
1149, 1234-12-13 00:00:00, Billy Larkin, 1123, 12.75, 65.0, 162.75"""

for idx in range(len(data)):
  data[idx] = data[idx].split(', ')
  for jdx in range(len(data[idx])):
    if data[idx][jdx].isnumeric():    # Is it an integer?
      value = int(data[idx][jdx])
    elif isfloat(data[idx][jdx]):     # Is it a float?
      value = float(data[idx][jdx])
    elif isdatetime(data[idx][jdx]):  # Is it a date?
      value = datetime.datetime.fromisoformat(data[idx][jdx])
    else:
      value = data[idx][jdx]
    data[idx][jdx] = value

data.sort(key=lambda x: x[1])

您还可以按更具体的内容进行排序:

data.sort(key=lambda x: x[1].month)

注意:您可能不需要 for 循环中的所有逻辑。我认为csv 包为您做了一些基本的预处理,例如拆分和数据类型转换。

【讨论】:

    【解决方案2】:

    导入和数据帧

    import pandas as pd
    import matplotlib.dates as mdates  # used to format the x-axis
    import matplotlib.pyplot as plt
    
    # read in the data
    df = pd.read_csv('Claims.dat', header=None)
    
    # convert the column to a datetime format, which ensures the data points will be plotted in chronological order
    df[1] = pd.to_datetime(df[1], errors='coerce').dt.date
    
    # display(df)
          0           1              2                 3      4      5        6
    0  1145  2021-07-17            bob              rome  12.75   65.0   162.75
    1  1146  2021-07-12   billy larkin             italy  93.75  325.0  1043.75
    2   114  2021-07-28       beatrice              rome   1.00   10.0   100.00
    3    29  2021-07-25          Colin   italy the third  10.00   10.0    50.00
    4     5  2021-07-22       Veronica            canada  10.00  100.0  1000.00
    5  1149  2020-12-13   Billy Larkin              1123  12.75   65.0   162.75
    

    绘制数据帧

    # plot the dataframe, which uses matplotlib as the backend
    ax = df.plot(x=1, y=6, marker='.', color='r', figsize=(10, 7), label='Totals')
    
    # format title and labels
    ax.set_xlabel('Months', color="red", size='large')
    ax.set_ylabel('Totals', color="red", size='large')
    ax.set_title('Claims Data:   Team Bobby\n Second Place is the First Looser', color='Blue', weight='bold', size='large')
    
    # format ticks
    xt = plt.xticks(rotation=45, horizontalalignment='right', size='small')
    yt = plt.yticks(weight='bold', size='small', rotation=45)
    
    # format the dates on the xaxis
    myFmt = mdates.DateFormatter('%b')
    ax.xaxis.set_major_formatter(myFmt)
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2021-12-09
      • 1970-01-01
      • 2017-07-11
      • 2018-01-23
      • 1970-01-01
      • 2021-03-27
      • 2016-01-04
      相关资源
      最近更新 更多