【问题标题】:Plot for every 10 minutes in datetime在日期时间中每 10 分钟绘制一次
【发布时间】:2020-02-15 13:49:51
【问题描述】:

我使用的“df”对于每个datetime 都有多行。我想每 10 分钟绘制一个具有相同 datetime 的所有坐标的散点图。每个位置都有一个数据条目,每 10 分钟在df_data

如果我手动将时间输入t_list = [datetime(2017, 12, 23, 06, 00, 00), datetime(2017, 12, 23, 06, 10, 00), datetime(2017, 12, 23, 06, 20, 00)],它会起作用,但我想用使用df 中的日期的东西来替换它,这样我就可以将它用于多个数据集。

import pandas as pd
import matplotlib.pyplot as plt
from datetime import datetime, timedelta
import numpy as np

df_data = pd.read_csv('C:\data.csv')
df_data['datetime'] = pd.to_datetime(df_data['TimeStamp'] )
df = df_data[(df_data['datetime']>= datetime(2017, 12, 23, 06,00, 00)) &
         (df_data['datetime']< datetime(2017, 12, 23, 07, 00, 00))]

##want a time array for all of the datetimes in the df
t_list = [datetime(2017, 12, 23, 06, 00, 00), datetime(2017, 12, 23, 06, 10, 00), 
datetime(2017, 12, 
23, 06, 20, 00)]

for t in t_list:
    t_end = t + timedelta(minutes = 10)
    t_text = t.strftime("%d-%b-%Y (%H:%M)")

    #boolean indexing with multiple conditions, you should wrap each single condition in brackets
    df_t = df[(df['datetime']>=t) & (df['datetime']<t_end)]

    #get data into variable
    ws = df_t['Sp_mean']
    lat = df_t['x']
    lon = df_t['y']
    col = 0.75

    #calc min/max for setting scale on images
    min_ws = df['Sp_mean'].min()
    max_ws = df['Sp_mean'].max()

    plt.figure(figsize=(15,10))
    plt.scatter(lon, lat, c=ws,s=300, vmin=min_ws, vmax=max_ws)  
    plt.title('event' + t_text,fontweight = 'bold',fontsize=18)
    plt.show()

我尝试了几种方法来尝试将datetime 复制为可迭代列表,但没有给我想要的结果,最近的结果如下:

date_arrray = np.arange(np.datetime64(df['datetime']))
df['timedelta'] = pd.to_timedelta(df['datetime'])

示例数据集

【问题讨论】:

  • 您能否发布数据集的样本或至少发布您使用的日期格式?
  • 您是否要进行“分组”? -- pandas.pydata.org/pandas-docs/stable/reference/api/…
  • @gustavovelascoh - 编辑问题以包含数据集的 sn-p
  • @squar_o 我是否正确理解您希望每 10 分钟间隔一个散点图?
  • t_list = df['TimeStamp'].unique()。然后对其进行排序并迭代值以获得正确的切片。

标签: python python-2.7 datetime for-loop


【解决方案1】:

熊猫你不熟悉的接缝。您应该检查 resample function
df_data 成为您的原始数据:

# make a DatetimeIndex and resample it to 10-Min interval
df_data.index = pd.to_datetime(df_data['TimeStamp'])
resampled_data = df_data.resample('10Min')

# loop it:
min_ws = df['Sp_mean'].min()
max_ws = df['Sp_mean'].max()
col = 0.75
for start_time, sampled_df in resampled_data:
    ws = sampled_df['Sp_mean']
    lat = sampled_df['x']
    lon = sampled_df['y']
    plt.figure(figsize=(15,10))
    plt.scatter(lon, lat, c=ws,s=300, vmin=min_ws, vmax=max_ws)  
    plt.title('event' + start_time.strftime('%Y-%m-%d %H:%M:%S'),fontweight = 'bold',fontsize=18)
    plt.show()

【讨论】:

  • 我应该明确表示我的数据已经每 10 分钟采样一次 - 进行了编辑以反映这一点。
  • @squar_o 这不会改变任何事情。您仍然可以使用resample 进行分组和统计。
【解决方案2】:

我试过这个数据集:

           datetime  x  y
0  31/10/2017 23:50  1  9
1  31/10/2017 23:50  1  9
2  31/10/2017 23:50  1  9
3  31/10/2017 23:40  1  9
4  31/10/2017 23:40  1  9
5  31/10/2017 23:40  1  9
6  31/10/2017 23:30  1  9
7  31/10/2017 23:30  1  9
8  31/10/2017 23:20  1  9

还有这段代码:

a = [["31/10/2017 23:50", 1,9],["31/10/2017 23:50", 1,9],["31/10/2017 23:50", 1,9],["31/10/2017 23:40", 1,9],["31/10/2017 23:40", 1,9],["31/10/2017 23:40", 1,9],["31/10/2017 23:30", 1,9],["31/10/2017 23:30", 1,9],["31/10/2017 23:20", 1,9]]
df = pd.DataFrame(a,columns=["TimeStamp","x","y"])
df["datetime"] = pd.to_datetime(df["TimeStamp"])
t_list = df.groupby("datetime").all().index
print(t_list)
# DatetimeIndex(['2017-10-31 23:20:00', '2017-10-31 23:30:00',
# ...                '2017-10-31 23:40:00', '2017-10-31 23:50:00'],
# ...               dtype='datetime64[ns]', name='datetime', freq=None)

【讨论】:

    【解决方案3】:

    希望对你有帮助

    new_df = df.groupby('datetime')
    
    for hour_group in new_df.groups:
        min_ws = new_df.get_group(i)['Sp_mean'].min()
        max_ws = new_df.get_group(i)['Sp_mean'].max()
    
        lat = new_df.get_group(i)['x']
        lon = new_df.get_group(i)['y']
        ws = new_df.get_group(i)['Sp_mean']
    
        plt.figure(figsize=(15,10))
        plt.scatter(lon, lat, c=ws,s=300, vmin=min_ws, vmax=max_ws)  
        plt.title('event' + hour_group.strftime('%Y-%m-%d %H:%M:%S'),
                  fontweight ='bold',fontsize=18)
    
        plt.show()
    

    【讨论】:

      【解决方案4】:

      如果我理解得很好,您希望将您的数据按 10 分钟分组。如果您的数据集已经过采样,您只需按分钟对数据进行分组并迭代生成的数据帧。

      minutes_dfs = df.groupby(df.datetime.map(lambda t: t.minute))
      

      如果还没有采样,你可以分组 10 分钟

      minutes_dfs = a.groupby(pd.Grouper(freq='10Min'))
      

      完整代码:

      # Example Data Frame
      data = {'TimeStamp':['31/10/2017 23:50:00', '31/10/2017 23:50:00', '31/10/2017 23:50:00', '31/10/2017 23:40:00', '31/10/2017 23:40:00', '31/10/2017 23:40:00', '31/10/2017 23:40:00', '31/10/2017 23:40:00', '31/10/2017 23:40:00', '31/10/2017 23:40:00', '31/10/2017 23:40:00', '31/10/2017 23:40:00'], 
              'datetime':['31/10/2017 23:50:00', '31/10/2017 23:50:00', '31/10/2017 23:50:00', '31/10/2017 23:40:00', '31/10/2017 23:40:00', '31/10/2017 23:40:00', '31/10/2017 23:40:00', '31/10/2017 23:40:00', '31/10/2017 23:40:00', '31/10/2017 23:40:00', '31/10/2017 23:40:00', '31/10/2017 23:40:00'], 
              'x':[1, 2, 3, 1, 2, 3, 4, 5, 6, 7, 8, 9], 
              'y':[9, 8, 7, 9, 8, 7, 6, 5, 4, 3, 2, 1], 
              'Sp_mean':[6.49, 5.63, 7.07, 7.86, 7.27, 6.59, 6.78, 8.35, 6.3, 5.82, 8.74, 8.94]}
      df = pd.DataFrame(data)
      df['TimeStamp'] = pd.to_datetime(df['TimeStamp'])
      df['datetime'] = pd.to_datetime(df['datetime'])
      df = df.set_index('datetime')
      df['datetime'] = df.index
      print(df)
      
      #If data is already sampled
      #minutes_dfs = df.groupby(df.datetime.map(lambda t: t.minute))
      
      #Not sampled data
      minutes_dfs = a.groupby(pd.Grouper(freq='10Min'))
      
      for min, minutes in minutes_dfs:
        t_text = str(min)
        #get data into variable
        ws = minutes['Sp_mean']
        lat = minutes['x']
        lon = minutes['y']
        col = 0.75
      
        #calc min/max for setting scale on images
        min_ws = df['Sp_mean'].min()
        max_ws = df['Sp_mean'].max()
      
        plt.figure(figsize=(15,10))
        plt.scatter(lon, lat, c=ws,s=300, vmin=min_ws, vmax=max_ws)  
        plt.title('event' + t_text,fontweight = 'bold',fontsize=18)
        plt.show()
      

      【讨论】:

        【解决方案5】:

        简单的解决方案是: 假设 -- df = df.set_index('datetime')... 等

        使用:https://numpy.org/doc/stable/reference/arrays.datetime.html

        start_date = df.index[0]
        min_10 = np.timedelta64(10,'m')
        
        for date in df.index[1::]:
        if  np.timedelta64(date - start) >= min_10:
             start = date
             # do your plotting
        

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 1970-01-01
          • 2021-06-16
          • 1970-01-01
          • 2012-02-08
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2023-03-16
          相关资源
          最近更新 更多