【问题标题】:Separating a dataframe by date and calculating averages Numpy Python按日期分隔数据框并计算平均值 Numpy Python
【发布时间】:2021-09-15 18:44:22
【问题描述】:

data_listmonthly_values数组相互关联,所以数据点'2019-09-01 00:00:00'= 15 , 2019-10-01 00:00:00'= 39.6... etc。下面的year_changes 函数显示了发生新年的索引。我正在尝试编写一个函数来显示给定年份内所有月度值的平均值。因此,由于 2019 年有 4 个月 2019-09-01 00:00:00 - 2020-01-01 00:00:00,它需要数字的总和 15., 39.6, 0.2, 34.3,然后除以 2019 年的月数,即 4,得到 Expected Output22.28。我怎么能编写这样的代码?

import datetime
import numpy as np
import pandas as pd
from pandas import DataFrame

date_list = ['2019-09-01 00:00:00', '2019-10-01 00:00:00', '2019-11-01 00:00:00',
 '2019-12-01 00:00:00', '2020-01-01 00:00:00', '2020-02-01 00:00:00', 
 '2020-03-01 00:00:00', '2020-04-01 00:00:00', '2020-05-01 00:00:00', 
 '2020-06-01 00:00:00', '2020-07-01 00:00:00', '2020-08-01 00:00:00',
 '2020-09-01 00:00:00','2020-10-01 00:00:00', '2020-11-01 00:00:00', 
 '2020-12-01 00:00:00','2021-01-01 00:00:00','2021-02-01 00:00:00', '2021-03-01 00:00:00', 
 '2021-04-01 00:00:00','2021-05-01 00:00:00', '2021-06-01 00:00:00', 
 '2021-07-01 00:00:00']
monthly_values = np.array([ 15., 39.6, 0.2, 34.3, 19.6, 26.8, 15.7, 26., 12.6, 15.5, 18.6, 2.3, 6.5,
   2.5, 12.2, 11.6, 93.9, 25.5, 26.5, -16.5, -1.4, -1.8, 5.])

data = DataFrame (date_list,columns=['Data'])
datetime = pd.to_datetime(data['Data'])

year_changes = data.loc[np.where(datetime.dt.year.diff().gt(0))].index.tolist()

预计年产量值:

2019 Average: 22.28
2020 Average: 14.16
2021 Avreage: 21.03

【问题讨论】:

    标签: arrays python-3.x pandas dataframe numpy


    【解决方案1】:
    1. 您可以从date_listmonthly_values 创建数据框:
    data = pd.DataFrame({"Date": date_list, "Values": monthly_values})
    data["Date"] = pd.to_datetime(data["Date"])
    

    打印:

             Date  Values
    0  2019-09-01    15.0
    1  2019-10-01    39.6
    2  2019-11-01     0.2
    3  2019-12-01    34.3
    4  2020-01-01    19.6
    5  2020-02-01    26.8
    6  2020-03-01    15.7
    7  2020-04-01    26.0
    8  2020-05-01    12.6
    9  2020-06-01    15.5
    10 2020-07-01    18.6
    11 2020-08-01     2.3
    12 2020-09-01     6.5
    13 2020-10-01     2.5
    14 2020-11-01    12.2
    15 2020-12-01    11.6
    16 2021-01-01    93.9
    17 2021-02-01    25.5
    18 2021-03-01    26.5
    19 2021-04-01   -16.5
    20 2021-05-01    -1.4
    21 2021-06-01    -1.8
    22 2021-07-01     5.0
    
    1. 然后使用.groupby.dt.year 作为石斑鱼:
    print(data.groupby(data["Date"].dt.year).mean())
    

    打印:

             Values
    Date           
    2019  22.275000
    2020  14.158333
    2021  18.742857
    

    【讨论】:

    猜你喜欢
    • 2021-09-15
    • 2019-02-24
    • 1970-01-01
    • 2020-07-24
    • 2019-04-15
    • 2016-06-27
    • 2013-02-25
    • 1970-01-01
    • 2014-06-04
    相关资源
    最近更新 更多