【问题标题】:How to account for weekends and holidays when using python使用python时如何考虑周末和节假日
【发布时间】:2017-04-06 17:00:17
【问题描述】:

我正在尝试每月格式化此数据,但在当前日期之前。

import pandas as pd
import datetime
import quandl
import numpy as np

start = datetime.datetime(1993, 10, 2)
end = datetime.date.today()

df = quandl.get("FRED/DGS20", collapse="daily").reset_index()
df.index=np.arange(0,len(df))

print (df)

l=[]
for i in range(0,len(df)):
    if (df['DATE'].loc[i]).day == (df['DATE'].loc[len(df)-1]).day:
        l.append(df['DATE'].loc[i])

我遇到的问题是,如果该日期是周末或节假日,则会跳过该月。如果给定的日期是 N/A,我如何让 python 选择一个月中最接近的适用日期?

【问题讨论】:

    标签: python datetime pandas numpy quandl


    【解决方案1】:

    有点长,但没有 for 循环!在df.index=np.arange(0,len(df)) 行之后插入我的代码:

    years = pd.DatetimeIndex(df['DATE']).year
    years_u = np.unique(years)
    years_u_norm = years_u - years_u[0]
    months = pd.DatetimeIndex(df['DATE']).month
    months_u = np.unique(months)
    months_u_norm = months_u - months_u[0]
    days = pd.DatetimeIndex(df['DATE']).day
    days_u = np.unique(days)
    days_u_norm = days_u - days_u[0]
    shp = (years_u_norm[-1]+1, months_u_norm[-1]+1, days_u_norm[-1]+1)
    mat = np.full(shp, np.nan).ravel()
    
    y_ind = years - years_u[0]
    m_ind = months - months_u[0]
    d_ind = days - days_u[0]
    inds = np.vstack([y_ind[np.newaxis], m_ind[np.newaxis], d_ind[np.newaxis]])
    
    inds2 = np.ravel_multi_index(inds, shp)
    inds_grid = np.indices(shp)[2].ravel()
    mat[inds2] = inds_grid[inds2]
    start_ind = np.ravel_multi_index([[start.year - years_u[0]], [start.month - months_u[0]], [start.day - days_u[0]]], shp)
    mat[:start_ind] = np.inf
    end_ind = np.ravel_multi_index([[end.year - years_u[0]], [end.month - months_u[0]], [end.day - days_u[0] + 1]], shp)
    mat[end_ind:] = np.inf
    
    mat = mat.reshape(shp)
    dist = np.absolute(mat - np.full(shp, end.day-1))
    
    min_dist = np.nanargmin(dist, axis=2) + 1
    inds_f = np.unique(np.ravel_multi_index(inds[:-1, :], shp[:-1]))
    res_inds = np.indices(min_dist.shape)
    res_y = res_inds[0].ravel()[inds_f] + years_u[0]
    res_m = res_inds[1].ravel()[inds_f] + months_u[0]
    res_d = min_dist.ravel()[inds_f]
    
    df = pd.DataFrame({'year': res_y,
                       'month': res_m,
                       'day': res_d})
    print(df)
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多