【发布时间】:2020-10-05 16:38:53
【问题描述】:
我有一个数据框,其中有一个名为 Month 的日期时间列和另外两个列:
data = [['Canada',10, '2020-09-01'], ['Canada',20, '2020-10-01'], ['Canada',30, '2020-12-01'], ['Canada',40, '2021-01-01'],
['Europe',30, '2020-09-01'], ['Europe',20, '2020-10-01'], ['Europe',10, '2020-12-01'], ['Europe',40, '2021-01-01'],
['US',40, '2020-09-01'], ['US',10, '2020-10-01'], ['US',20, '2020-12-01'], ['US',30, '2021-01-01']]
df = pd.DataFrame(data,columns=['Region','sales', 'Month'])
接下来,我将“月份”列转换为具有特定格式的字符串:
df['Month'] = df['Month'].dt.strftime('%b-%Y')
现在,我旋转数据框并导出到 excel:
df['Month'] = pd.pivot_table(df['Month'], values = 'sales', index=["Region"], columns = "Month").reset_index()
df.to_excel(writer, sheet_name='sales', index=False, startrow=4, header=False)
由于“月份”列是一个字符串,当我将数据框写入 excel 时,日期按字母顺序排序。我希望日期按日期时间值排序。
我尝试在旋转之前将“月份”列转换为日期时间,但在这种情况下,导出到 excel 后我没有得到正确格式的日期:
df['Month'] = pd.to_datetime(df['Month'], format='%b-%Y')
我什至尝试使用 ExcelWriter 格式,但似乎效果不佳。
df['Month'] = pd.to_datetime(df['Month'])
df = pd.pivot_table(df, values = 'sales', index=["Region"], columns = "Month").reset_index()
df = df.append(pd.Series(df.sum(),name='System'))\
.assign(Total=df.sum(1))
# extract the datetime component of the multilevel column names
dates = [v for v in df.columns[1:]]
# reformat dates to the desired string format
dates_str = [v.strftime('%b-%Y') for v in dates]
# create a dict
updates = dict(zip(dates, dates_str))
# rename the columns, which will stay in the current, correct order
df = df.rename(columns=updates, inplace=True)
df.to_excel(writer, sheet_name='sales', index=False, startrow=4, header=False)
【问题讨论】: