Python pandas根据日期和列添加小时修改数据框答案

【问题标题】：Python pandas modify dataframe according to date and column adding hoursPython pandas根据日期和列添加小时修改数据框
【发布时间】：2020-03-10 20:46:35
【问题描述】：

我有以下数据框：

;h0;h1;h2;h3;h4;h5;h6;h7;h8;h9;h10;h11;h12;h13;h14;h15;h16;h17;h18;h19;h20;h21;h22;h23
2017-01-01;52.72248155184351;49.2949899678983;46.57492391198069;44.087373768731766;44.14801243124734;42.17606224526609;43.18529986793594;39.58391124876044;41.63499969987035;41.40594457169249;47.58107920806581;46.56963630932529;47.377935483897694;37.99479190229543;38.53347417483357;40.62674178535282;45.81503347748674;49.0590694393733;52.73183568074295;54.37213882189341;54.737087166843295;50.224872755157314;47.874441844531056;47.8848916244788
2017-01-02;49.08874087825248;44.998912615866075;45.92457207636786;42.38001388673675;41.66922093408655;43.02027406525752;49.82151473221541;53.23401784350719;58.33805556091773;56.197239473200206;55.7686948361035;57.03099874898539;55.445563603040405;54.929102019056195;55.85170734639889;57.98929007227575;56.65821961018764;61.01309728212006;63.63384537162659;61.730431501017684;54.40180394585544;50.27375006416599;51.229656340500156;51.22066846069472
2017-01-03;50.07885876956572;47.00180020415448;44.47243045246001;42.62192562660052;40.15465704760352;43.48422695796396;50.01631022884173;54.8674584250141;60.434849010428685;61.47694796693493;60.766557330286844;59.12019178422993;53.97447369962696;51.85242030255539;53.604945764469065;56.48188852869667;59.12301823257856;72.05688032286155;74.61342126987793;70.76845988290785;64.13311592022278;58.7237387203283;55.2422389373486;52.63648285910918

如您所见，列中有日期，有小时。我想创建一个只有两列的新数据框：第一个日期（还有小时数据）和一个包含值的列。类似于以下内容：

2017-01-01 00:00:00 ; 52.72248
2017-01-01 01:00:00 ; 49.2949899678983
...

我可以创建一个新的数据框并使用一个循环来填充它。这是我现在做的

icount = 0
for idd in range(0,365):
   for ih in range(0,24):
      df.loc[df.index.values[icount]] = ecodf.iloc[idd,ih]
      icount = icount + 1

你怎么看？

谢谢

【问题讨论】：

标签： python pandas sorting date dataframe

【解决方案1】：

将列名称转换为新列，转换为小时并使用 pd.to_datetime

s = df.stack()    
pd.concat([
    pd.to_datetime(s.reset_index() \
                    .replace({'level_1': r'h(\d+)'}, {'level_1': '\\1:00'}, regex=True)  \
                    [['level_0','level_1']].apply(' '.join, axis=1)), \
     s.reset_index(drop=True)], \
     axis=1, sort=False)

                     0          1
0  2017-01-01 00:00:00  52.722482
1  2017-01-01 01:00:00  49.294990
2  2017-01-01 02:00:00  46.574924
3  2017-01-01 03:00:00  44.087374
4  2017-01-01 04:00:00  44.148012
..                 ...        ...
67 2017-01-03 19:00:00  70.768460
68 2017-01-03 20:00:00  64.133116
69 2017-01-03 21:00:00  58.723739
70 2017-01-03 22:00:00  55.242239
71 2017-01-03 23:00:00  52.636483

[72 rows x 2 columns]
>>>

【讨论】：