【问题标题】：how to split and concat pandas dataframe如何拆分和连接熊猫数据框
【发布时间】：2018-01-26 15:08:09
【问题描述】：

我有一个datetime index DataFrame 的熊猫像这样：

                         A   B  C  A_1  B_1
2017-07-01 00:00:00  1  34  e    9    0
2017-07-01 00:05:00  2  34  e   92    2
2017-07-01 00:10:00  3  34  e   23    3
2017-07-01 00:15:00  4  34  e    2    5
2017-07-01 00:20:00  5  34  e    4    3

我想将它拆分并与axis=0连接，结果是这样的

                     C  REQ  _1
2017-07-01 00:00:00  e  1    9
2017-07-01 00:05:00  e  2   92
2017-07-01 00:10:00  e  3   23
2017-07-01 00:15:00  e  4    2
2017-07-01 00:20:00  e  5    4
2017-07-01 00:00:00  e  34    0
2017-07-01 00:05:00  e  34    2
2017-07-01 00:10:00  e  34    3
2017-07-01 00:15:00  e  34    5
2017-07-01 00:20:00  e  34    3

所以，我必须这样做：首先，选择df[['C','A','A_1']]、df[['C','B', 'B_1']]。然后映射列，并连接结果。

很复杂，pandas有内置方法吗？或者有什么更快的方法？因为我有数千列要连接以获得最终结果。

【问题讨论】：

标签： python pandas time-series

【解决方案1】：

编辑

在做了一些研究之后，lreshape 没有很好地记录，而在当前 API 中的pd.wide_to_long 与 lreshape 一样，具有更大的灵活性。

https://github.com/pandas-dev/pandas/issues/2567

https://github.com/pandas-dev/pandas/issues/15003

让我们使用 API 记录的方法：

dict1 = {'A':'REQ_A1','B':'REQ_B1','A_1':'Value_A1','B_1':'Value_B1'}

df2 = df1.rename(columns=dict1)

(pd.wide_to_long(df2.reset_index(),['REQ','Value'],i='index',j='C',sep='_',suffix='.')
  .rename_axis(['index','dropme'])
  .reset_index()
  .drop('dropme', axis=1)
  .rename(columns={'Value':'_1'}))

输出：

                 index  C  REQ  _1
0  2017-07-01 00:00:00  e    1   9
1  2017-07-01 00:05:00  e    2  92
2  2017-07-01 00:10:00  e    3  23
3  2017-07-01 00:15:00  e    4   2
4  2017-07-01 00:20:00  e    5   4
5  2017-07-01 00:00:00  e   34   0
6  2017-07-01 00:05:00  e   34   2
7  2017-07-01 00:10:00  e   34   3
8  2017-07-01 00:15:00  e   34   5
9  2017-07-01 00:20:00  e   34   3

使用pd.lreshape：

d = {'REQ': ['A', 'B'], '_1': ['A_1', 'B_1']}
df_out = (pd.lreshape(df.reset_index(), d).set_index('index'))

输出：

                     C  REQ  _1
index                          
2017-07-01 00:00:00  e    1   9
2017-07-01 00:05:00  e    2  92
2017-07-01 00:10:00  e    3  23
2017-07-01 00:15:00  e    4   2
2017-07-01 00:20:00  e    5   4
2017-07-01 00:00:00  e   34   0
2017-07-01 00:05:00  e   34   2
2017-07-01 00:10:00  e   34   3
2017-07-01 00:15:00  e   34   5
2017-07-01 00:20:00  e   34   3

【讨论】：

太棒了，lreshape 究竟做了什么？在文档中找不到它
是的，没有记录。但是，我看到其他人在这里使用它很多。基本上，它会根据提供的目录重塑数据框。
@J.Doe For lreshape github.com/pandas-dev/pandas/blob/… 你也可以检查 pd.wide_to_long 应该也可以工作。
@Vaishali 我做了一些挖掘 lreshape Wen 是正确的，让我们改用 pd.wide_to_long。
@ScottBoston BTW 对我的回答投反对票，让我弄清楚lreshape 根本不安全，你可以在这里查看并让我们互相更新，当lreshape 弃用stackoverflow.com/questions/45699653/…