groupby 索引列表列表答案

【问题标题】：groupby list of lists of indexesgroupby 索引列表列表
【发布时间】：2026-01-26 14:25:01
【问题描述】：

我有一个 np 的列表。数组，表示 pandas 数据帧的索引。

我需要 groupby 索引来获取每个数组的每个组

比方说，那是 df:

index values
0     2
1     3
2     2
3     2
4     4
5     4
6     1
7     4
8     4
9     4

这就是 np.arrays 的列表：

[array([0, 1, 2, 3]), array([6, 7, 8])]

从这些数据中，我希望得到 2 个没有循环操作的组作为单个 groupby 对象：

组1：

index values
0     2
1     3
2     2
3     2

组2：

index values
6     1
7     4
8     4

我要再次强调，最后我需要获得一个 groupby 对象。

谢谢！

【问题讨论】：

为什么不想要 for 循环？
@peterWeNYoBen，有数百万行
但是列表呢？是不是只需要减少几组？

标签： pandas numpy pandas-groupby

【解决方案1】：

我仍然使用for循环来创建groupby键dict

l=[np.array([0, 1, 2, 3]), np.array([6, 7, 8])]
df=pd.DataFrame([2, 3, 2, 2, 4, 4, 1, 4, 4, 4],columns=['values'])

from collections import ChainMap
L=dict(ChainMap(*[dict.fromkeys(y,x) for x, y in enumerate(l)]))
list(df.groupby(L))
Out[33]: 
[(0.0,        values
  index        
  0           2
  1           3
  2           2
  3           2), (1.0,        values
  index        
  6           1
  7           4
  8           4)]

【讨论】：

感谢您的回复。我需要一个 groupby 对象，例如 df.groupby('foo') 来进行矢量化计算，而不是组列表。

【解决方案2】：

df=pd.DataFrame([2,3,2,2,4,4,1,4,4,4],columns=['values'])
df.index.name ='index'
l=[np.array([0, 1, 2, 3]), np.array([6, 7, 8])]

group1= df.loc[pd.Series(l[0])]
group2= df.loc[pd.Series(l[1])]

【讨论】：

感谢您的回复。我已经意识到，我的问题是如何获取 groupby 对象，例如 df.groupby('foo') 来进行矢量化计算，而不是组列表。期待。
我把两组分开了

【解决方案3】：

这似乎是X-Y problem：

l = [np.array([0,1,2,3]), np.array([6,7,8])]
df_indx = pd.DataFrame(l).stack().reset_index()
df_new = df.assign(foo=df['index'].map(df_indx.set_index(0)['level_0']))
for n,g in df_new.groupby('foo'):
    print(g)

输出：

   index  values  foo
0      0       2  0.0
1      1       3  0.0
2      2       2  0.0
3      3       2  0.0
   index  values  foo
6      6       1  1.0
7      7       4  1.0
8      8       4  1.0

【讨论】：