【问题标题】:Pandas split dataframes based on index values from a list of tuplesPandas 根据元组列表中的索引值拆分数据帧
【发布时间】:2020-02-24 17:59:06
【问题描述】:

假设我有一个带有索引值的元组列表:

mapper= [(0,6),(9,13),(17,27)]

我有一个很大的 ma​​ster_df,我想根据上面列表中的元组索引值将其拆分为多个 df。

mapper[0][0] 是起点,mapper[0][1] 是终点。我有一个 df 名称列表。

df_list= ['df_1','df_2,'df_3']

我尝试了下面的以下 sn-p 尝试根据 ma​​pper

的索引值填充多个 df
for x in range(len(df_list)):
    df_list[x] = master_df[mapper[x][0]:mapper[x][1]]

但这并没有按照我的设想进行。对我来说理想的解决方案是三个单独的 df 根据列表中的元组索引值拆分 master_df。

这是我想要完成的一个示例:

master_df:
     Name    Role       Location
0    Gina    Assistance    NY
1    Jake    Officer       Brooklyn
2    Boyle   Detective     99
3    Scully  Assistance    NY
4    Diaz    Officer       Brooklyn
5    Hitchcock Detective     99
6    Amy    Assistance    NY
7    Terry    Officer       Brooklyn
8    Holt   Detective     99
9    Judy   Assistance    NY
10   Adrian Officer       Brooklyn

mapper = [(0,3),(3,6),(6,11)]
df_list = ['df_1','df_2','df_3']

寻求结果

df_1:
     Name    Role       Location
0    Gina    Assistance    NY
1    Jake    Officer       Brooklyn
2    Boyle   Detective     99

df_2:
     Name    Role       Location
3    Scully  Assistance    NY
4    Diaz    Officer       Brooklyn
5    Hitchcock Detective     99

df_3:
     Name    Role       Location
6    Amy    Assistance    NY
7    Terry    Officer       Brooklyn
8    Holt   Detective     99
9    Judy   Assistance    NY
10   Adrian Officer       Brooklyn

感谢任何帮助/指导!

【问题讨论】:

  • 当索引数据帧时,使用索引,你应该使用loc。你可以试试df_list = [master_df.loc(axis=0)[map[x][0]:map[x][1]] for map in mapper]
  • 一些示例输入和输出将有助于使您的问题更加清晰,正如所写的那样;使您实际尝试做的事情感到困惑。见:minimal reproducible example
  • @G.Anderson 谢谢你的建议。用一些示例输入和输出编辑了我的问题。

标签: python pandas dataframe for-loop tuples


【解决方案1】:

您可以使用* 解包元组,并将它们传递给范围函数,然后使用iloc[] 获取这些索引:

df_list=[df.iloc[range(*i),:] for i in mapper]

[     Name        Role  Location
 0   Gina  Assistance        NY
 1   Jake     Officer  Brooklyn
 2  Boyle   Detective        99,
         Name        Role  Location
 3     Scully  Assistance        NY
 4       Diaz     Officer  Brooklyn
 5  Hitchcock   Detective        99,
      Name        Role  Location
 6      Amy  Assistance        NY
 7    Terry     Officer  Brooklyn
 8     Holt   Detective        99
 9     Judy  Assistance        NY
 10  Adrian     Officer  Brooklyn]

如果您希望将它们分配给名称,则必须将其设为字典(请参阅How to create a variable number of variables

df_dict=dict(zip(df_list,[df.iloc[range(*i),:] for i in mapper]))

{'df_1':     Name        Role  Location
 0   Gina  Assistance        NY
 1   Jake     Officer  Brooklyn
 2  Boyle   Detective        99,
 'df_2':         Name        Role  Location
 3     Scully  Assistance        NY
 4       Diaz     Officer  Brooklyn
 5  Hitchcock   Detective        99,
 'df_3':       Name        Role  Location
 6      Amy  Assistance        NY
 7    Terry     Officer  Brooklyn
 8     Holt   Detective        99
 9     Judy  Assistance        NY
 10  Adrian     Officer  Brooklyn}

【讨论】:

    猜你喜欢
    • 2018-03-09
    • 2021-04-29
    • 2021-02-24
    • 1970-01-01
    • 2018-05-05
    • 2021-08-05
    • 2020-04-25
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多