【问题标题】:Data_Frame Error - ValueError: Can only compare identically-labeled Series objectsData_Frame 错误 - ValueError:只能比较标签相同的系列对象
【发布时间】:2026-02-03 03:35:01
【问题描述】:

我有两个data_frames,如下:

df_name:
   Student_ID  Name        DOB
0           1  Raju 1993-02-02
1           2  Indu 1987-01-04
2           3  Laya 2000-06-24
df_marks:
    Student_ID Subject  Int1/40  Int2/40
0            1     Eng       10       35
1            1     Tam       30       38
2            1     Mat       20       30
3            1     Sci       15       20
4            2     Eng       35       25
5            2     Tam       25       15
6            2     Mat       22       30
7            2     Sci       29       23
8            3     Eng       18       17
9            3     Tam       19       16
10           3     Mat       27       26

任务是创建一个data_frame(下一个),这里我需要添加df_marks['Int1/40']&df_marks['Int2/40'],如果df_name['Student_ID'] == df_marks['Student_ID']

   Student_id  Name        DOB  Tam/50
0           1  Raju 1993-02-02     NaN
1           2  Indu 1987-01-04     NaN  
2           3  Laya 2000-06-24     NaN

我试过了

df_out['Tam/50'] = df_marks[['Int1/40','Int2/40']].sum(axis=1).where(df_marks['Subject']==df_out['Student_id'])

但它给出的错误是,

ValueError: Can only compare identically-labeled Series objects

我们有什么简单的方法可以做到这一点吗?

问候, 迪帕克冲刺

【问题讨论】:

  • 什么是df_out?你为什么要比较df_marks['Subject']==df_out['Student_id']?请使用正确的预期输出正确编辑您的问题。
  • 基本上df_out是我的输出数据框,如果Student_ID匹配,我需要添加列('Int1/40','Int2/40')

标签: python pandas numpy dataframe


【解决方案1】:

DataFrame.join 与聚合的sum 一起用于df_name 中的新列:

df_marks['Tam/50'] = df_marks[['Int1/40','Int2/40']].sum(axis=1)
df_name = df_name.join(df_marks.groupby('Student_ID')['Tam/50'].sum(), on='Student_ID')
print (df_name)
   Student_ID  Name         DOB  Tam/50
0           1  Raju  1993-02-02     198
1           2  Indu  1987-01-04     204
2           3  Laya  2000-06-24     123

或者没有辅助列的解决方案:

s = (df_marks[['Int1/40','Int2/40']].sum(axis=1)
                                    .groupby(df_marks['Student_ID'])
                                    .sum()
                                    .rename('Tam/50'))

df_name = df_name.join(s, on='Student_ID')
print (df_name)
   Student_ID  Name         DOB  Tam/50
0           1  Raju  1993-02-02     198
1           2  Indu  1987-01-04     204
2           3  Laya  2000-06-24     123

【讨论】:

    【解决方案2】:

    您可以使用pd.merge 来匹配Student_ID 上的两个数据框。然后使用groupbysum

    In [574]: res = pd.merge(df_name, df_marks,on='Student_ID')
    In [592]: r = res.groupby(['Student_ID', 'Name', 'DOB'])[['Int1/40','Int2/40']].sum(1).reset_index()
    
    In [594]: r['Tam/50'] = r['Int1/40'] + r['Int2/40']
    In [604]: r.drop(['Int1/40', 'Int2/40'], 1, inplace=True)
    
    In [605]: r
    Out[605]: 
       Student_ID  Name         DOB  Tam/50
    0           1  Raju  1993-02-02     198
    1           2  Indu  1987-01-04     204
    2           3  Laya  2000-06-24     123
    

    【讨论】:

      最近更新 更多