【问题标题】:How to find repeating matching values in two Data Frame columns in Python?如何在 Python 的两个数据框列中查找重复匹配值?
【发布时间】:2017-02-09 12:36:31
【问题描述】:

我正在尝试编写一个脚本来检查经理编号是否与员工编号匹配。它将继续向下列,直到检查所有数字。 完成后,它会打印出匹配或不匹配的列表。

[In:]
import pandas as pd

#reading in csv file to Data Frame
employeeData = pd.read_csv("C:/Users/Desktop/EmployeeList.csv")

#creatig a Data Frame
dataF = pd.DataFrame(employeeData);

#empty list where instances of T/F will be stored
booleans = [];

#256 manager numbers + 1896 empty rows
managers = pd.Series(employeeData['Manager ID Number']

 #Edit Forgot to include this line
condition = managers.equals(merge['Employee ID'])

 #check each row of employee data. 2153 rows of Employee Numbers
 for index, row in employeeData.iterrows():
    #Check every single Manager number for a match
    for index, row in managers.iteritems():
         if condition:
         booleans.append(True)
         print("Something matched!")

         else:
         print("Didn't match!"
         booleans.append(False)
#A length of all booleans is printed. 
print(len(booleans))


[Out:] Actual 
"Didn't match!" x 2153 times. (number of employees in list)

[Out:] Desired: 
"Something matched!"
"Didn't match!"
"Something matched!"
"Something matched!"
"Didn't match!"
"Something matched!".... to line 2153

我的问题是索引计数似乎不会下降。它只会输出与第一个数字不匹配的数百次。我想将行位置向下移动,以便根据经理列表检查所有员工编号。有些经理的员工比其他经理多,所以我必须检查每一个!(256)我很尴尬地说我已经被这个问题困住了很长一段时间。 python新手,所以任何提示将不胜感激

【问题讨论】:

    标签: python excel python-3.x csv pandas


    【解决方案1】:

    IIUC 你需要使用Pandas Merge()

    df_emp_mng= pd.merge(df_Emp,df_Mang,left_on='EMP ID',right_on='Manager ID')
    print (df_emp_mng)
    
    print 'Number of managers in Employee' ,len(df_emp_mng)
    print 'Number of managers not in Employee' ,len(df_Emp)-len(df_emp_mng)
    

    输入 - 员工数据

       EMP ID name  MID
    0     123   E3    1
    1     124   E1    1
    2     125   E2    2
    3       4   X4    5
    

    输入 - 经理数据

       Manager ID Manager name Dep
    0           1           X1   C
    1           2           X2   D
    2           3           X3   E
    3           4           X4   F
    4           5           X5   F
    

    输出

       EMP ID name  MID  Manager ID Manager name Dep
    0       4   X4    5           4           X4   F
    
    Number of managers in Employee 1
    
    Number of managers not in Employee 3
    

    【讨论】:

      猜你喜欢
      • 2019-01-04
      • 2018-05-15
      • 1970-01-01
      • 1970-01-01
      • 2019-06-06
      • 2019-04-04
      • 2019-12-26
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多