【问题标题】:How do I get index of a specific value (in second dataframe) based on the same value in first dataframe如何根据第一个数据帧中的相同值获取特定值的索引(在第二个数据帧中)
【发布时间】:2022-01-26 14:26:16
【问题描述】:

我有 2 个数据框,df_tsdf_cmexport。我正在尝试在df_cmexport 中获取placement idindex 以获取df_ts 中的展示位置 参考了解解释:Click here to view excel file

一旦我将这些展示位置 ID 的索引作为列表,我将使用 for j in list_pe_ts_1: 遍历它们以获取“j”索引的一些值:df_cmexport['p_start_year'][j]

我下面的代码由于某种原因返回一个空列表print(list_pe_ts_1) 返回[]

我认为list_pe_ts_1 = df_cmexport.index[df_cmexport['Placement ID'] == pid_1].tolist() 有问题,因为它会返回长度为 0 的空列表

我什至尝试使用list_pe_ts_1 = df_cmexport.loc[df_cmexport.isin([pid_1]).any(axis=1)].index,但仍然给出一个空列表

总是感谢您的帮助 :) 为大家干杯@stackoverflow

for i in range(0, len(df_ts)):
    pid_1 = df_ts['PLACEMENT ID'][i]
    print('for pid ', pid_1)
    list_pe_ts_1 = df_cmexport.index[df_cmexport['Placement ID'] == pid_1].tolist()
    print('len of list',len(list_pe_ts_1))
    ts_p_start_year_for_pid = df_ts['p_start_year'][i]
    ts_p_start_month_for_pid = df_ts['p_start_month'][i]
    ts_p_start_day_for_pid = df_ts['p_start_date'][i]

    print('\np_start_full_date_ts for :', pid_1, 'y:', ts_p_start_year_for_pid, 'm:', ts_p_start_month_for_pid,
          'd:', ts_p_start_day_for_pid)
    # j=list_pe_ts
    print(list_pe_ts_1)
    for j in list_pe_ts_1:
        # print(j)

        export_p_start_year_for_pid = df_cmexport['p_start_year'][j]
        export_p_start_month_for_pid = df_cmexport['p_start_month'][j]
        export_p_start_day_for_pid = df_cmexport['p_start_date'][j]
        print('\np_start_full_date_export for ', pid, "at row(", j, ") :", export_p_start_year_for_pid,
              export_p_start_month_for_pid, export_p_start_day_for_pid)
    if (ts_p_start_year_for_pid == export_p_start_year_for_pid) and (
            ts_p_start_month_for_pid == export_p_start_month_for_pid) and (
            ts_p_start_day_for_pid == export_p_start_day_for_pid):
        pids_p_1.add(pid_1)
        # print('pass',pids_p_1)

        # print(export_p_end_year_for_pid)
    else:
        pids_f_1.add(pid_1)
        # print("mismatch in placement end date for pid ", pids)
        # print("pids list ",pids)
        # print('fail',pids_f_1)

【问题讨论】:

    标签: python pandas


    【解决方案1】:

    通过以下片段,您可以从秒数据帧中获取匹配索引字段的列表。

    import pandas as pd
    df_ts = pd.DataFrame(data = {'index in df':[0,1,2,3,4,5,6,7,8,9,10,11,12],
                                       "pid":[1,1,2,2,3,3,3,4,6,8,8,9,9],
                                })
    
    df_cmexport = pd.DataFrame(data = {'index in df':[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20],
                                      "pid":[1,1,1,2,3,3,3,3,3,4,4,4,5,5,6,7,8,8,9,9,9],
                                })
    

    通过合并两者来创建新的数据框

    result = pd.merge(df_ts, df_cmexport, left_on=["pid"], right_on=["pid"], how='left', indicator='True', sort=True)
    

    然后识别“df_y 中的索引”数据帧中的唯一值

    index_list = result["index in df_y"].unique()
    

    你得到的结果;

    index_list
    Out[9]: 
    array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 14, 16, 17, 18, 19,
           20], dtype=int64)
    

    【讨论】:

    • 感谢您的帮助!解决了我的问题! :)
    猜你喜欢
    • 2020-06-22
    • 1970-01-01
    • 1970-01-01
    • 2020-05-26
    • 2022-07-21
    • 2012-08-13
    • 1970-01-01
    • 2020-10-01
    • 1970-01-01
    相关资源
    最近更新 更多