【问题标题】:Is there any similar formula like IFERROR, IF, MATCH, SEARCH, INDEX in python, pandas?python中是否有类似IFERROR,IF,MATCH,SEARCH,INDEX,pandas的公式?
【发布时间】:2020-08-07 14:54:13
【问题描述】:

我在 3 个不同的文件中有 3 组 Excel 数据,我想使用匹配和搜索功能,然后索引结果,保存在一个新文件中。

df = pd.DataFrame({'date': [16042020, 20042020, 16042020, 16042020, 17042020],
                    'no' : [230255,1755,210520, 65556,12355],
                    'des': ['ant','flower', 'happy','hate', 'okay'],
                    'des2': ['cheeeee','die', 'of','bore','sad']})

df1 = pd.DataFrame({ 'condition': ['good', 'bad', 'good',  'good',  'bad'],
                    'no': [230255,  1755,  7897, 6666, 1311],
                    'des': ['ant', 'flower', 'happy', 'hate','okay'],
                    'which no': ['1234', '5555', '3535','1359','8979']})

df2 = pd.DataFrame({ 'condition': ['bad', 'bad', 'good', 'good','good'],
                      'no': [46451,  448713, 210520, 65556, 8795],
                     'des': ['ant','flower', 'happy','hate', 'okay'],
                     'which no': [1234,  5555, 3535, 1359,8979]})

OUTPUT:
df     date      no     des     des2
0  16042020  230255     ant  cheeeee
1  20042020    1755  flower      die
2  16042020  210520   happy       of
3  16042020   65556    hate     bore
4  17042020   12355    okay      sad

df1 condition    no     des which no
0      good  230255     ant     1234
1       bad    1755  flower     5555
2      good    7897   happy     3535
3      good    6666    hate     1359
4       bad    1311    okay     8979

df2  condition   no     des  which no
0       bad   46451     ant      1234
1       bad  448713  flower      5555
2      good  210520   happy      3535
3      good   65556    hate      1359
4      good    8795    okay      8979

我的意图是在df1的'no'中搜索df的'no',条件为'condition'=“good”,如果为真,则输出'which no',如果为假,则在df2中搜索,如果仍然为假,则输出“不匹配”

如果我使用谷歌表格公式如下:

df'result' = iferror(index(df1'which no', match(1,search(isnumber(df'no',df1'no')))*(df1'condition' = "good"),0)),iferror(index(df2'which no', match(1,search(isnumber(df'no',df2'no')))*(df2'condition' = "good"),0))),"NO MATCH")

RESULT: 
       date      no     des     des2   **result**
0  16042020  230255     ant  cheeeee     1234
1  20042020    1755  flower      die     NO MATCH
2  16042020  210520   happy       of     3535
3  16042020   65556    hate     bore     1359
4  17042020   12355    okay      sad     NO MATCH

我的输出应该如下所示,在 df excel 文件的新列和另一个新列中显示来自 (df1/df2) 的数据的列表

**result**     **from which list**
0   1234         df1
1  NO MATCH      NONE
2  3535          df2
3  1359          df2
4  NO MATCH     NONE

【问题讨论】:

    标签: python excel pandas dataframe google-sheets


    【解决方案1】:

    想法是首先使用concat,然后仅过滤good 行,如有必要,还可以通过no 过滤出重复项DataFrame.drop_duplicates

    df3 = pd.concat([df1, df2]).query('condition == "good"').drop_duplicates('no')
    print (df3)
      condition      no    des which no
    0      good  230255    ant     1234
    2      good    7897  happy     3535
    3      good    6666   hate     1359
    2      good  210520  happy     3535
    3      good   65556   hate     1359
    4      good    8795   okay     8979
    

    然后DataFrame.merge左连接用DataFrame.fillna替换缺失值:

    df = df.merge(df3[['no','which no']], on='no', how='left').fillna({'which no':'NO MATCH'})
    print (df)
           date      no     des     des2  which no
    0  16042020  230255     ant  cheeeee      1234
    1  20042020    1755  flower      die  NO MATCH
    2  16042020  210520   happy       of      3535
    3  16042020   65556    hate     bore      1359
    4  17042020   12355    okay      sad  NO MATCH
    

    编辑:如果使用参数keysDataFrame.reset_index 创建新列:

    df3 = (pd.concat([df1, df2], keys=('df1','df2'))
            .reset_index()
            .rename(columns={'level_0':'from which list'})
            .query('condition == "good"'))
    print (df3)
      from which list  level_1 condition      no    des which no
    0             df1        0      good  230255    ant     1234
    2             df1        2      good    7897  happy     3535
    3             df1        3      good    6666   hate     1359
    7             df2        2      good  210520  happy     3535
    8             df2        3      good   65556   hate     1359
    9             df2        4      good    8795   okay     8979
    

    ...然后在df3 中过滤列表中的此列:

    df = (df.merge(df3[['no','which no', 'from which list']], on='no', how='left')
           .fillna({'which no':'NO MATCH'}))
    print (df)
           date      no     des     des2  which no from which list
    0  16042020  230255     ant  cheeeee      1234             df1
    1  20042020    1755  flower      die  NO MATCH             NaN
    2  16042020  210520   happy       of      3535             df2
    3  16042020   65556    hate     bore      1359             df2
    4  17042020   12355    okay      sad  NO MATCH             NaN
    

    【讨论】:

    • 如果我想要一个新列来自哪个列表,如果它来自 df2 列表,显示 df2 列表*,该怎么办?
    猜你喜欢
    • 1970-01-01
    • 2013-08-16
    • 2013-10-26
    • 2019-10-06
    • 2015-11-29
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多