【问题标题】:How to Create a DataFrame with Loops?如何创建带有循环的 DataFrame?
【发布时间】:2019-12-29 14:11:54
【问题描述】:
data = {'col1':['Country', 'State', 'City', 'park' ,'avenue'],
       'col2':['County','stats','PARK','Avenue', 'cities']}



    col1     col2
0   Country   County
1   State     stats
2   City      PARK
3   park      Avenue
4   avenue    cities

我试图用模糊模糊技术匹配两列的名称并按分数排序。

输出:

col1    col2   score  order
0 Country County  92     1
1 Country stats   31     2
2 Country PARK    18     3
3 Country Avenue  17     4
4 Country cities  16     5
5 State   County  80     1
6 State   stats   36     2
7 State   PARK    22     3
8 State   Avenue  18     4
9 State   cities  16     5
.....

我做了什么:

'''

from fuzzywuzzy import fuzz
import pandas as pd
import numpy as np

    for i in df.col1:
        for j in df.col2:
            print(i,j,fuzz.token_set_ratio(i, j))

'''

我卡在这里了..

【问题讨论】:

    标签: python python-3.x pandas python-2.7 fuzzy-logic


    【解决方案1】:

    让我们做吧

    df['score']=df.apply(lambda x : fuzz.ratio(x['col1'],x['col2']),1)
    df['score']
    0    92
    1    60
    2     0
    3     0
    4    17
    dtype: int64
    

    然后

    df['order']=(-df['score']).groupby(df['col1']).rank(method='first')
    

    【讨论】:

      猜你喜欢
      • 2017-05-04
      • 1970-01-01
      • 2021-10-24
      • 1970-01-01
      • 2022-08-07
      • 2023-02-01
      • 2021-05-14
      • 2022-01-01
      • 2021-05-08
      相关资源
      最近更新 更多