【发布时间】:2019-04-04 13:54:27
【问题描述】:
假设我有一个这样的运动列表:
sports=["futball","fitbal","football","tennis","tenis","tenisse","footbal","zennis","ping-pong"]
如果模糊匹配优于 0.5 并且不只是与其自身匹配,我想创建一个数据框,将运动的每个元素与其最接近的元素匹配。 (我想为此使用函数fuzzywuzzy.fuzz.ratio(x,y))
结果应该是这样的:
pd.DataFrame({"sport":sports,"closest_match":["futball","futball","football","tennis","tennis","tennis","futball","tennis","ping-pong"]})
sport closest_match
0 futball futball
1 fitbal futball
2 football football
3 tennis tennis
4 tenis tennis
5 tenisse tennis
6 footbal futball
7 zennis tennis
8 ping-pong ping-pong
谢谢
【问题讨论】:
标签: python pandas fuzzy fuzzywuzzy