【发布时间】:2019-01-06 08:51:42
【问题描述】:
有一个水果列表,我想检查它们是否存在以及哪些存在于数据框中(不管哪些列),并指出它们。
import pandas as pd
Fruits = ["Avocado", "Blackberry", "Black Sapote", "Fingered Citron", "Crab Apples", "Custard Apple", "Chico Fruit", "Coconut", "Damson", "Elderberry", "Goji Berry", "Grape", "Guava", "Huckleberry"]
data = {'ID': ["488", "14805", "23591", "470995", "56251", "85964", "5268", "322624", "342225", "380689", "480562", "5623"],
'Content' : ["Kalo Beruin", "this is Blackberry", "Khara Beruin", "guava and coconut", "Lapha", "Loha Sura", "Matichak", "Miniket Rice", "Mou Beruin", "Moulata", "oh Goji Berry", "purple Grape"],
'Content_1' : ["Jook-sing noodles", "grape", "Lai fun", "Damson", "Liangpi", "Custard Apple and Crab apples", "Misua", "nana Coconut Berry", "Damson", "Paomo", "Ramen", "Rice vermicelli"]}
df = pd.DataFrame(data)
df = df[['ID', 'Content', 'Content_1']]
s = pd.Series(data['Content'])
s_1 = pd.Series(data['Content_1'])
df["found_content"] = s[s.str.contains('|'.join(Fruits))]
df["found_content_1"] = s_1[s_1.str.contains('|'.join(Fruits))]
writer = pd.ExcelWriter('C:\\TEM\\22522.xlsx')
df.to_excel(writer,'Sheet1', index = False)
writer.save()
代码的问题是:
- 它不显示水果,而是显示全部内容。例如 14805 的行,它应该只是“黑莓”而不是整个原始内容。
- 它区分大小写,因此缺少一些发现,例如 14805 行。
- 我想使用“;”将结果分开,如 85964 行。
我怎样才能实现它?谢谢。
这是当前输出和想要输出的屏幕截图。
【问题讨论】:
-
这有点风,如果可能的话,你能简化这个例子吗?
-
@coldspeed,感谢您的评论。这是为了提供更多的样品进行测试。下次我会注意的。