例如,如果 CSV 的内容如下:
CSV
Size,Color,Shape,Accept
small,blue,oval,yes
small,green,oval,yes
big,green,oval,no
big,red,square,no
small,red,square,no
small,blue,square,yes
big,red,circle,yes
我们想知道使用 nltk 朴素贝叶斯是否会接受 small-red-oval 项,我们可以使用以下代码:
蟒蛇
import csv
import nltk
f = open('C:/Users/Amrit/Documents/Data/exp.csv')
csv_f = csv.reader(f)
csv_f.next() #skip the header line
dataset = []
for row in csv_f:
dataset.append(({'size': row[0], 'color': row[1], 'shape': row[2]}, row[3]))
print (dataset)
classifier = nltk.NaiveBayesClassifier.train(dataset)
mydata = {'size':'small', 'color':'red', 'shape':'oval'}
print (mydata, classifier.classify(mydata))
注意:我也在学习。感谢@Francisco Couzo 和@Milad M 提供的链接