【发布时间】:2019-11-15 15:20:44
【问题描述】:
在 IRIS 数据上增加或稳定这个基本 KNN 模型的准确度得分(不会显着变化)可能是哪些关键因素?
尝试
from sklearn import neighbors, datasets, preprocessing
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
iris = datasets.load_iris()
X, y = iris.data[:, :], iris.target
Xtrain, Xtest, y_train, y_test = train_test_split(X, y)
scaler = preprocessing.StandardScaler().fit(Xtrain)
Xtrain = scaler.transform(Xtrain)
Xtest = scaler.transform(Xtest)
knn = neighbors.KNeighborsClassifier(n_neighbors=4)
knn.fit(Xtrain, y_train)
y_pred = knn.predict(Xtest)
print(accuracy_score(y_test, y_pred))
print(classification_report(y_test, y_pred))
print(confusion_matrix(y_test, y_pred))
样本准确度得分
0.9736842105263158
0.9473684210526315
1.0
0.9210526315789473
分类报告
precision recall f1-score support
0 1.00 1.00 1.00 12
1 0.79 1.00 0.88 11
2 1.00 0.80 0.89 15
accuracy 0.92 38
macro avg 0.93 0.93 0.92 38
weighted avg 0.94 0.92 0.92 38
样本混淆矩阵
[[12 0 0]
[ 0 11 0]
[ 0 3 12]]
【问题讨论】:
-
稳定精度是什么意思?你想为这个问题找到一个好的“k”值吗?
-
你的意思是跨越多次运行?如果是,那你为什么要这样做?
标签: python algorithm machine-learning scikit-learn knn