保存 perdition 模型并将新数据集插入其中答案

【问题标题】：saving perdition model and insert new dataset into it保存 perdition 模型并将新数据集插入其中
【发布时间】：2021-03-25 22:56:12
【问题描述】：

我已经建立了一个预测模型，显示每个客户使用 KNN 模型获得信用卡的可能性，如下所示

# Fitting KNN to the Training set:

classifier = KNeighborsClassifier(n_neighbors = 7, metric = 'minkowski', p = 1,leaf_size = 1)
classifier.fit(X_train, y_train)

# Predicting the Test set results 
y_pred  = classifier.predict(X_test)

#Evaluate results
acc = accuracy_score(y_test, y_pred )
prec = precision_score(y_test, y_pred )
rec = recall_score(y_test, y_pred )
f1 = f1_score(y_test, y_pred )
f2 = fbeta_score(y_test, y_pred, beta=2.0)

results = pd.DataFrame([['K-Nearest Neighbours', acc, prec, rec, f1, f2]],
               columns = ['Model', 'Accuracy', 'Precision', 'Recall', 'F1 Score', 'F2 Score'])

print(results)

# Predict the Test set results

y_pred = best_model.predict(X_test)

#probability score
y_pred_probs = best_model.predict_proba(X_test)
y_pred_probs  = y_pred_probs [:, 1]

# Step 20: Format Final Results:

    final_results = pd.concat([ y_test], axis = 1).dropna()
    
    final_results['predictions'] = y_pred 
    
    final_results["propensity_to_pay(%)"] = y_pred_probs 
    
    final_results["propensity_to_pay(%)"] = final_results["propensity_to_pay(%)"]*100
    
    final_results["propensity_to_pay(%)"]=final_results["propensity_to_pay(%)"].round(2)
    
    final_results = final_results[[ 'Approved', 'predictions', 'propensity_to_pay(%)']]
    
    final_results ['Ranking'] = pd.qcut(final_results['propensity_to_pay(%)'].rank(method = 'first'),10,labels=range(10,0,-1))
    
    print (final_results)

输出是

        Approved  predictions  propensity_to_pay(%) Ranking
134         0            1                 57.14      10
212         0            0                 28.57      10
655         1            1                 85.71       2
83          1            1                 71.43       7
297         1            1                 71.43       7
..        ...          ...                   ...     ...
271         1            1                 71.43       2
517         0            1                 71.43       2
28          0            1                 57.14       7
377         1            1                 85.71       1
244         0            1                 71.43

我已经使用 pickle 保存了模型，如下所示

# Save to file in the current working directory
pkl_filename = "pickle_modelKNN2332.pkl"
with open(pkl_filename, 'wb') as file:
    pickle.dump(classifier, file)

# Load from file
with open(pkl_filename, 'rb') as file:
    pickle_model = pickle.load(file)

如何将新记录插入其中，例如我有一个新的客户，其中包含这些数据

 gender   YearsEmployed Income  Employed  CreditScore  Age   Debt    Married    Approved 

    b          1.25      30.83       t         6        30   4.460       y

如何获取Approved的值和获得信用卡的百分比

【问题讨论】：

请提供尽可能多的有关您的问题的信息以建立上下文。怎么会有人知道这个pkl 序列化文件是什么？请参考stackoverflow.com/help/how-to-ask 获取指南模板。
我正在尝试将新数据插入到已经过训练和测试的 KNN 模型中。这里的目的是能够根据这个模型预测是否给新客户一张信用卡。

标签： python classification data-science prediction knn

【解决方案1】：

import pickle 
  
# Save the trained model as a pickle string. 
saved_model = pickle.dumps(knn) 
  
# Load the pickled model 
knn_from_pickle = pickle.loads(saved_model) 
  
# Use the loaded pickled model to make predictions 
knn_from_pickle.predict(X_test)

参考 => https://www.geeksforgeeks.org/saving-a-machine-learning-model/

【讨论】：