来自 Google Cloud ML-Engine 上已部署的 SCIKITLEARN 模型的预测答案

【问题标题】：Prediction from a Deployed SCIKITLEARN model at Google Cloud ML-Engine来自 Google Cloud ML-Engine 上已部署的 SCIKITLEARN 模型的预测
【发布时间】：2018-10-03 01:59:21
【问题描述】：

我创建了一个用于欺诈检测的机器学习模型：

实际模型代码的小sn-p为：

from sklearn.metrics import classification_report, accuracy_score
from sklearn.ensemble import IsolationForest
from sklearn.neighbors import LocalOutlierFactor

# define a random state
state = 1

# define the outlier detection method
classifiers = {
    "Isolation Forest": IsolationForest(max_samples=len(X),
                                       contamination=outlier_fraction,
                                       random_state=state),
    "Local Outlier Factor": LocalOutlierFactor(
    n_neighbors = 20,
    contamination = outlier_fraction)
}
import pickle
# fit the model
n_outliers = len(Fraud)

for i, (clf_name, clf) in enumerate(classifiers.items()):

    # fit te data and tag outliers
    if clf_name == "Local Outlier Factor":
        y_pred = clf.fit_predict(X)
        # Export the classifier to a file
        with open('model.pkl', 'wb') as model_file:
            pickle.dump(clf, model_file)
        scores_pred = clf.negative_outlier_factor_
    else:
        clf.fit(X)
        scores_pred = clf.decision_function(X)
        y_pred = clf.predict(X)
        # Export the classifier to a file
        with open('model.pkl', 'wb') as model_file:
            pickle.dump(clf, model_file)

    # Reshape the prediction values to 0 for valid and 1 for fraudulent
    y_pred[y_pred == 1] = 0
    y_pred[y_pred == -1] = 1

    n_errors = (y_pred != Y).sum()

    # run classification metrics 
    print('{}:{}'.format(clf_name, n_errors))
    print(accuracy_score(Y, y_pred ))
    print(classification_report(Y, y_pred ))

我已经在谷歌云平台上成功创建了一个存储桶、ml模型和一个版本。但是作为机器学习世界的初学者，我很困惑，我如何将输入传递给这个模型以获得真正的预测，因为这个模型现在部署在 Google 的 ML-Engine 上。

更新：如 N3da 的回答所述，现在我正在使用此代码进行在线预测：

import os
from googleapiclient import discovery
from oauth2client.client import GoogleCredentials

PROJECT_ID = "PROJECT_ID"
VERSION_NAME = "VERSION"
MODEL_NAME = "MODEL_NAME"
credentials = GoogleCredentials.get_application_default()
service = discovery.build('ml', 'v1', credentials=credentials)
name = 'projects/{}/models/{}'.format(PROJECT_ID, MODEL_NAME)
name += '/versions/{}'.format(VERSION_NAME)

data = {
    "instances": [
      [265580, 7, 68728, 8.36, 4.76, 84.12, 79.36, 3346, 1, 11.99, 1.14,
        655012, 0.65, 258374, 0, 84.12],
    ]
}

response = service.projects().predict(
    name=name,
    body={'instances': data}
).execute()

if 'error' in response:
  print (response['error'])
else:
  online_results = response['predictions']
  print(online_results)

但它返回访问错误为：

googleapiclient.errors.HttpError: https://ml.googleapis.com/v1/projects/PROJECT_ID/models/MODEL_NAME/versions/VERSION:predict?alt=json 返回“访问模型被拒绝。”

请帮帮我！

【问题讨论】：

您能否验证模型存在于 gcs 存储桶中并且您的用户可以访问它？附加（和不相关）注意事项：在上面的代码中，您可能只需执行以下操作：data = [[265580, 7, 68728, 8.36, 4.76, 84.12, 79.36, 3346, 1, 11.99, 1.14,655012, 0.65, 258374, 0, 84.12] ] "instances" 键在下面添加了几行（body={'instances': data}），因此您不需要重复两次。
请看下面的n3das更新
嗨@rhaertel80，你是对的，我已经更新了我的代码。

标签： python machine-learning scikit-learn prediction google-cloud-ml

【解决方案1】：

成功创建Version 后，您可以使用gcloud 工具或发送http 请求来获取在线预测。来自this，下面是一个从python代码发送http请求的例子：

service = googleapiclient.discovery.build('ml', 'v1')
name = 'projects/{}/models/{}'.format(PROJECT_ID, MODEL_NAME)
name += '/versions/{}'.format(VERSION_NAME)

response = service.projects().predict(
    name=name,
    body={'instances': data}
).execute()

if 'error' in response:
    print (response['error'])
else:
  online_results = response['predictions']

data 在上面的示例中将是一个列表，其中每个元素都是模型接受的实例。 Here 是关于预测请求和响应的更多信息。

更新：对于您提到的权限问题，这将有助于了解您最初是如何/在何处创建模型和版本的（通过 gcloud、UI 控制台、在您的笔记本电脑上等）。错误消息表明您的用户可以访问您的项目，但不是模型。尝试从运行 Python 代码的任何位置运行 gcloud auth login，并确认它显示为默认项目的项目与您的 PROJECT_ID 匹配。

【讨论】：

嗨@N3da，你能看看这个问题吗：stackoverflow.com/q/50013828/7644562，好吗？
已回答（希望如此）