【问题标题】:Invoking Endpoint in AWS SageMaker for Scikit Learn Model在 AWS SageMaker 中为 Scikit 学习模型调用终端节点
【发布时间】:2019-02-08 04:46:57
【问题描述】:

在 AWS Sagemaker 上部署 scikit 模型后,我使用以下方法调用我的模型:

import pandas as pd
payload = pd.read_csv('test3.csv')
payload_file = io.StringIO()
payload.to_csv(payload_file, header = None, index = None)

import boto3
client = boto3.client('sagemaker-runtime')
response = client.invoke_endpoint(
    EndpointName= endpoint_name,
    Body= payload_file.getvalue(),
    ContentType = 'text/csv')
import json
result = json.loads(response['Body'].read().decode())
print(result)

上面的代码完美运行,但是当我尝试时:

payload = np.array([[100,5,1,2,3,4]])

我得到错误:

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from container-1 with message 
"<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> <title>500 Internal Server Error</title> <h1>
Internal Server Error</h1> <p>The server encountered an internal error and was unable to complete your request.  
Either the server is overloaded or there is an error in the application.</p> 

Scikit-learn SageMaker Estimators and Models 中提到,

SageMaker Scikit-learn 模型服务器提供默认实现 输入_fn。此函数反序列化 JSON、CSV 或 NPY 编码数据 到 NumPy 数组中。

我想知道如何修改默认值以接受 2D numpy 数组,以便将其用于实时预测。

有什么建议吗?我尝试使用 Inference Pipeline with Scikit-learn and Linear Learner 作为参考,但无法用 Scikit 模型替换线性学习器。我收到了同样的错误。

【问题讨论】:

    标签: python amazon-web-services numpy machine-learning scikit-learn


    【解决方案1】:

    如果有人找到改变默认 input_fn、predict_fn 和 output_fn 以接受 numpy 数组或字符串的方法,请分享。

    但我确实找到了一种使用默认设置的方法。

    import numpy as np
    import pandas as pd
    
    df = pd.DataFrame(np.array([[100.0,0.08276299999999992,77.24,0.0008276299999999992,43.56,
                                 6.6000000000000005,69.60699488825647,66.0,583.0,66.0,6.503081996847735,44.765133295284,
                                 0.4844340723821271,21.35599999999999],
                                [100.0,0.02812099999999873,66.24,0.0002855600000003733,43.56,6.6000000000000005,
                                 1.6884635296354735,66.0,78.0,66.0,6.754543287329573,47.06480204081666,
                                 0.42642318733140017,0.4703999999999951],
                                [100.0,4.374382,961.36,0.043743819999999996,25153.96,158.6,649.8146514292529,120.0,1586.0
                                 ,1512.0,-0.25255116297020636,1.2255274408634853,-2.5421402801039323,614.5056]]),
                      columns=['a', 'b', 'c','d','e','f','g','h','i','j','k','l','m','n'])
    import io
    from io import StringIO
    test_file = io.StringIO()
    df.to_csv(test_file,header = None, index = None)
    

    然后:

    import boto3
    client = boto3.client('sagemaker-runtime')
    response = client.invoke_endpoint(
        EndpointName= endpoint_name,
        Body= test_file.getvalue(),
        ContentType = 'text/csv')
    import json
    result = json.loads(response['Body'].read().decode())
    print(result)
    

    但是,如果有更好的解决方案,那真的很有帮助。

    【讨论】:

    • 对于通过谷歌搜索找到此问题的任何人 - 如果您查看此处:sagemaker.readthedocs.io/en/stable/… 您可以创建自定义输入/预测/输出函数。我能够使用这种方法覆盖默认的predict_fn
    【解决方案2】:

    您应该能够为 model.deploy() 返回的预测器设置序列化器/反序列化器。在此处的 FM 示例笔记本中有一个这样做的示例:

    https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/factorization_machines_mnist/factorization_machines_mnist.ipynb

    请试试这个,让我知道它是否适合你!

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2019-04-23
      • 2019-07-04
      • 2020-09-08
      • 2020-10-18
      相关资源
      最近更新 更多