【发布时间】:2019-08-07 23:45:12
【问题描述】:
我想将训练有素的 tensorflow 模型部署到 amazon sagemaker,我正在按照此处的官方指南:https://aws.amazon.com/blogs/machine-learning/deploy-trained-keras-or-tensorflow-models-using-amazon-sagemaker/ 使用 jupyter notebook 部署我的模型。
但是当我尝试使用代码时:
predictor = sagemaker_model.deploy(initial_instance_count=1, instance_type='ml.t2.medium')
它给了我以下错误信息:
ValueError:托管端点 sagemaker-tensorflow-2019-08-07-22-57-59-547 时出错:失败原因:图像 '520713654638.dkr.ecr.us-west-1.amazonaws.com/sagemaker- tensorflow:1.12-cpu-py3 '不存在。
我认为教程没有告诉我创建图像,我不知道该怎么做。
import boto3, re
from sagemaker import get_execution_role
role = get_execution_role()
# make a tar ball of the model data files
import tarfile
with tarfile.open('model.tar.gz', mode='w:gz') as archive:
archive.add('export', recursive=True)
# create a new s3 bucket and upload the tarball to it
import sagemaker
sagemaker_session = sagemaker.Session()
inputs = sagemaker_session.upload_data(path='model.tar.gz', key_prefix='model')
from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
role = role,
framework_version = '1.12',
entry_point = 'train.py',
py_version='py3')
%%time
#here I fail to deploy the model and get the error message
predictor = sagemaker_model.deploy(initial_instance_count=1,
instance_type='ml.m4.xlarge')
【问题讨论】:
-
对于未来的读者,在我的情况下,我不得不提到
py_version='py2'才能让它工作。
标签: python amazon-web-services tensorflow amazon-s3 amazon-sagemaker