亚马逊Sagemaker部署
这是一种快速指南,可以使用培训的模型 亚马逊练介会 模型托管服务。
在Sagemaker中部署模型是一个三步的过程:
- 在Sagemaker创建模型
- 创建端点配置
- 创建一个端点
有关如何将模型部署到Amazon Sagemaker的更多信息,请结帐文档 这里.
我们将使用 亚马逊Sagemaker Python SDK 这使得这种简单并自动化一些步骤。
价钱
可以找到Sagemaker部署定价信息 这里。简而言之:您支付每小时的费率,具体取决于您选择的实例类型。小心,因为这可以快速加起来 - 例如,最小的P3实例成本>2000美元/月。另请注意,AWS自由层仅提供足够的时间来运行m4.xlarge实例5天。
设置您的Sagemaker笔记本实例
设置您的笔记本实例,在其中您在Sagemaker笔记本实例上培训了FastAi模型。要使用FastAI安装设置新的Sagemaker笔记本实例,请按照概述的步骤操作 这里.
确保您在内核中安装了Amazon Sagemaker Python SDK Python 3.。运行的示例命令如下:
pip install sagemaker
每个项目设置
在您的笔记本实例上培训您的模型
在Sagemaker笔记本实例上创建一个Jupyter笔记本,了解您的项目,培训您的FastAi模型。
基于宠物课1练习的示例如下:
from fastai.vision import *
path = untar_data(URLs.PETS)
path_img = path/'images'
fnames = get_image_files(path_img)
pat = re.compile(r'/([^/]+)_\d+.jpg$')
bs=64
data = ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms=get_transforms(),
size=299, bs=bs//2).normalize(imagenet_stats)
learn = create_cnn(data, models.resnet50, metrics=error_rate)
learn.fit_one_cycle(8)
learn.unfreeze()
learn.fit_one_cycle(3, max_lr=slice(1e-6,1e-4))
导出您的模型
Now that you have trained your learn
object you can export the data
object and save the model weights with the following commands:
learn.export(path_img/'models/resnet50.pkl')
ZIP模型人工制品并上传到S3
现在我们导出了我们的模型人工制品,我们可以将它们拉动并上传到S3。
import tarfile
with tarfile.open(path_img/'models/model.tar.gz', 'w:gz') as f:
t = tarfile.TarInfo('models')
t.type = tarfile.DIRTYPE
f.addfile(t)
f.add(path_img/'models/resnet50.pkl', arcname='resnet50.pkl')
现在我们可以使用以下命令将其上传到S3。
import sagemaker
from sagemaker.utils import name_from_base
sagemaker_session = sagemaker.Session()
bucket = sagemaker_session.default_bucket()
prefix = f'sagemaker/{name_from_base("fastai-pets-model")}'
model_artefact = sagemaker_session.upload_data(path=str(path_img/'models/model.tar.gz'), bucket=bucket, key_prefix=prefix)
创建模型服务脚本
现在我们已准备好将我们的模型部署到Sagemaker模型托管服务。我们将使用 sagemaker python sdk. 与亚马逊杰匠 开源Pytorch容器 由于此容器支持Fast.AI图书馆。使用其中一个预定义的Amazon Sagemaker容器使其可以轻松地编写脚本,然后在亚马逊Sagemaker中运行它只是几步。
To serve models in SageMaker, we need a script that implements 4 methods: model_fn
, input_fn
, predict_fn
& output_fn
.
- The
model_fn
method needs to load the PyTorch model from the saved weights from disk. - The
input_fn
method needs to deserialze the invoke request body into an object we can perform prediction on. - The
predict_fn
method takes the deserialized request object and performs inference against the loaded model. - The
output_fn
method takes the result of prediction and serializes this according to the response content type.
The methods input_fn
and output_fn
are optional and if obmitted SageMaker will assume the input and output objects are of type n format with Content-Type application/x-npy
.
有关Pytorch模型如何服务的更多信息,请检查项目页面 这里.
可以在下面找到用于服务Vision Resnet模型的示例脚本:
import logging, requests, os, io, glob, time
from fastai.vision import *
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
JSON_CONTENT_TYPE = 'application/json'
JPEG_CONTENT_TYPE = 'image/jpeg'
# loads the model into memory from disk and returns it
def model_fn(model_dir):
logger.info('model_fn')
path = Path(model_dir)
learn = load_learner(model_dir, fname='resnet50.pkl')
return learn
# Deserialize the Invoke request body into an object we can perform prediction on
def input_fn(request_body, content_type=JPEG_CONTENT_TYPE):
logger.info('Deserializing the input data.')
# process an image uploaded to the endpoint
if content_type == JPEG_CONTENT_TYPE: return open_image(io.BytesIO(request_body))
# process a URL submitted to the endpoint
if content_type == JSON_CONTENT_TYPE:
img_request = requests.get(request_body['url'], stream=True)
return open_image(io.BytesIO(img_request.content))
raise Exception('Requested unsupported ContentType in content_type: {}'.format(content_type))
# Perform prediction on the deserialized object, with the loaded model
def predict_fn(input_object, model):
logger.info("Calling model")
start_time = time.time()
predict_class,predict_idx,predict_values = model.predict(input_object)
print("--- Inference time: %s seconds ---" % (time.time() - start_time))
print(f'Predicted class is {str(predict_class)}')
print(f'Predict confidence score is {predict_values[predict_idx.item()].item()}')
return dict(class_name = str(predict_class),
confidence = predict_values[predict_idx.item()].item())
# Serialize the prediction result into the desired response content type
def output_fn(prediction, accept=JSON_CONTENT_TYPE):
logger.info('Serializing the generated output.')
if accept == JSON_CONTENT_TYPE: return json.dumps(prediction), accept
raise Exception('Requested unsupported ContentType in Accept: {}'.format(accept))
将脚本保存到Python中,例如 serve.py
部署到杰匠
首先,我们需要创建一个预测的类来接受JPEG图像作为输入和输出JSON。默认行为是接受numpy数组。
from sagemaker.predictor import Predictor
class ImagePredictor(Predictor):
def __init__(self, endpoint_name, sagemaker_session):
super().__init__(endpoint_name, sagemaker_session=sagemaker_session, serializer=None,
deserializer=json_deserializer, content_type='image/jpeg')
我们需要获取IAM角色ARN,以提供介绍的权限来读取我们的模型人工制品。
role = sagemaker.get_execution_role()
In this example we will deploy our model to the instance type ml.m4.xlarge
. We will pass in the name of our serving script e.g. serve.py
. We will also pass in the S3 path of our model that we uploaded earlier.
from sagemaker.pytorch import PyTorchModel
model=PyTorchModel(model_data=model_artefact, name=name_from_base("fastai-pets-model"),
role=role, framework_version='1.0.0', py_version='py3', entry_point='serve.py', predictor_cls=ImagePredictor)
predictor = model.deploy(initial_instance_count=1, instance_type='ml.m4.xlarge')
杰玛人需要一段时间来提供准备推断的端点。
测试终点
现在,您可以使用呼叫对部署的端点进行推理调用,例如:
url = <some url of an image to test>
img_bytes = requests.get(url).content
predictor.predict(img_bytes); response
本地测试
In case you want to test the endpoint before deploying to SageMaker you can run the following deploy
command changing the parameter name instance_type
value to local
.
predictor = model.deploy(initial_instance_count=1, instance_type='local')
You can call the predictor.predict()
the same as earlier but it will call the local endpoint.