亚马逊Sagemaker部署

这是一种快速指南,可以使用培训的模型 亚马逊练介会 模型托管服务。

在Sagemaker中部署模型是一个三步的过程:

  1. 在Sagemaker创建模型
  2. 创建端点配置
  3. 创建一个端点

有关如何将模型部署到Amazon Sagemaker的更多信息,请结帐文档 这里.

我们将使用 亚马逊Sagemaker Python SDK 这使得这种简单并自动化一些步骤。

价钱

可以找到Sagemaker部署定价信息 这里。简而言之:您支付每小时的费率,具体取决于您选择的实例类型。小心,因为这可以快速加起来 - 例如,最小的P3实例成本>2000美元/月。另请注意,AWS自由层仅提供足够的时间来运行m4.xlarge实例5天。

设置您的Sagemaker笔记本实例

设置您的笔记本实例,在其中您在Sagemaker笔记本实例上培训了FastAi模型。要使用FastAI安装设置新的Sagemaker笔记本实例,请按照概述的步骤操作 这里.

确保您在内核中安装了Amazon Sagemaker Python SDK Python 3.。运行的示例命令如下:

pip install sagemaker

每个项目设置

在您的笔记本实例上培训您的模型

在Sagemaker笔记本实例上创建一个Jupyter笔记本,了解您的项目,培训您的FastAi模型。

基于宠物课1练习的示例如下:

from fastai.vision import *
path = untar_data(URLs.PETS)
path_img = path/'images'
fnames = get_image_files(path_img)
pat = re.compile(r'/([^/]+)_\d+.jpg$')
bs=64
data = ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms=get_transforms(),
                                   size=299, bs=bs//2).normalize(imagenet_stats)
learn = create_cnn(data, models.resnet50, metrics=error_rate)
learn.fit_one_cycle(8)
learn.unfreeze()
learn.fit_one_cycle(3, max_lr=slice(1e-6,1e-4))

导出您的模型

Now that you have trained your learn object you can export the data object and save the model weights with the following commands:

learn.export(path_img/'models/resnet50.pkl')

ZIP模型人工制品并上传到S3

现在我们导出了我们的模型人工制品,我们可以将它们拉动并上传到S3。

import tarfile
with tarfile.open(path_img/'models/model.tar.gz', 'w:gz') as f:
    t = tarfile.TarInfo('models')
    t.type = tarfile.DIRTYPE
    f.addfile(t)
    f.add(path_img/'models/resnet50.pkl', arcname='resnet50.pkl')

现在我们可以使用以下命令将其上传到S3。

import sagemaker
from sagemaker.utils import name_from_base
sagemaker_session = sagemaker.Session()
bucket = sagemaker_session.default_bucket()
prefix = f'sagemaker/{name_from_base("fastai-pets-model")}'
model_artefact = sagemaker_session.upload_data(path=str(path_img/'models/model.tar.gz'), bucket=bucket, key_prefix=prefix)

创建模型服务脚本

现在我们已准备好将我们的模型部署到Sagemaker模型托管服务。我们将使用 sagemaker python sdk. 与亚马逊杰匠 开源Pytorch容器 由于此容器支持Fast.AI图书馆。使用其中一个预定义的Amazon Sagemaker容器使其可以轻松地编写脚本,然后在亚马逊Sagemaker中运行它只是几步。

To serve models in SageMaker, we need a script that implements 4 methods: model_fn, input_fn, predict_fn & output_fn.

  • The model_fn method needs to load the PyTorch model from the saved weights from disk.
  • The input_fn method needs to deserialze the invoke request body into an object we can perform prediction on.
  • The predict_fn method takes the deserialized request object and performs inference against the loaded model.
  • The output_fn method takes the result of prediction and serializes this according to the response content type.

The methods input_fn and output_fn are optional and if obmitted SageMaker will assume the input and output objects are of type n format with Content-Type application/x-npy.

有关Pytorch模型如何服务的更多信息,请检查项目页面 这里.

可以在下面找到用于服务Vision Resnet模型的示例脚本:

import logging, requests, os, io, glob, time
from fastai.vision import *

logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)

JSON_CONTENT_TYPE = 'application/json'
JPEG_CONTENT_TYPE = 'image/jpeg'

# loads the model into memory from disk and returns it
def model_fn(model_dir):
    logger.info('model_fn')
    path = Path(model_dir)
    learn = load_learner(model_dir, fname='resnet50.pkl')
    return learn

# Deserialize the Invoke request body into an object we can perform prediction on
def input_fn(request_body, content_type=JPEG_CONTENT_TYPE):
    logger.info('Deserializing the input data.')
    # process an image uploaded to the endpoint
    if content_type == JPEG_CONTENT_TYPE: return open_image(io.BytesIO(request_body))
    # process a URL submitted to the endpoint
    if content_type == JSON_CONTENT_TYPE:
        img_request = requests.get(request_body['url'], stream=True)
        return open_image(io.BytesIO(img_request.content))
    raise Exception('Requested unsupported ContentType in content_type: {}'.format(content_type))

# Perform prediction on the deserialized object, with the loaded model
def predict_fn(input_object, model):
    logger.info("Calling model")
    start_time = time.time()
    predict_class,predict_idx,predict_values = model.predict(input_object)
    print("--- Inference time: %s seconds ---" % (time.time() - start_time))
    print(f'Predicted class is {str(predict_class)}')
    print(f'Predict confidence score is {predict_values[predict_idx.item()].item()}')
    return dict(class_name = str(predict_class),
        confidence = predict_values[predict_idx.item()].item())

# Serialize the prediction result into the desired response content type
def output_fn(prediction, accept=JSON_CONTENT_TYPE):        
    logger.info('Serializing the generated output.')
    if accept == JSON_CONTENT_TYPE: return json.dumps(prediction), accept
    raise Exception('Requested unsupported ContentType in Accept: {}'.format(accept))    

将脚本保存到Python中,例如 serve.py

部署到杰匠

首先,我们需要创建一个预测的类来接受JPEG图像作为输入和输出JSON。默认行为是接受numpy数组。

from sagemaker.predictor import Predictor

class ImagePredictor(Predictor):
    def __init__(self, endpoint_name, sagemaker_session):
        super().__init__(endpoint_name, sagemaker_session=sagemaker_session, serializer=None, 
                         deserializer=json_deserializer, content_type='image/jpeg')

我们需要获取IAM角色ARN,以提供介绍的权限来读取我们的模型人工制品。

role = sagemaker.get_execution_role()

In this example we will deploy our model to the instance type ml.m4.xlarge. We will pass in the name of our serving script e.g. serve.py. We will also pass in the S3 path of our model that we uploaded earlier.

from sagemaker.pytorch import PyTorchModel

model=PyTorchModel(model_data=model_artefact, name=name_from_base("fastai-pets-model"),
    role=role, framework_version='1.0.0', py_version='py3', entry_point='serve.py', predictor_cls=ImagePredictor)

predictor = model.deploy(initial_instance_count=1, instance_type='ml.m4.xlarge')

杰玛人需要一段时间来提供准备推断的端点。

测试终点

现在,您可以使用呼叫对部署的端点进行推理调用,例如:

url = <some url of an image to test>
img_bytes = requests.get(url).content
predictor.predict(img_bytes); response

本地测试

In case you want to test the endpoint before deploying to SageMaker you can run the following deploy command changing the parameter name instance_type value to local.

predictor = model.deploy(initial_instance_count=1, instance_type='local')

You can call the predictor.predict() the same as earlier but it will call the local endpoint.