Kubernetes部署

本指南演示了如何将第一课中的宠物分类模型部署为REST 使用BentoML将API服务器连接到Kubernetes集群。

Setup

在开始本教程之前,请确保您具有以下条件:

  • 启用Kubernetes的群集或机器。
    • This guide uses Kubernetes’ recommend learning environment, minikube. minikube installation: //kubernetes.io/docs/setup/learning-environment/minikube/
    • learn more about kubernetes installation: //kubernetes.io/docs/setup/
      • 由云提供商管理的kubernetes集群
        • AWS: //aws.amazon.com/eks/
        • Google: //cloud.google.com/kubernetes-engine/
        • Azure: //docs.microsoft.com/en-us/azure/aks/intro-kubernetes
    • kubectl CLI tool: //kubernetes.io/docs/tasks/tools/install-kubectl/
  • 在本地系统上正确安装并配置了Docker和Docker Hub
    • Docker installation instruction: //www.docker.com/get-started
    • Docker Hub: //hub.docker.com
  • Python (3.6 or above) and required packages: bentoml, fastai, torch, and torchvision
    • pip install bentoml fastai==1.0.57 torch==1.4.0 torchvision=0.5.0

使用BentoML构建模型API服务器

The following code defines a model server using Fastai model, asks BentoML to figure out the required PyPi packages automatically. It also defines an API called predict, that 是访问此模型服务器的入口点。 API期望 Fastai ImageData 对象作为其输入数据。

# pet_classification.py file

from bentoml import BentoService, api, env, artifacts
from bentoml.artifact import FastaiModelArtifact
from bentoml.handlers import FastaiImageHandler

@artifacts([FastaiModelArtifact('pet_classifier')])
@env((auto_pip_dependencies=True)
class PetClassification(BentoService):

    @api(FastaiImageHandler)
    def predict(self, image):
        result = self.artifacts.pet_classifier.predict(image)
        return str(result)

运行以下代码以创建带有pet分类的BentoService SavedBundle Fastai第一课笔记本中的模型。 BentoService SavedBundle是版本控制的文件存档 准备进行生产部署。归档文件包含上面定义的模型服务,Python代码 依赖项,PyPi依赖项和经过训练的宠物分类模型:

from fastai.vision import *

path = untar_data(URLs.PETS)
path_img = path/'images'
fnames = get_image_files(path_img)
bs=64
np.random.seed(2)
pat = r'/([^/]+)_\d+.jpg$'
data = ImageDataBunch.from_name_re(
    path_img,
    fnames,
    pat,
    num_workers=0,
    ds_tfms=get_transforms(),
    size=224,
    bs=bs
).normalize(imagenet_stats)
learn = create_cnn(data, models.resnet50, metrics=error_rate)
learn.fit_one_cycle(8)
learn.unfreeze()
learn.fit_one_cycle(3, max_lr=slice(1e-6,1e-4))

from pet_classification import PetClassification

# Create a PetClassification instance
service = PetClassification()

#  Pack the newly trained model artifact
service.pack('pet_classifier', learn)

# Save the prediction service to disk for model serving
service.save()

保存BentoService实例后,您现在可以使用以下命令启动REST API服务器: 训练模型并在本地测试API服务器:

# Start BentoML API server:
bentoml serve PetClassification:latest
# Send test request

# Replace PATH_TO_TEST_IMAGE_FILE with one of the image from {path_img}
# An example path: /Users/user_name/.fastai/data/oxford-iiit-pet/images/shiba_inu_122.jpg
curl -i \
    --request POST \
    --header "Content-Type: multipart/form-data" \
    -F "[email protected]_TO_TEST_IMAGE_FILE" \
    localhost:5000/predict

将模型服务器部署到Kubernetes

构建模型服务器映像

BentoML提供了一种使用Docker容器化模型API服务器的便捷方法:

  1. 使用bentoml get命令找到SavedBundle目录

  2. 使用SavedBundle目录运行包含Dockerfile的docker build

  3. 运行生成的Docker镜像以启动为模型提供服务的Docker容器

# Download and install jq, the JSON processor: //stedolan.github.io/jq/download/
saved_path=$(bentoml get PetClassifier:latest -q | jq -r ".uri.uri")

# Replace {docker_username} with your Docker Hub username
docker build -t {docker_username}/pet-classifier .
docker push {docker_username}/pet-classifier

Use docker run command to test the docker image locally:

docker run -p 5000:5000 {docker_username}/pet-classifier

In another terminal window, use the curl command from above to get the prediction result.

部署到Kubernetes

以下是一个示例YAML文件,用于指定运行和运行所需的资源。 expose a BentoML model server in a Kubernetes cluster. Replace {docker_username} with your Docker Hub username and save it to pet-classifier.yaml file:

apiVersion: v1
kind: Service
metadata:
  labels:
    app: pet-classifier
  name: pet-classifier
spec:
  ports:
  - name: predict
    port: 5000
    targetPort: 5000
  selector:
    app: pet-classifier
  type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: pet-classifier
  name: pet-classifier
spec:
  selector:
    matchLabels:
      app: pet-classifier
  template:
    metadata:
      labels:
        app: pet-classifier
    spec:
      containers:
      - image: {docker_username}/pet-classifier
        name: pet-classifier
        ports:
        - containerPort: 5000

Use kubectl apply command to deploy the model server to kubernetes cluster.

kubectl apply -f pet-classifier.yaml

Check deployment status with kubectl:

kubectl get svc pet-classifier

发送预测请求

Make prediction request with curl:

# If you are not using minikube, replacing ${minikube ip} with your Kubernetes cluster's IP

# Replace PATH_TO_TEST_IMAGE_FILE
curl -i \
    --request POST \
    --header "Content-Type: multipart/form-data" \
    -F "[email protected]_TO_TEST_IMAGE_FILE" \
    ${minikube ip}:5000/predict

从Kubernetes集群删除部署

kubectl delete -f pet-classifier.yaml

使用Prometheus监视模型服务器指标

Setup

开始本节之前,请确保您具有以下各项:

  • 安装了Prometheus的集群。
    • For Kubernetes installation: //github.com/coreos/kube-prometheus
    • For Prometheus installation in other environments: //prometheus.io/docs/introduction/first_steps/#starting-prometheus

BentoML模型服务器具有内置的Prometheus指标端点。用户也可以自定义 使用BentoML构建模型服务器时,度量标准可以满足他们的需求。

要使用启用Prometheus的Kubernetes集群监视指标,请更新注释 in deployment spec with prometheus.io/scrape: true and prometheus.io/port: 5000.

例如:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: pet-classifier
  name: pet-classifier
spec:
  selector:
    matchLabels:
      app: pet-classifier
  template:
    metadata:
      labels:
        app: pet-classifier
      annotations:
        prometheus.io/scrape: true
        prometheus.io/port: 5000
    spec:
      containers:
      - image: {docker_username}/pet-classifier
        name: pet-classifier
        ports:
        - containerPort: 5000

要在其他环境中监视指标,请更新Prometheus抓取配置。

Prometheus配置中的抓取作业示例:

job_name: pet-classifier
host: MODEL_SERVER_IP:5000

附加信息

  • BentoML documentation: //docs.bentoml.org/en/latest
  • Deployment tutorials to other platforms or services: //docs.bentoml.org/en/latest/deployment/index.html