【问题标题】:MLFlow Pytorch ModelMLFlow Pytorch 模型
【发布时间】:2022-05-16 15:30:55
【问题描述】:

我有一个训练有素的 Yolo 模型并且是 model.pt 格式,我可以上传模型以在 mlflow 中创建工件。但是,当我查看 yaml 文件时,它列出了一些依赖项。我确定我以错误的方式加载。

渠道:

  • 康达锻造 依赖关系:
  • python=3.6.13
  • 点: **- 毫升流
    • scikit-learn==0.24.2
    • cloudpickle==1.6.0** 名称:mlflow-env

任何人,请告诉我如何使用预训练模型将其推送到 mlflow 以创建工件,然后将依赖项(docker)容器化以推送到 AWS ECR

【问题讨论】:

  • 请提供您用于将模型登录到 mlflow 的源代码或方法。

标签: amazon-web-services data-science amazon-ecr mlflow mlops


【解决方案1】:

Chassis (https://www.chassis.ml) 是一个开源项目,可以满足您的需求。

您提供 CHassis 您的 PyTorch 文件。它将它包装在一个 mlflow 模型中,然后是一个 grpc 服务器,创建一个容器,然后将所有内容放入 Docker hub 中。然后,您可以从 docker hub 拉取并自己推送到 ECR。

PyTorch 示例在此处:https://github.com/modzy/chassis/tree/main/chassisml_sdk/examples/pytorch,虽然没有 Yolo,但示例中包含了一个 fasterrcnn 笔记本,您可以根据需要进行修改。

您需要访问机箱服务器。您可以按照机箱网站上的说明在本地设置一个 (https://chassis.ml/getting-started/deploy-manual/),或者通过在 https://chassis.modzy.com 注册来使用公共托管的一个

基本代码在这里:

#import modules
import chassisml
import pickle
import cv2
import torch
import getpass
import numpy as np
import torchvision.models as models
from torchvision import transforms

#provide docker crednetials
dockerhub_user = getpass.getpass('docker hub username')
dockerhub_pass = getpass.getpass('docker hub password')

#pull model and define pre / post processing of data
model = models.detection.fasterrcnn_mobilenet_v3_large_fpn(pretrained=True)
model.eval()

COCO_INSTANCE_CATEGORY_NAMES = [
    '__background__', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus',
    'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'N/A', 'stop sign',
    'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
    'elephant', 'bear', 'zebra', 'giraffe', 'N/A', 'backpack', 'umbrella', 'N/A', 'N/A',
    'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball',
    'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket',
    'bottle', 'N/A', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl',
    'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza',
    'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'N/A', 'dining table',
    'N/A', 'N/A', 'toilet', 'N/A', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
    'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'N/A', 'book',
    'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush'
]

transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.ToTensor(),
])       

device = 'cpu'

def preprocess(input_bytes):
    decoded = cv2.imdecode(np.frombuffer(input_bytes, np.uint8), -1)
    img_t = transform(decoded)
    batch_t = torch.unsqueeze(img_t, 0).to(device)
    return batch_t

def postprocess(num_detections, predictions):
    inference_result = {
        "detections": [
            {
                "xMin": predictions["boxes"][i].detach().cpu().numpy().tolist()[0],
                "xMax": predictions["boxes"][i].detach().cpu().numpy().tolist()[2],
                "yMin": predictions["boxes"][i].detach().cpu().numpy().tolist()[1],
                "yMax": predictions["boxes"][i].detach().cpu().numpy().tolist()[3],
                "class": labels[predictions["labels"][i].detach().cpu().item()],
                "classProbability": predictions["scores"][i].detach().cpu().item(),
            } for i in range(num_detections)
        ]
    }

    structured_output = {
        "data": {
            "result": inference_result,
            "explanation": None,
            "drift": None,
        }
    }    
    return structured_output

def process(input_bytes):
    
    # preprocess
    batch_t = preprocess(input_bytes)
    
    # run inference
    predictions = model(batch_t)[0]
    num_detections = len(predictions["boxes"])
    
    # postprocess
    structured_output = postprocess(num_detections, predictions)
    
    return structured_output

#create chassis client
chassis_client = chassisml.ChassisClient("<chassis_server_url>:<chassis service port>")

#convert pytorch model to mlflow model
# create Chassis model
chassis_model = chassis_client.create_model(process_fn=process)

# test Chassis model (can pass filepath, bufferedreader, bytes, or text here):
sample_filepath = './data/airplane.jpg'
results = chassis_model.test(sample_filepath)
print(results)

# have chassis containerize model
response = chassis_model.publish(
    model_name="PyTorch Faster R-CNN Object Detection",
    model_version="0.0.2",
    registry_user=dockerhub_user,
    registry_pass=dockerhub_pass
)

# wait for packaging to complete.
job_id = response.get('job_id')
final_status = chassis_client.block_until_complete(job_id)

【讨论】:

    【解决方案2】:

    在创建模型并了解依赖关系时,为什么不传递手动建模所需的库列表?

        conda_env ={ 
            "channels": ["conda-forge"],
            "dependencies" : [
                 "python=<your-python-version>",
                 "pip",
                 {
                     "pip":[
                        "<your-pip-dependency>==<version>"
                     ],
                 },
            ],
            "name": "mlflow-env"
        }
    

    conda_env 应该作为参数之一传递给您的日志模型。

        mlflow.pytorch.log_model(
            artifact_path="",
            conda_env= conda_env
            )
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2020-01-07
      • 2019-10-13
      • 2021-05-02
      • 1970-01-01
      • 2020-03-30
      • 2020-05-05
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多