从 Python 类生成 JSON 模式规范答案

【问题标题】：Generate a JSON Schema specification from Python classes从 Python 类生成 JSON 模式规范
【发布时间】：2021-12-05 14:40:54
【问题描述】：

简介

大家好！

我正在尝试在 Python3 中开发多智能体模型。所以我的方法是创建基本类并将它们派生为更具体和具体的类。例如，一个 Bike 类继承自 Vehicle，它本身继承自一个基本的 Agent 类。

问题

我想使用 JSON Schema 提供一个明确的我的类初始化参数的规范（并将它们用于验证），但我正在努力自动生成它们。我们来看一个例子：

class Agent:
    
    SCHEMA = {
        "properties": {
            "agent_id": {
                "type": "string",
                "description": "unique identifier"
            },
            "network": {
                "type": "string",
                "description": "road network used by the agent to move"
            },
            "origin": {
                "type": "integer",
                "description": "origin position id",
            },
            "icon": {
                "type": "string",
                "description": "display icon"
            }
        },
        "required": ["agent_id", "network", "origin", "icon"]
    }
    
    def __init__(self, agent_id, network, origin, icon):
        self.id = agent_id
        self.network = network
        self.position = origin
        self.icon = icon
        
    def move(self, position):
        self.position = position


class User(Agent):
    SCHEMA = {
        # that's what i want in the end, but i don't want to duplicate the common properties
        "properties": {
            "agent_id": {
                "type": "string",
                "description": "unique identifier"
            },
            # notice that there is no "network" property
            "origin": {
                "type": "integer",
                "description": "origin position id",
            },
            "destination": {
                "type": "integer",
                "description": "destination position id",
            },
            "icon": {
                "type": "string",
                "description": "display icon"
            }
        },
        "required": ["agent_id", "origin", "destination", "icon"]
    }

    def __init__(self, agent_id, origin, destination, icon="user"):
        super().__init__(agent_id, "walk", origin, icon)
        self.destination = destination


class Vehicle(Agent):
    
    SCHEMA = {
        # one way to inherit the schema could be like this, but it has its flaws
        **super().SCHEMA,
        "seats": {
            "type": "integer",
            "description": "capacity of the vehicle"
        }
    }
    
    def __init__(self, agent_id, network, origin, seats, icon):
        super().__init__(agent_id, network, origin, icon)
        self.seats = seats
      
        
class Bike(Vehicle):

    # i want a schema here too, but without the "seats" prop
    # and maybe specify the default value for "icon" ?

    def __init__(self, agent_id, origin, icon="bike"):
        super().__init__(agent_id, "bike", origin, icon, 1)
        
        
class Car(Vehicle):
    
    # quite same question here

    def __init__(self, agent_id, origin, seats, icon="car"):
        super().__init__(agent_id, "drive", origin, icon, seats)

如您所见，我想在向类中添加新参数时编写和添加规范。我想重新使用更高级别的模式以减少代码重复，但这很难。我有一些想法，例如我上面提出的，但它不允许从父类中剥离参数模式。也许使用方法可以让我更好地控制模式的构建方式。

问题

我想知道是否有允许这样做的库，或者，我很乐意就如何实现这一点提供一些建议。

【问题讨论】：

如果您不使用 SCHEMA 进行任何逻辑，我认为标准的 PeP-8 文档就足够了！
我想使用 json 模式进行输入验证，并在 Web 应用程序中生成用于运行模型的表单。我发现 json 模式非常实用，因为我可以将它们用于规范、验证、文档，并将它们导出以用于我可能有的任何其他用途！

标签： python multi-agent

【解决方案1】：

Pydantic可以帮你实现：

来自Pydantic documentation的示例

from enum import Enum
from pydantic import BaseModel, Field


class FooBar(BaseModel):
    count: int
    size: float = None


class Gender(str, Enum):
    male = 'male'
    female = 'female'
    other = 'other'
    not_given = 'not_given'


class MainModel(BaseModel):
    """
    This is the description of the main model
    """

    foo_bar: FooBar = Field(...)
    gender: Gender = Field(None, alias='Gender')
    snap: int = Field(
        42,
        title='The Snap',
        description='this is the value of snap',
        gt=30,
        lt=50,
    )

    class Config:
        title = 'Main'


# this is equivalent to json.dumps(MainModel.schema(), indent=2):
print(MainModel.schema_json(indent=2))

输出：

# this is equivalent to json.dumps(MainModel.schema(), indent=2):
print(MainModel.schema_json(indent=2))

{
  "title": "Main",
  "description": "This is the description of the main model",
  "type": "object",
  "properties": {
    "foo_bar": {
      "$ref": "#/definitions/FooBar"
    },
    "Gender": {
      "$ref": "#/definitions/Gender"
    },
    "snap": {
      "title": "The Snap",
      "description": "this is the value of snap",
      "default": 42,
      "exclusiveMinimum": 30,
      "exclusiveMaximum": 50,
      "type": "integer"
    }
  },
  "required": [
    "foo_bar"
  ],
  "definitions": {
    "FooBar": {
      "title": "FooBar",
      "type": "object",
      "properties": {
        "count": {
          "title": "Count",
          "type": "integer"
        },
        "size": {
          "title": "Size",
          "type": "number"
        }
      },
      "required": [
        "count"
      ]
    },
    "Gender": {
      "title": "Gender",
      "description": "An enumeration.",
      "enum": [
        "male",
        "female",
        "other",
        "not_given"
      ],
      "type": "string"
    }
  }
}

【讨论】：

另外，还有像 pyhump 这样的库可以帮助你处理 JSON/POJO:s 之间的大小写问题
感谢您的回答！我查看了 Pydantic，但在我看来，我必须为我的每个代理类创建一个单独的 Pydantic 数据模型，这似乎有很多重复的代码。而且我真的不明白这将如何与继承的类一起使用。但也许我不太了解 Pydantic，因为他们的大多数示例不包含实际代码，仅包含数据模型。我确实需要在我的课程中编写代码