【问题标题】:How to disable JSON schema in Kafka Source Connector (e.g. Debezium)如何在 Kafka 源连接器(例如 Debezium)中禁用 JSON 模式
【发布时间】:2021-05-03 08:50:17
【问题描述】:

我遵循了 Debezium 教程 (https://github.com/debezium/debezium-examples/tree/master/tutorial#using-postgres),所有从 Postgres 收到的 CDC 数据都以 JSON 格式和模式发送到 Kafka 主题 - 如何摆脱模式?

这里是连接器的配置(在 Docker 容器中启动)

{
    "name": "inventory-connector",
    "config": {
        "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
        "tasks.max": "1",
        "key.converter.schemas.enable": "false",
        "value.converter.schemas.enable": "false",
        "database.hostname": "postgres",
        "database.port": "5432",
        "database.user": "postgres",
        "database.password": "postgres",
        "database.dbname" : "postgres",
        "database.server.name": "dbserver1",
        "schema.include": "inventory"
    }
}

JSON 模式仍在消息中。 只有在使用以下环境变量启动 Docker 容器时,我才设法摆脱它:

 - CONNECT_KEY_CONVERTER_SCHEMAS_ENABLE=false
 - CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE=false

为什么我无法通过连接器配置实现完全相同的效果?

带有架构的 Kafka 消息示例:

{"schema":{"type":"struct","fields":[{"type":"int32","optional":false,"field":"id"}],"optional":false,"name":"dbserver1.inventory.customers.Key"},"payload":{"id":1001}}    {"schema":{"type":"struct","fields":[{"type":"struct","fields":[{"type":"int32","optional":false,"field":"id"},{"type":"string","optional":false,"field":"first_name"},{"type":"string","optional":false,"field":"last_name"},{"type":"string","optional":false,"field":"email"}],"optional":true,"name":"dbserver1.inventory.customers.Value","field":"before"},{"type":"struct","fields":[{"type":"int32","optional":false,"field":"id"},{"type":"string","optional":false,"field":"first_name"},{"type":"string","optional":false,"field":"last_name"},{"type":"string","optional":false,"field":"email"}],"optional":true,"name":"dbserver1.inventory.customers.Value","field":"after"},{"type":"struct","fields":[{"type":"string","optional":false,"field":"version"},{"type":"string","optional":false,"field":"connector"},{"type":"string","optional":false,"field":"name"},{"type":"int64","optional":false,"field":"ts_ms"},{"type":"string","optional":true,"name":"io.debezium.data.Enum","version":1,"parameters":{"allowed":"true,last,false"},"default":"false","field":"snapshot"},{"type":"string","optional":false,"field":"db"},{"type":"string","optional":false,"field":"schema"},{"type":"string","optional":false,"field":"table"},{"type":"int64","optional":true,"field":"txId"},{"type":"int64","optional":true,"field":"lsn"},{"type":"int64","optional":true,"field":"xmin"}],"optional":false,"name":"io.debezium.connector.postgresql.Source","field":"source"},{"type":"string","optional":false,"field":"op"},{"type":"int64","optional":true,"field":"ts_ms"},{"type":"struct","fields":[{"type":"string","optional":false,"field":"id"},{"type":"int64","optional":false,"field":"total_order"},{"type":"int64","optional":false,"field":"data_collection_order"}],"optional":true,"field":"transaction"}],"optional":false,"name":"dbserver1.inventory.customers.Envelope"},"payload":{"before":null,"after":{"id":1001,"first_name":"Sally","last_name":"Thomas","email":"sally.thomas@acme.com"},"source":{"version":"1.4.1.Final","connector":"postgresql","name":"dbserver1","ts_ms":1611918971029,"snapshot":"true","db":"postgres","schema":"inventory","table":"customers","txId":602,"lsn":34078720,"xmin":null},"op":"r","ts_ms":1611918971032,"transaction":null}}

示例(我需要)无架构:

{"id":1001} {"before":null,"after":{"id":1001,"first_name":"Sally","last_name":"Thomas","email":"sally.thomas@acme.com"},"source":{"version":"1.4.1.Final","connector":"postgresql","name":"dbserver1","ts_ms":1611920304594,"snapshot":"true","db":"postgres","schema":"inventory","table":"customers","txId":597,"lsn":33809448,"xmin":null},"op":"r","ts_ms":1611920304596,"transaction":null}

Debezium 容器使用以下命令运行:

docker run -it --name connect -p 8083:8083 -e GROUP_ID=1 -e CONFIG_STORAGE_TOPIC=my_connect_configs -e OFFSET_STORAGE_TOPIC=my_connect_offsets -e STATUS_STORAGE_TOPIC=my_connect_statuses -e CONNECT_KEY_CONVERTER_SCHEMAS_ENABLE=false -e CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE=false --link zookeeper:zookeeper --link kafka:kafka --link mysql:mysql debezium/connect:1.3

或作为 docker-compose

  connect:
    image: debezium/connect:${DEBEZIUM_VERSION}
    ports:
     - 8083:8083
    links:
     - kafka
     - postgres
    environment:
     - BOOTSTRAP_SERVERS=kafka:9092
     - GROUP_ID=1
     - CONFIG_STORAGE_TOPIC=my_connect_configs
     - OFFSET_STORAGE_TOPIC=my_connect_offsets
     - STATUS_STORAGE_TOPIC=my_connect_statuses
     - CONNECT_KEY_CONVERTER_SCHEMAS_ENABLE=false
     - CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE=false

CONNECT_KEY_CONVERTER_SCHEMAS_ENABLE=falseCONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE=false 是我后来添加的,但没有它们我无法摆脱架构。

connect docker 容器(Kafka 连接器服务器集群 - 如果我理解正确的话)在没有任何连接器的情况下启动。 我手动创建它。

创建连接器时来自 docker-compose 的用于连接的日志

connect_1    | 2021-01-29 18:04:57,395 INFO   ||  JsonConverterConfig values: 
connect_1    |  converter.type = key
connect_1    |  decimal.format = BASE64
connect_1    |  schemas.cache.size = 1000
connect_1    |  schemas.enable = true
connect_1    |    [org.apache.kafka.connect.json.JsonConverterConfig]
connect_1    | 2021-01-29 18:04:57,396 INFO   ||  Set up the key converter class org.apache.kafka.connect.json.JsonConverter for task inventory-connector-0 using the worker config   [org.apache.kafka.connect.runtime.Worker]
connect_1    | 2021-01-29 18:04:57,396 INFO   ||  JsonConverterConfig values: 
connect_1    |  converter.type = value
connect_1    |  decimal.format = BASE64
connect_1    |  schemas.cache.size = 1000
connect_1    |  schemas.enable = true
connect_1    |    [org.apache.kafka.connect.json.JsonConverterConfig]
...
connect_1    | 2021-01-29 18:04:57,458 INFO   ||  Starting PostgresConnectorTask with configuration:   [io.debezium.connector.common.BaseSourceTask]
connect_1    | 2021-01-29 18:04:57,460 INFO   ||     key.converter.schemas.enable = false   [io.debezium.connector.common.BaseSourceTask]
connect_1    | 2021-01-29 18:04:57,460 INFO   ||     value.converter.schemas.enable = false   [io.debezium.connector.common.BaseSourceTask]

这里是获取连接器命令输出:

$ curl -i http://localhost:8083/connectors/inventory-connector

{"name":"inventory-connector","config":{"connector.class":"io.debezium.connector.postgresql.PostgresConnector",**"key.converter.schemas.enable":"false"**,"database.user":"postgres","database.dbname":"postgres","tasks.max":"1","database.hostname":"postgres","database.password":"postgres",**"value.converter.schemas.enable":"false"**,"name":"inventory-connector","database.server.name":"dbserver1","database.port":"5432","schema.include":"inventory"},"tasks":[{"connector":"inventory-connector","task":0}],"type":"source"}

【问题讨论】:

  • 请提供您的 docker run 命令
  • @IskuskovAlexander,完成
  • 如果您打算在连接器中使用启用模式的配置,也许您应该显式添加 json 转换器。以我的经验,这很好用
  • 是的,"value.converter.schemas.enable": "false" 应该可以正常工作。也许配置没有采取,或者您查看的消息来自不同的主题?无论哪种方式,您都可以在 Kafka Connect 工作人员日志中验证哪些设置有效。但它们绝对可以在连接器配置 JSON 本身中被覆盖。
  • @RobinMoffatt,我什至从一开始就更改了禁用模式的 docker,但它不起作用:(如果我通过 REST http.../connectors/connector-name 获取 c 连接器配置返回我禁用模式,但我仍然在主题中看到它们(我只有一个主题):) 我将重新检查日志并更新

标签: apache-kafka jsonschema apache-kafka-connect debezium


【解决方案1】:

我转载了this 的例子。正如评论中提到的@OneCricketeer,您必须明确添加JsonConverter

{
    "name": "inventory-connector",
    "config": {
        "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
        "tasks.max": "1",
        "key.converter": "org.apache.kafka.connect.json.JsonConverter",
        "key.converter.schemas.enable": "false",
        "value.converter": "org.apache.kafka.connect.json.JsonConverter",
        "value.converter.schemas.enable": "false",
        "database.hostname": "postgres",
        "database.port": "5432",
        "database.user": "postgres",
        "database.password": "postgres",
        "database.dbname" : "postgres",
        "database.server.name": "dbserver1",
        "schema.include": "inventory"
    }
}

【讨论】:

  • 是的,谢谢,但在过去我已经添加了“key.converter”:“org.apache.kafka.connect.json.JsonConverter”(价值相同),它也没用(对于 MySql DB - 但我认为 DB 不是问题) - 也许我上次做错了什么。现在我又试了一次,它成功了
猜你喜欢
  • 2020-04-27
  • 2020-05-17
  • 2019-07-22
  • 2019-01-17
  • 2022-10-20
  • 2020-03-07
  • 2020-04-27
  • 2019-10-14
  • 2020-11-28
相关资源
最近更新 更多