【问题标题】:Flink 1.12 serialize Avro Generic Record to Kafka failed with com.esotericsoftware.kryo.KryoException: java.lang.UnsupportedOperationExceptionFlink 1.12 将 Avro Generic Record 序列化到 Kafka 失败,出现 com.esotericsoftware.kryo.KryoException: java.lang.UnsupportedOperationException
【发布时间】:2021-07-01 00:00:17
【问题描述】:

我有一个 DataStream[GenericRecord]:

val consumer = new FlinkKafkaConsumer[String]("input_csv_topic", new SimpleStringSchema(), properties)
val stream = senv.
    addSource(consumer).
    map(line => {
        val arr = line.split(",")

        val schemaUrl = "" // avro schema link, standard .avsc file format
        val schemaStr = scala.io.Source.fromURL(schemaUrl).mkString.toString().stripLineEnd

        import org.codehaus.jettison.json.{JSONObject, JSONArray}
        val schemaFields: JSONArray = new JSONObject(schemaStr).optJSONArray("fields")

        val genericDevice: GenericRecord = new GenericData.Record(new Schema.Parser().parse(schemaStr))

        for(i <- 0 until arr.length) {
            val fieldObj: JSONObject = schemaFields.optJSONObject(i)
            val columnName = fieldObj.optString("name")
            var columnType = fieldObj.optString("type")

            if (columnType.contains("string")) {
                genericDevice.put(columnName, arr(i))
            } else if (columnType.contains("int")) {
                genericDevice.put(columnName, toInt(arr(i)).getOrElse(0).asInstanceOf[Number].intValue)
            } else if (columnType.contains("long")) {
                genericDevice.put(columnName, toLong(arr(i)).getOrElse(0).asInstanceOf[Number].longValue)
            }
        }

        genericDevice
    })

val kafkaSink = new FlinkKafkaProducer[GenericRecord](
    "output_avro_topic",
    new MyKafkaAvroSerializationSchema[GenericRecord](classOf[GenericRecord], "output_avro_topic", "this is the key", schemaStr),
    properties,
    FlinkKafkaProducer.Semantic.AT_LEAST_ONCE)

stream.addSink(kafkaSink)

这是 MyKafkaAvroSerializationSchema 的实现:

class MyKafkaAvroSerializationSchema[T](avroType: Class[T], topic: String, key: String, schemaStr: String) extends KafkaSerializationSchema[T]  {

    lazy val schema: Schema = new Schema.Parser().parse(schemaStr)

    override def serialize(element: T, timestamp: lang.Long): ProducerRecord[Array[Byte], Array[Byte]] = {

        val cl = Thread.currentThread().getContextClassLoader()
        val genericData = new GenericData(cl)
        val writer = new GenericDatumWriter[T](schema, genericData)

        // val writer = new ReflectDatumWriter[T](schema)
        // val writer = new SpecificDatumWriter[T](schema)
        val out = new ByteArrayOutputStream()
        val encoder: BinaryEncoder = EncoderFactory.get().binaryEncoder(out, null)
        writer.write(element, encoder)
        encoder.flush()
        out.close()

        new ProducerRecord[Array[Byte], Array[Byte]](topic, key.getBytes, out.toByteArray)
    }
}

Here's stack trace screenshot:

    com.esotericsoftware.kryo.KryoException: java.lang.UnsupportedOperationException
    Serialization trace:
    reserved (org.apache.avro.Schema$Field)
    fieldMap (org.apache.avro.Schema$RecordSchema)
    schema (org.apache.avro.generic.GenericData$Record)

如何使用 Flink 将 Avro Generic Record 序列化到 Kafka?我测试了不同的编写器,但仍然得到 com.esotericsoftware.kryo.KryoException: java.lang.UnsupportedOperationException,感谢您的输入。

【问题讨论】:

    标签: apache-kafka apache-flink avro flink-streaming


    【解决方案1】:

    您可以简单地将flink-avro 模块添加到您的项目中,并在提供架构后使用已经提供的AvroSerializationSchema,该SpecificRecordGenericRecord 都可以使用。

    【讨论】:

    • 我之前也有过,但没用,遇到了同样的问题,这就是我添加 MyKafkaAvroSerializationSchema.scala 的原因
    猜你喜欢
    • 2021-04-26
    • 1970-01-01
    • 2021-10-23
    • 2021-04-23
    • 2020-09-11
    • 1970-01-01
    • 2021-10-20
    • 2021-04-15
    • 1970-01-01
    相关资源
    最近更新 更多