【问题标题】:Not able to fetch records from SAP HANA using Kafka Connect无法使用 Kafka Connect 从 SAP HANA 获取记录
【发布时间】:2020-12-19 04:28:05
【问题描述】:

我是 Kafka Connect 的新手,我正在尝试从 SAP S/4 HANA 复制/获取数据,并使用 Kafka Connect 将其保存在 HDFS 上。到目前为止,我通过以下链接尝试了很多东西:

https://github.com/SAP/kafka-connect-sap

https://medium.com/@schh/confluent-platform-connecting-sap-hana-to-kafka-ec8a8a5be44

我的配置如下所示:

connect-standalone.properties

bootstrap.servers=10.0.4.146:9092
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter.schemas.enable=true
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=true
internal.value.converter.schemas.enable=true
offset.storage.file.filename=/tmp/connect.offsets
offset.flush.interval.ms=10000
plugin.path=/home/user1/ngdbc.jar,/home/user1/kafka-connect-hana-1.0-SNAPSHOT.jar

hana-source.properties

name=saptohive-source
connector.class=com.sap.kafka.connect.source.hana.HANASourceConnector
tasks.max=1
topics=saptohive
connection.url=jdbc:sap://34.169.244.241:30041/
connection.user="MYUSER"
connection.password="MYPASS"
saptohive.table.name="SAPHANADB"."MARA"

hdfs-sink.properties

name=hdfs-sink
connector.class=io.confluent.connect.hdfs.HdfsSinkConnector
tasks.max=1
topics=saptohive
hdfs.url=hdfs://10.0.1.244:8020/warehouse/tablespace/external/hive/
flush.size=3
hive.integration=true
hive.metastore.uris=thrift://10.0.1.244:9083/
hive.database=dev_ingestion_raw
schema.compatibility=BACKWARD

错误

[2020-08-31 10:58:35,969] ERROR Task saptohive-source-0 threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:148)
    java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError: scala/runtime/java8/JFunction0$mcV$sp
            at com.sap.kafka.connect.source.GenericSourceTask.start(GenericSourceTask.scala:34)
            at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:167)
            at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:146)
            at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:190)
            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
            at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
            at java.lang.Thread.run(Thread.java:748)
    Caused by: java.lang.NoClassDefFoundError: scala/runtime/java8/JFunction0$mcV$sp
            ... 9 more
    Caused by: java.lang.ClassNotFoundException: scala.runtime.java8.JFunction0$mcV$sp
            at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
            at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
            at org.apache.kafka.connect.runtime.isolation.PluginClassLoader.loadClass(PluginClassLoader.java:104)
            at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
            ... 9 more
    [2020-08-31 10:58:35,971] ERROR Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:149)

我不确定到底是什么问题。整个流程卡在:

 (io.confluent.connect.hdfs.HdfsSinkConnectorConfig:223)
[2020-08-31 10:58:36,186] INFO AvroDataConfig values:
        schemas.cache.config = 1000
        enhanced.avro.schema.support = false
        connect.meta.data = true
 (io.confluent.connect.avro.AvroDataConfig:170)
[2020-08-31 10:58:36,190] INFO Hadoop configuration directory  (io.confluent.connect.hdfs.DataWriter:93)
[2020-08-31 10:58:36,467] WARN Unable to load native-hadoop library for your platform... using builtin-java classes where applicable (org.apache.hadoop.util.NativeCodeLoader:62)
[2020-08-31 10:58:37,326] INFO Trying to connect to metastore with URI thrift://10.0.1.244:9083/ (hive.metastore:376)
[2020-08-31 10:58:37,362] INFO Connected to metastore. (hive.metastore:472)
[2020-08-31 10:58:37,437] INFO Sink task WorkerSinkTask{id=hdfs-sink-0} finished initialization and start (org.apache.kafka.connect.runtime.WorkerSinkTask:260)
[2020-08-31 10:58:37,523] INFO Discovered coordinator 10.0.1.33:9092 (id: 2147483646 rack: null) for group connect-hdfs-sink. (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:607)
[2020-08-31 10:58:37,536] INFO Revoking previously assigned partitions [] for group connect-hdfs-sink (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:419)
[2020-08-31 10:58:37,537] INFO (Re-)joining group connect-hdfs-sink (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:442)
[2020-08-31 10:58:37,547] INFO Successfully joined group connect-hdfs-sink with generation 3 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:409)
[2020-08-31 10:58:37,550] INFO Setting newly assigned partitions [saptohive-0] for group connect-hdfs-sink (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:262)
[2020-08-31 10:58:37,562] INFO Started recovery for topic partition saptohive-0 (io.confluent.connect.hdfs.TopicPartitionWriter:208)
[2020-08-31 10:58:37,570] INFO Finished recovery for topic partition saptohive-0 (io.confluent.connect.hdfs.TopicPartitionWriter:223)

所有文件,即connect-standalone.properties,hana-source.properties,hdfs-sink.properties和两个“.jar”文件,即ngdbc.jar和kafka-connect-hana-1.0-SNAPSHOT.jar都在同一个目录。

我使用的命令是:

connect-standalone connect-standalone.properties hana-source.properties hdfs-sink.properties

我需要知道我做错了什么。任何帮助,将不胜感激。谢谢。

【问题讨论】:

  • 你自己编译了那个项目吗?你用的是什么版本的 Java+Scala?
  • 我的 CentOS 机器有 Scala 2.11.8 和 Java 1.8
  • 你运行的是哪个 Kafka 版本?一个使用 Scala 2.11 或 2.12?通常NoClassDefFoundError: scala/runtime 表示 Scala 版本问题
  • 整个 HDP 集群实际上使用的是 Scala 2.11.8。
  • 好的,但我想问你是否使用相同版本构建了那个 github 存储库

标签: java apache-kafka hdfs apache-kafka-connect confluent-platform


【解决方案1】:

实际上,我们集群中的 Kafka 和 Scala 版本比用于构建 SAP HANA Kafka 连接器的版本旧。使用 Kafka 版本 2.x.x 创建了一个新集群,并且可以正常工作。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2022-01-20
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多