【发布时间】:2021-01-05 01:36:51
【问题描述】:
使用 AWS 胶水开发者端点 Spark Version - 2.4 Python Version- 3
代码:
df=spark.read.format("avro").load("s3://dataexport/users/prod-users.avro")
尝试读取 avro 文件时收到以下错误消息:
Failed to find data source: avro. Avro is built-in but external data source module since Spark 2.4. Please deploy the application as per the deployment section of "Apache Avro Data Source Guide".;
找到以下链接,但对解决我的问题没有帮助
https://spark.apache.org/docs/latest/sql-data-sources-avro.html[ApacheAvro 数据源指南][1]
【问题讨论】:
标签: apache-spark pyspark aws-glue apache-zeppelin