【发布时间】:2019-04-14 20:56:08
【问题描述】:
我有一个具有两种分区格式的 S3 存储桶:
- S3://bucketname/tablename/year/month/day
- S3://bucketname/tablename/device/year/month/day
文件格式为 Avro。
我尝试通过val df = spark.read.format("com.databricks.spark.avro").load("s3://S3://bucketname/tablename") 阅读。
错误信息是
java.lang.AssertionError: assertion failed: Conflicting partition column names detected:
Partition column name list #0: xx, yy
Partition column name list #1: xx
For partitioned table directories, data files should only live in leaf directories.
And directories at the same level should have the same partition column name.
Please check the following directories for unexpected files or inconsistent partition column names:
【问题讨论】:
标签: apache-spark amazon-s3 apache-spark-sql avro