【问题标题】:pyspark sql : AttributeError: 'NoneType' object has no attribute 'join'pyspark sql:AttributeError:'NoneType'对象没有属性'join'
【发布时间】:2018-10-11 15:52:11
【问题描述】:
def main(inputs, output):

    sdf = spark.read.csv(inputs, schema=observation_schema)
    sdf.registerTempTable('filtertable')

    result = spark.sql("""
    SELECT * FROM filtertable WHERE qflag IS NULL
    """).show()

    temp_max = spark.sql(""" SELECT date, station, value FROM filtertable WHERE (observation = 'TMAX')""").show()
    temp_min = spark.sql(""" SELECT date, station, value FROM filtertable WHERE (observation = 'TMIN')""").show()

    result = temp_max.join(temp_min, condition1).select(temp_max('date'), temp_max('station'), ((temp_max('TMAX')-temp_min('TMIN'))/10)).alias('Range'))

错误:

Traceback (most recent call last):
  File "/Users/syedikram/Documents/temp_range_sql.py", line 96, in <module>
    main(inputs, output)
  File "/Users/syedikram/Documents/temp_range_sql.py", line 52, in main
    result = temp_max.join(temp_min, condition1).select(temp_max('date'), temp_max('station'), ((temp_max('TMAX')-temp_min('TMIN')/10)).alias('Range'))
AttributeError: 'NoneType' object has no attribute 'join'

执行连接操作会给我 Nonetype 对象错误。在线查找并没有帮助,因为 pyspark sql 的在线文档很少。 我在这里做错了什么?

【问题讨论】:

    标签: pyspark pyspark-sql


    【解决方案1】:

    temp_maxtemp_min 中删除.show(),因为show 只打印一个字符串并且不返回任何内容(因此您会得到AttributeError: 'NoneType' object has no attribute 'join')。

    【讨论】:

      猜你喜欢
      • 2021-11-07
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2017-03-10
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多