【发布时间】:2018-08-30 18:07:25
【问题描述】:
我有一个函数,它试图将广播变量传递给 UDF。
函数如下:
def generate_lookup_code(self, lookup_map):
lookup_map_broadcast = spark_session.sparkContext.broadcast(lookup_map)
print("lookup_map has been broadcasted")
#### UDF function only return a constant string###
def _generate_code(bc_reasoncode_lookup_map):
reasoncode_lookup_map = bc_reasoncode_lookup_map.value
return "hello"
udfGenerateCode = F.udf(_generate_code, StringType())
input_df = input_df.withColumn('code', udfGenerateCode(lookup_map_broadcast))
input_df.show()
我的意图只是试图将广播变量传递给 UDF,但是,我得到了错误:
'Broadcast' object has no attribute '_get_object_id'
不知道哪里错了?
【问题讨论】:
标签: apache-spark pyspark broadcast