【发布时间】:2021-02-26 21:40:06
【问题描述】:
我正在对数据框的多列执行 ltrim 和 rtrim,但现在我可以单独执行了。喜欢
# selected_colums = selected_colums.withColumn("last_name", ltrim(selected_colums.last_name))
# selected_colums = selected_colums.withColumn("last_name", rtrim(selected_colums.last_name))
# selected_colums = selected_colums.withColumn("email", ltrim(selected_colums.email))
# selected_colums = selected_colums.withColumn("email", rtrim(selected_colums.email))
# selected_colums = selected_colums.withColumn("phone_number", ltrim(selected_colums.phone_number))
# selected_colums = selected_colums.withColumn("phone_number", rtrim(selected_colums.phone_number))
但我想像下面那样循环执行它
sdk = ['first_name','last_name','email','phone_number','email_alt','phone_number_alt']
for x in sdk:
selected_colums = selected_colums.withColumn(x, ltrim(selected_colums.last_name))
它给了我语法错误。 请帮助我优化此代码,以便对于任意数量的列,我可以通过列表执行 ltrim 或 rtrim。
【问题讨论】:
标签: python-3.x database dataframe apache-spark pyspark