round() 函数不适用于 databricks-Python答案

【问题标题】：round() function not working for databricks-Pythonround() 函数不适用于 databricks-Python
【发布时间】：2020-11-27 17:05:43
【问题描述】：

我正在尝试使用 databricks 中的 round() 函数将一些浮点值四舍五入为 2 位。但是，databricks python 不能像普通 python 那样工作。

如果有的话，请帮助我解释原因和解决方案。

lis = [-12.1334, 12.23433, 1.2343, -104.444]
lis2 = [round(val,2)  for val in lis]
print(lis2)


TypeError: Invalid argument, not a string or column: -12.1334 of type <type 'float'>. For column literals, use 'lit', 'array', 'struct' or 'create_map' function.

Image Proof of Code

【问题讨论】：

此代码不会产生声称的错误。请发布您的真实代码。
也无法重现错误
为了方便添加了图片证明
另外，代码必须在数据块笔记本中运行，而不是普通的 python 方式。正常的python方式将运行没有任何错误。问题在于基于数据块的 python。

标签： python databricks

【解决方案1】：

只有当您从 spark.sql 中的 function 模块导入 spark round 函数时，这才能重现

spark round 函数需要一个字符串或一个列。这解释了错误。

您可以为导入设置别名，例如 import pyspark.sql.functions as F 而不是 from pyspark.sql.functions import *

你可以通过这种方式获取originround方法。

import builtins
round = getattr(builtins, "round")

然后就可以执行了

lis = [-12.1334, 12.23433, 1.2343, -104.444]
lis2 = [round(val, 2) for val in lis]
print(lis2)

【讨论】：

【解决方案2】：

您好，问题很可能与名称空间冲突有关。我跑了类似的东西

from pyspark.sql.functions import *

其中包含函数round。您可以通过运行帮助轻松查看正在使用的回合：

help(round)

解决这个问题的简单方法是将 pyspark 函数指定到不同的命名空间。

import pyspark.sql.functions as F
lis = [-12.1334, 12.23433, 1.2343, -104.444]
lis2 = [round(val,2)  for val in lis]
print(lis2)

[-12.13, 12.23, 1.23, -104.44]

【讨论】：

【解决方案3】：

试试这个：

lis = [-12.1334, 12.23433, 1.2343, -104.444]
list_em = []
for row in lis:
    list_em.append(round(row,2))
print(list_em)

[-12.13, 12.23, 1.23, -104.44]

【讨论】：

这根本不能解释问题。 OP发布的代码甚至没有可复制的问题。
这也不起作用。当我在 databricks 笔记本中运行时，圆形功能不起作用。您可以查看随附的图片证明。

【解决方案4】：

我相信这是您正在应用的函数的源代码：

def round(col, scale=0):
    """
    Round the given value to `scale` decimal places using HALF_UP rounding mode if `scale` >= 0
    or at integral part when `scale` < 0.

    >>> spark.createDataFrame([(2.5,)], ['a']).select(round('a', 0).alias('r')).collect()
    [Row(r=3.0)]
    """
    sc = SparkContext._active_spark_context
    return Column(sc._jvm.functions.round(_to_java_column(col), scale))

很明显，它说要传入一列，而不是十进制数。你导入*了吗？这可能会覆盖内置函数。

【讨论】：

是的。我已经删除了所有导入的火花东西。现在它的工作。这是轮函数被覆盖的问题。谢谢。