【问题标题】:Why I am getting Error while using Lambda within Apply为什么在 Apply 中使用 Lambda 时出现错误
【发布时间】:2021-07-27 03:59:34
【问题描述】:

请求帮助,了解以下错误的原因?:

import numpy as np
from pydataset import data
mtcars = data('mtcars')

mtcars.apply(['mean', lambda x: max(x)-min(x), lambda x: np.percentile(x, 0.15)])

我正在尝试为数据集 mtcars 的所有列创建一个包含平均值、最大值最小值和第 15 个百分位数的数据框。

错误信息:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/aggregation.py in agg_list_like(obj, arg, _axis)
    674     try:
--> 675         return concat(results, keys=keys, axis=1, sort=False)
    676     except TypeError as err:

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/reshape/concat.py in concat(objs, axis, join, ignore_index, keys, levels, names, verify_integrity, sort, copy)
    284     """
--> 285     op = _Concatenator(
    286         objs,

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/reshape/concat.py in __init__(self, objs, axis, join, keys, levels, names, ignore_index, verify_integrity, copy, sort)
    369                 )
--> 370                 raise TypeError(msg)
    371 

TypeError: cannot concatenate object of type '<class 'float'>'; only Series and DataFrame objs are valid

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
<ipython-input-645-51b8f1de1855> in <module>
----> 1 mtcars.apply(['mean', lambda x: max(x)-min(x), lambda x: np.percentile(x, 0.15)])

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/frame.py in apply(self, func, axis, raw, result_type, args, **kwds)
   7766             kwds=kwds,
   7767         )
-> 7768         return op.get_result()
   7769 
   7770     def applymap(self, func, na_action: Optional[str] = None) -> DataFrame:

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/apply.py in get_result(self)
    145             # pandas\core\apply.py:144: error: "aggregate" of "DataFrame" gets
    146             # multiple values for keyword argument "axis"
--> 147             return self.obj.aggregate(  # type: ignore[misc]
    148                 self.f, axis=self.axis, *self.args, **self.kwds
    149             )

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/frame.py in aggregate(self, func, axis, *args, **kwargs)
   7576         result = None
   7577         try:
-> 7578             result, how = self._aggregate(func, axis, *args, **kwargs)
   7579         except TypeError as err:
   7580             exc = TypeError(

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/frame.py in _aggregate(self, arg, axis, *args, **kwargs)
   7607             result = result.T if result is not None else result
   7608             return result, how
-> 7609         return aggregate(self, arg, *args, **kwargs)
   7610 
   7611     agg = aggregate

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/aggregation.py in aggregate(obj, arg, *args, **kwargs)
    584         # we require a list, but not an 'str'
    585         arg = cast(List[AggFuncTypeBase], arg)
--> 586         return agg_list_like(obj, arg, _axis=_axis), None
    587     else:
    588         result = None

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/aggregation.py in agg_list_like(obj, arg, _axis)
    651             colg = obj._gotitem(col, ndim=1, subset=selected_obj.iloc[:, index])
    652             try:
--> 653                 new_res = colg.aggregate(arg)
    654             except (TypeError, DataError):
    655                 pass

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/series.py in aggregate(self, func, axis, *args, **kwargs)
   3972             func = dict(kwargs.items())
   3973 
-> 3974         result, how = aggregate(self, func, *args, **kwargs)
   3975         if result is None:
   3976 

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/aggregation.py in aggregate(obj, arg, *args, **kwargs)
    584         # we require a list, but not an 'str'
    585         arg = cast(List[AggFuncTypeBase], arg)
--> 586         return agg_list_like(obj, arg, _axis=_axis), None
    587     else:
    588         result = None

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/aggregation.py in agg_list_like(obj, arg, _axis)
    683         result = Series(results, index=keys, name=obj.name)
    684         if is_nested_object(result):
--> 685             raise ValueError(
    686                 "cannot combine transform and aggregation operations"
    687             ) from err

ValueError: cannot combine transform and aggregation operations

但是,以下工作:

mtcars.apply(['mean', lambda x: max(x)-min(x)])

type(mtcars.apply(lambda x: np.percentile(x, 0.15)))type(mtcars.apply(lambda x: max(x)-min(x))) 都提供 Pandas 系列。那么为什么只有百分位数会出现问题呢?

谢谢

【问题讨论】:

  • 请包含完整的错误回溯。
  • 什么是mtcars?它是一个列表、一个自定义类的实例、一个 Pandas 数据框等吗?如果是 Pandas,你可能想用pandas 标记这个问题

标签: python pandas


【解决方案1】:

阅读answer by @James 我的猜测是您需要编写自定义函数,以便将该函数应用于系列而不是每个元素。也许其他更熟悉底层 pandas 代码的人可以参与进来:

def min_max(x):
    return max(x)-min(x)
def perc(x):
    return x.quantile(0.15)

mtcars.agg(['mean',min_max,perc])

               mpg     cyl        disp        hp      drat       wt      qsec      vs       am    gear    carb
mean     20.090625  6.1875  230.721875  146.6875  3.596563  3.21725  17.84875  0.4375  0.40625  3.6875  2.8125
min_max  23.500000  4.0000  400.900000  283.0000  2.170000  3.91100   8.40000  1.0000  1.00000  2.0000  7.0000
perc     14.895000  4.0000  103.485000   82.2500  3.070000  2.17900  16.24300  0.0000  0.00000  3.0000  1.0000

【讨论】:

  • 感谢您的回答。但我仍然不清楚为什么 Lambda 表达式不起作用。
  • 我认为如果你这样做 lambda x: x.quantile(0.15) 它应该可以工作。我做了上面定义函数的步骤,以便最终产品更具可读性
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2018-06-11
  • 2019-02-06
  • 2016-07-30
  • 1970-01-01
  • 2022-12-18
  • 2020-03-16
  • 2021-06-26
相关资源
最近更新 更多