【问题标题】:groupby for pandas Series not working熊猫系列的groupby不起作用
【发布时间】:2013-07-29 13:28:00
【问题描述】:

我无法对熊猫系列对象进行分组。 DataFrames 很好,但我似乎无法对 Series 进行 groupby。有没有人能让这个工作?

>>> import pandas as pd
>>> a = pd.Series([1,2,3,4], index=[4,3,2,1])
>>> a
4    1
3    2
2    3
1    4
dtype: int64
>>> a.groupby()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/generic.py", line 153, in groupby
    sort=sort, group_keys=group_keys)
  File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 537, in groupby
    return klass(obj, by, **kwds)
  File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 195, in __init__
    level=level, sort=sort)
  File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 1326, in _get_grouper
    ping = Grouping(group_axis, gpr, name=name, level=level, sort=sort)
  File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 1203, in __init__
    self.grouper = self.index.map(self.grouper)
  File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/index.py", line 878, in map
    return self._arrmap(self.values, mapper)
  File "generated.pyx", line 2200, in pandas.algos.arrmap_int64 (pandas/algos.c:61221)
TypeError: 'NoneType' object is not callable

【问题讨论】:

    标签: python pandas


    【解决方案1】:

    您需要传递某种映射(可能是字典/函数/索引)

    In [6]: a
    Out[6]: 
    4    1
    3    2
    2    3
    1    4
    dtype: int64
    
    In [7]: a.groupby(a.index).sum()
    Out[7]: 
    1    4
    2    3
    3    2
    4    1
    dtype: int64
    
    In [3]: a.groupby(lambda x: x % 2 == 0).sum()
    Out[3]: 
    False    6
    True     4
    dtype: int64
    

    【讨论】:

      【解决方案2】:

      如果您需要按系列的值分组:

      grouped = a.groupby(a)
      

      grouped = a.groupby(lambda x: a[x])
      

      【讨论】:

        【解决方案3】:

        不要太认真地回答这个问题;)我并不是说这是个好主意。

        如果您真的想内联或以“流畅”的方式进行操作,您可以这样做。

        def smart_groupby(self, by=None, *args, **kwargs):
            if by is None:
                return self.groupby(self, *args, **kwargs)
            return self.groupby(by, *args, **kwargs)
        
        import pandas as pd
        ps.Series.groupby = smart_groupby
        
        pd.Series(['a', 'a', 'a', 'b', 'b']).groupby().count()
        

        结果是

        a    3
        b    2
        dtype: int64
        

        它应该像往常一样运行,但还有一个额外的好处是,如果您省略 by,它会根据自身进行分组。

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2018-05-15
          • 2013-05-29
          • 2019-02-24
          相关资源
          最近更新 更多