如何使用熊猫？答案

【问题标题】：How to use pandas at?如何使用熊猫？
【发布时间】：2018-12-22 23:42:54
【问题描述】：

我经常对 pandas 的切片操作感到困惑，例如，

import pandas as pd
raw_data = {'regiment': ['Nighthawks', 'Nighthawks', 'Nighthawks', 'Nighthawks', 'Dragoons', 'Dragoons', 'Dragoons', 'Dragoons', 'Scouts', 'Scouts', 'Scouts', 'Scouts'], 
    'company': ['1st', '1st', '2nd', '2nd', '1st', '1st', '2nd', '2nd','1st', '1st', '2nd', '2nd'], 
    'name': ['Miller', 'Jacobson', 'Ali', 'Milner', 'Cooze', 'Jacon', 'Ryaner', 'Sone', 'Sloan', 'Piger', 'Riani', 'Ali'], 
    'preTestScore': [4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3],
    'postTestScore': [25, 94, 57, 62, 70, 25, 94, 57, 62, 70, 62, 70]}
df = pd.DataFrame(raw_data, columns = ['regiment', 'company', 'name', 'preTestScore', 'postTestScore'])

def get_stats(group):
    return {'min': group.min(), 'max': group.max(), 'count': group.count(), 'mean': group.mean()}
bins = [0, 25, 50, 75, 100]
group_names = ['Low', 'Okay', 'Good', 'Great']
df['categories'] = pd.cut(df['postTestScore'], bins, labels=group_names)
des = df['postTestScore'].groupby(df['categories']).apply(get_stats).unstack()
des.at['Good','mean']

我得到了：

TypeError Traceback（最近调用最后）pandas/_libs/index.pyx 在 pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi 在 pandas._libs.hashtable.Int64HashTable.get_item()

TypeError：需要一个整数

在处理上述异常的过程中，又发生了一个异常：

KeyError Traceback（最近调用最后）在（） ----> 1 des.at['Good','mean']

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexing.py 在 getitem(self, key) 1867 1868 key = self._convert_key(key) -> 1869 return self.obj._get_value(*key, takeable=self._takeable) 1870 1871 def setitem(self, 键，值）：

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py 在 _get_value(self, index, col, takeable) 1983 1984 尝试： -> 1985 return engine.get_value(series._values, index) 1986 except (TypeError, ValueError): 1987

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

KeyError: '好'

我该怎么做？

提前致谢。

【问题讨论】：

具体做什么？

标签： python pandas dataframe

【解决方案1】：

问题出在线路上，

des = df['postTestScore'].groupby(df['categories']).apply(get_stats).unstack()

通过 'postTestScroe' 进行分组后，您将获得 "Series" 而不是 "DataFrame"，如下所示。

现在，当您尝试使用 DataFrame des ".at" 访问标量标签时，它无法识别标签 'Good'，因为它在 Series 中不存在。

des.at['Good','mean']

只需尝试打印 des 打印，您将看到生成的系列。

           count   max   mean   min
categories
Low           2.0  25.0  25.00  25.0
Okay          0.0   NaN    NaN   NaN
Good          8.0  70.0  63.75  57.0
Great         2.0  94.0  94.00  94.0

【讨论】：

【解决方案2】：

由于分类索引，它无法正常工作：

des.index
# Out[322]: CategoricalIndex(['Low', 'Okay', 'Good', 'Great'], categories=['Low', 'Okay', 'Good', 'Great'], ordered=True, name='categories', dtype='category')

尝试像这样改变它：

des.index = des.index.tolist()
des.at['Good','mean']
# Out[326]: 63.75

【讨论】：