【问题标题】:Pandas pivot table without multi-index return没有多索引返回的 Pandas 数据透视表
【发布时间】:2020-04-07 14:53:58
【问题描述】:

我有以下数据:

s = '{"j":{"0":"{}","1":"{}","2":"{}","3":"{}","4":"{}"},"l":{"0":"some","1":"some","2":"some","3":"some","4":"some"},"l_t":{"0":"thing","1":"thing","2":"thing","3":"thing","4":"thing"},"o_l":{"0":"one","1":"one","2":"two","3":"one","4":"one"},"s":{"0":"y","1":"y","2":"y","3":"y","4":"y"},"val":{"0":4,"1":4,"2":3,"3":4,"4":4},"v_text":{"0":"L","1":"L","2":"NLH","3":"L","4":"L"},"v_text_2":{"0":"light","1":"light","2":"neither heavy or light","3":"light","4":"light"},"v":{"0":"x","1":"x","2":"x","3":"x","4":"x"},"year":{"0":2020,"1":2020,"2":2020,"3":2020,"4":2020}}'
dt_test = pd.read_json(s)

看起来像:

    j     l    l_t  o_l  s  val v_text                v_text_2  v  year
0  {}  some  thing  one  y    4      L                   light  x  2020
1  {}  some  thing  one  y    4      L                   light  x  2020
2  {}  some  thing  two  y    3    NLH  neither heavy or light  x  2020
3  {}  some  thing  one  y    4      L                   light  x  2020
4  {}  some  thing  one  y    4      L                   light  x  2020


并且想创建一个数据透视表,我不明白为什么我创建的数据透视表有一个多索引作为列。

这是我尝试过的:

dt_test.pivot_table(index="v_text_2", columns="l_t", aggfunc="count")

看起来像:

                           j     l   o_l     s     v v_text   val  year
l_t                    thing thing thing thing thing  thing thing thing
v_text_2                                                               
light                      4     4     4     4     4      4     4     4
neither heavy or light     1     1     1     1     1      1     1     1

我希望它看起来像:

l_t                    thing
v_text_2                    
light                      4
neither heavy or light     1

最终我想汇总这些数据,以便我可以绘制它。

【问题讨论】:

  • 将它们作为值传递:dt_test.pivot_table(index="v_text_2", values="l_t", aggfunc="count")?

标签: python pandas pivot pivot-table aggregate


【解决方案1】:

或者,您可以使用pandas.crosstab

pd.crosstab(df['v_text_2'],df['l_t'])

l_t                     thing
v_text_2                     
light                       4
neither heavy or light      1

这将产生与预期相同的输出。

【讨论】:

    【解决方案2】:

    实际上这是一个非常奇怪的行为 - 对于 pivot_table 除了您要使用的 agg 函数,您还应该提及您要应用它的列:

    例如:

    dt_test.pivot_table(index="v_text_2", aggfunc="count", columns="l_t", values="year")
    

    输出:

    l_t                     thing
    v_text_2
    light                       4
    neither heavy or light      1
    

    【讨论】:

    • 谢谢 - “* 实际上这是一个非常奇怪的行为*”,您是指我如何使用该功能吗?或者函数如何返回它的值
    • 我的意思是函数返回 - 从技术上讲,你知道,如果你愿意的话使用aggfunc="sum" 对哪一列求和是有问题的,但对于count,默认获取任何列可能会很好。而实际的默认行为是 - 取除已用于 indexcolumns 的所有列:github.com/pandas-dev/pandas/blob/v1.0.3/pandas/core/reshape/…
    猜你喜欢
    • 2020-08-30
    • 2016-09-15
    • 2020-10-17
    • 1970-01-01
    • 2018-06-20
    • 2017-05-24
    • 1970-01-01
    • 2023-03-29
    • 1970-01-01
    相关资源
    最近更新 更多