计算数据框中的连续重复元素并将它们存储在新列中答案

【问题标题】：counting consequtive duplicate elements in a dataframe and storing them in a new colum计算数据框中的连续重复元素并将它们存储在新列中
【发布时间】：2021-07-31 09:18:27
【问题描述】：

我正在尝试计算数据框中的连续元素并将它们存储在新列中。我不想计算一个元素在列表中整体出现的总次数，但它连续出现了多少次，我使用了这个：

a=[1,1,3,3,3,5,6,3,3,0,0,0,2,2,2,0]
df = pd.DataFrame(list(zip(a)), columns =['Patch']) 
df['count'] = df.groupby('Patch').Patch.transform('size') 
print(df)

这给了我这样的结果：

Patch  count
0       1      2
1       1      2
2       3      5
3       3      5
4       3      5
5       5      1
6       6      1
7       3      5
8       3      5
9       0      4
10      0      4
11      0      4
12      2      3
13      2      3
14      2      3
15      0      4

但是我希望结果是这样的：

    Patch  count
0       1      2
1       3      3
2       5      1
3       6      1
4       3      2
5       0      3
6       2      3
7       0      1

【问题讨论】：

标签： python dataframe count pandas-groupby drop-duplicates

【解决方案1】：

df = (
    df.groupby((df.Patch != df.Patch.shift(1)).cumsum())
    .agg({"Patch": ("first", "count")})
    .reset_index(drop=True)
    .droplevel(level=0, axis=1)
    .rename(columns={"first": "Patch"})
)
print(df)

打印：

   Patch  count
0      1      2
1      3      3
2      5      1
3      6      1
4      3      2
5      0      3
6      2      3
7      0      1

【讨论】：

我试图仅绘制计数的直方图，使用：df.hist()。但这不起作用