删除熊猫数据框中的相似行答案

【问题标题】：remove similar rows in pandas dataframe删除熊猫数据框中的相似行
【发布时间】：2022-01-09 09:13:36
【问题描述】：

这是我的示例结果 -（我有 1000 行）

      xCode  xYear   Repeated
1.    100    1900    3
2.    100    1900    3
3.    100    1934    3
4.    200    1921    1
5.    157    1945    1
       .
       . 
999.  ...    ....    .
1000. ...    ....    .

如何停止计算相同的行（查看重复的列） - 数据框中的相似行

      xCode  xYear   Repeated
1.    100    1900    2
2.    100    1900    2
3.    100    1934    2
4.    200    1921    1
5.    157    1945    1
       .
       . 
999.  ...    ....    .
1000. ...    ....    .

【问题讨论】：

第一个问题请见pandas.pydata.org/pandas-docs/dev/reference/api/…
您能否详细说明问题的第二部分？我不明白你所说的“xCode 100 与 xyear 1945 没有关联，所以乘法”是什么意思。好像你只是将IsntConnectWith 乘以repeated
在多行中，xcode和xyear重复

标签： python python-3.x pandas openpyxl

【解决方案1】：

至于第一部分，你可以使用类似这样的东西

import pandas as pd
d = {
    'xCode':[1,1,1,2,3],
    'IsntConnectWith':[2,2,3,4,5]
}
df = pd.DataFrame(d)
df['repeated'] = df.apply(lambda x: (df==x).all(axis=1).sum(),axis=1)
print(df)

输出：

   xCode  IsntConnectWith  repeated
0      1                2         2
1      1                2         2
2      1                3         1
3      2                4         1
4      3                5         1

【讨论】：