【问题标题】:Sum values of one column for each occurrence in another column [duplicate]另一列中每次出现的一列的总和值[重复]
【发布时间】:2019-08-12 18:05:16
【问题描述】:

我似乎找不到谷歌搜索这个问题的正确措辞,因为我得到了非常相似但不正确的答案。

我正忙于处理泰坦尼克号数据集,并想总结一个家庭中幸存成员的数量。所以数据集看起来像这样:

+-------------+----------+-----------+-------------+ | PassengerId | Survived | Surname | NumSurvived | +-------------+----------+-----------+-------------+ | 1 | 0 | Braund | | | 2 | 1 | Cumings | | | 3 | 1 | Heikkinen | | | 4 | 1 | Futrelle | | | 5 | 0 | Braund | | | 6 | 0 | Moran | | | 7 | 0 | Futrelle | | | 8 | 0 | Braund | | | 9 | 1 | Cumings | | +-------------+----------+-----------+-------------+

我需要对 NumSurvived 列中每个姓氏的 Survived 值求和,如下所示:

+-------------+----------+-----------+-------------+ | PassengerId | Survived | Surname | NumSurvived | +-------------+----------+-----------+-------------+ | 1 | 0 | Braund | 0 | | 2 | 1 | Cumings | 2 | | 3 | 1 | Heikkinen | 1 | | 4 | 1 | Futrelle | 1 | | 5 | 0 | Braund | 0 | | 6 | 0 | Moran | 0 | | 7 | 0 | Futrelle | 1 | | 8 | 0 | Braund | 0 | | 9 | 1 | Cumings | 2 | +-------------+----------+-----------+-------------+

【问题讨论】:

  • @ZackJoubert 没问题
  • 是的,现在才意识到。如何提取求和的“生存”值?

标签: python pandas dataframe


【解决方案1】:

尝试:

df['NumSurvived']=df.groupby('Surname')['Survived'].transform(lambda x: x.eq(1).sum())

打印(df)

   PassengerId  Survived    Surname  NumSurvived
0            1         0     Braund            0
1            2         1    Cumings            2
2            3         1  Heikkinen            1
3            4         1   Futrelle            1
4            5         0     Braund            0
5            6         0      Moran            0
6            7         0   Futrelle            1
7            8         0     Braund            0
8            9         1    Cumings            2

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2020-10-09
    • 1970-01-01
    • 2019-07-23
    • 1970-01-01
    • 2020-08-19
    • 2020-12-20
    • 1970-01-01
    相关资源
    最近更新 更多