【问题标题】:Choose various entries from a list if its key occurs in another list and add strings together如果其键出现在另一个列表中,则从列表中选择多个条目并将字符串添加在一起
【发布时间】:2026-02-05 14:35:01
【问题描述】:

我对我的数据框有疑问。在每一列中,对于每一行,我都有一个相关人员列表(personlist)和一个人员演讲列表(speech)(相关和不相关人员的演讲)。现在,我想选择相关人员的演讲(来自人员列表),他们是否相关的信息在另一列的列表(人员列表)中给出,然后将他们所有的演讲加在一起,同时忽略不相关的演讲。因此,一列提供了我要查找的姓​​氏列表,另一列提供了所有发言者(名字和姓氏)及其演讲的列表,我想创建一个新列,其中添加了相关人员的演讲(用空格分隔)并存储在相应的行中。

所以我的初始数据集如下所示:

ticker  year    quarter personlist              jobposition speech
xx      2009    1       ("Angle", "Barth")      CEO         [("Mike Angle", "Thank you"), ("Barbara Barth", "It is"), ("Will Cook", "Yes, true")]
xx      2009    1       ("Angle", "Barth")      CFO         [("Mike Angle", "Thank you"), ("Barbara Barth", "It is"), ("Will Cook", "Yes, true")]
xx      2009    2       ("Angle", "Barth")      CEO         [("Mike Angle", "I am surprised"), ("Barbara Barth", "So am I"), ("Will Cook", "Me too")]
xx      2009    2       ("Angle", "Barth")      CFO         [("Mike Angle", "I am surprised"), ("Barbara Barth", "So am I"), ("Will Cook", "Me too")]
yy      2008    3       ("Cruz", "Dolm")        CEO         [("Damien Cruz", "Hello"), ("Lara Dolm", "Nice to meet you"), ("Lara Bel", "You too")]
yy      2008    3       ("Cruz", "Dolm")        CFO         [("Damien Cruz", "Hello"), ("Lara Dolm", "Nice to meet you"), ("Lara Bel", "You too")]

例如,对于第一行,我想检查每个键值对是否第一个列表条目以个人列表中的姓氏之一结尾,如果没有继续,如果是,则取语音部分(即条目的值) 并将其存储在新列中,为其他列重复并将匹配项添加在一起。因此,我想要以下数据集(我在这里隐藏了最初的专栏演讲,但它仍然应该包含,所以我不想替换它,只是创建一个新专栏)。

ticker  year    quarter personlist               relevantspeeches
xx      2009    1       ("Angle", "Barth")       "Thank you It is"
xx      2009    1       ("Angle", "Barth")       "Thank you It is"
xx      2009    2       ("Angle", "Barth")       "I am surprised So am I"
xx      2009    2       ("Angle", "Barth")       "I am surprised So am I"
yy      2008    3       ("Cruz", "Dolm")         "Hello Nice to meet you"
yy      2008    3       ("Cruz", "Dolm")         "Hello Nice to meet you"

有人可以帮我解决这个问题吗?

谢谢!!朱莉娅

【问题讨论】:

标签: python list pandas


【解决方案1】:

带有理解列表和应用方法:

def select(row):
    return " ".join([said for person in row.personlist
    for name,said in row.speech if person in name])

df['relevant'] = df.apply(select,axis=1) 

df.relevant 然后是:

"""
0           Thank you It is
1           Thank you It is
2    I am surprised So am I
3    I am surprised So am I
4    Hello Nice to meet you
5    Hello Nice to meet you
Name: relevant, dtype: object
"""

【讨论】:

    最近更新 更多