【问题标题】:Split a column in df by another column value将 df 中的一列拆分为另一列值
【发布时间】:2023-04-10 10:55:01
【问题描述】:

在 python 中,我有以下 df(第一行的标题):

FullName          FirstName
'MichaelJordan'   'Michael'
'KobeBryant'      'Kobe'
'LeBronJames'     'LeBron'  

我正在尝试根据“FirstName”中的值拆分“FullName”中的每条记录,但我没有运气......

这是我尝试过的:

df['Names'] = df['FullName'].str.split(df['FirstName'])

这会产生错误:

'Series' objects are mutable, thus they cannot be hashed

期望的输出:

print(df['Names'])

['Michael', 'Jordan']
['Kobe', 'Bryant']
['LeBron', 'James']

【问题讨论】:

标签: python pandas


【解决方案1】:
>>> df.assign(names=[[firstname, fullname[len(firstname):]] 
                     for fullname, firstname in df[['FullName', 'FirstName']].values])
        FullName FirstName              names
0  MichaelJordan   Michael  [Michael, Jordan]
1     KobeBryant      Kobe     [Kobe, Bryant]
2    LeBronJames    LeBron    [LeBron, James]

【讨论】:

  • 聪明...鉴于名字通常是...呃.. first (-:
【解决方案2】:

str.replace

lastnames = [full.replace(first, '') for full, first in zip(df.FullName, df.FirstName)]
df.assign(LastName=lastnames)

        FullName FirstName LastName
0  MichaelJordan   Michael   Jordan
1     KobeBryant      Kobe   Bryant
2    LeBronJames    LeBron    James

同样的想法,但使用map

df.assign(LastName=[*map(lambda a, b: a.replace(b, ''), df.FullName, df.FirstName)])

        FullName FirstName LastName
0  MichaelJordan   Michael   Jordan
1     KobeBryant      Kobe   Bryant
2    LeBronJames    LeBron    James

【讨论】:

  • 这正是我想要的。谢谢!
【解决方案3】:

这是一个带有应用程序的单行器。在FirstName的长度上拆分FullName

df['Names'] = df.apply(lambda row: [row['FullName'][:len(row['FirstName'])], row['FullName'][len(row['FirstName']):]] if row['FullName'].startswith(row['FirstName']) else '', axis=1)
        FullName FirstName              Names
0  MichaelJordan   Michael  [Michael, Jordan]
1     KobeBryant      Kobe     [Kobe, Bryant]
2    LeBronJames    LeBron    [LeBron, James]


【讨论】:

    【解决方案4】:

    由于您正在执行我们可以使用 apply 的逐行操作,

    这个想法是用它自己替换名字+一个逗号来分割它

    df["SplitName"] = df.apply(
        lambda x: x["FullName"].replace(x["FirstName"], f"{x['FirstName']}, "), axis=1
    )
    
    
    print(df['SplitName'].str.split(',',expand=True))
    
             0        1
    0  Michael   Jordan
    1     Kobe   Bryant
    2   LeBron    James
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2013-03-06
      • 2020-08-16
      • 2017-10-03
      • 1970-01-01
      • 1970-01-01
      • 2022-12-09
      相关资源
      最近更新 更多