【问题标题】:Split dataframe column with second column as delimiter以第二列作为分隔符拆分数据框列
【发布时间】:2021-05-01 12:24:21
【问题描述】:

我想通过使用同一行中第二列的值将一列拆分为两列,因此第二列值用作拆分分隔符。

我收到了错误 TypeError: 'Series' objects are mutable, thus they cannot be hashed,这是有道理的,它接收的是一个系列,而不是单个值,但我不确定如何隔离到第二列的单行值。

样本数据:

    title_location                    delimiter
0   Doctor - ABC - Los Angeles, CA    - ABC -
1   Lawyer - ABC - Atlanta, GA        - ABC -
2   Athlete - XYZ - Jacksonville, FL  - XYZ -

代码:

bigdata[['title', 'location']] = bigdata['title_location'].str.split(bigdata['delimiter'], expand=True)

期望的输出:

    title_location                    delimiter    title    location
0   Doctor - ABC - Los Angeles, CA    - ABC -      Doctor   Los Angeles, CA
1   Lawyer - ABC - Atlanta, GA        - ABC -      Lawyer   Atlanta, GA
2   Athlete - XYZ - Jacksonville, FL  - XYZ -      Athlete  Jacksonville, FL

【问题讨论】:

    标签: python pandas string split delimiter


    【解决方案1】:

    试试apply

    bigdata[['title', 'location']]=bigdata.apply(func=lambda row: row['title_location'].split(row['delimiter']), axis=1, result_type="expand")
    

    【讨论】:

      【解决方案2】:

      让我们试试zip 然后join 回来

      df = df.join(pd.DataFrame([x.split(y) for x ,y in zip(df.title_location,df.delimiter)],index=df.index,columns=['Title','Location']))
      df
      Out[200]: 
                           title_location delimiter     Title           Location
      0    Doctor - ABC - Los Angeles, CA   - ABC -   Doctor     Los Angeles, CA
      1        Lawyer - ABC - Atlanta, GA   - ABC -   Lawyer         Atlanta, GA
      2  Athlete - XYZ - Jacksonville, FL   - XYZ -  Athlete    Jacksonville, FL
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2018-10-25
        • 2016-09-16
        • 1970-01-01
        • 1970-01-01
        • 2013-02-27
        • 1970-01-01
        相关资源
        最近更新 更多