【问题标题】:How to recreate new columns with column names from one column & column values from the other如何使用一列的列名和另一列的列值重新创建新列
【发布时间】:2021-02-23 03:17:06
【问题描述】:

我的数据框中有 2 列包含列表值,如下所示:

salary.labels   salary.percentages
['Not Impacted', 'Salary Not Paid', 'Salary Cut', 'Variables Impacted', 'Appraisal Delayed']    [29, 0.9, 2.2, 11.3, 56.6]
['Not Impacted', 'Salary Not Paid', 'Salary Cut', 'Variables Impacted', 'Appraisal Delayed']    [74.5, 1.1, 1.4, 12, 11]
['Not Impacted', 'Salary Not Paid', 'Salary Cut', 'Variables Impacted', 'Appraisal Delayed']    [63.4, 1.9, 2.2, 11.2, 21.3]
['Not Impacted', 'Salary Not Paid', 'Salary Cut', 'Variables Impacted', 'Appraisal Delayed']    [58.3, 0.6, 1.9, 7.9, 31.3]
['Not Impacted', 'Salary Not Paid', 'Salary Cut', 'Variables Impacted', 'Appraisal Delayed']    [80.4, 1.4, 2.2, 4.7, 11.3]
['Not Impacted', 'Salary Not Paid', 'Salary Cut', 'Variables Impacted', 'Appraisal Delayed']    [71.2, 0.9, 1.2, 6.3, 20.4]
['Not Impacted', 'Salary Not Paid', 'Salary Cut', 'Variables Impacted', 'Appraisal Delayed']    [39.9, 1.6, 5.8, 15.8, 36.9]
['Not Impacted', 'Salary Not Paid', 'Salary Cut', 'Variables Impacted', 'Appraisal Delayed']    [56.5, 0.8, 2.3, 9.8, 30.6]
['Not Impacted', 'Salary Not Paid', 'Salary Cut', 'Variables Impacted', 'Appraisal Delayed']    [42.9, 2.3, 5.1, 14.1, 35.6]

我希望创建新列,以便列标签将采用salary.labels 列的值,并且每行中的列值将采用salary.percentages 列中的相应值。

预期的输出数据框如下所示:

'Not Impacted' 'Salary Not Paid' 'Salary Cut' 'Variables Impacted' 'Appraisal Delayed'
29, 0.9, 2.2, 11.3, 56.6
74.5, 1.1, 1.4, 12, 11
63.4, 1.9, 2.2, 11.2, 21.3
58.3, 0.6, 1.9, 7.9, 31.3
80.4, 1.4, 2.2, 4.7, 11.3
71.2, 0.9, 1.2, 6.3, 20.4
39.9, 1.6, 5.8, 15.8, 36.9
56.5, 0.8, 2.3, 9.8, 30.6
42.9, 2.3, 5.1, 14.1, 35.6

如何使用 pandas 操作来执行此操作?

【问题讨论】:

    标签: python pandas data-cleaning feature-extraction feature-engineering


    【解决方案1】:

    如果salary.labels 中的所有列表都相同,则使用DataFrame 构造函数将第二列转换为salary.labels 第一行的列表和列:

    df = pd.DataFrame(df['salary.percentages'].tolist(), columns=df['salary.labels'].iloc[0])
    print (df)
       Not Impacted  Salary Not Paid  Salary Cut  Variables Impacted  \
    0          29.0              0.9         2.2                11.3   
    1          74.5              1.1         1.4                12.0   
    2          63.4              1.9         2.2                11.2   
    3          58.3              0.6         1.9                 7.9   
    4          80.4              1.4         2.2                 4.7   
    5          71.2              0.9         1.2                 6.3   
    6          39.9              1.6         5.8                15.8   
    7          56.5              0.8         2.3                 9.8   
    8          42.9              2.3         5.1                14.1   
    
       Appraisal Delayed  
    0               56.6  
    1               11.0  
    2               21.3  
    3               31.3  
    4               11.3  
    5               20.4  
    6               36.9  
    7               30.6  
    8               35.6  
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2021-06-25
      • 2020-12-17
      • 1970-01-01
      • 2021-12-02
      • 2021-03-30
      • 2020-09-23
      • 1970-01-01
      • 2022-01-05
      相关资源
      最近更新 更多