【发布时间】:2018-04-08 14:17:31
【问题描述】:
A B C D E
0 165349.20 136897.80 471784.10 New York 192261.83
1 162597.70 151377.59 443898.53 California 191792.06
2 153441.51 101145.55 407934.54 Florida 191050.39
3 144372.41 118671.85 383199.62 New York 182901.99
4 142107.34 91391.77 366168.42 Florida 166187.94
使用后df = pd.get_dummies(df, columns=['D'])
A B C E D_New York D_California D_Florida
0 165349.20 136897.80 471784.10 192261.83 0 0 1
1 162597.70 151377.59 443898.53 191792.06 1 0 0
2 153441.51 101145.55 407934.54 191050.39 0 1 0
3 144372.41 118671.85 383199.62 182901.99 0 0 1
4 142107.34 91391.77 366168.42 166187.94 0 1 0
有没有一种方法可以在不使用 df[['A','B','C','D_Califorina','D_New York','D_Florida','E']] 的情况下使输出看起来像这样?
A B C D_New York D_California D_Florida E
0 165349.20 136897.80 471784.10 0 0 1 192261.83
1 162597.70 151377.59 443898.53 1 0 0 191792.06
2 153441.51 101145.55 407934.54 0 1 0 191050.39
3 144372.41 118671.85 383199.62 0 0 1 182901.99
4 142107.34 91391.77 366168.42 0 1 0 166187.94
【问题讨论】:
-
看来你需要
df.sort_index(axis=1)
标签: python-3.x pandas one-hot-encoding