【发布时间】:2019-06-15 06:19:12
【问题描述】:
我无法解决这个错误。我认为这是我对数据框和索引的误解。另外,可能是对 for 循环的误解。 (我习惯于 matlab for 循环......迭代,直观地说,更容易:D)
这是错误:
KeyError: "['United States' 'Canada' 'Mexico'] not found in axis"
这发生在线路:as_df=as_df.drop(as_df[column])
但这没有任何意义...我调用的是单个列而不是整个虚拟变量集。
以下代码可以复制运行。我确定了。
我的密码:
import pandas as pd
import numpy as np
df=pd.DataFrame({"country": ['United States','Canada','Mexico'], "price": [23,32,21], "points": [3,4,4.5]})
df=df[['country','price','points']]
df2=df[['country']]
features=df2.columns
print(features)
target='points'
#------_-__-___---____________________
as_df=pd.concat([df[features],df[target]],axis=1)
#Now for Column Check
for column in as_df[features]:
col=as_df[[column]]
#Categorical Data Conversion
#This will split the countries into their own column with 1 being when it
#is true and 0 being when it is false
col.select_dtypes(include='object')
dummies=pd.get_dummies(col)
#ML Check:
dumcols=dummies.drop(dummies.columns[1],axis=1)
if dumcols.shape[1] > 1:
print(column)
as_df=as_df.drop(as_df[column])
else:
dummydf=col
as_df=pd.concat([as_df,dummydf],axis=1)
as_df.head()
【问题讨论】:
标签: python dataframe for-loop indexing