在 jupyter 中构建决策树的 KeyError：答案

【问题标题】：KeyError building decision tree in jupyter:在 jupyter 中构建决策树的 KeyError：
【发布时间】：2021-12-22 05:07:09
【问题描述】：

我正在使用带有以下代码的 Jupyter 笔记本在 Python 中构建 scikit-learn 决策树：

from pandas import read_csv
from sklearn import tree
data = read_csv("data.csv")
print(data.head())
       A;B;C;D;E;F;Class
0     1;1;1;0;0;0;0
1     0;1;1;0;0;1;0
2     1;1;1;0;0;0;0
3     0;0;1;0;0;0;0
4     0;1;1;0;0;0;0
predictors = ['A','B','C','D','E','F']
X = data[predictors]
Y = data.Class
decisionTreeClassifier = tree.DecisionTreeClassifier(criterion="entropy")
dTree = decisionTreeClassifier.fit(X, Y)
dotData = tree.export_graphviz(dTree, out_file=None)
print(dotData)

我的列预测器是 A;B;C;D;E;F。但我得到这个错误：

KeyError                                  Traceback (most recent call last)
<ipython-input-24-9ecbffecc41d> in <module>
      1 predictors = ['A','B','C','D','E','F']
----> 2 X = data[predictors]
      3 Y = data.Class 

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   3028             if is_iterator(key):
   3029                 key = list(key)   
   -> 3030             indexer = self.loc._get_listlike_indexer(key, axis=1, 
raise_missing=True)[1]
   3031 
   3032         # take() does not accept boolean indexers

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexing.py in 
_get_listlike_indexer(self, key, axis, raise_missing)
   1264             keyarr, indexer, new_indexer = ax._reindex_non_unique(keyarr)
   1265 
-> 1266         self._validate_read_indexer(keyarr, indexer, axis, 
raise_missing=raise_missing)
   1267         return keyarr, indexer
   1268 

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexing.py in 
  _validate_read_indexer(self, key, indexer, axis, raise_missing)
   1306             if missing == len(indexer):
   1307                 axis_name = self.obj._get_axis_name(axis)
-> 1308                 raise KeyError(f"None of [{key}] are in the [{axis_name}]")
   1309 
   1310             ax = self.obj._get_axis(axis)

KeyError: "None of [Index(['A', 'B', 'C', 'D', 'E', 'F'], dtype='object')] are in the 
[columns]"

我尚未在布尔设置中修改我的数据集，但我无法解决它。请帮帮我

【问题讨论】：

您的列A 似乎有一个前导空格，因此当您尝试访问“A”列时会导致密钥无效。尝试从csv中删除前导空格或将其添加到predictors（[' A', 'B',...]）

标签： python pandas compiler-errors decision-tree

【解决方案1】：

错误是数据集 .csv 只是一列，因为每个数据之间的分隔都是用“;”不带“，” 使用 .xlsx/.csv 在线转换器解决。

【讨论】：