为什么我的随机森林和决策树一直显示 100% 准确率？ [复制]答案

【问题标题】：Why does my random forest and decision tree keep displaying 100% accuracy? [duplicate]为什么我的随机森林和决策树一直显示 100% 准确率？ [复制]
【发布时间】：2019-11-22 14:59:02
【问题描述】：

我处于停滞状态，因为我的输出在随机森林和决策树上始终显示 100% 准确率，但不支持向量机

我相信问题在于如何训练或测试数据。我认为这是对训练数据的测试，而不是测试数据。但是，我不知道如何解决它。

import pandas as pd
import numpy as np
import keras
from keras.models import Sequential
from keras.layers import Dense
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import confusion_matrix, classification_report
import sklearn.metrics as metrics
import seaborn as sns
import warnings
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split

warnings.filterwarnings("ignore")

heart_data = pd.read_csv('data1.csv')

heart_data.head()
y = heart_data.target.values
x_data = heart_data.drop(['target'], axis = 1)
x = (x_data - np.min(x_data)) / (np.max(x_data) - np.min(x_data)).values
n_cols = x.shape[1]

#Splitting Data
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.20)




def regression_model():
    # create model
    model = Sequential()
    #inputs
    model.add(Dense(50, activation='relu', input_shape=(n_cols,)))
    model.add(Dense(50, activation='relu')) # activation function
    model.add(Dense(1))

    # compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    #loss measures the results and figures out how bad it did. Optimizer generates next guess.
    return model


# build the model
model = regression_model()
print (model)
# fit the model
history=model.fit(x_train, y_train, validation_data=(x_test,y_test), epochs=10, batch_size=10)



# summarize history for loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()

#Decision Tree
print ("Processing Decision Tree")
dtc = DecisionTreeClassifier()
dtc.fit(x_test,y_test)
print("Decision Tree Test Accuracy {:.2f}%".format(dtc.score(x_test, y_test)*100))


#Support Vector Machine
print ("Processing Support Vector Machine")
svm = SVC(random_state = 1)
svm.fit(x_test, y_test)
print("Test Accuracy of SVM Algorithm: {:.2f}%".format(svm.score(x_test,y_test)*100))

#Random Forest
print ("Processing Random Forest")
rf = RandomForestClassifier(n_estimators = 1000, random_state = 1)
rf.fit(x_test, y_test)
print("Random Forest Algorithm Accuracy Score : {:.2f}%".format(rf.score(x_test,y_test)*100))

我希望在随机森林中获得 +90%。非常感谢任何语法建议或更改。

【问题讨论】：

请不要重复同一个问题，你昨天已经问过这个问题了，cmets已经给你指出了解决方案。

标签： python keras

【解决方案1】：

您应该使用 x_train 和 y_train 训练您的模型，并使用测试数据对其进行验证。

例如：

#Random Forest
print ("Processing Random Forest")
rf = RandomForestClassifier(n_estimators = 1000, random_state = 1)
rf.fit(x_train, y_train)
y_test_pred = rf.predict(x_test)
print("Random Forest Algorithm Accuracy Score : {:.2f}%".format(rf.score(x_test,y_test)*100))

【讨论】：