【发布时间】:2020-01-09 12:22:48
【问题描述】:
我是 Python 和 ML 的初学者。我正在练习这个 Iris 数据集,以使用张量流 2.0 创建一个 ML 模型。
我解析了 csv 并使用数据集训练了模型。在我的模型创建过程中,我能够获得 90% 的训练准确率和 91% 的验证准确率。
import tensorflow as tf
import numpy as np
from sklearn import preprocessing
csv_data = np.loadtxt('iris_training.csv',delimiter=',')
target_all = csv_data[:,-1]
csv_data = csv_data[:,0:-1]
# Shuffling the input
shuffled_indices = np.arange(csv_data.shape[0])
np.random.shuffle(shuffled_indices)
shuffled_inputs = csv_data[shuffled_indices]
shuffled_targets = target_all[shuffled_indices]
# Standardize the Inputs
shuffled_inputs = preprocessing.scale(shuffled_inputs)
# Split date into train , validation and test
total_count = shuffled_inputs.shape[0]
train_data_count = int(0.8*total_count)
validation_data_count = int(0.1*total_count)
test_data_count = total_count - train_data_count - validation_data_count
train_inputs = shuffled_inputs[:train_data_count]
train_targets = shuffled_targets[:train_data_count]
validation_inputs = shuffled_inputs[train_data_count:train_data_count+validation_data_count]
validation_targets = shuffled_targets[train_data_count:train_data_count+validation_data_count]
test_inputs = shuffled_inputs[train_data_count+validation_data_count:]
test_targets = shuffled_targets[train_data_count+validation_data_count:]
print(len(train_inputs))
print(len(validation_inputs))
print(len(test_inputs))
# Model Creation
input_size = 4
hidden_layer_size = 100
output_size = 3
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(hidden_layer_size, input_dim=input_size, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(hidden_layer_size, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(output_size, activation=tf.nn.softmax))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',metrics=['accuracy'])
model.fit(train_inputs,train_targets, epochs=10, validation_data=(validation_inputs, validation_targets), verbose=2)
prediction = model.predict(test_inputs)
如果我的代码中有什么可以提高这个简单 Iris 数据集的模型的准确性,请指出我。
用于训练我的模型的文件:Iris Csv
【问题讨论】:
-
这不是 pycharm 的问题。它只是一个超出
target和/或outs范围的索引 -
您对
evaluate的使用是错误的。您可能正在寻找predict -
@luigigi 如果我
test_loss, test_accuracy = model.predict(test_inputs)我得到line 78, in <module> test_loss, test_accuracy = model.predict(test_inputs) ValueError: too many values to unpack Issue你能帮我吗 -
是的,因为
predict只有一个返回值。使用predictions = model.predict(test_inputs)之类的东西 -
@luigigi 是的,我正在寻找
predict,它只有一个返回值。谢谢你的帮助。现在我从我的模型中得到分布概率。有什么方法可以提高我的模型的准确性吗?
标签: python tensorflow machine-learning