【发布时间】:2022-01-17 11:35:03
【问题描述】:
我有一个具有以下形状的数据集:(2400, 2) (2400,) (1600, 2) (1600,) 我的任务是通过二元逻辑回归执行非线性可分离分类。 但我在可视化部分收到以下错误:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-19-2754b9327868> in <module>()
4
5 # Plot different regions and color them
----> 6 output = output.reshape(x_vals.shape)
7 plt.imshow(output, interpolation='nearest',
8 extent=(x_min, x_max, y_min, y_max),
ValueError: cannot reshape array of size 2880000 into shape (1200,1200)
如何将数组重塑为矩阵? 以下是我的实现供参考:
num_features = 2
learning_rate = 0.0001
training_steps = 4000
batch_size = 32
display_step = 50
x_train, y_train = map(list, zip(*[(x,y) for x,y in zip(x_train, y_train) if y==0 or y==1]))
x_test, y_test = map(list, zip(*[(x,y) for x,y in zip(x_test, y_test) if y==0 or y==1]))
x_train, x_test = np.array(x_train, np.float32), np.array(x_test, np.float32)
y_train, y_test = np.array(y_train, np.int64), np.array(y_test, np.int64)
x_train, x_test = x_train.reshape([-1, num_features]), x_test.reshape([-1, num_features])
x_train, x_test = x_train/255., x_test/255.
train_data = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_data = train_data.repeat().shuffle(5000).batch(batch_size).prefetch(1)
b = tf.Variable(tf.ones((num_features, 2)) * 0.000001, name = "weight")
b0 = tf.Variable(0., name = "bias")
def logistic_regression(x, b, b0):
return 1. / (1. + tf.exp(-tf.matmul(x, b) - b0))
def loglikelihood(p, y_true):
return tf.reduce_sum(tf.one_hot(y_true, 2) * tf.math.log(p), axis=-1)
def accuracy(y_pred, y_true):
correct_prediction = tf.equal(tf.argmax(y_pred, axis=-1), y_true)
return tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
optimizer = tf.optimizers.Adam()
for step, (batch_x, batch_y) in enumerate(train_data.take(training_steps), 1):
with tf.GradientTape() as g:
g.watch([b, b0])
p = logistic_regression(batch_x, b, b0)
ll = loglikelihood(p, batch_y)
ll_sum = tf.reduce_mean(ll)
grad_b, grad_b0 = g.gradient(ll_sum, [b, b0])
optimizer.apply_gradients(zip([grad_b, grad_b0], [b, b0]))
if step % display_step == 0:
p = logistic_regression(batch_x, b, b0)
acc = accuracy(p, batch_y)
p = logistic_regression(x_test, b, b0)
val_acc = accuracy(p, y_test)
print("step: %i, acc: %f, val_acc %f" % (step, acc, val_acc))
def predict(x_test):
return tf.round(logistic_regression(x_test, b, b0))
import numpy as np
x_min, y_min = -12, -12
x_max, y_max = 12, 12
x_vals, y_vals = np.meshgrid(np.arange(x_min, x_max, 0.02), np.arange(y_min, y_max, 0.02))
xy_grid = pd.DataFrame(zip(x_vals.ravel(), y_vals.ravel()), dtype=np.float32)
# Predict output labels for all the points on the grid
output = predict(xy_grid.to_numpy()).numpy()
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1, 1)
# Plot different regions and color them
output = output.reshape(x_vals.shape)
plt.imshow(output, interpolation='nearest',
extent=(x_min, x_max, y_min, y_max),
cmap=plt.cm.Paired,
aspect='auto',
origin='lower')
pd.DataFrame(np.concatenate([x_train,
np.expand_dims(y_train, axis=-1)], axis=1)).plot.scatter(0, 1, c=2, colormap='viridis', ax=ax)
预期的结果应该是这样的: expected image
但我得到以下图像: resulting image
【问题讨论】:
-
它给你一个错误,因为当你以正确的方式做这件事时,你的
output数组包含的元素比 (1200, 1200) 矩阵中的元素多。您需要选择一个包含len(output)元素的矩阵大小 -
@QuantumMecha 你的意思是我必须调整 x_min、y_min、x_max、y_max 值?
-
是的,要么将x范围改大,要么减小步长以获得更多的x_vals
标签: python-3.x machine-learning logistic-regression