如何使用逻辑回归预测输出值？答案

【问题标题】：how to predict the output value using logistic regression?如何使用逻辑回归预测输出值？
【发布时间】：2020-11-26 13:40:48
【问题描述】：

我成功地模拟了我的分类函数，通过 ANN 利用 pandas 和 sklearn 库来预测输出二进制的单个值。现在我想模拟我的模型来预测另一个不是二进制的特征，因为输入列是 (0,1,4,6,7,8,11,12,13,14)，输出列是 (15)我的数据集。输入数据的典型示例是 [4096,0.07324,1.7,20,5.2,64,0.142,0.5,35,30,584.232]，因为某些值是浮点数。如何使用逻辑回归通过前十个数字预测 584.232？谢谢大家。

dataset = pd.read_csv("DataSet.csv")
X = dataset.iloc[:, [0,1,4,6,7,8,11,12,13,14]].values
y = dataset.iloc[:, 15].values

为了避免类型错误，我使用以下方式将输入值转换为浮点数：

dataset['ColumnsName'] = dataset['ColumnsName'].astype(float)
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
labelEncoder_X_delay_1 = LabelEncoder()
X[:, 1] = labelEncoder_X_1.fit_transform(X[:, 1])
labelEncoder_X_delay_2 = LabelEncoder()
X[:, 2] = labelEncoder_X_2.fit_transform(X[:, 2])
# normalizing the input
X = X.T
X = X / np.amax(X, axis=1)
X = X.T
# splitting the dataset into the training set and test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = 
train_test_split(X, y, test_size = 0.2, random_state = 0)
# feature scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
# fitting logestic regression to the training set
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression(random_state = 0)
classifier.fit(X_train, y_train)

但是在编译到现在的代码之后，它给出了错误：

from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression(random_state = 0)
classifier.fit(X_train, y_train)
Traceback (most recent call last):

  File "<ipython-input-5-f18c8875152f>", line 3, in <module>
    classifier.fit(X_train, y_train)

  File "C:\Users\ali\anaconda3\lib\site-packages\sklearn\linear_model\_logistic.py", line 1528, in fit
    check_classification_targets(y)

  File "C:\Users\ali\anaconda3\lib\site-packages\sklearn\utils\multiclass.py", line 169, in check_classification_targets
    raise ValueError("Unknown label type: %r" % y_type)

ValueError: Unknown label type: 'continuous'

我已经将预定义的列从字符串转换为浮点数！

【问题讨论】：

可能是 scaler 没用？逻辑回归中的输出或 y 应为 0 或 1。是否属于一个班级（属于另一个班级？）
错误信息还有更多内容吗？ ...堆栈跟踪？
亲爱的 rickhg12hs，我已将错误部分更新为完整消息。
亲爱的 aerijman，在使用 StandardScaler 模型之前，我插入了适当的“relu”激活函数代码。它工作正常。如有必要，我会将代码更新给其他人。

标签： python scikit-learn regression logistic-regression

【解决方案1】：

dataset = pd.read_csv("DataSet.csv")
X = dataset.iloc[:, [0,1,4,6,7,8,11,12,13,14]].values
y = dataset.iloc[:, 15].values

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
labelEncoder_X_delay_1 = LabelEncoder()
X[:, 1] = labelEncoder_X_1.fit_transform(X[:, 1])
labelEncoder_X_delay_2 = LabelEncoder()
X[:, 2] = labelEncoder_X_2.fit_transform(X[:, 2])
# normalizing the input
X = X.T
X = X / np.amax(X, axis=1)
X = X.T
# splitting the dataset into the training set and test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)
# Activation Function
model = Sequential()
model.add(Dense(6, input_dim=9, activation= "relu"))
model.add(Dense(6, activation= "relu"))
model.add(Dense(6, activation= "relu"))
model.add(Dense(1))

# splitting the dataset into the training set and test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = 
train_test_split(X, y, test_size = 0.2, random_state = 0)
# feature scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
# fitting logestic regression to the training set
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression(random_state = 0)
classifier.fit(X_train, y_train)

【讨论】：