【问题标题】:how to predict the output value using logistic regression?如何使用逻辑回归预测输出值?
【发布时间】:2020-11-26 13:40:48
【问题描述】:

我成功地模拟了我的分类函数,通过 ANN 利用 pandas 和 sklearn 库来预测输出二进制的单个值。现在我想模拟我的模型来预测另一个不是二进制的特征,因为输入列是 (0,1,4,6,7,8,11,12,13,14),输出列是 (15)我的数据集。输入数据的典型示例是 [4096,0.07324,1.7,20,5.2,64,0.142,0.5,35,30,584.232],因为某些值是浮点数。如何使用逻辑回归通过前十个数字预测 584.232? 谢谢大家。

dataset = pd.read_csv("DataSet.csv")
X = dataset.iloc[:, [0,1,4,6,7,8,11,12,13,14]].values
y = dataset.iloc[:, 15].values

为了避免类型错误,我使用以下方式将输入值转换为浮点数:

dataset['ColumnsName'] = dataset['ColumnsName'].astype(float)
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
labelEncoder_X_delay_1 = LabelEncoder()
X[:, 1] = labelEncoder_X_1.fit_transform(X[:, 1])
labelEncoder_X_delay_2 = LabelEncoder()
X[:, 2] = labelEncoder_X_2.fit_transform(X[:, 2])
# normalizing the input
X = X.T
X = X / np.amax(X, axis=1)
X = X.T
# splitting the dataset into the training set and test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = 
train_test_split(X, y, test_size = 0.2, random_state = 0)
# feature scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
# fitting logestic regression to the training set
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression(random_state = 0)
classifier.fit(X_train, y_train)

但是在编译到现在的代码之后,它给出了错误:

from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression(random_state = 0)
classifier.fit(X_train, y_train)
Traceback (most recent call last):

  File "<ipython-input-5-f18c8875152f>", line 3, in <module>
    classifier.fit(X_train, y_train)

  File "C:\Users\ali\anaconda3\lib\site-packages\sklearn\linear_model\_logistic.py", line 1528, in fit
    check_classification_targets(y)

  File "C:\Users\ali\anaconda3\lib\site-packages\sklearn\utils\multiclass.py", line 169, in check_classification_targets
    raise ValueError("Unknown label type: %r" % y_type)

ValueError: Unknown label type: 'continuous'

我已经将预定义的列从字符串转换为浮点数!

【问题讨论】:

  • 可能是 scaler 没用?逻辑回归中的输出或 y 应为 0 或 1。是否属于一个班级(属于另一个班级?)
  • 错误信息还有更多内容吗? ...堆栈跟踪?
  • 亲爱的 rickhg12hs,我已将错误部分更新为完整消息。
  • 亲爱的 aerijman,在使用 StandardScaler 模型之前,我插入了适当的“relu”激活函数代码。它工作正常。如有必要,我会将代码更新给其他人。

标签: python scikit-learn regression logistic-regression


【解决方案1】:
dataset = pd.read_csv("DataSet.csv")
X = dataset.iloc[:, [0,1,4,6,7,8,11,12,13,14]].values
y = dataset.iloc[:, 15].values

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
labelEncoder_X_delay_1 = LabelEncoder()
X[:, 1] = labelEncoder_X_1.fit_transform(X[:, 1])
labelEncoder_X_delay_2 = LabelEncoder()
X[:, 2] = labelEncoder_X_2.fit_transform(X[:, 2])
# normalizing the input
X = X.T
X = X / np.amax(X, axis=1)
X = X.T
# splitting the dataset into the training set and test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)
# Activation Function
model = Sequential()
model.add(Dense(6, input_dim=9, activation= "relu"))
model.add(Dense(6, activation= "relu"))
model.add(Dense(6, activation= "relu"))
model.add(Dense(1))

# splitting the dataset into the training set and test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = 
train_test_split(X, y, test_size = 0.2, random_state = 0)
# feature scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
# fitting logestic regression to the training set
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression(random_state = 0)
classifier.fit(X_train, y_train)

【讨论】:

    猜你喜欢
    • 2017-01-19
    • 1970-01-01
    • 2020-06-13
    • 2019-03-05
    • 2019-06-07
    • 2012-08-24
    • 2021-06-29
    • 2021-04-16
    • 2021-02-27
    相关资源
    最近更新 更多