【发布时间】:2018-06-18 14:51:13
【问题描述】:
Givenone-hot encoding 和 dummy coding 的区别,是pandas.get_dummies 方法 one-hot encoding 在使用默认参数时(即drop_first=False)吗?
如果是这样,我从逻辑回归模型中删除截距是否有意义?这是一个例子:
# I assume I have already my dataset in a DataFrame X and the true labels in y
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
X = pd.get_dummies(X)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = .80)
clf = LogisticRegression(fit_intercept=False)
clf.fit(X_train, y_train)
【问题讨论】:
标签: python pandas scikit-learn