LinearSVC 和 SVC(kernel="linear") 有什么区别？答案

【问题标题】：What is the difference between LinearSVC and SVC(kernel="linear")?LinearSVC 和 SVC(kernel="linear") 有什么区别？
【发布时间】：2018-01-05 03:42:31
【问题描述】：

我找到了sklearn.svm.LinearSVC 和sklearn.svm.SVC(kernel='linear')，它们看起来与我非常相似，但我在路透社上得到的结果却截然不同。

sklearn.svm.LinearSVC: 81.05% in   28.87s train /    9.71s test
sklearn.svm.SVC      : 33.55% in 6536.53s train / 2418.62s test

两者都有一个线性内核。 LinearSVC的容差高于SVC的容差：

LinearSVC(C=1.0, tol=0.0001, max_iter=1000, penalty='l2', loss='squared_hinge', dual=True, multi_class='ovr', fit_intercept=True, intercept_scaling=1)
SVC      (C=1.0, tol=0.001,    max_iter=-1, shrinking=True, probability=False, cache_size=200, decision_function_shape=None)

这两个函数有何不同？即使我设置了kernel='linear、tol=0.0001、max_iter=1000 anddecision_function_shape='ovr'theSVCtakes much longer thanLinearSVC`。为什么？

我使用sklearn 0.18，两者都包含在OneVsRestClassifier 中。我不确定这是否与multi_class='ovr' / decision_function_shape='ovr' 相同。

【问题讨论】：

能不能升级到0.18.2，看看结果是不是还是不一样？
相信版本不是这样的。 sklearn 文档包含拟合这些分类器的示例。结果因模型使用的方法而异。
已经有一些关于它的讨论，也许检查这些：stackoverflow.com/questions/33843981/… 和 stackoverflow.com/questions/35076586/…
文档说明他们使用不同的实现，甚至使用 sklearn 中的方法或直接访问低级实现也会导致不同的分数。

标签： scikit-learn svm

【解决方案1】：

确实，LinearSVC 和 SVC(kernel='linear') 产生不同的结果，即。 e.指标得分和决策边界，因为它们使用不同的方法。下面的玩具例子证明了这一点：

from sklearn.datasets import load_iris
from sklearn.svm import LinearSVC, SVC

X, y = load_iris(return_X_y=True)

clf_1 = LinearSVC().fit(X, y)  # possible to state loss='hinge'
clf_2 = SVC(kernel='linear').fit(X, y)

score_1 = clf_1.score(X, y)
score_2 = clf_2.score(X, y)

print('LinearSVC score %s' % score_1)
print('SVC score %s' % score_2)

--------------------------
>>>    0.96666666666666667
>>>    0.98666666666666669

这种差异的主要原则如下：

默认缩放，LinearSVC 最小化平方铰链损失，而SVC 最小化常规铰链损失。可以为LinearSVC 中的loss 参数手动定义“铰链”字符串。
LinearSVC 使用 One-vs-All（也称为 One-vs-Rest）多类归约，而 SVC 使用 One-vs-One 多类归约。还注意到here。此外，对于多类分类问题SVC 适合N * (N - 1) / 2 模型，其中N 是类的数量。相比之下，LinearSVC 只适合N 模型。如果分类问题是二元的，那么只有一个模型适合两种情况。 multi_class 和 decision_function_shape 参数没有任何共同之处。第二个是聚合器，它将决策函数的结果转换为方便的(n_features, n_samples) 形状。 multi_class 是一种建立解决方案的算法方法。
LinearSVC 的基础估计器是 liblinear，实际上会惩罚截距。 SVC 使用不使用的 libsvm 估计器。 liblinear 估计器针对线性（特殊）情况进行了优化，因此比 libsvm 在大量数据上收敛得更快。这就是LinearSVC 解决问题所需时间更少的原因。

事实上，LinearSVC 在截距缩放后实际上并不是线性的，正如 cmets 部分所述。

【讨论】：

在official documentation of scikit learn 中，似乎数学公式并不表示截距会受到惩罚。还是我误会了？

【解决方案2】：

它们之间的主要区别是 linearsvc 让您只选择线性分类器，而 svc 让您从各种非线性分类器中进行选择。但是不建议将 svc 用于非线性问题，因为它们非常慢。尝试导入其他库进行非线性分类。

现在，即使在定义 kernel='linear' 之后，我们也没有得到相同的输出，这是因为 linearsvc 和 svc 在做背景数学时尝试了不同的方法。此外，linearsvc 的工作原理是一对一，而 svc 的工作原理是一对一。

我希望这能回答你的问题。

【讨论】：