我相信sklearn 库在这里会有所帮助。根据 scikit-learn 文档 (https://scikit-learn.org/stable/modules/svm.html),sklearn.svm.SVC 类“能够对数据集执行二进制和多类分类。”
标签实际上可以采用任何整数集,只要它们是不同的(例如 {-1, 1, 2} 和 {0, 1, 2} 和 {1, 2, 3} 都是有效的)。一般来说,我认为将 {0, 1, 2, ..., N} 用于标签分配是最佳做法。
请看下面的代码示例:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
N = 1000 # Number of samples
# Create synthetic dataset
X1 = np.random.normal(loc=0, scale=1, size=(N, 2))
Y1 = 0 * np.ones(shape=(1000,)) # LABEL = 0
X2 = np.random.normal(loc=[-5, 5], scale=1, size=(N, 2))
Y2 = 1 * np.ones(shape=(1000,)) # LABEL = 1
X3 = np.random.normal(loc=[5, -5], scale=1, size=(N, 2))
Y3 = 2 * np.ones(shape=(1000,)) # LABEL = 2
# Create stacked dataset
X = np.vstack((X1, X2, X3))
Y = np.hstack((Y1, Y2, Y3))
# TRAIN SVM LEARNING ALGORITHM
clf = SVC(kernel='linear')
clf = clf.fit(X, Y)
# create decision boundary plot
xx, yy = np.meshgrid(
np.arange(-10, 10, 0.2),
np.arange(-10, 10, 0.2))
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
# PLOT EVERYTHING
plt.scatter(X1[:,0], X1[:,1], color='r')
plt.scatter(X2[:,0], X2[:,1], color='b')
plt.scatter(X3[:,0], X3[:,1], color='y')
plt.contourf(xx,yy,Z,cmap=plt.cm.coolwarm, alpha=0.8)
plt.title("SVM With Linear Kernel and Three Labels (0, 1, 2)")
plt.show()
希望这会有所帮助!