遍历 numpy 数组的行以查找模式答案

【问题标题】：Iterate through rows of numpy array to find mode遍历 numpy 数组的行以查找模式
【发布时间】：2017-03-22 16:18:26
【问题描述】：

我正在尝试创建一个决策树分类器函数，该函数将构建决策树集合，并根据所有决策树的多数投票预测做出最终预测。我的方法是构建一个矩阵，将每个决策树的预测放在单独的列中，然后为每一行（对应于每个数据点）找到模态值以对该数据点进行最终预测。

到目前为止我的功能是：

def majority_classify(x_train, y_train, x_test, y_test, num_samples):

n = x_train.shape[0]
c=len(np.unique(y_train))

votes=np.zeros((n, c))
predictions_train=np.empty((n, num_samples+1))
predictions_test=np.empty((n, num_samples))


for i in range(0, num_samples):
    # Randomly a sample points from the train set of size 'n'
    indices = np.random.choice(np.arange(0, n), size=n)

    x_train_sample = x_train[indices, :]
    y_train_sample = y_train[indices]

    dt_major = tree.DecisionTreeClassifier(max_depth = 2)
    model_major = dt_major.fit(x_train, y_train)

    predictions_train[:,i]=model_major.predict(x_train)




for r in predictions_train:
    predict_train = mode(r)[0][0]

但是，我遇到的问题是如何遍历每一行并找到模式。有什么建议吗？

谢谢！

【问题讨论】：

The documentation 是一个很好的起点。您应该在问题中包含一个最小示例，以及所需的结果。
我想将每一行作为一个单元进行迭代，而不是对每一行中的项目进行迭代。我认为我没有在该文档中看到如何做到这一点。
docs.scipy.org/doc/numpy/user/…
你可以使用任何包还是被限制？

标签： python numpy

【解决方案1】：

查看scipy.stats.mode:

import numpy as np
from scipy.stats import mode

>>> a = np.array([[1,1,0],[1,2,2],[2,0,0]])
>>> mode(a, axis=1)[0]
array([[1],
       [2],
       [0]])

【讨论】：

【解决方案2】：

将np.unique 与return_counts 参数一起使用。
使用计数数组上的argmax 从唯一数组中获取值。
使用np.apply_along_axis 自定义函数mode

def mode(a):
    u, c = np.unique(a, return_counts=True)
    return u[c.argmax()]

a = np.array([
        [1, 2, 3],
        [2, 3, 4],
        [3, 4, 5],
        [2, 5, 6],
        [4, 1, 7],
        [5, 4, 8],
        [6, 6, 3]
    ])

np.apply_along_axis(mode, 0, a)

array([2, 4, 3])

【讨论】：