如何以字典方式逐项比较两个二维数组？答案

【问题标题】：How to compare two 2D arrays item by item in lexicographic way?如何以字典方式逐项比较两个二维数组？
【发布时间】：2021-04-02 14:30:47
【问题描述】：

我知道如何按第一项、第二项等比较两个数组。例如[ 2 3 10 9 6 -1] 高于[ 2 3 2 10 -1 -1]。我需要一种适用于几个二维数组的矢量化方式，如下所示：

a = [[ 1  3 10  9  6 -1]
 [ 5 12  6  5  4  3]
 [ 2  9  5  6 -1 -1]
 [ 2  9  4  8 -1 -1]
 [ 1  5 12 11  9 -1]
 [ 0 12  9  6  5  3]
 [ 4  9 -1 -1 -1 -1]
 [ 1  5  9  6  2 -1]
 [ 2  9  5 12 -1 -1]
 [ 1  8 11  9  5 -1]]

反对

b = [[ 2  3  2 10 -1 -1]
 [ 1  3 12  6  4 -1]
 [ 0 10  9  7  6  5]
 [ 2  6  4 12 -1 -1]
 [ 1  6 12 11 10 -1]
 [ 1  3 12  8  6 -1]
 [ 4  9 -1 -1 -1 -1]
 [ 0 12  6  5  4  2]
 [ 0 12 10  9  6  5]
 [ 1  8 11  9  5 -1]]

如何获取第一个数组中获胜、失败或平局的项目的索引？预期的输出是：

{'win': [1, 2, 3, 7, 8],
 'lose': [0, 4, 5],
 'tie': [6, 9]}

【问题讨论】：

javascript中的代码可以帮助你吗？
它有一个 numpy 标签，专门将我的问题与 Python 联系起来。在这种情况下，我将编辑我的标签。
我已经有了解决方案，这是自愿分享。

标签： python arrays algorithm numpy comparison

【解决方案1】：

幸运的是，我能够实现一个按预期工作的方法：

def lexcompare(a, b):
    diff = a - b
    rows, cols = np.where(diff!=0)
    idx = np.r_[True, np.diff(rows).astype(bool)]
    checks = diff[rows[idx], cols[idx]] > 0
    wins, loses = rows[idx][checks], rows[idx][~checks]
    tie = np.setdiff1d(np.arange(len(diff)), rows[idx])
    return {'win': wins, 'lose': loses, 'tie': tie}

输出：

{'win': array([1, 2, 3, 7, 8], dtype=int64),
 'lose': array([0, 4, 5], dtype=int64),
 'tie': array([6, 9])}

【讨论】：

你为什么要写这么复杂的东西，每行的线性遍历就可以了？
然后编写自己的解决方案；）

【解决方案2】：

即使它可能不是很numpy-esk，一个简单的手动逐行比较也可以解决问题：

import numpy as np


def lex_comp(a: np.ndarray, b: np.ndarray) -> dict:
    d = {'win': [], 'tie': [], 'lose': []}
    
    for i, (a_i, b_i) in enumerate(zip(a, b)):
        for a_ij, b_ij in zip(a_i, b_i):
            if a_ij > b_ij:
                d['win'].append(i)
                break
            if b_ij > a_ij:
                d['lose'].append(i)
                break
        else:
            d['tie'].append(i)
    
    return d


a = np.asarray([
    [1, 3, 10, 9, 6, -1],
    [5, 12, 6, 5, 4, 3],
    [2, 9, 5, 6, -1, -1],
    [2, 9, 4, 8, -1, -1],
    [1, 5, 12, 11, 9, -1],
    [0, 12, 9, 6, 5, 3],
    [4, 9, -1, -1, -1, -1],
    [1, 5, 9, 6, 2, -1],
    [2, 9, 5, 12, -1, -1],
    [1, 8, 11, 9, 5, -1]
])

b = np.asarray([
    [2, 3, 2, 10, -1, -1],
    [1, 3, 12, 6, 4, -1],
    [0, 10, 9, 7, 6, 5],
    [2, 6, 4, 12, -1, -1],
    [1, 6, 12, 11, 10, -1],
    [1, 3, 12, 8, 6, -1],
    [4, 9, -1, -1, -1, -1],
    [0, 12, 6, 5, 4, 2],
    [0, 12, 10, 9, 6, 5],
    [1, 8, 11, 9, 5, -1]
])
    
results = lex_comp(a, b) # {'win': [1, 2, 3, 7, 8], 'tie': [6, 9], 'lose': [0, 4, 5]}

【讨论】：

lex_comp 比 lexcompare 慢 4 倍。这就是我寻找矢量化解决方案的原因（for 循环很慢）。
取决于输入矩阵（N，K）的维数，但是我也对此进行了测试。如果 K 很大，lex_comp 的性能比较好 - 如果 N 很大，lexcompare 性能更好。