枚举所有可能的决策规则答案

【问题标题】：Enumerate all possible decisions rules枚举所有可能的决策规则
【发布时间】：2017-06-21 01:44:17
【问题描述】：

我有m 输入变量I_1, ..., I_m 来决定。每个变量可能有n 可能的值。决策结果D 是二元的。

决策规则 R 是从集合 D x I_1 x ... x I_m 到集合 {0, 1} 的映射，因此对于任何 (i_1, ..., i_m) in I_1 x ... x I_m，它都拥有 1 = sum_(d in D) R(d, i_1, ..., i_m)。也就是说：对于输入值的任何组合，只有一个决定是可能的。

例如，没有任何输入变量，您有两个决策规则：

D   R1   R2
a    0    1
b    1    0

这就是R1 选择决策b 而R2 选择a 的规则。

使用一个二进制输入变量I，您有四种可能的决策规则：

I   D   R1   R2   R3   R4
0   a    0    0    1    1
0   b    1    1    0    0
1   a    0    1    0    1
1   b    1    0    1    0

这就是判断规则R2如果输入为0则选择b，如果输入为1则选择a。

使用两个二进制输入变量 I 和 K，您有 16 条可能的决策规则

I    K    D    R1   R2   R3   R4   R5   R6   R7   R8   R9   R10  R11  R12  R13  R14  R15  R16
0    0    a    0    0    0    0    0    0    0    0    1    1    1    1    1    1    1    1
0    0    b    1    1    1    1    1    1    1    1    0    0    0    0    0    0    0    0
1    0    a    0    0    0    0    1    1    1    1    0    0    0    0    1    1    1    1
1    0    b    1    1    1    1    0    0    0    0    1    1    1    1    0    0    0    0
0    1    a    0    0    1    1    0    0    1    1    0    0    1    1    0    0    1    1
0    1    b    1    1    0    0    1    1    0    0    1    1    0    0    1    1    0    0
1    1    a    0    1    0    1    0    1    0    1    0    1    0    1    0    1    0    1
1    1    b    1    0    1    0    1    0    1    0    1    0    1    0    1    0    1    0

我的问题是如何为任意一组输入变量枚举所有可能的决策规则？

免责声明：这是作业的一部分。然而，作业仅限于具有一个二进制输入变量的情况，以便可以简单地枚举所有四种情况。我通过了这部分作业 - 实际上根本不需要枚举 - 但我对 matlab 中的通用解决方案感兴趣。

【问题讨论】：

您能否澄清一下“列举所有可能的决策规则”是什么意思？您的意思是获取诸如“16 种可能的决策规则”之类的可能组合的总数吗？
@Cebri 是的，我想拥有所有 16 个实例 D1 到 D16，即我需要 16 个向量。请记住，对于输入变量的任何组合，只有一个决定是可能的。对于上面的示例，I == 0 Dx 可以是 a 或 b 不能两者兼而有之。
明白。顺便说一句，如果我错了，请纠正我，但我认为有两个二进制输入变量有 8 个可能的决策规则而不是 16 个。
@Cebri 我添加了所有 16 条决策规则 D1 ... D16。您拥有 [I1, I2, D] (2^3) 的不同组合的 8 种组合，但您有 16 条决策规则 (2^(2^2)。

标签： matlab matrix decision-tree markov-chains bayesian-networks

【解决方案1】：

如何列举所有可能的决策规则任意一组输入变量？

首先通过分析和理解当我们根据输入变量的数量 (n) 写下决策规则 (R) 的二进制 permutations 时可见的重复模式 （五）。然后构建一组函数，这些函数会自动生成这些排列并显示一个包含结果的表格，就像您手动完成一样。

在代码方面，有许多不同的有效方法可以解决这个问题，但从我的角度来看，我认为使用逻辑矩阵是一种很好的方法。我将调用此矩阵 (M)。该矩阵包含三个部分（如您描述中的表格）：

左：n 输入变量 (V) 列
中心：1 决策列 (D)
右：2^(2^n) 决策规则列 (R)

由于您的问题有两个决定（A 和 B），我们也可以将它们视为逻辑值：

A = 0
B = 1

注意：我为 A 和 B 选择了这个值，而不是相反的值，因为它允许我们生成输入变量的二进制排列（我将称之为“states”）（ V) 和决策 (D) 使用自然二进制计数。

对于n = 0，M 看起来像：

0   0   1
1   1   0

对于n = 1，M 看起来像：

0   0   0   0   1   1
0   1   1   1   0   0
1   0   0   1   0   1
1   1   1   0   1   0

对于n = 2，M 看起来像：

0   0   0   0   0   0   0   0   0   0   0   1   1   1   1   1   1   1   1
0   0   1   1   1   1   1   1   1   1   1   0   0   0   0   0   0   0   0
0   1   0   0   0   0   0   1   1   1   1   0   0   0   0   1   1   1   1
0   1   1   1   1   1   1   0   0   0   0   1   1   1   1   0   0   0   0
1   0   0   0   0   1   1   0   0   1   1   0   0   1   1   0   0   1   1
1   0   1   1   1   0   0   1   1   0   0   1   1   0   0   1   1   0   0
1   1   0   0   1   0   1   0   1   0   1   0   1   0   1   0   1   0   1
1   1   1   1   0   1   0   1   0   1   0   1   0   1   0   1   0   1   0

正如你所说，M 的大小增长得很快：

行（“状态”）以2^(n + 1) 的速度增长
列以(n + 1) + 2^(2^n) 的速率增长：n 输入变量列 + 1 决策列 (D) + 2^(2^n) 决策规则列 (R)。

从前面的矩阵中，我们几乎无法区分任何重复的模式，但如果我们使用颜色，我们可以清楚地看到决策规则 (R) 区域中的一些模式：

对于n = 0：

对于n = 1：

对于n = 2：

我们可以看到有相同“单位模式”（盒装数字）的逐行副本。每个“单元模式”是2 行宽和2^(2^n)/k 列宽（其中k 是每两行模式的重复次数）。 M 中的第一个模式始终是单个副本 (k = 1)，k 每 2 行重复一次。

我们将使用所有这些信息创建一组函数，使我们能够通过使用我将调用的table (T) 枚举所有可能的决策规则。

我写了一个名为CalcParams的函数，它根据n计算问题的所有必要参数（如M的行数和列数等）：

function[a, b, c, d, e] = CalcParams(n)
% Calculate necessary parameters.
% Inputs:
% n - number of input variables.

% Number of states (rows).
a = 2^(n + 1);
% Number of decision rules (R) (decision rules columns).
b = 2^(2^n);
% Column index of first decision rule (R1).
c = n + 2;
% Number of columns of input variables (V) and decision (D).
d = n + 1;
% Total number of columns.
e = d + b;
end

然后我写了一个名为ValidDecRules的函数，它给出了n和M，检查输入的决策规则是否满足要求：

对于输入变量的任何组合，只有一个决策是可能的。

如果决策规则满足要求，则函数返回1并显示消息VALID decision rules，否则函数返回0并显示消息INVALID decision rules。

function[val] = ValidDecRules(n, M)
% This function checks if the input decision rules meet the requirement:
% For any combination of input variables only one decision is possible.
% Inputs:
% n - number of input variables.
% M - binary matrix.

% Calculate necessary parameters.
[~, ~, c, ~, e] = CalcParams(n);

% Invalid decision rules by default.
val = 0;
% Extract odd rows from decision rules (R).
M_odd = M(1:2:end, c:e);
% Extract even rows from decision rules (R).
M_even = M(2:2:end, c:e);

% Check that all elements of the odd rows are different than the elements
% of the even rows.
if(all(all(M_odd ~= M_even, 1), 2))
    % Valid decision rules.
    val = 1;
    disp('VALID decision rules');
else
    % Invalid decision rules.
    disp('INVALID decision rules');
end

end

然后我写了一个名为GenM的函数，它基于n生成二进制矩阵M，如果你使用可选参数'plot'，它将使用imagesc绘制M的决策规则.

function[M] = GenM(n, varargin)
% This function generates the binary matrix M.
% Inputs:
% n - number of input variables.
% Options:
% 'plot' - plot decision rules of M.

% Calculate necessary parameters.
[a, b, c, d, e] = CalcParams(n);

% Anonymous functions.
f1 = @(v, k) uint8(repmat(v, 1, k));
f2 = @(v, k) f1([v; ~v], k);
f3 = @(b, k) f2([false(1, b/(2*k)), ~false(1, b/(2*k))], k);

% Binary permutations of input variables (V) and decision (D).
Dec = 0:a-1; % Array: decimal representation of every state.
Bin = dec2bin(Dec); % Array: binary representation of every state.

% Preallocate matrix M.
M(length(Bin), d) = 0;

% Loop: input variables (V) and decision (D).
% Writes binary states in matrix M.
for i = 1:d
    M(:, i) = uint8(str2num(Bin(:, i)));
end

% Loop: decision rules.
% Writes binary permutations of decision rules (R) in matrix (M).
% Start with k = 1.
k = 1;
for i = 1:2:a
    M(i:(i + 1), c:e) = f3(b, k);
    k = k*2;
end

% Continue only if decision rules (R) are valid.
if(ValidDecRules(n, M))
    % Plot decision rules if 'plot' option is used.
    if(~isempty(varargin))
        if(any(strcmp(varargin, 'plot')))
            % Visualize decision rules as image.
            imagesc(M(:, c:e));
            title('Decision Rules (R)');
            colormap summer;
            axis off;
        end
    end
else
    % If decision rules are invalid, return empty output.
    M = [];
end

end

最后，我编写了一个名为EnumDecRules 的函数，它采用n 并生成一个表T，与您问题描述中的表非常相似。该函数还返回用于生成T 的二进制矩阵M。如果您使用'plot' 可选参数，它将绘制M 的决策规则（如GenM 函数）。

EnumDecRules 函数能够真正回答您的问题，因为它具有您所要求的行为。

function[T, M] = EnumDecRules(n, varargin)
% This function generates the table (T) with the results and also returns
% the binary matrix M that was used to generate T.
% Inputs:
% n - number of input variables.
% Options:
% 'plot' - plot decision rules of M.

% Calculate necessary parameters.
[a, ~, ~, d, e] = CalcParams(n);

% Generate the binary matrix M.
M = GenM(n, varargin{:});

if(~isempty(M))
    % Loop: variable names to diplay in table header.
    % Initialize indexes for numbering.
    Vi = 1; % Input variable numbering index.
    Ri = 1; % Decision rules numbering index.
    for i = 1:e
        if i <= n
            % Input variables.
            % Write V[Vi].
            Names{i} = ['V', sprintf('%d', Vi)];
            % Increase index.
            Vi = Vi + 1;
        elseif i == d
            % Decision.
            % Write D.
            Names{i} = 'D';
        elseif i > d
            % Decision rules.
            % Write R[Ri].
            Names{i} = ['R', sprintf('%d', Ri)];
            % Increase index.
            Ri = Ri + 1;
        end
    end

    % Generate table with results.
    T = array2table(M, ...
        'VariableNames', Names);

    % Modify decision column (D) of table.
    % Replace 0 with 'A'.
    % Replace 1 with 'B'.
    T.D = repmat({'A'; 'B'}, a/2, 1);
else
    % If M is empty, return empty output.
    T = [];
end

end

使用示例：

确保将所有函数正确保存在同一目录中。

示例 1：

调用EnumDecRules 函数枚举n = 1 的所有可能决策规则：

[T, M] = EnumDecRules(1)

这些是输出：

VALID decision rules
T = 
    V1     D     R1    R2    R3    R4
    __    ___    __    __    __    __
    0     'A'    0     0     1     1 
    0     'B'    1     1     0     0 
    1     'A'    0     1     0     1 
    1     'B'    1     0     1     0 
M =
     0     0     0     0     1     1
     0     1     1     1     0     0
     1     0     0     1     0     1
     1     1     1     0     1     0

示例 2：

调用EnumDecRules 函数枚举n = 2 的所有可能决策规则，并绘制决策规则：

[T, M] = EnumDecRules(2, 'plot')

这些是输出：

VALID decision rules
T = 
    V1    V2     D     R1    R2    R3    R4    R5    R6    R7    R8    R9    R10    R11    R12    R13    R14    R15    R16
    __    __    ___    __    __    __    __    __    __    __    __    __    ___    ___    ___    ___    ___    ___    ___
    0     0     'A'    0     0     0     0     0     0     0     0     1     1      1      1      1      1      1      1  
    0     0     'B'    1     1     1     1     1     1     1     1     0     0      0      0      0      0      0      0  
    0     1     'A'    0     0     0     0     1     1     1     1     0     0      0      0      1      1      1      1  
    0     1     'B'    1     1     1     1     0     0     0     0     1     1      1      1      0      0      0      0  
    1     0     'A'    0     0     1     1     0     0     1     1     0     0      1      1      0      0      1      1  
    1     0     'B'    1     1     0     0     1     1     0     0     1     1      0      0      1      1      0      0  
    1     1     'A'    0     1     0     1     0     1     0     1     0     1      0      1      0      1      0      1  
    1     1     'B'    1     0     1     0     1     0     1     0     1     0      1      0      1      0      1      0  
M =
  Columns 1 through 9
     0     0     0     0     0     0     0     0     0
     0     0     1     1     1     1     1     1     1
     0     1     0     0     0     0     0     1     1
     0     1     1     1     1     1     1     0     0
     1     0     0     0     0     1     1     0     0
     1     0     1     1     1     0     0     1     1
     1     1     0     0     1     0     1     0     1
     1     1     1     1     0     1     0     1     0
  Columns 10 through 18
     0     0     1     1     1     1     1     1     1
     1     1     0     0     0     0     0     0     0
     1     1     0     0     0     0     1     1     1
     0     0     1     1     1     1     0     0     0
     1     1     0     0     1     1     0     0     1
     0     0     1     1     0     0     1     1     0
     0     1     0     1     0     1     0     1     0
     1     0     1     0     1     0     1     0     1
  Column 19
     1
     0
     1
     0
     1
     0
     1
     0

还有剧情：

由于这种类型的算法增长如此之快，将EnumDecRules 或GenM 用于n >= 5 可能会导致内存不足错误。

我真的希望这会有所帮助。如果您对代码的具体说明有任何疑问，请发表评论，我很乐意为您解答。

【讨论】：

太棒了！我想知道这个问题是否可以用一两行矢量化代码来解决。正如GenM 的核心所示，这并非不可能。非常感谢您的详细分析和富有洞察力的 cmets！