一对一整数映射函数答案

【问题标题】：One-to-one integer mapping function一对一整数映射函数
【发布时间】：2022-04-20 01:36:56
【问题描述】：

我们正在使用 MySQL 并开发一个应用程序，我们不希望 ID 序列公开可见……ID 几乎不是最高机密，如果有人确实能够解码它们，也没有什么重大问题。

所以，哈希当然是显而易见的解决方案，我们目前正在使用 MD5... 32 位整数进入，我们将 MD5 修剪为 64 位，然后将其存储。但是，我们不知道当您像这样修剪时发生碰撞的可能性有多大（特别是因为所有数字都来自自动增量或当前时间）。我们目前检查冲突，但由于我们可能一次插入 100.000 行，因此性能很差（无法批量插入）。

但最后，我们真的不需要哈希提供的安全性，它们会消耗不必要的空间，还需要额外的索引……那么，有没有简单且足够好的函数/算法可以保证任何数字的一对一映射，没有明显的顺序数字视觉模式？

编辑：我使用的是默认情况下不支持整数运算的 PHP，但环顾四周后，我发现它可以通过位运算符廉价地复制。 32位整数乘法的代码可以在这里找到：http://pastebin.com/np28xhQF

【问题讨论】：

有无数个函数可以保证 1:1 映射。
@Wooble，那么我想你应该能够很容易地回答这个问题;-)
棘手的部分是，现在当您在 SO 上提出问题时，我们提供的任何答案都需要抵抗我们在此处给出的答案；-)
@Wooble 我们说的是 32 位整数，所以只有 (2^32)！ 1:1 映射。

标签： math hash cryptography

【解决方案1】：

如果足够好，您可以简单地与 0xDEADBEEF 进行异或。

或者乘以一个奇数 mod 2^32。对于逆映射只需乘以multiplicative inverse

示例：n = 2345678901；乘法逆（mod 2^32）：2313902621 对于映射只需乘以 2345678901 (mod 2^32)：

1 --> 2345678901 2 --> 396390506

对于逆映射，乘以 2313902621。

【讨论】：

仅通过查看f(0)、f(1)、f(2)，这两种方法是否都给出了明显的模式？
@aioobe 是真的，但我知道这里的安全性不是问题。只有 OP 可以决定这是否“足够好”。
@aioobe：是的，但正如 OP 所说，它不一定是无与伦比的。
如果数字足够大，第二种方法不会给出明显的模式。
XOR 确实给出了我相信的相当可预测的模式，尽管我猜它可以帮助“随机化”。无论如何，关于乘以一个奇数......我猜它应该工作给一个非常大的数字......但这也需要处理非常大的数字（并且使倒退的成本更高一些？或者？）。虽然，我在想，一个简单的解决方案可能是将值分成 4x 1byte ...然后有 4 个单独的数组，它们分别对值进行打乱。

【解决方案2】：

如果您想确保 1:1 映射，请使用加密（即排列），而不是哈希。加密必须是 1:1，因为它可以被解密。

如果您想要 32 位数字，请使用 Hasty Pudding Cypher 或编写一个简单的四轮 Feistel 密码。

这是我之前准备的：

import java.util.Random;

/**
 * IntegerPerm is a reversible keyed permutation of the integers.
 * This class is not cryptographically secure as the F function
 * is too simple and there are not enough rounds.
 *
 * @author Martin Ross
 */
public final class IntegerPerm {
    //////////////////
    // Private Data //
    //////////////////

    /** Non-zero default key, from www.random.org */
    private final static int DEFAULT_KEY = 0x6CFB18E2;

    private final static int LOW_16_MASK = 0xFFFF;
    private final static int HALF_SHIFT = 16;
    private final static int NUM_ROUNDS = 4;

    /** Permutation key */
    private int mKey;

    /** Round key schedule */
    private int[] mRoundKeys = new int[NUM_ROUNDS];

    //////////////////
    // Constructors //
    //////////////////

    public IntegerPerm() { this(DEFAULT_KEY); }

    public IntegerPerm(int key) { setKey(key); }

    ////////////////////
    // Public Methods //
    ////////////////////

    /** Sets a new value for the key and key schedule. */
    public void setKey(int newKey) {
        assert (NUM_ROUNDS == 4) : "NUM_ROUNDS is not 4";
        mKey = newKey;

        mRoundKeys[0] = mKey & LOW_16_MASK;
        mRoundKeys[1] = ~(mKey & LOW_16_MASK);
        mRoundKeys[2] = mKey >>> HALF_SHIFT;
        mRoundKeys[3] = ~(mKey >>> HALF_SHIFT);
    } // end setKey()

    /** Returns the current value of the key. */
    public int getKey() { return mKey; }

    /**
     * Calculates the enciphered (i.e. permuted) value of the given integer
     * under the current key.
     *
     * @param plain the integer to encipher.
     *
     * @return the enciphered (permuted) value.
     */
    public int encipher(int plain) {
        // 1 Split into two halves.
        int rhs = plain & LOW_16_MASK;
        int lhs = plain >>> HALF_SHIFT;

        // 2 Do NUM_ROUNDS simple Feistel rounds.
        for (int i = 0; i < NUM_ROUNDS; ++i) {
            if (i > 0) {
                // Swap lhs <-> rhs
                final int temp = lhs;
                lhs = rhs;
                rhs = temp;
            } // end if
            // Apply Feistel round function F().
            rhs ^= F(lhs, i);
        } // end for

        // 3 Recombine the two halves and return.
        return (lhs << HALF_SHIFT) + (rhs & LOW_16_MASK);
    } // end encipher()

    /**
     * Calculates the deciphered (i.e. inverse permuted) value of the given
     * integer under the current key.
     *
     * @param cypher the integer to decipher.
     *
     * @return the deciphered (inverse permuted) value.
     */
    public int decipher(int cypher) {
        // 1 Split into two halves.
        int rhs = cypher & LOW_16_MASK;
        int lhs = cypher >>> HALF_SHIFT;

        // 2 Do NUM_ROUNDS simple Feistel rounds.
        for (int i = 0; i < NUM_ROUNDS; ++i) {
            if (i > 0) {
                // Swap lhs <-> rhs
                final int temp = lhs;
                lhs = rhs;
                rhs = temp;
            } // end if
            // Apply Feistel round function F().
            rhs ^= F(lhs, NUM_ROUNDS - 1 - i);
        } // end for

        // 4 Recombine the two halves and return.
        return (lhs << HALF_SHIFT) + (rhs & LOW_16_MASK);
    } // end decipher()

    /////////////////////
    // Private Methods //
    /////////////////////

    // The F function for the Feistel rounds.
    private int F(int num, int round) {
        // XOR with round key.
        num ^= mRoundKeys[round];
        // Square, then XOR the high and low parts.
        num *= num;
        return (num >>> HALF_SHIFT) ^ (num & LOW_16_MASK);
    } // end F()

} // end class IntegerPerm

【讨论】：

非常正确，但是，Henrik 的解决方案足以满足我的需求，而且在 PHP 中的速度也相当快，所以这就是我将采用的方法。但事实上，加密本来是最好的解决方案。

【解决方案3】：

按照 Henrik 在第二个建议中所说的去做。但是由于这些值似乎被人们使用（否则你不想随机化它们）。再迈出一步。将序列号乘以一个大素数并减少 mod N，其中 N 是 2 的幂。但选择 N 比您可以存储的小 2 位。接下来，将结果乘以 11 并使用它。所以我们有：

哈希 = ((count * large_prime) % 536870912) * 11

乘以 11 可以防止大多数数据输入错误 - 如果任何数字输入错误，结果将不是 11 的倍数。如果任何 2 位数字被转置，结果将不是 11 的倍数。所以对输入的任何值进行初步检查，在查看数据库之前检查它是否能被 11 整除。

【讨论】：

这都是后端的东西，所以没有数据输入错误的可能性。无论如何，PHP 不支持整数数学，因此虽然该方法很棒，但遗憾的是它在我的情况下不可用。

【解决方案4】：

您可以对大素数使用 mod 操作。

你的数字 * 大素数 1 / 大素数 2。

素数 1 应该大于第二个。秒数应该接近 2^32 但小于它。比它更难替代。

素数 1 和素数 2 应该是常数。

【讨论】：

有没有办法恢复到原来的整数？
对于 1..big 素数 2，你永远不会有相同的休息，因为在相反的情况下，N2 - N1 应该可以被素数 2 整除，那么你的数字 2 - 你的数字 1 也应该可以整除，但这在集合 1..big 素数 2 中是不可能的。
你可以把两者都放在你的桌子上
我不知道恢复原始值的功能，我不确定它是否可能。

【解决方案5】：

对于我们的应用程序，我们使用 bit shuffle 来生成 ID。很容易恢复到原来的ID。

func (m Meeting) MeetingCode() uint {
    hashed := (m.ID + 10000000) & 0x00FFFFFF
    chunks := [24]uint{}
    for i := 0; i < 24; i++ {
        chunks[i] = hashed >> i & 0x1
    }
    shuffle := [24]uint{14, 1, 15, 21, 0, 6, 5, 10, 4, 3, 20, 22, 2, 23, 8, 13, 19, 9, 18, 12, 7, 11, 16, 17}
    result := uint(0)
    for i := 0; i < 24; i++ {
        result = result | (chunks[shuffle[i]] << i)
    }
    return result
}

【讨论】：

【解决方案6】：

有一个非常简单但没有人发布的解决方案，即使已选择答案，我强烈建议任何访问此问题的人考虑二进制表示的性质以及模运算的应用。

给定一个有限范围的整数，所有值都可以通过对其索引的简单加法以任何顺序排列，同时通过模数受到索引范围的约束。你甚至可以利用简单的整数溢出，这样就不需要使用模运算符了。

本质上，您将在内存中有一个静态变量，其中一个函数在调用时将静态变量增加一些常量，强制边界，然后返回值。此输出可以是所需输出集合的索引，也可以是所需输出本身

定义映射的增量常量可能是内存中返回值大小的几倍，但给定任何映射，都存在一些有限常量，可以通过简单的模运算来实现映射。

【讨论】：

但是请注意，一个给定的映射将有无限多的常数来实现它，并且会有很多重叠，这就是为什么通常偏爱素数的原因。
您可以使用乘法而不是加法，并且获得所需的常数是微不足道的，但实际上它的应用需要更多时间并且需要更多内存。加法也是可取的，因为乘法实际上不会产生每个潜在的映射，并且更难反转，因此阿贝尔加法可能在密码学上更优越。
在模数运算中的加法，即。