当 a 和 b 都小于 c，但 a * b 溢出时，如何计算 a * b / c？答案

【问题标题】：How can I compute a * b / c when both a and b are smaller than c, but a * b overflows?当 a 和 b 都小于 c，但 a * b 溢出时，如何计算 a * b / c？
【发布时间】：2020-10-28 07:51:49
【问题描述】：

假设uint是我的定点平台上最大的整数类型，我有：

uint func(uint a, uint b, uint c);

这需要返回a * b / c 的良好近似值。

c 的值大于a 的值和b 的值。

所以我们确定a * b / c 的值适合uint。

但是，a * b 的值本身会溢出uint 的大小。

所以计算a * b / c 值的一种方法是：

return a / c * b;

甚至：

if (a > b)
    return a / c * b;
return b / c * a;

但是，c 的值大于a 的值和b 的值。

所以上面的建议只会返回零。

我需要按比例减少 a * b 和 c，但同样 - 问题是 a * b 溢出。

理想情况下，我能够：

将a * b 替换为uint(-1)
将c 替换为uint(-1) / a / b * c。

但是无论我如何排序表达式uint(-1) / a / b * c，我都会遇到一个问题：

由于uint(-1) / a / b，uint(-1) / a / b * c 被截断为零
uint(-1) / a * c / b 因uint(-1) / a * c 而溢出
uint(-1) * c / a / b 因uint(-1) * c 而溢出

我该如何处理这种情况才能找到a * b / c 的良好近似值？

编辑 1

我的平台上没有_umul128之类的东西，当时最大的整数类型是uint64。我最大的类型是uint，我不支持比这更大的任何东西（无论是在硬件级别，也不在某些预先存在的标准库中）。

我最大的类型是uint。

编辑 2

针对众多重复的建议和cmets：

我手头没有一些“更大的类型”，我可以用它来解决这个问题。这就是为什么问题的开场白是：

假设uint是我定点平台上最大的整数类型

我假设不存在其他类型，无论是在 SW 层（通过一些内置标准库）还是在 HW 层。

【问题讨论】：

好吧，a 或 b 必须大于“uint 的一半”，所以也许我应该将其中较大的一个替换为，例如，a uint(-1) / a。然后，我可以按比例修复c...只是一个想法...
@GSerg：不，不是，谢谢。
@4386427: 精度优先，谢谢。
@vmp：C++ 在这里有何不同？这是一个纯算术问题。无论如何，我的问题甚至不在 C 语言中，而是在 Solidity 中。我发布它 C 的唯一原因是因为它的受众比 Solidity 的受众更多，而这两种语言具有相同的本机支持整数除法的共同性质（即，根据定义）。
这能回答你的问题吗？ Fast method to multiply integer by proper fraction without floats or overflow

标签： c integer integer-overflow integer-arithmetic

【解决方案1】：

需要返回a * b / c的良好近似
我最大的类型是uint
a 和 b 都小于 c

this 32-bit problem 的变化：

Algorithm: Scale a, b to not overflow

SQRT_MAX_P1 as a compile time constant of sqrt(uint_MAX + 1)
sh = 0;
if (c >= SQRT_MAX_P1) {
  while (|a| >= SQRT_MAX_P1) a/=2, sh++
  while (|b| >= SQRT_MAX_P1) b/=2, sh++
  while (|c| >= SQRT_MAX_P1) c/=2, sh--
}
result = a*b/c

shift result by sh.

对于 n 位 uint，我希望结果至少在 n/2 有效数字范围内是正确的。

可以通过利用较小的 a,b 小于 SQRT_MAX_P1 来改进事情。如果有兴趣，稍后会详细介绍。

例子

#include <inttypes.h>

#define IMAX_BITS(m) ((m)/((m)%255+1) / 255%255*8 + 7-86/((m)%255+12))
// https://stackoverflow.com/a/4589384/2410359

#define UINTMAX_WIDTH (IMAX_BITS(UINTMAX_MAX))
#define SQRT_UINTMAX_P1 (((uintmax_t)1ull) << (UINTMAX_WIDTH/2))

uintmax_t muldiv_about(uintmax_t a, uintmax_t b, uintmax_t c) {
  int shift = 0;
  if (c > SQRT_UINTMAX_P1) {
    while (a >= SQRT_UINTMAX_P1) {
      a /= 2; shift++;
    }
    while (b >= SQRT_UINTMAX_P1) {
      b /= 2; shift++;
    }
    while (c >= SQRT_UINTMAX_P1) {
      c /= 2; shift--;
    }
  }
  uintmax_t r = a * b / c;
  if (shift > 0) r <<= shift;
  if (shift < 0) r >>= shift;
  return r;
}



#include <stdio.h>

int main() {
  uintmax_t a = 12345678;
  uintmax_t b = 4235266395;
  uintmax_t c = 4235266396;
  uintmax_t r = muldiv_about(a,b,c);
  printf("%ju\n", r);
}

32 位数学输出（准确答案是 12345677）

12345600

64 位数学输出

12345677

【讨论】：

谢谢。我已经建立了一个在O(1) 复杂性（无循环）下工作的解决方案。请在下方（或上方）查看我的回答。
@4386427 稍后我会查看更多内容。

【解决方案2】：

这是另一种使用递归和最小近似来实现高精度的方法。

首先是代码，下面是解释。

代码：

uint32_t bp(uint32_t a) {
  uint32_t b = 0;
  while (a!=0)
  {
    ++b;
    a >>= 1;
  };
  return b;
}

int mul_no_ovf(uint32_t a, uint32_t b)
{
  return ((bp(a) + bp(b)) <= 32);
}

uint32_t f(uint32_t a, uint32_t b, uint32_t c)
{
  if (mul_no_ovf(a, b))
  {
    return (a*b) / c;
  }

  uint32_t m = c / b;
  ++m;
  uint32_t x = m*b - c;
  // So m * b == c + x where x < b and m >= 2

  uint32_t n = a/m;
  uint32_t r = a % m;
  // So a*b == n * (c + x) + r*b == n*c + n*x + r*b where r*b < c

  // Approximation: get rid of the r*b part
  uint32_t res = n;
  if (r*b > c/2) ++res;

  return res + f(n, x, c);
}

说明：

The multiplication a * b can be written as a sum of b

a * b = b + b + .... + b

Since b < c we can take a number m of these b so that (m-1)*b < c <= m*b, like

(b + b + ... + b) + (b + b + ... + b) + .... + b + b + b
\---------------/   \---------------/ +        \-------/
       m*b        +        m*b        + .... +     r*b
     \-------------------------------------/
            n times m*b

so we have

a*b = n*m*b + r*b

where r*b < c and m*b > c. Consequently, m*b is equal to c + x, so we have

a*b = n*(c + x) + r*b = n*c + n*x + r*b

Divide by c :

a*b/c = (n*c + n*x + r*b)/c = n + n*x/c + r*b/c

The values m, n, x, r can all be calculated from a, b and c without any loss of 
precision using integer division (/) and remainder (%).

The approximation is to look at r*b (which is less than c) and "add zero" when r*b<=c/2
and "add one" when r*b>c/2.

So now there are two possibilities:

1) a*b = n + n*x/c

2) a*b = (n + 1) + n*x/c

So the problem (i.e. calculating a*b/c) has been changed to the form

MULDIV(a1,b1,c) = NUMBER + MULDIV(a2,b2,c)

where a2,b2 is less than a1,b2. Consequently, recursion can be used until 
a2*b2 no longer overflows (and the calculation can be done directly).

【讨论】：

谢谢。我已经建立了一个在O(1) 复杂性（无循环）下工作的解决方案。请在下方（或上方）查看我的回答。

【解决方案3】：

我已经建立了一个在O(1) 复杂度下工作的解决方案（无循环）：

typedef unsigned long long uint;

typedef struct
{
    uint n;
    uint d;
}
fraction;

uint func(uint a, uint b, uint c);
fraction reducedRatio(uint n, uint d, uint max);
fraction normalizedRatio(uint a, uint b, uint scale);
fraction accurateRatio(uint a, uint b, uint scale);
fraction toFraction(uint n, uint d);
uint roundDiv(uint n, uint d);

uint func(uint a, uint b, uint c)
{
    uint hi = a > b ? a : b;
    uint lo = a < b ? a : b;
    fraction f = reducedRatio(hi, c, (uint)(-1) / lo);
    return f.n * lo / f.d;
}

fraction reducedRatio(uint n, uint d, uint max)
{
    fraction f = toFraction(n, d);
    if (n > max || d > max)
        f = normalizedRatio(n, d, max);
    if (f.n != f.d)
        return f;
    return toFraction(1, 1);
}

fraction normalizedRatio(uint a, uint b, uint scale)
{
    if (a <= b)
        return accurateRatio(a, b, scale);
    fraction f = accurateRatio(b, a, scale);
    return toFraction(f.d, f.n);
}

fraction accurateRatio(uint a, uint b, uint scale)
{
    uint maxVal = (uint)(-1) / scale;
    if (a > maxVal)
    {
        uint c = a / (maxVal + 1) + 1;
        a /= c; // we can now safely compute `a * scale`
        b /= c;
    }
    if (a != b)
    {
        uint n = a * scale;
        uint d = a + b; // can overflow
        if (d >= a) // no overflow in `a + b`
        {
            uint x = roundDiv(n, d); // we can now safely compute `scale - x`
            uint y = scale - x;
            return toFraction(x, y);
        }
        if (n < b - (b - a) / 2)
        {
            return toFraction(0, scale); // `a * scale < (a + b) / 2 < MAXUINT256 < a + b`
        }
        return toFraction(1, scale - 1); // `(a + b) / 2 < a * scale < MAXUINT256 < a + b`
    }
    return toFraction(scale / 2, scale / 2); // allow reduction to `(1, 1)` in the calling function
}

fraction toFraction(uint n, uint d)
{
    fraction f = {n, d};
    return f;
}

uint roundDiv(uint n, uint d)
{
    return n / d + n % d / (d - d / 2);
}

这是我的测试：

#include <stdio.h>

int main()
{
    uint a = (uint)(-1) / 3;            // 0x5555555555555555
    uint b = (uint)(-1) / 2;            // 0x7fffffffffffffff
    uint c = (uint)(-1) / 1;            // 0xffffffffffffffff
    printf("0x%llx", func(a, b, c));    // 0x2aaaaaaaaaaaaaaa
    return 0;
}

【讨论】：

我有点预料到这一点...您还记得我问过您“精度与性能”的问题，您的回答是“精度优先，谢谢”。现在您发布了一个精度“差”的 O(1) 解决方案。所以看起来你实际上想要性能而不是精度;-)
@4386427：是的，但这也达到了精度，所以...
精度...好吧，“足够的精度”取决于您的应用程序（这就是我问的原因），这可能是您需要的精度。那么很好 :-) 这只是一个基于 32 位无符号的“随机”选择示例：a: 12345678 b: 4235266395 c: 4292973296. Correct answer: 12179725 Recursive answer: 12179725 O(1) answer: 12134037 所以 O(1) 方法偏离了 ~45000 或 ~0.4% 这可能对您的应用程序足够好
@4386427：是的，你是对的。当您最初询问有关“性能与精度”的问题时，我认为任何最坏情况下的性能解决方案仍然是O(1)（即，肯定有几个操作，但不取决于输入的长度）。因此，我立即以“精度优先于性能”作为回应。但是后来我收到了多个“while”答案，不幸的是我无法在我的系统中真正允许这些答案。谢谢！
引用：“我假设任何最坏情况的性能解决方案仍然是 O(1)”好吧，如果我们想严格要求，我们不能真正谈论大 O 复杂性在这里，因为这里没有任何东西可以无限增长。上限是无符号类型中的位数，因此尽管使用了循环/递归，但执行时间有一个上限，这使得算法 O(1)。举个例子：在我的递归算法中，a 的值在递归调用之间至少除以 2。因此，您无法获得比位数更多的递归调用。所以在大 O 中，这将是 O(1)。

【解决方案4】：

您可以按如下方式取消素因数：

uint gcd(uint a, uint b) 
{
    uint c;
    while (b) 
    {
        a %= b;
        c = a;
        a = b;
        b = c;
    }
    return a;
}


uint func(uint a, uint b, uint c)
{
    uint temp = gcd(a, c);
    a = a/temp;
    c = c/temp;

    temp = gcd(b, c);
    b = b/temp;
    c = c/temp;

    // Since you are sure the result will fit in the variable, you can simply
    // return the expression you wanted after having those terms canceled.
    return a * b / c;
}

【讨论】：

不寻找基于 GCD 的解决方案。首先，因为它会产生非常糟糕的最坏情况性能。其次，因为不能保证能解决问题（例如，只要 GCD 为 1）。
如果 GCD 为 1，则结果不适合变量...
@vmp 我不认为你的说法是真的