如何在 C++ 中找到模 1000000007 的大数阶乘？答案

【问题标题】：How do I find factorials of large numbers modulo 1000000007 in C++?如何在 C++ 中找到模 1000000007 的大数阶乘？
【发布时间】：2015-03-20 13:59:04
【问题描述】：

求大数模 1000000007 的阶乘

在 Python 或 Java 中没问题，但在 C++ 中存在溢出约束。

这是我尝试过的代码：

#include<iostream>
 #define ull long long int
 #define mod 1000000007
 ull fact(ull n)
 {
           if(n==1 || n==0) return 1;
           return ((n%mod)*(fact(n-1)%mod)%mod);
 }
 int main()
 {
              cout<<fact(50000)<<endl;
              return 0;
 }

但是输出无效。

【问题讨论】：

"但是输出无效。"想详细说明一下吗？
我猜你正在溢出调用堆栈...
因子很快溢出long long。这就是为什么您需要使用任意精度数学库的原因。 C++ 有很多。
@HansPassant：但这是对一些足够小的 32 位值取模的数字。不应该有任何溢出。
您的代码works for me，给出的结果与更明智的迭代版本相同。应该没有溢出，因为在减少模 mod 之后，两个乘法器都适合 32 位，所以乘积将适合 64（并且long long 至少有 64 位）。递归可能会溢出您的堆栈。 “无效”到底是什么意思？你会得到什么结果，你期望什么（以及为什么）？

标签： c++ c math coding-style modulo

【解决方案1】：

检查此代码。应该没有任何问题，因为 unsigned long long 可以轻松存储任何模值 10^9+7。我的意思是，如果您使用的是模块化值而不是实际值，那么您为什么还要关心它呢？（已知ull可以存储10^9+7）。

 ull ans;
    ull fact(int n)
    {
        if(n<INT_MAX)
        {
        ans=1;
        for(int i=2;i<=n;i++)
         ans=(ans*i)%mod;
         return ans;
        }
    }

这只会做阶乘。

这里使用 nINT_MAX 条件，因为如果我们不使用它，那么如果 n=INT_MAX 则 for 循环的索引增量（i++）可能会导致 INT_MAX 的值增加，这将使其变为 0。所以条件永远不会是假的，它会陷入无限循环。

注意：如果您想在 c++ 中精确计算阶乘，您可以使用 1000 个字符的数组，其中每个字符代表一个数字。然后你将逐渐乘以得到结果。 n*(n-1)*..2*1

注意：如果您进行多次递归调用，则可能会导致堆栈内存溢出，因为每个函数调用都会导致推送一个帧（包含它的返回点等）。

【讨论】：

@BenjaminTrent.：这就是添加代码的原因。无需进行任何将帧推入堆栈并导致堆栈溢出的恢复调用。简单的迭代就可以达到效果。
@BenjaminTrent.：我已经修改了我的答案，以便更清楚。我也提到了你所讨论的问题。希望它使答案完整。
@rcgldr.：我已经检查过了。 10^9+7 是素数。您可能知道这是一个众所周知的数字，经常出现在许多竞争性编程问题中，否则在某些情况下它会给出错误的输出。
次要：使用for(int i=n;i >=2; i--)，然后使用ull fact(INT_MAX)避免无限循环
@chux.: 你能详细说明一下吗？（for循环之一）。那你是说检查 if(n

【解决方案2】：

如果x!! = 1 * 3 * 5 * 7 * 9 * 11 * ...，那么2x! = 2x!! * 2^x * x!。

这为我们提供了更有效的阶乘算法。

template<ull mod>
struct fast_fact {
  ull m( ull a, ull b ) const {
    ull r = (a*b)%mod;
    return r;
  }
  template<class...Ts>
  ull m( ull a, ull b, Ts...ts ) const {
    return m( m( a, b ), ts... );
  }
  // calculates x!!, ie 1*3*5*7*...
  ull double_fact( ull x ) const {
    ull ret = 1;
    for (ull i = 3; i < x; i+=2) {
      ret = m(i,ret);
    }
    return ret;
  }
  // calculate 2^2^n for n=0...bits in ull
  // a pointer to this is stored statically to make calculating
  // 2^k faster:
  ull const* get_pows() const {
    static ull retval[ sizeof(ull)*8 ] = {2%mod};
    for (int i = 1; i < sizeof(ull)*8; ++i) {
      retval[i] = m(retval[i-1],retval[i-1]);
    }
    return retval;
  }
  // calculate 2^x.  We decompose x into bits
  // and multiply together the 2^2^i for each bit i
  // that is set in x:
  ull pow_2( ull x ) const {
    static ull const* pows = get_pows();
    ull retval = 1;
    for (int i = 0; x; ++i, (x=x/2)){
      if (x&1) retval = m(retval, pows[i]);
    }
    return retval;
  }
  // the actual calculation:
  ull operator()( ull x ) const {
    x = x%mod;
    if (x==0) return 1;
    ull result = 1;
     // odd case:
    if (x&1) result = m( (*this)(x-1), x );
    else result = m( double_fact(x), pow_2(x/2), (*this)(x/2) );
    return result;
  }
};
template<ull mod>
ull factorial_mod( ull x ) {
  return fast_fact<mod>()(x);
}

live example

更快的版本可以重复使用 x!! 的结果，因为这些结果经常重复。

Caching live example，通过合理巧妙地缓存x!! 值，对于大 n 而言，速度大约是上述速度的 2 倍。每次调用double_factorial(n) 都会创建 lg k 个缓存条目，其中 k 是 n 与最大的旧缓存条目之间的距离。因为 k 以 n 为界。在实践中，这似乎在第一次调用后将加权“缓存未命中”减少到几乎为零：n!! 的第一次计算注入了足够的缓存条目，我们不会在以后计算 !! 时花费大量时间。

这个优化版本比简单的迭代实现快了大约 41%（基本上所有时间都花在计算第一个 n!!）。

进一步的改进可能包括使第一次x!! 计算更快，优化缓存可能会带来微小的改进。下一个问题：如何让x!! 更快？

【讨论】：