【问题标题】：Tricking numpy/python into representing very large and very small numbers欺骗 numpy/python 来表示非常大和非常小的数字
【发布时间】：2015-08-31 01:55:42
【问题描述】：

我需要在低至 -150 的范围内计算以下函数的积分：

import numpy as np
from scipy.special import ndtr

def my_func(x):
    return np.exp(x ** 2) * 2 * ndtr(x * np.sqrt(2))

问题在于这部分功能

np.exp(x ** 2)

趋向于无穷大 -- 对于 x 的值小于大约 -26，我得到 inf。

还有这部分功能

2 * ndtr(x * np.sqrt(2))

相当于

from scipy.special import erf

1 + erf(x)

趋向于 0。

所以，一个非常非常大的数字乘以一个非常非常小的数字应该给我一个合理大小的数字——但是，python 给我的是nan。

我可以做些什么来规避这个问题？

【问题讨论】：

你确定你的积分没有解析解吗？
@ReblochonMasque 不，我不是。你知道我在哪里可以找到吗？我当然没有自己的数学能力。
这样的事情有帮助吗？ np.exp(x**2 + np.log(2) + np.log(ndtr(x*np.sqrt(2))))
这是否会显着影响积分值？或者，您可以通过分析找出log(ndtr(x)) 是什么，然后将其分解为exp 术语......你明白了。
您可以使用scipy.special.log_ndtr 消除@askewchan 解决方案中的log 调用

标签： python numpy floating-point numbers scipy

【解决方案1】：

我认为@askewchan 的解决方案和scipy.special.log_ndtr 的组合可以解决问题：

from scipy.special import log_ndtr

_log2 = np.log(2)
_sqrt2 = np.sqrt(2)

def my_func(x):
    return np.exp(x ** 2) * 2 * ndtr(x * np.sqrt(2))

def my_func2(x):
    return np.exp(x * x + _log2 + log_ndtr(x * _sqrt2))

print(my_func(-150))
# nan

print(my_func2(-150)
# 0.0037611803122451198

对于x <= -20，log_ndtr(x)uses a Taylor series expansion of the error function to iteratively compute the log CDF directly，这比简单地采用log(ndtr(x))在数值上稳定得多。

更新

正如您在 cmets 中提到的，如果 x 足够大，exp 也会溢出。虽然您可以使用 mpmath.exp 解决此问题，但更简单、更快捷的方法是转换为 np.longdouble，在我的机器上，它可以表示高达 1.189731495357231765e+4932 的值：

import mpmath

def my_func3(x):
    return mpmath.exp(x * x + _log2 + log_ndtr(x * _sqrt2))

def my_func4(x):
    return np.exp(np.float128(x * x + _log2 + log_ndtr(x * _sqrt2)))

print(my_func2(50))
# inf

print(my_func3(50))
# mpf('1.0895188633566085e+1086')

print(my_func4(50))
# 1.0895188633566084842e+1086

%timeit my_func3(50)
# The slowest run took 8.01 times longer than the fastest. This could mean that
# an intermediate result is being cached  100000 loops, best of 3: 15.5 µs per
# loop

%timeit my_func4(50)
# The slowest run took 11.11 times longer than the fastest. This could mean
# that an intermediate result is being cached  100000 loops, best of 3: 2.9 µs
# per loop

【讨论】：

对于这个用例可能是一个小注释，但对于标量，math.log 和 math.sqrt 比 np.log 和 np.sqrt 快大约十倍。甚至更快，log2 = math.log(2) 在函数定义之外。对我来说，这使调用quad 的速度提高了两倍
@askewchan 好点 - 我并没有真正考虑性能
这很棒。非常感谢。我现在正在测试它。 my_func2(50) 引发 RunTimeWarning：“exp 中遇到溢出。” np.exp 似乎无法处理大于 709 的输入。（math.exp 也一样。）知道如何解决这个问题吗？
我要研究积分的解析解，因为溢出似乎是死胡同。
@SalvadorDali 我不确定你在问什么。我所做的只是my_func2(50)。我没有计算integral(log(my_func2(50))) 或打算计算它。我想我的假设是，如果此时无法评估函数本身，quad 将无法工作。我想我首先尝试了quad，但它失败了。（或者你的问题是针对@ali_m 的？）

【解决方案2】：

已经有这样的功能：erfcx。我认为erfcx(-x) 应该给你你想要的被积函数（注意1+erf(x)=erfc(-x)）。

【讨论】：

erf(x) 是一个奇函数：erf(-x) = -erf(x)。所以erfc(-x) = 1 - erf(-x) = 1 + erf(x)（第一个相等是erfc的定义，第二个使用奇对称）。
请注意，通过翻转参数的符号，您可以将quad(my_func, -14, -4) 替换为quad(erfcx, 4, 14)。
@WarrenWeckesser 谢谢。最后一个问题：erfcx 定义为exp(x ** 2) * erfc(x)。如果我做erfcx(-x)，那不是给我exp(-x ** 2) * erfc(-x)，而我想要的是exp(x ** 2) * erfc(-x)？或者，等等，我猜它给了我exp((-x) ** 2) * erf((-x))，这是我想要的。

【解决方案3】：

不确定这会有多大帮助，但这里有一些想法太长了，无法发表评论。

您需要计算的积分，您的correctly identified 将是。打开括号，您可以整合总和的两个部分。

Scipy 有这个imaginary error function implemented

第二部分更难：

这是generalized hypergeometric function。可悲的是，它看起来像 scipy does not have an implementation of it，但 this package 声称确实如此。

这里我使用了不带常数的不定积分，知道了from to 的值，就很清楚如何使用定积分了。

【讨论】：

这个策略似乎把我搞砸了，scipy.special.erfi(50) 的计算结果为 inf。
给你的数学问题：我将如何处理我的积分是“半定”的情况——即，我有一个上限，但我的下限是-inf。我不确定如何评估您的问题 at -inf. 中的函数
@dbliss scipy.special.erfi(50) 是 inf。如果b = super huge，那么很可能b + a 也是超大的，其中a > 0（因为如果x > 1，第二个积分也是正数）。不太确定你想如何计算它。
(1) 正确，但 mpmath.erfi(50) 返回 6 * 10 ** 1083。考虑到我最终对积分做了什么，很高兴在这里有一个精确的表示。 (2) 我需要做的是评估我们一直在谈论的从-inf 到x 的积分，其中x 从函数调用到从-150 到50 的函数调用不同。我不知道该怎么做。
^ 没关系。我最近的两个 cmets 确实是一个与计算双积分有关的单独问题。我将尝试自己解决这个问题，如果我做不到，请发布一个新问题。