python中的整数平方根答案

【问题标题】：Integer square root in pythonpython中的整数平方根
【发布时间】：2013-03-01 17:05:24
【问题描述】：

python 或标准库中是否有整数平方根？我希望它是准确的（即返回一个整数），如果没有解决方案就吠叫。

此刻我推出了自己的幼稚：

def isqrt(n):
    i = int(math.sqrt(n) + 0.5)
    if i**2 == n:
        return i
    raise ValueError('input was not a perfect square')

但它很丑陋，我真的不相信它用于大整数。如果我超过了该值，我可以遍历方块并放弃，但我认为做这样的事情会有点慢。另外我想我可能会重新发明轮子，像这样的东西肯定已经存在于 python 中......

【问题讨论】：

这不是经常出现的要求，所以没有内置的。您的解决方案没有任何问题，但我会做一个风格上的改变 - 反转 if 的条件，所以 return 排在最后。
不能因为使用浮点数而溢出/搞砸大输入吗？
@wim：它可以而且将会。
当n 变得太大而无法放入没有截断的浮点数时，它将溢出，即 2**53。即便如此，由于您对结果进行了四舍五入，它可能仍然有效。你真的要处理这么大的数字吗？
是的，我将使用远大于 2**53 的数字。

标签： python math integer sqrt

【解决方案1】：

Python 默认的math 库有一个整数平方根函数：

math.isqrt(n)

返回非负整数n的整数平方根。这是 n 的精确平方根的下限，或等价于满足 a² ≤ n 的最大整数 a。

【讨论】：

这适用于 Python 3.8 及更高版本。对于这个版本，这是最好的解决方案

【解决方案2】：

更新：标准库中的Python 3.8 has a math.isqrt function！

我在小 (0…2²²) 和大 (2⁵⁰⁰⁰¹) 输入上对每个（正确的）函数进行了基准测试。在这两种情况下，最明显的赢家是gmpy2.isqrt suggested by mathmandan，排在第一，其次是 Python 3.8 的math.isqrt，排在第三位的是ActiveState recipe linked by NPE。 ActiveState 配方有一堆可以用班次代替的部门，这使它更快一点（但仍然落后于本机函数）：

def isqrt(n):
    if n > 0:
        x = 1 << (n.bit_length() + 1 >> 1)
        while True:
            y = (x + n // x) >> 1
            if y >= x:
                return x
            x = y
    elif n == 0:
        return 0
    else:
        raise ValueError("square root not defined for negative numbers")

基准测试结果：

gmpy2.isqrt() (mathmandan)：小 0.08 µs，大 0.07 ms
int(gmpy2.isqrt())*：小 0.3 µs，大 0.07 ms
Python 3.8 math.isqrt：小 0.13 µs，大 0.9 ms
ActiveState（如上优化）：小 0.6 µs，大 17.0 ms
ActiveState (NPE)：小 1.0 µs，大 17.3 ms
castlebravo long-hand：小 4 µs，大 80 ms
mathmandan improved：小 2.7 µs，大 120 ms
martineau（与this correction）：小2.3 µs，大140 ms
nibot：小 8 µs，大 1000 ms
mathmandan：小 1.8 µs，大 2200 ms
castlebravo Newton’s method：小 1.5 µs，大 19000 ms
user448810：小 1.4 µs，大 20000 ms

（* 由于gmpy2.isqrt 返回一个gmpy2.mpz 对象，它的行为与int 大部分但不完全相同，您可能需要将其转换回int 以用于某些用途。）

【讨论】：

我没有把它们都检查出来，但是gmpy2.isqrt是一个C扩展模块，所以在Python中并没有真正做到。
@martineau 这也是我努力优化纯 Python 答案的原因。有些读者会想要纯 Python，有些人会想要最快的东西，不管技术如何，有些人会想要一些小而清晰的东西可以复制和粘贴，有些人会想要一个他们可以直接导入的解决方案，谁知道——我只是在一种可以比较它们的方法。
我是gmpy2 的维护者，我在gmpy2 中有几个相关功能的cmets。 gmpy2.is_square() 通常是确定一个数字是否是完美平方的最快方法。它执行一些非常快速的测试，可以快速识别大多数不完美的平方，并且只在需要时计算平方根。 gmpy2.isqrt_rem() 将返回整数平方根和余数。
你应该归功于 Fredrik Johansson 而不是 ActiveState。这里的代码只是他的答案的 Python 实现：stackoverflow.com/a/1624602/4311651
见math theory。

【解决方案3】：

我用循环比较了这里给出的不同方法：

for i in range (1000000): # 700 msec
    r=int(123456781234567**0.5+0.5)
    if r**2==123456781234567:rr=r
    else:rr=-1

发现这是最快的并且不需要数学导入。很长可能会失败，但是看看这个

15241576832799734552675677489**0.5 = 123456781234567.0

【讨论】：

请看这个 1000000000000000000000053646891100000000000000011928482406771 (=nextprime(10^29+5342347)*nextprime(10^31+2232788)) from @

【解决方案4】：

浮点数无法在计算机上精确表示。您可以在 python 的浮点数精度范围内测试所需的接近度设置 epsilon。

def isqrt(n):
    epsilon = .00000000001
    i = int(n**.5 + 0.5)
    if abs(i**2 - n) < epsilon:
        return i
    raise ValueError('input was not a perfect square')

【讨论】：

对于较大的 n 值，这似乎也失败了。牛顿的方法看起来很有前途或十进制。十进制解决方案。

【解决方案5】：

试试这个条件（没有额外的计算）：

def isqrt(n):
  i = math.sqrt(n)
  if i != int(i):
    raise ValueError('input was not a perfect square')  
  return i

如果您需要它返回 int（不是带有尾随零的 float），则分配第二个变量或计算 int(i) 两次。

【讨论】：

另一个可以是if not i.is_integer()。无论如何，这个函数对于大输入失败，其中数字不能表示为浮点数（甚至可能在此之前）。
尝试用(10**10)**2-1调用这个函数，发现它错误地认为参数是一个完美的正方形。

【解决方案6】：

这是一个非常简单的实现：

def i_sqrt(n):
    i = n.bit_length() >> 1    # i = floor( (1 + floor(log_2(n))) / 2 )
    m = 1 << i    # m = 2^i
    #
    # Fact: (2^(i + 1))^2 > n, so m has at least as many bits 
    # as the floor of the square root of n.
    #
    # Proof: (2^(i+1))^2 = 2^(2i + 2) >= 2^(floor(log_2(n)) + 2)
    # >= 2^(ceil(log_2(n) + 1) >= 2^(log_2(n) + 1) > 2^(log_2(n)) = n. QED.
    #
    while m*m > n:
        m >>= 1
        i -= 1
    for k in xrange(i-1, -1, -1):
        x = m | (1 << k)
        if x*x <= n:
            m = x
    return m

这只是一个二分搜索。将值m初始化为不超过平方根的2的最大幂，然后检查是否可以设置每个较小的位，同时保持结果不大于平方根。（按降序逐个检查位。）

对于n 的相当大的值（例如，大约10**6000，或大约20000 位），这似乎是：

比牛顿法实现更快described by user448810。
比gmpy2 中的my other answer 内置方法慢得多。
与长手平方根 described by nibot 相当，但速度稍慢。

所有这些方法都在这种大小的输入上成功，但在我的机器上，这个函数大约需要 1.5 秒，而 @Nibot 大约需要 0.9 秒，@user448810 大约需要 19 秒，而 gmpy2 内置方法需要更少超过一毫秒（！）。示例：

>>> import random
>>> import timeit
>>> import gmpy2
>>> r = random.getrandbits
>>> t = timeit.timeit
>>> t('i_sqrt(r(20000))', 'from __main__ import *', number = 5)/5. # This function
1.5102493192883117
>>> t('exact_sqrt(r(20000))', 'from __main__ import *', number = 5)/5. # Nibot
0.8952787937686366
>>> t('isqrt(r(20000))', 'from __main__ import *', number = 5)/5. # user448810
19.326695976676184
>>> t('gmpy2.isqrt(r(20000))', 'from __main__ import *', number = 5)/5. # gmpy2
0.0003599147067689046
>>> all(i_sqrt(n)==isqrt(n)==exact_sqrt(n)[0]==int(gmpy2.isqrt(n)) for n in (r(1500) for i in xrange(1500)))
True

这个函数可以很容易地推广，虽然它不是很好，因为我对m 的初始猜测没有那么精确：

def i_root(num, root, report_exactness = True):
    i = num.bit_length() / root
    m = 1 << i
    while m ** root < num:
        m <<= 1
        i += 1
    while m ** root > num:
        m >>= 1
        i -= 1
    for k in xrange(i-1, -1, -1):
        x = m | (1 << k)
        if x ** root <= num:
            m = x
    if report_exactness:
        return m, m ** root == num
    return m

但是，请注意gmpy2 也有一个i_root 方法。

事实上，这种方法可以适用于任何（非负的、递增的）函数f，以确定“f 的整数逆”。但是，要选择一个有效的初始值m，您仍然需要了解f。

编辑：感谢@Greggo 指出i_sqrt 函数可以重写以避免使用任何乘法。这产生了令人印象深刻的性能提升！

def improved_i_sqrt(n):
    assert n >= 0
    if n == 0:
        return 0
    i = n.bit_length() >> 1    # i = floor( (1 + floor(log_2(n))) / 2 )
    m = 1 << i    # m = 2^i
    #
    # Fact: (2^(i + 1))^2 > n, so m has at least as many bits
    # as the floor of the square root of n.
    #
    # Proof: (2^(i+1))^2 = 2^(2i + 2) >= 2^(floor(log_2(n)) + 2)
    # >= 2^(ceil(log_2(n) + 1) >= 2^(log_2(n) + 1) > 2^(log_2(n)) = n. QED.
    #
    while (m << i) > n: # (m<<i) = m*(2^i) = m*m
        m >>= 1
        i -= 1
    d = n - (m << i) # d = n-m^2
    for k in xrange(i-1, -1, -1):
        j = 1 << k
        new_diff = d - (((m<<1) | j) << k) # n-(m+2^k)^2 = n-m^2-2*m*2^k-2^(2k)
        if new_diff >= 0:
            d = new_diff
            m |= j
    return m

注意，通过构造，m << 1 的第 kth 位没有设置，因此可以使用按位或来实现 (m<<1) + (1<<k) 的加法。最终我将(2*m*(2**k) + 2**(2*k))写成(((m<<1) | (1<<k)) << k)，所以它是三个移位和一个按位或（然后是减法得到new_diff）。也许还有更有效的方法来获得这个？无论如何，这比乘以m*m 好得多！与上述比较：

>>> t('improved_i_sqrt(r(20000))', 'from __main__ import *', number = 5)/5.
0.10908999762373242
>>> all(improved_i_sqrt(n) == i_sqrt(n) for n in xrange(10**6))
True

【讨论】：

有一种方法可以消除平方根运算中的所有乘法，基本上您可以跟踪 n 和 m**2 之间的差异，并在 m 增加时向下调整该差异。如果您查找 sqrt 软件版本的源代码，例如netlib.org/fdlibm/e_sqrt.c 你会发现这个方法正在使用中（并且那个在 cmets 中有很好的解释）。
@greggo 太好了！我已经发布了整数平方根函数的改进（无乘法）版本——如果您有进一步的建议，请告诉我。非常感谢您的帮助！

【解决方案7】：

很抱歉回复晚了；我只是偶然发现了这个页面。万一将来有人访问此页面，python 模块 gmpy2 旨在处理非常大的输入，其中包括整数平方根函数。

例子：

>>> import gmpy2
>>> gmpy2.isqrt((10**100+1)**2)
mpz(10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001L)
>>> gmpy2.isqrt((10**100+1)**2 - 1)
mpz(10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000L)

当然，一切都会有“mpz”标签，但 mpz 与 int 兼容：

>>> gmpy2.mpz(3)*4
mpz(12)

>>> int(gmpy2.mpz(12))
12

请参阅my other answer，了解此方法相对于该问题的其他一些答案的性能的讨论。

下载：https://code.google.com/p/gmpy/

【讨论】：

【解决方案8】：

牛顿法在整数上效果很好：

def isqrt(n):
    x = n
    y = (x + 1) // 2
    while y < x:
        x = y
        y = (x + n // x) // 2
    return x

这将返回 x * x 不超过 n 的最大整数 x。如果你想检查结果是否正好是平方根，只需执行乘法来检查 n 是否是一个完美的平方。

我在my blog 讨论了这个算法以及其他三种计算平方根的算法。

【讨论】：

您可以使用 y = 1 << (n.bit_length()>>1) 获得更好的初始近似值（感谢 mathmandan）。
@greggo 这是个好主意，但它与 user448810 的其余算法不兼容。它使函数返回 100 的平方根 8。
是的，上述算法需要 x,y 估计值开始 > 而不是真正的平方根，然后向下计算。所以我的公式需要调整；在我的脑海中，y = 2 << ((n.bit_length()>>1) 应该可以工作，但可能是一种将其切得更细的方法。
@greggo 这几乎可以工作，但对于 n = 2、3、4 或 8 则失败。y = (2 ** ((n.bit_length()+1) // 2)) - 1 将适用于所有非负整数，包括 0。
y = 1<<b-1-(b+1)%2，其中b = math.frexp(n)[1]，等于ceil(log2(n))。适用于所有非负整数，包括 0。4 的最大幂

【解决方案9】：

长手平方根算法

事实证明，有一种计算平方根的算法可以手动计算，例如长除法。该算法的每次迭代都会产生结果平方根的一个数字，同时消耗您寻求其平方根的数字的两个数字。虽然算法的“长手”版本以十进制指定，但它适用于任何基础，二进制最容易实现，可能执行速度最快（取决于底层的 bignum 表示）。

由于该算法对数字逐位进行运算，因此对于任意大的完全平方会产生精确结果，而对于非完全平方，它可以产生与小数点右侧一样多的精度位数想要的。

“数学博士”网站上有两篇很好的文章解释了算法：

这是一个 Python 实现：

def exact_sqrt(x):
    """Calculate the square root of an arbitrarily large integer. 

    The result of exact_sqrt(x) is a tuple (a, r) such that a**2 + r = x, where
    a is the largest integer such that a**2 <= x, and r is the "remainder".  If
    x is a perfect square, then r will be zero.

    The algorithm used is the "long-hand square root" algorithm, as described at
    http://mathforum.org/library/drmath/view/52656.html

    Tobin Fricke 2014-04-23
    Max Planck Institute for Gravitational Physics
    Hannover, Germany
    """

    N = 0   # Problem so far
    a = 0   # Solution so far

    # We'll process the number two bits at a time, starting at the MSB
    L = x.bit_length()
    L += (L % 2)          # Round up to the next even number

    for i in xrange(L, -1, -1):

        # Get the next group of two bits
        n = (x >> (2*i)) & 0b11

        # Check whether we can reduce the remainder
        if ((N - a*a) << 2) + n >= (a<<2) + 1:
            b = 1
        else:
            b = 0

        a = (a << 1) | b   # Concatenate the next bit of the solution
        N = (N << 2) | n   # Concatenate the next bit of the problem

    return (a, N-a*a)

您可以轻松修改此函数以进行额外的迭代来计算平方根的小数部分。我最感兴趣的是计算大完全平方的根。

我不确定这与“整数牛顿法”算法相比如何。我怀疑牛顿的方法更快，因为它原则上可以在一次迭代中生成多位解，而“长手”算法每次迭代只生成一位解。

源代码：https://gist.github.com/tobin/11233492

【讨论】：

我为 Python3 重写了你的解决方案，让它快了 2.4 倍！最大的优化是重写 range 函数以进行一半的迭代，但我按摩每一行以挤出我能做到的任何性能。请查看我的版本here，感谢您提供了一个很好的起点。

【解决方案10】：

好像你可以这样检查：

if int(math.sqrt(n))**2 == n:
    print n, 'is a perfect square'

更新：

正如您所指出的，对于较大的 n 值，上述方法会失败。对于那些看起来很有希望的人，这是对示例 C 代码的改编，Martin Guy @ UKC，1985 年 6 月，用于维基百科文章 Methods of computing square roots 中提到的相对简单的二进制数字逐位计算方法：

from math import ceil, log

def isqrt(n):
    res = 0
    bit = 4**int(ceil(log(n, 4))) if n else 0  # smallest power of 4 >= the argument
    while bit:
        if n >= res + bit:
            n -= res + bit
            res = (res >> 1) + bit
        else:
            res >>= 1
        bit >>= 2
    return res

if __name__ == '__main__':
    from math import sqrt  # for comparison purposes

    for i in range(17)+[2**53, (10**100+1)**2]:
        is_perfect_sq = isqrt(i)**2 == i
        print '{:21,d}:  math.sqrt={:12,.7G}, isqrt={:10,d} {}'.format(
            i, sqrt(i), isqrt(i), '(perfect square)' if is_perfect_sq else '')

输出：

                    0:  math.sqrt=           0, isqrt=         0 (perfect square)
                    1:  math.sqrt=           1, isqrt=         1 (perfect square)
                    2:  math.sqrt=    1.414214, isqrt=         1
                    3:  math.sqrt=    1.732051, isqrt=         1
                    4:  math.sqrt=           2, isqrt=         2 (perfect square)
                    5:  math.sqrt=    2.236068, isqrt=         2
                    6:  math.sqrt=     2.44949, isqrt=         2
                    7:  math.sqrt=    2.645751, isqrt=         2
                    8:  math.sqrt=    2.828427, isqrt=         2
                    9:  math.sqrt=           3, isqrt=         3 (perfect square)
                   10:  math.sqrt=    3.162278, isqrt=         3
                   11:  math.sqrt=    3.316625, isqrt=         3
                   12:  math.sqrt=    3.464102, isqrt=         3
                   13:  math.sqrt=    3.605551, isqrt=         3
                   14:  math.sqrt=    3.741657, isqrt=         3
                   15:  math.sqrt=    3.872983, isqrt=         3
                   16:  math.sqrt=           4, isqrt=         4 (perfect square)
9,007,199,254,740,992:  math.sqrt=9.490627E+07, isqrt=94,906,265
100,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,020,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,001:  math.sqrt=      1E+100, isqrt=10,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,001 (perfect square)

【讨论】：

@wim：是的......我相信我对答案所做的最后一次更新解决了这个缺点，因此是一个非常有用的解决方案。
4 ** int(ceil(log(n, 4))) 构造仍然依赖于浮点数学，因此对于大输入可能会失败。请改用4 ** ((n - 1).bit_length() + 1 >> 1)，或者更好的是1 << ((n - 1).bit_length() + 1 & -2)。
@AndersKaseorg：我会调查的。 FWIW，bit_length() 在 Python 2 中不存在，这是用于开发当前答案中的代码的版本。
存在于2.7中。（并不是说我不相信你当时没有使用 2.7，但我希望我只是稍微乐观地说现在没有人为 pre-2.7 进行开发。）
(n - 1).bit_length() ↦ len(bin(n - 1)) - 2 是那个时代的典型解决方法。

【解决方案11】：

一种选择是使用decimal 模块，并以足够精确的浮点数进行：

import decimal

def isqrt(n):
    nd = decimal.Decimal(n)
    with decimal.localcontext() as ctx:
        ctx.prec = n.bit_length()
        i = int(nd.sqrt())
    if i**2 != n:
        raise ValueError('input was not a perfect square')
    return i

我认为应该可行：

>>> isqrt(1)
1
>>> isqrt(7**14) == 7**7
True
>>> isqrt(11**1000) == 11**500
True
>>> isqrt(11**1000+1)
Traceback (most recent call last):
  File "<ipython-input-121-e80953fb4d8e>", line 1, in <module>
    isqrt(11**1000+1)
  File "<ipython-input-100-dd91f704e2bd>", line 10, in isqrt
    raise ValueError('input was not a perfect square')
ValueError: input was not a perfect square

【讨论】：

您可以稍微改进这一点，让decimal 完成引发异常的工作。只需添加ctx.traps = [decimal.Rounded, decimal.Inexact] 设置陷阱条件，如果发生任何一种情况都会引发异常；无需自己测试和raise。您还可以通过将ctx.prec = n.bit_length() 更改为ctx.prec = int(math.ceil(math.log10(n))) + 1 之类的东西来显着提高性能，这会显着降低精度，但应该仍然始终提供足够的精度。当然，这假设您没有 math.isqrt 并且无法更新到 Python 3.8+ 来获取它。
@ShadowRanger 正如在其他一些 cmets 中所指出的，数学模块在这里不是一个选项，因为它不能处理任意大整数
@miracle173: math.isqrt 可以。其他 cmets（早于math.isqrt 的存在）指的是math.sqrt（或等效地，只是做** 0.5，两者都产生float 输出。math.isqrt 是纯的整数到整数，处理任意大的整数就好了。
@ShadowRanger 是的，你是对的。 math.isqrt 能够处理任意大的整数

【解决方案12】：

您的函数因大量输入而失败：

In [26]: isqrt((10**100+1)**2)

ValueError: input was not a perfect square

有一个recipe on the ActiveState site，希望它更可靠，因为它只使用整数数学。它基于早期的 StackOverflow 问题：Writing your own square root function

【讨论】：