为什么 strconv.ParseUint 比 strconv.Atoi 慢？答案

【问题标题】：Why is strconv.ParseUint so slow compared to strconv.Atoi?为什么 strconv.ParseUint 比 strconv.Atoi 慢？
【发布时间】：2023-06-29 07:20:01
【问题描述】：

我正在使用以下代码对从 string 到 int 和 uint 的解组进行基准测试：

package main

import (
    "strconv"
    "testing"
)

func BenchmarkUnmarshalInt(b *testing.B) {
    for i := 0; i < b.N; i++ {
        UnmarshalInt("123456")
    }
}

func BenchmarkUnmarshalUint(b *testing.B) {
    for i := 0; i < b.N; i++ {
        UnmarshalUint("123456")
    }
}

func UnmarshalInt(v string) int {
    i, _ := strconv.Atoi(v)
    return i
}

func UnmarshalUint(v string) uint {
    i, _ := strconv.ParseUint(v, 10, 64)
    return uint(i)
}

结果：

Running tool: C:\Go\bin\go.exe test -benchmem -run=^$ myBench/main -bench .

goos: windows
goarch: amd64
pkg: myBench/main
BenchmarkUnmarshalInt-8     99994166            11.7 ns/op         0 B/op          0 allocs/op
BenchmarkUnmarshalUint-8    54550413            21.0 ns/op         0 B/op          0 allocs/op

是否有可能第二个 (uint) 的速度几乎是第一个 (int) 的两倍？

【问题讨论】：

标签： performance go type-conversion benchmarking microbenchmark

【解决方案1】：

是的，这是可能的。 strconv.Atoi 在输入字符串长度小于 19 时（如果 int 为 32 位，则为 10）具有快速路径。这使它更快，因为它不需要检查溢出。

如果您将测试编号更改为“1234567890123456789”（假设为 64 位 int），那么您的 int 基准测试会比 uint 基准测试稍慢，因为无法使用快速路径。在我的机器上，签名版本需要 37.6 ns/op，而未签名版本需要 31.5 ns/op。

这是修改后的基准代码（注意，我添加了一个变量来总结解析结果，以防编译器变得聪明并优化它）。

package main

import (
        "fmt"
        "strconv"
        "testing"
)

const X = "1234567890123456789"

func BenchmarkUnmarshalInt(b *testing.B) {
        var T int
        for i := 0; i < b.N; i++ {
                T += UnmarshalInt(X)
        }
        fmt.Println(T)
}

func BenchmarkUnmarshalUint(b *testing.B) {
        var T uint
        for i := 0; i < b.N; i++ {
                T += UnmarshalUint(X)
        }
        fmt.Println(T)
}

func UnmarshalInt(v string) int {
        i, _ := strconv.Atoi(v)
        return i
}

func UnmarshalUint(v string) uint {
        i, _ := strconv.ParseUint(v, 10, 64)
        return uint(i)
}

供参考，标准库中strconv.Atoi的代码目前如下：

func Atoi(s string) (int, error) {
    const fnAtoi = "Atoi"

    sLen := len(s)
    if intSize == 32 && (0 < sLen && sLen < 10) ||
        intSize == 64 && (0 < sLen && sLen < 19) {
        // Fast path for small integers that fit int type.
        s0 := s
        if s[0] == '-' || s[0] == '+' {
            s = s[1:]
            if len(s) < 1 {
                return 0, &NumError{fnAtoi, s0, ErrSyntax}
            }
        }

        n := 0
        for _, ch := range []byte(s) {
            ch -= '0'
            if ch > 9 {
                return 0, &NumError{fnAtoi, s0, ErrSyntax}
            }
            n = n*10 + int(ch)
        }
        if s0[0] == '-' {
            n = -n
        }
        return n, nil
    }

    // Slow path for invalid, big, or underscored integers.
    i64, err := ParseInt(s, 10, 0)
    if nerr, ok := err.(*NumError); ok {
        nerr.Func = fnAtoi
    }
    return int(i64), err
}

【讨论】：

哦。非常感谢。我正在决定是否使用int 或uint 作为我项目的模型ID。 ID 将由 JSON/GraphQL 适配器多次编组和解组，因此如果我将它们放在 19 个字符以下，它肯定会快得多。我对吗？你有什么建议？ Gorm 2 的新方法是使用uints 作为 ID。
10 纳秒不会对您的整体解组时间产生任何影响。
您对 ID 有什么建议？ int 或 uint, @Jimb?
在微优化此细节之前最好对整个应用程序进行基准测试和分析。如果编组 ID 是瓶颈，请考虑在 uint 或 int 之间切换。
从字面上看并不重要。如果这些是不用于任何计算的不透明标识符，那么 uint 就可以了。