这个clang优化是一个错误吗？答案

【问题标题】：Is this clang optimization a bug?这个clang优化是一个错误吗？
【发布时间】：2018-09-20 12:35:36
【问题描述】：

在 OSX High Sierra 上使用 clang 编译带有 -O3 的代码时遇到了一个有趣的问题。代码是这样的：

#include <stdint.h>
#include <limits.h> /* for CHAR_BIT */
#include <stdio.h> /* for printf() */
#include <stddef.h> /* for size_t */

uint64_t get_morton_code(uint16_t x, uint16_t y, uint16_t z)
{
    /* Returns the number formed by interleaving the bits in x, y, and z, also
     * known as the morton code.
     *
     * See https://graphics.stanford.edu/~seander/bithacks.html#InterleaveTableO
bvious.
     */
    size_t i;
    uint64_t a = 0;

    for (i = 0; i < sizeof(x)*CHAR_BIT; i++) {
        a |= (x & 1U << i) << (2*i) | (y & 1U << i) << (2*i + 1) | (z & 1U << i)
 << (2*i + 2);
    }

    return a;
}

int main(int argc, char **argv)
{
    printf("get_morton_code(99,159,46) = %llu\n", get_morton_code(99,159,46));
    return 0;
}

使用cc -O1 -o test_morton_code test_morton_code.c 编译时，我得到以下输出：

get_morton_code(99,159,46) = 4631995

这是正确的。但是，使用cc -O3 -o test_morton_code test_morton_code.c 编译时：

get_morton_code(99,159,46) = 4294967295

这是错误的。

同样奇怪的是，当从-O2 切换到-O3 时，这个错误出现在我的代码中，而在上面的最小工作示例中，它出现在从-O1 切换到-O2 时。

这是编译器优化中的一个错误，还是我做了一些愚蠢的事情，只有在编译器更积极地优化时才会出现？

我正在使用以下版本的 clang：

snotdaqs-iMac:snoFitter snoperator$ cc --version
Apple LLVM version 9.1.0 (clang-902.0.39.1)
Target: x86_64-apple-darwin17.5.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

【问题讨论】：

标签： c optimization clang compiler-optimization

【解决方案1】：

UndefinedBehaviorSanitizer 对发现此类错误非常有帮助：

$ clang -fsanitize=undefined -O3 o3.c
$ ./a.out
o3.c:19:2: runtime error: shift exponent 32 is too large for 32-bit type 'unsigned int'
get_morton_code(99,159,46) = 4294967295

一种可能的解决方法是将1Us 替换为1ULL，unsigned long long 至少为 64 位并且可以移动那么远。

【讨论】：

哇！这真是太棒了！

【解决方案2】：

当i 在循环中为15 时，2*i+2 为32，并且您将unsigned int 移动unsigned int 中的位数，这是未定义的。

您显然打算在 64 位字段中工作，因此将移位的左侧转换为 uint64_t。

uint64_t 的正确printf 格式是get_morton_code(99,159,46) = %" PRIu64 "\n"。 PRIu64 在 <inttypes.h> 标头中定义。

【讨论】：

谢谢！我认为2*i 是正确的。您选择了 i 的额外因子，因为您要掩蔽的位已经在 ith 位置。
PRId64 是 d 的对应物。您应该使用PRIu64 代替uint64_t。
@a3f：谢谢，已修复。为此，我会推荐PRIx64，因为它使查看这些位变得非常容易。
你能引用标准，说在无符号的 shift 上溢出是未定义的行为吗？我很好奇，因为定义了 unsigned 上的溢出我不明白为什么用移位它不会。