什么是 16 字节有符号整数数据类型？”答案

【问题标题】：What is a 16 byte signed integer data type?"什么是 16 字节有符号整数数据类型？”
【发布时间】：2018-12-18 20:05:37
【问题描述】：

我制作了这个程序来测试任意整数文字被评估为哪些数据类型。这个程序的灵感来自于阅读 StackOverflow 上的其他一些问题。

How do I define a constant equal to -2147483648?

Why do we define INT_MIN as -INT_MAX - 1?

在这些问题中，我们有一个问题：程序员想把INT_MIN写成-2^31，但是2^31实际上是一个字面量和@ 987654326@ 是一元否定运算符。由于 INT_MAX 通常是 2^31 - 1 具有 32 位 int，因此文字 2^31 不能表示为 int，因此它被提升为更大的数据类型。第三个问题的第二个答案有一个图表，根据该图表确定整数文字的数据类型。编译器从顶部向下遍历列表，直到找到适合文字的数据类型。

Suffix Decimal constants none int long int long long int

================================================ ============================

在我的小程序中，我定义了一个宏，它将以 C 字符串的形式返回变量、文字或表达式的“名称”。基本上，它返回在宏内部传递的文本，与您在代码编辑器中看到的完全一样。我用它来打印文字表达式。

我想确定表达式的数据类型，它的计算结果。我必须对我如何做到这一点有点聪明。我们如何确定 C 中变量或表达式的数据类型？我得出的结论是，只需要两个“位”信息：数据类型的宽度（以字节为单位）和数据类型的符号。

我使用sizeof() 运算符来确定数据类型的宽度（以字节为单位）。我还使用另一个宏来确定数据类型是否已签名。 typeof() 是一个 GNU 编译器扩展，它返回变量或表达式的数据类型。但我无法读取数据类型。我将-1 类型转换为该数据类型。如果它是有符号数据类型，它仍然是-1，如果它是一个无符号数据类型，它将成为该数据类型的UINT_MAX。

#include <stdio.h>   /* C standard input/output - for printf()     */
#include <stdlib.h>  /* C standard library      - for EXIT_SUCCESS */

/**
 * Returns the name of the variable or expression passed in as a string.
 */
#define NAME(x) #x

/**
 * Returns 1 if the passed in expression is a signed type.
 * -1 is cast to the type of the expression.
 * If it is signed, -1 < 0 == 1 (TRUE)
 * If it is unsigned, UMax < 0 == 0 (FALSE)
 */
#define IS_SIGNED_TYPE(x) ((typeof(x))-1 < 0)

int main(void)
{

    /* What data type is the literal -9223372036854775808? */

    printf("The literal is %s\n", NAME(-9223372036854775808));
    printf("The literal takes up %u bytes\n", sizeof(-9223372036854775808));
    if (IS_SIGNED_TYPE(-9223372036854775808))
        printf("The literal is of a signed type.\n");
    else
        printf("The literal is of an unsigned type.\n");

    return EXIT_SUCCESS;
}

如您所见，我正在测试 -2^63 以查看它是什么数据类型。问题在于，在 ISO C90 中，整数文字的“最大”数据类型似乎是 long long int，如果我们可以相信图表的话。众所周知，long long int 在现代 64 位系统上的数值范围为 -2^63 到 2^63 - 1。但是，上面的 - 是一元否定运算符，实际上并不是整数文字的一部分。我正在尝试确定 2^63 的数据类型，这对于long long int 来说太大了。我试图在 C 的类型系统中引起错误。这是故意的，仅用于教育目的。

我正在编译和运行程序。我使用-std=gnu99 而不是-std=c99，因为我使用的是typeof()，这是一个GNU 编译器扩展，实际上并不是ISO C99 标准的一部分。我得到以下输出：

$ gcc -m64 -std=gnu99 -pedantic experiment.c
$
$ ./a.out
The literal is -9223372036854775808
The literal takes up 16 bytes
The literal is of a signed type.

我看到等价于 2^63 的整数文字计算为 16 字节有符号整数类型！据我所知，C 编程语言中没有这种数据类型。我也不知道有任何英特尔 x86_64 处理器有一个 16 字节的寄存器来存储这样的右值。如果我错了，请纠正我。解释这里发生了什么？为什么没有溢出？另外，是否可以在 C 中定义 16 字节的数据类型？你会怎么做？

【问题讨论】：

宽度和符号不足以识别类型，即使是整数类型。例如，long 的大小几乎总是与int 或long long 相同。
好吧，显然long 和long long 都不是16 字节！为了识别整数数据类型，您有什么建议？
我相信某些版本的 gcc 实现了 128 位类型，尽管它通常是“在软件中模拟的”，因为您是对的，大多数 CPU 没有寄存器，而 ALU 的操作在这样的大小上.
如果您想根据表达式的类型做出决定，_Generic 将是标准方法。
打印sizeof的结果，你应该使用%zu

标签： c++ c types integer

【解决方案1】：

您的平台可能有__int128 并且9223372036854775808 正在获取该类型。

让 C 编译器打印类型名的一种简单方法是：

int main(void)
{

    #define LITERAL (-9223372036854775808)
    _Generic(LITERAL, struct {char x;}/*can't ever match*/: "");

}

在我的 x86_64 Linux 上，上面生成了一个 error: ‘_Generic’ selector of type ‘__int128’ is not compatible with any association 错误信息，暗示__int128 确实是字面量的类型。

（这样，warning: integer constant is so large that it is unsigned 是错误的。好吧，gcc 并不完美。）

【讨论】：

@M.M 因为类型显然是签名的。
@M.M O.P 的问题甚至包括一个签名测试，表明生成的类型是签名的。

【解决方案2】：

经过一番挖掘，这是我发现的。我将代码转换为 C++，假设在这种情况下 C 和 C++ 的行为相似。我想创建一个模板函数来接受任何数据类型。我使用__PRETTY_FUNCTION__，它是一个 GNU 编译器扩展，它返回一个包含函数“原型”的 C 字符串，我的意思是返回类型、名称和输入的形式参数。我对形式参数感兴趣。使用这种技术，我能够准确地确定传入的表达式的数据类型，而无需猜测！

/**
 * This is a templated function.
 * It accepts a value "object" of any data type, which is labeled as "T".
 *
 * The __PRETTY_FUNCTION__ is a GNU compiler extension which is actually
 * a C-string that evaluates to the "pretty" name of a function,
 * means including the function's return type and the types of its
 * formal parameters.
 *
 * I'm using __PRETTY_FUNCTION__ to determine the data type of the passed
 * in expression to the function, during the runtime!
 */
template<typename T>
void foo(T value)
{
    std::cout << __PRETTY_FUNCTION__ << std::endl;
}

foo(5);
foo(-9223372036854775808);

编译运行，我得到这个输出：

$ g++ -m64 -std=c++11 experiment2.cpp
$
$ ./a.out
void foo(T) [with T = int]
void foo(T) [with T = __int128]

我看到传入的表达式是__int128 类型。显然，这是一个 GNU 编译器特定的扩展，而不是 C 标准的一部分。

Why isn't there int128_t?

https://gcc.gnu.org/onlinedocs/gcc-4.6.4/gcc/_005f_005fint128.html

https://gcc.gnu.org/onlinedocs/gcc-4.6.4/gcc/C-Extensions.html#C-Extensions

How is a 16 byte data type stored on a 64 bit machine

【讨论】：

对于 C++，Scott Mayers 建议在已声明但未定义的模板 template<class T> class TP; TP<decltype(LITERAL)> tp; 上打印带有编译器错误的类型。我的回答基本上是使用该策略的基于_Generic 的C 版本。（无论如何，您可能不想用 C++ 代码回答带有 C 标记的问题。每个人都知道我们 C 程序员经常讨厌 C++ :D）。
@PSkocik 没问题，添加了c++ 标签！无论如何，我倾向于将 C++ 特性与 C 特性互换使用。如果我想使用 C++ 中 C 缺乏的东西，我只需执行 #ifdef __cplusplus 并将其直接放入 .c 源代码文件，或者我使用 extern "C" 将 C 和 C++ 代码链接在一起。好吧，只需使用您手中可用的工具即可。
对我来说，这就像在 C 中使用任何 GNU 编译器扩展或内联汇编代码一样。所以是的，它不是纯 C。
而C++ 是一种编程语言，++C 是C 代码，混合了c++ 代码和各种非标准扩展！ :D

【解决方案3】：

启用所有警告后-Wall gcc 将发出warning: integer constant is so large that it is unsigned 警告。 Gcc 将此整数常量分配给 __int128 和 sizeof(__int128) = 16 类型。
您可以使用 _Generic 宏进行检查：

#define typestring(v) _Generic((v), \
    long long: "long long", \
    unsigned long long: "unsigned long long", \
    __int128: "__int128" \
    )

int main()
{
    printf("Type is %s\n", typestring(-9223372036854775808));
    return 0;
}

Type is __int128

或者来自 printf 的警告：

int main() {
    printf("%s", -9223372036854775808);
    return 0;
}

编译时会出现警告：

warning: format '%s' expects argument of type 'char *', but argument 2 has type '__int128' [-Wformat=]

【讨论】：

找出传入表达式或文字的数据类型的巧妙方法！
不涉及演员阵容。我想你的意思是说 gcc 使用 __int128 的类型作为这个常量。
对，它只有__int128 类型。