整数溢出

整数溢出大家都不陌生,可能陌生的是 gcc 对带符号整数溢出的处理。

先看标准怎么说,对于无符号整数,标准如此描述:

A computation involving unsigned operands can never overflow,
because a result that cannot be represented by the resulting unsigned
integer type is reduced modulo the number that is one greater than the
largest value that can be represented by the resulting type.

换句话说,无符号整数的溢出总是 wrap 回去的,但依旧在无符号整数的表示范围内。

对于带符号整数的溢出,标准规定是未定义行为:

If an exceptional condition occurs during the evaluation of an expression (that is, if the
result is not mathematically defined or not in the range of representable values for its
type), the behavior is undefined.

实际上(使用二进制补码),如果溢出的话,两个正数结果会变成负数,超出了正数本身的范围。wikipedia 中如此说:

Most computers distinguish between two kinds of overflow conditions. A carry occurs when the result of an addition or subtraction, considering the operands and result as unsigned numbers, does not fit in the result. Therefore, it is useful to check the carry flag after adding or subtracting numbers that are interpreted as unsigned values. An overflow proper occurs when the result does not have the sign that one would predict from the signs of the operands (e.g. a negative result when adding two positive numbers). Therefore, it is useful to check the overflow flag after adding or subtracting numbers that are represented in two’s complement form (i.e. they are considered signed numbers).

正是因为“未定义”,所以 gcc 会大胆地做出一些可能令你吃惊的优化。看下面的例子:

[c]

include

include

int wrap(int a) {
return (a + 1 > a);
}

int main(void) {
printf(“%sn”, wrap(INT_MAX) ? “no wrap” : “wrapped”);
}
[/c]

很好理解,可是 gcc 完全可以如此生成 wrap() 函数的代码:

00000000004004f0 :
4004f0: b8 01 00 00 00 mov $0x1,%eax
4004f5: c3 retq

因为 gcc 假设了带符号的整数从来不会溢出,所以”a + 1 > a” 总会是真,所以直接返回1!这么做完全符合标准,但不符合我们的期待。我们期望 wrap() 函数能够检测是否会溢出。

为此,gcc 引入了几个相关的命令行选项:-fwrapv,-fstrict-overflow/-fno-strict-overflow,-Wstrict-overflow。简单地说,-fstrict-overflow 就是告诉编译器,带符号整数溢出是未定义的,你可以假设它不会发生,从而继续做优化。而 -fwrapv 是说,带符号整数的溢出是定义好的,就是 wrap,你按照这个定义来编译。gcc 文档中提到:

Using -fwrapv means that integer signed overflow is fully defined: it wraps. When -fwrapv is used, there is no difference between -fstrict-overflow and -fno-strict-overflow for integers. With -fwrapv certain types of overflow are permitted. For example, if the compiler gets an overflow when doing arithmetic on constants, the overflowed value can still be used with fwrapv, but not otherwise.

我们加上 -fno-strict-overflow 之后再去编译上面的代码,结果明显不同:

00000000004004f0 :
4004f0: 8d 47 01 lea 0x1(%rdi),%eax
4004f3: 39 f8 cmp %edi,%eax
4004f5: 0f 9f c0 setg %al
4004f8: 0f b6 c0 movzbl %al,%eax
4004fb: c3 retq

而对于使用二进制补码的机器来说,-fwrapv 和 -fno-strict-overflow 只有细微的区别: -fno-strict-overflow 只是说不去优化,而-fwrapv 明确地定义了溢出的行为。

Linux 内核中曾经出现一个相关的 bug,是 Linus 大神出手搞定了,他说道:

It looks like ‘fwrapv’ generates more temporaries (possibly for the code
that treies to enforce the exact twos-complement behavior) that then all
get optimized back out again. The differences seem to be in the temporary
variable numbers etc, not in the actual code.

So fwrapv really is different from fno-strict-pverflow, and disturbs the
code generation more.

IOW, I’m convinced we should never use fwrapv. It’s clearly a buggy piece
of sh*t, as shown by our 4.1.x experiences. We should use
-fno-strict-overflow.

所以编译 Linux 内核时使用的是 -fno-strict-overflow。