May 25 2012

Floyd's cycle-finding algorithm

Floyd’s cycle-finding algorithm 是一个经典的算法，用来检测一个链表中是否有环这种问题。因为它使用了两个指针，快的指针被称为兔子，慢的指针被称为乌龟，所以这个算法又被称为“龟兔算法”。

Emacs 源代码中有它的一个完美实现，我们可以看 src/eval.c 中的 signal_error()。

[c]
/ Signal `error’ with message S, and additional arg ARG.
If ARG is not a genuine list, make it a one-element list. /

void
signal_error (s, arg)
char *s;
Lisp_Object arg;
{
Lisp_Object tortoise, hare;

hare = tortoise = arg;
while (CONSP (hare))
{
hare = XCDR (hare);
if (!CONSP (hare))
break;

  hare = XCDR (hare);
  tortoise = XCDR (tortoise);

  if (EQ (hare, tortoise))
    break;
}

if (!NILP (hare))
arg = Fcons (arg, Qnil); / Make it a list. /

xsignal (Qerror, Fcons (build_string (s), arg));
}
[/c]

上面有一些 lisp “词汇” 可能你看不懂，翻译成你看得懂的代码如下。

[c]

bool
list_has_cycle(NODE list)
{
NODE tortoise, *hare;

hare = tortoise = list;
while (hare)
{
hare = hare->next;
if (!hare)
break;

  hare = hare-&gt;next;
  tortoise = tortoise-&gt;next;

  if (hare == tortoise)
    break;
}

if (hare)
return true;

return false;
}
[/c]

May 23 2012

整数溢出

整数溢出大家都不陌生，可能陌生的是 gcc 对带符号整数溢出的处理。

先看标准怎么说，对于无符号整数，标准如此描述：

A computation involving unsigned operands can never overflow,
because a result that cannot be represented by the resulting unsigned
integer type is reduced modulo the number that is one greater than the
largest value that can be represented by the resulting type.

换句话说，无符号整数的溢出总是 wrap 回去的，但依旧在无符号整数的表示范围内。

对于带符号整数的溢出，标准规定是未定义行为：

If an exceptional condition occurs during the evaluation of an expression (that is, if the
result is not mathematically defined or not in the range of representable values for its
type), the behavior is undefined.

实际上（使用二进制补码），如果溢出的话，两个正数结果会变成负数，超出了正数本身的范围。wikipedia 中如此说：

Most computers distinguish between two kinds of overflow conditions. A carry occurs when the result of an addition or subtraction, considering the operands and result as unsigned numbers, does not fit in the result. Therefore, it is useful to check the carry flag after adding or subtracting numbers that are interpreted as unsigned values. An overflow proper occurs when the result does not have the sign that one would predict from the signs of the operands (e.g. a negative result when adding two positive numbers). Therefore, it is useful to check the overflow flag after adding or subtracting numbers that are represented in two’s complement form (i.e. they are considered signed numbers).

正是因为“未定义”，所以 gcc 会大胆地做出一些可能令你吃惊的优化。看下面的例子：

[c]

include

int wrap(int a) {
return (a + 1 > a);
}

int main(void) {
printf(“%sn”, wrap(INT_MAX) ? “no wrap” : “wrapped”);
}
[/c]

很好理解，可是 gcc 完全可以如此生成 wrap() 函数的代码：

00000000004004f0 :
4004f0: b8 01 00 00 00 mov $0x1,%eax
4004f5: c3 retq

因为 gcc 假设了带符号的整数从来不会溢出，所以”a + 1 > a” 总会是真，所以直接返回1！这么做完全符合标准，但不符合我们的期待。我们期望 wrap() 函数能够检测是否会溢出。

为此，gcc 引入了几个相关的命令行选项：-fwrapv，-fstrict-overflow/-fno-strict-overflow，-Wstrict-overflow。简单地说，-fstrict-overflow 就是告诉编译器，带符号整数溢出是未定义的，你可以假设它不会发生，从而继续做优化。而 -fwrapv 是说，带符号整数的溢出是定义好的，就是 wrap，你按照这个定义来编译。gcc 文档中提到：

Using -fwrapv means that integer signed overflow is fully defined: it wraps. When -fwrapv is used, there is no difference between -fstrict-overflow and -fno-strict-overflow for integers. With -fwrapv certain types of overflow are permitted. For example, if the compiler gets an overflow when doing arithmetic on constants, the overflowed value can still be used with fwrapv, but not otherwise.

我们加上 -fno-strict-overflow 之后再去编译上面的代码，结果明显不同：

00000000004004f0 :
4004f0: 8d 47 01 lea 0x1(%rdi),%eax
4004f3: 39 f8 cmp %edi,%eax
4004f5: 0f 9f c0 setg %al
4004f8: 0f b6 c0 movzbl %al,%eax
4004fb: c3 retq

而对于使用二进制补码的机器来说，-fwrapv 和 -fno-strict-overflow 只有细微的区别： -fno-strict-overflow 只是说不去优化，而-fwrapv 明确地定义了溢出的行为。

Linux 内核中曾经出现一个相关的 bug，是 Linus 大神出手搞定了，他说道：

It looks like ‘fwrapv’ generates more temporaries (possibly for the code
that treies to enforce the exact twos-complement behavior) that then all
get optimized back out again. The differences seem to be in the temporary
variable numbers etc, not in the actual code.

So fwrapv really is different from fno-strict-pverflow, and disturbs the
code generation more.

IOW, I’m convinced we should never use fwrapv. It’s clearly a buggy piece
of sh*t, as shown by our 4.1.x experiences. We should use
-fno-strict-overflow.

所以编译 Linux 内核时使用的是 -fno-strict-overflow。

May 15 2012

A poem about division

From: Hacker’s Delight Author: Henry S. Warren

I think that I shall never envision
An op unlovely as division.

An op whose answer must be guessed
And then, through multiply, assessed;

An op for which we dearly pay,
In cycles wasted every day.

Division code is often hairy;
Long division’s downright scary.

The proofs can overtax your brain,
The ceiling and floor may drive you insane.

Good code to divide takes a Knuthian hero,
But even God can’t divide by zero!

May 12 2012

为什么 bool 只是一个宏

可能你也会感到吃惊，在C99/C11标准中，bool 的定义居然是个宏，而不是 typedef 类型！

C11 第 7.18 节中明确提到： The macro “bool” expands to _Bool.

而且 false 和 true 也是宏……不过，它下面接着有解释：

Notwithstanding the provisions of 7.1.3, a program may undefine and perhaps then
redefine the macros bool, true, and false.

可见之所以把它们定义成宏是为了让用户可以重定义，想必在C99 之前应该有不少C程序都自己定义了 bool，false 和 true 吧！而且第 7.31.9 节中说这个以后会去掉：

The ability to undefine and perhaps then redefine the macros bool, true, and false is
an obsolescent feature.

你要是感到不爽，你可以这么做：
[c]

undef bool

define bool bool

typedef _Bool bool;
[/c]

May 4 2012

A Programmer's Poem

From: http://edweissman.com/53640595
Author: Ed Weissman

My friends all think
that I’m a neb
Cause I spend much time
here on the web

The good things here
I do not abuse
Except lots of time
on hacker news

I don’t read reddit
I will not digg
I’m not on facebook
My work’s too big

I do not text
I do not tweet
I just work on
Things that are neat

I check email
throughout the day
But there are no games
that I will play

My phone’s on vibrate
I do not chat
My work is really
Where it’s at

Knuth and Turing
are my big heroes
I love to move
Ones and zeros

My head is down
I’m in the mode
Don’t bother me
I have to code

Those who need me
leave voicemail
I’m much too busy
trying not to fail

I learn on-line
and from my schools
But I must avoid
all sorts of trolls

I can’t believe
I wrote this ode
When I have so much
I have to code

I’m not an addict
I have no drug
I’ve got to go
To fix a bug

Mar 20 2012

用宏实现 alignof

我们知道在C2011标准中引入了 alignof 操作符，但是在 gcc 和 glibc 完全支持 C2011 之前，我们仍然不能使用它。其实我们可以自己用宏实现一个（出自c.l.c）：

define AlignOf(type) (offsetof(struct { char c; type t; }, t))

它巧妙地把参数指定的类型放入一个结构体中，在此结构体的最前面镶入一个 char 类型，这样该成员在结构体中的偏移就是它对齐的位置。看例子：

[c]

include

define AlignOf(type) (offsetof(struct { char c; type t; }, t))

int main(void)
{
struct foo { char a; int b; long c; };
typedef struct foo bar[128];

printf("sizeof foo: %lun", sizeof(struct foo));
printf("AlignOf foo: %lun", AlignOf(struct foo));
printf("sizeof bar: %lun", sizeof(bar));
printf("AlignOf bar: %lun", AlignOf(bar));
return 0;

}
[/c]
输出：
sizeof foo: 16
AlignOf foo: 8
sizeof bar: 2048
AlignOf bar: 8

Mar 7 2012

关于 extern inline

（本文是《C语言编程艺术》的一部分，转载请注明出处，勿用于商业用途。）

大家一定对C语言 inline 关键字不陌生，甚至经常用到 static inline 的函数。可能感到陌生的是 extern inline。C11 标准在6.7.4 p7 中对此进行了描述，不过比较生涩难读。简单地讲，static line 和 extern inline 的区别，从名字上也能看得出来，就是编译单元之外可见性的问题。把一个普通函数给内联掉之后，一个显著的区别就是外部可见性没了，怎么能同时既能内联又能保留外部的可见性呢？为了解决这个问题就有了extern inline。

static inline 是说这个函数就这么一个，而且没有外部可见性，在所有的编译单元中大家必须共用这么一个定义，这也是为什么 static inline 通常要放到头文件中的原因；而 extern inline 就不一样了，它是说这个函数虽然只有一个定义，但是既有内联的版本，也有非内联的版本，如果你能看得到它的定义。即在同一个编译单元中，那么你就可以用它的内联版本；看不到的话，你就可以用非内联版本，即和其它普通函数一模一样。

而如果既不带 static 也不带 extern 的话，含义又不同了：前面的 extern inline 只需要定义一次，只是不同的地方看到的版本不同，有的地方看到的是内联的版本，而有的地方看到的只是个 external reference。而仅用 inline 的话，就变成了要定义两次，带 inline 关键字的这个定义就是内联版本，在这个编译单元中都用这个版本，而在外部，还是使用普通的非内联定义。换句话说，带 inline 的定义遮盖住了外部的定义。既然有两个不同的版本，那么也就需要保持这两个版本的定义相同，否则就会有问题。

以上是标准C的定义，而 GNU89 的定义就不同了，基本上是把”extern inline”和”inline”的含义给交换了，“static line” 的含义都是相同的。看下面的标准C的例子：

[c]
//a.c
//C99
extern inline int max(int a, int b)
{
return a > b ? a : b;
}

int a = 10;
int b = 20;
int foo(void)
{
return max(a, b);
}

extern int bar(void);

int main(void)
{
return foo() + bar();
}

//b.c
//C99
extern int max(int a, int b);

int bar(void)
{
int a = 10;
int b = 20;
return max(a, b);
}
[/c]

我们这么编译：
% gcc -std=c99 -O2 -Wall -W a.c b.c -o c

而我们如果把它编译成 GNU89 的话，就会报错：
% gcc -std=gnu89 -O2 -Wall -W a.c b.c -o c
/tmp/ccAJTzwY.o: In function bar': b.c:(.text+0xb): undefined reference tomax’
collect2: ld returned 1 exit status

很明显，因为 GNU89 中的 extern inline 需要两个定义，明显我们少了一个。修改后的代码如下：

[c]
//a.c
//GNU89
extern inline int max(int a, int b)
{
return a > b ? a : b;
}

int a = 10;
int b = 20;
int foo(void)
{
return max(a, b);
}

extern int bar(void);

int main(void)
{
return foo() + bar();
}

// b.c
//GNU89
int max(int a, int b)
{
return a > b ? a : b;
}

int bar(void)
{
int a = 10;
int b = 20;
return max(a, b);
}
[/c]

glibc 中就用到了 GNU89 的extern inline 特性，在 ctype.h 中，tolower()/toupper() 的定义是：
[c]

ifdef __USE_EXTERN_INLINES

extern_inline int NTH (tolower (int c))
{
return c >= -128 && c = -128 && c = -128 && c = -128 && c < 256 ? __ctype_toupper[c] : c;
}
[/c]

顺便说一句，gcc 提供了-fgnu89-inline 和 -fno-gnu89-inline 选项可在编译时控制上述 inline 的行为。

Stackoverflow 上有人做了一个很好的总结，我翻译了一下：

GNU89:

"inline"： 函数可能会被内联掉，非内联的版本总是会生成，而且外部可见，因此内联的定义在本编译单元中只能有一次，其它地方看到的是非内联版本。

"static inline"：不会生成外部可见的非内联版本，可能会生成一个 static 的非内联版本。它当然可以被定义多次，因为外部看不到它。

"extern inline"：不会生成一个非内联的版本，但是可以调用一个非内联版本（因此你必须在其它编译单元中定义它）。只能有一个定义的规则当然也适用，非内联版本和内联版本的代码必须是一样的。

C99 (or GNU99):

"inline"：和 GNU89 的 "extern inline" 一样，没有外部可见的函数生成，但是外部可见的函数必须存在，因为有可能会用到它。

"extern inline"：和 GNU89 的 "inline" 一样， 会生成外部可见的代码，最多一个编译单元可以使用它。

"static inline"：和 GNU89 的 "static inline" 一样，这是唯一一个在 GNU89 和 C99之间可移植的。

最后，我们可以看大神 Linus Torvalds 如何描述 extern inline：

“static inline” means “we have to have this function, if you use it, but don’t inline it, then make a static version of it in this compilation unit”. “extern inline” means “I actually have an extern for this function, but if you want to inline it, here’s the inline-version”.

参考资料：
http://www.greenend.org.uk/rjk/tech/inline.html
http://gcc.gnu.org/ml/gcc/2006-11/msg00006.html
http://stackoverflow.com/questions/6312597/is-inline-without-static-or-extern-ever-useful-in-c99
http://www.cnblogs.com/cnmaizi/archive/2011/01/19/1939686.html

Dec 21 2011

C 语言新标准——C11

12月8号，ISO 发布了新的 C 语言的新标准——C11，之前被称为C1X，官方名称 ISO/IEC 9899:2011。新的标准可以这里下载。这个标准是基于今年4月发布的名为 N1570 的草稿，但据说并未做任何改动。

根据 wikipedia 记载，相比 C99，C11 做了以下重要的更新：

1. 对齐处理操作符 alignof，函数 aligned_alloc()，以及头文件 <stdalign.h>。见 7.15 节。

2. Noreturn 函数标记，类似于 gcc 的 _attribute((noreturn))。例子：

    _Noreturn void thrd_exit(int res);

3. _Generic 关键词，有点儿类似于 gcc 的 typeof。例子：

#define cbrt(X) _Generic((X), long double: cbrtl,
                              default: cbrt,
                              float: cbrtf)(X)

4. 静态断言（ static assertions），_Static_assert()，在解释 #if 和 #error 之后被处理。例子：

    _Static_assert(FOO > 0, “FOO has a wrong value”);

5. 删除了 gets() 函数，C99中已经将此函数被标记为过时，推荐新的替代函数 gets_s()。

6. 新的 fopen() 模式，(“…x”)。类似 POSIX 中的 O_CREAT|O_EXCL，在文件锁中比较常用。

7. 匿名结构体/联合体，这个早已经在 gcc 中了，我们并不陌生，定义在 6.7.2.1 p13。

8. 多线程支持，包括：_Thread_local，头文件 <threads.h>，里面包含线程的创建和管理函数（比如 thrd_create()，thrd_exit()），mutex （比如 mtx_lock()，mtx_unlock()）等等，更多内容清参考 7.26 节。

9. _Atomic类型修饰符和头文件 <stdatomic.h>，见 7.17 节。

10. 带边界检查（Bounds-checking）的函数接口，定义了新的安全的函数，例如 fopen_s()，strcat_s() 等等。更多参考 Annex K。

11. 改进的 Unicode 支持，新的头文件 <uchar.h> 等。

12. 新增 quick_exit() 函数，作为第三种终止程序的方式，当 exit() 失败时可以做最少的清理工作（deinitializition），具体见 7.22.4.7。

13. 创建复数的宏， CMPLX()，见 7.3.9.3。

14. 更多浮点数处理的宏（More macros for querying the characteristics of floating point types, concerning subnormal floating point numbers and the number of decimal digits the type is able to store）。

15. struct timespec 成为 time.h 的一部分，以及宏 TIME_UTC，函数 timespec_get()。

gcc 4.6 中新增了新的选项 -std=c1x 来支持这一标准，更多支持参考这里。但是 glibc 相关的部分尚未实现，所以你还不能马上在 Linux 上体验最新的 C11 特性。

Dec 17 2011

Polyglot

可能你之前也见过这种程序：它是用两种以上的编程语言写成，可以不经修改作为两种语言编译/解释。今天在 wikipedia 上看到了它的定义：

In computing, a polyglot is a computer program or script written in a valid form of multiple programming languages, which performs the same operations or output independent of the programming language used to compile or interpret it.
之前见过的用2种语言写的 Polyglot 简直弱爆了，这里有用至少6种语言写成的 Polyglot：

http://ideology.com.au/polyglot/ （8种语言）
http://mauke.dyndns.org/stuff/poly.poly (16种语言！原链接不能用，我备份了一下。)

Dec 8 2011

再遇 bash 引号问题

最近在工作中再次遇到引号的问题，与以往不同，这次这个更加棘手。

问题是这样的：我要在某个 bash 脚本中调用某个命令，根据配置文件来决定应该传递哪些参数给调用的那个命令。而且，该命令的其中一个参数中带有空格，所以必须使用双引号把这一参数作为整体传递进去。你可以试一下，无论怎么使用引号都无法解决这个问题。

最后，我在 Bash FAQ 中找到了答案！解决方法是，把该命令的每一个参数作为 bash 数组的一个元素，最后用 “${args[@]}” 一起传递给该命令即可！

另，Bash FAQ 质量非常高，强烈推荐认真读一下。

A Geek's Page

Programming

Floyd's cycle-finding algorithm

整数溢出

include

include

A poem about division

为什么 bool 只是一个宏

undef bool

define bool bool

A Programmer's Poem

用宏实现 alignof

define AlignOf(type) (offsetof(struct { char c; type t; }, t))

include

include

include

define AlignOf(type) (offsetof(struct { char c; type t; }, t))

关于 extern inline

ifdef __USE_EXTERN_INLINES

C 语言新标准——C11

Polyglot

再遇 bash 引号问题