Imagine the assembly code that would be generated from:
if (__builtin_expect(x, 0)) {
foo();
...
} else {
bar();
...
}
I guess it should be something like:
cmp $x, 0
jne _foo
_bar:
call bar
...
jmp after_if
_foo:
call foo
...
after_if:
You can see that the instructions are arranged in such an order that the bar
case precedes the foo
case (as opposed to the C code). This can utilise the CPU pipeline better, since a jump thrashes the already fetched instructions.
Before the jump is executed, the instructions below it (the bar
case) are pushed to the pipeline. Since the foo
case is unlikely, jumping too is unlikely, hence thrashing the pipeline is unlikely.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…