Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
243 views
in Technique[技术] by (71.8m points)

c - Why does the clang 6.0 compiler optimize by starting indexes at -N and counting to zero, but clang 11.0 starts at 0 and counts to N?

For the code below, clang 6.0 and 11.0 have a subtle difference in their compiled assembly.

#include <stdint.h>

#define SIZE (1L << 16)
    
void test(uint8_t * restrict a,  uint8_t * restrict b) {
  uint64_t i;

  for (i = 0; i < SIZE; i++) {
    a[i] += b[i];
  } 
}

When I compile with arguments -O1 in clang 6.0, I get the following output:

test:                                   # @test
        mov     rax, -65536
.LBB0_1:                                # =>This Inner Loop Header: Depth=1
        movzx   ecx, byte ptr [rsi + rax + 65536]
        add     byte ptr [rdi + rax + 65536], cl
        add     rax, 1
        jne     .LBB0_1
        ret

Notice that the compiler changes the loop from a '0 to 65536' index to '-65536 to 0'. I thought this was very clever, because it makes use off the fact that add in assembly will set the ZF flag if the result is zero, saving an instruction. Unfortunately when I run the same code with the same arguments in clang 11.0, I get the following code:

test:                                   # @test
        xor     eax, eax
.LBB0_1:                                # =>This Inner Loop Header: Depth=1
        movzx   ecx, byte ptr [rsi + rax]
        add     byte ptr [rdi + rax], cl
        add     rax, 1
        cmp     rax, 65536
        jne     .LBB0_1
        ret

Notice this time, it keeps the '0 to 65536' index, and adds a cmp instruction at the end of each loop. Also, while this is a specific example, this is not unique to the code I wrote. It persists with -O3 and vectorization enabled as well

What gives? Was the original optimization not actually effective? Did processors change to obviate the trick?

question from:https://stackoverflow.com/questions/65891676/why-does-the-clang-6-0-compiler-optimize-by-starting-indexes-at-n-and-counting

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...