Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
518 views
in Technique[技术] by (71.8m points)

x86 assembly 16 bit vs 8 bit immediate operand encoding

I'm writing my own assembler and trying to encode the ADC instruction, I have a question about immediate values, especially when adding 8-bit value into the AX register.

When adding 16-bit value: adc ax, 0xff33 gets encoded as 15 33 ff which is correct. But would it matter if adc ax, 0x33 gets encoded as 15 33 00?

Nasm encodes this into 83 d0 33 which is obviously correct, but is my approach correct as well?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It's common for x86 to have more than 1 valid way of encoding an instruction. e.g. most op reg, reg instructions have a choice of encoding via the op r/m, reg or the op reg, r/m opcode.

And yes, normally you want an assembler to always pick the shortest encoding for an instruction. NASM even optimizes mov rax, 1 (7 bytes for mov r64, sign_extended_imm32) into mov eax, 1 (5 bytes) for x86-64, changing the operand-size to use the zero-extension from writing a 32-bit register instead of explicit sign-extension of a 32-bit immediate.

Using the sign-extended-imm8 encoding when available is always good

It's equal length for 16-bit, but shorter for 32-bit operand-size, so it simplifies your code to always choose imm8.

With operand-size of 32-bit, op eax, imm32 is 5 bytes, vs. op r/m32, imm8 still being 3 bytes. (Not counting any prefixes needed to set operand-size or other things; those will be the same for both.)

Performance advantages of the imm8 encoding

If an operand-size prefix is requires (e.g. in 32-bit mode for adc ax, 0x33), using the adc ax/eax/rax, imm16/32/32 encoding with an operand-size prefix will create an LCP stall on Intel CPUs (Length-Changing Prefix means the prefix changes the length of the rest of the instruction. This doesn't happen for the imm8 encoding because it's still (prefix) + opcode + modrm + imm8 regardless of the operand-size.

See Agner Fog's microarch.pdf and other performance links in the x86 tag wiki. See also x86 instruction encoding how to choose opcode which is a duplicate of this, except for the fact that adc is a special case.


In the specific case of adc/sbb, there is another advantage to avoiding the ax, imm16 encoding: See Which Intel microarchitecture introduced the ADC reg,0 single-uop special case? On Sandybridge through Haswell, adc ax, 0 is special-cased as a single-uop instruction, instead of the normal 2 for a 3-input uop (ax, flags, immediate).

But this special casing doesn't work for the no-ModRM short form encodings, so the 3-byte adc ax, imm16 still decodes to 2 uops. Only the decoder for the imm8 form checks if the immediate is zero before decoding to a single uop. (And it still doesn't work for adc al, imm8.)

So always choosing the sign-extended-imm8 whenever possible is optimal for this, too, even in 16-bit mode where no operand-size prefix would be required for adc ax,0 and thus the LCP-stall issue wouldn't happen.


Most assemblers don't provide an override to avoid the no-ModRM short form. When they were designed, there wasn't a performance use-case other than intentionally lengthening instructions to get alignment without adding NOPs before the top of a loop or other branch target: What methods can be used to efficiently extend instruction length on modern x86?

If you're designing a new flavour of asm syntax you might consider allowing more control of the encoding with override keywords. For existing designs, check out NASM's strict and nosplit keywords, and GAS's {vex2}, {vex3}, {disp32} and so on "prefixes"


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...