Summary of Intel 80x86 instructions

Notes contents for MIPS Assembly

Prepared by Lin Jensen , Bishop's University , for CSc 116 Assembly Language class, April 2002, revised for AT&T syntax, November 2003. If you prefer nasm (masm, tasm) syntax, see the old summary.

This summary lists only the most useful general-purpose instructions. Where there are differences between treating signed v. unsigned numbers, I have only included the forms for signed numbers (this applies to mutiplication, division, and jumps for less or greater.) There are a great many "complex" instructions that are not actually needed (e.g. loope, rep movsb), and many varient forms. Most instructions can operate on 8- 16- or 32-bit quantities, so long as all operands are the same size.

Registers

There are 4 general purpose registers, named EAX, EBX, ECX, EDX.

Each of these registers is 32 bits. The lower 16 bits and lowest 8 bits of EAX can be refered to using the names AX and AL respectively, and similarly for the others. The second-lowest 8 bits can be referred to as AH.

There are also 2 "index" registers, named ESI and EDI. In fact, all registers can be used for indirect addressing.

Two "saved" registers are associated with the stack . They must be preserved across function calls:

ESP is the stack pointer. It is automatically decremented by PUSH instructions, and incremented by POP instructions.
EBP is traditionally used as a frame pointer, or fixed reference point, by functions

Linux C register and function calling conventions:

Saved registers (callee -- the function called -- must save values): EBX, ESI, EDI, EBP, and of course, the stack pointer ESP.
Temporary registers: EAX, ECX, EDX.

When a function is called, the return address is on the top of the stack, then the first, second, etc. arguments.
When a function returns, the stack pointer must be pointing to the return address, so the ret instruction will work (without any operand.) The caller of the function is required to "clean up" the arguments on the stack, under the principal, Clean up your own mess.

Instructions, grouped by function
AT&T syntax, left-to-right operand order

Move

Data movement instructions do NOT affect the status flags. Thus merely moving a value into a register does not enable one to branch depending on its value, an arithmetic (or logic) instruction has to be used for the test.

mov   src, dest         # move src to dest, one (but not both) may be in memory
                        #  src may be immediate (constant), which is indicated with a $

xchg  op1, op2          # exchange 2 operands, at least one must be a register

lea   memory_reference, %reg  # "load effective address" calculates the address 
                              # of the memory reference, and stores it in the reg.

The last instruction can be used to calculate the address of some complex addressing mode. For loading the address of a variable in the data segment, a mov instruction can be used, as such an address is a constant:

mov	$myString, %edx		# loads the address of myString into the register

Arithmetic

All arithmetic instructions set status flags according to their results, these can be used by following jump instructions to decide whether to branch or not. The most important are Carry, Zero, Sign, and Overflow.

add   src, dest         # dest := dest + src, same limitations as mov

sub   src, dest         # dest := dest - src

cmp   src, dest         # same as sub except result NOT stored. Sets status flags only

imul  src, dest         # just one variant. Only the Overflow flag has meaning (too large)

cdq                     #   preceeds idiv to sign-extend eax (into edx) to a 64-bit number 
idiv  src               # eax := [edx:eax]/src, edx := remainder. The dividend is 64 bits.

inc   op                # increment (add 1 to) operand, sets OSZ (not C)
dec   op                # decrement (subtract 1 from) operand, sets OSZ (not C)

neg   op                # negate operand (2's complement)

Jump and branch instructions

jmp   label             # unconditional jump to label:

Conditional jumps (known to us as branch instructions) test the status flags as necessary to decide whether to jump or not. Thus they usually follow a CMP instruction, less frequently another arithmetic or logic instruction. The machine language instruction stores an 8-bit displacement relative to the current instruction, this imposes a severe limit on how far one can branch. Sometimes inserting one additional instruction will cause a "jump out of range" error, which you must work around using the opposite branch and a JMP. There are some "aliases." JE and JZ are the same instruction , as are JL and JNGE.

je   label    # jump if equal (or zero), tests the Z-flag

jne  label    # jump if not equal

jl   label    # jump if less (signed)   tests S- and O-flags

jg   label    # jump if greater         tests SZO flags

jle  label    # jump if less than or equal   opposite of jg

jge  label    # jump if greater than or equal

Function call and return

The CALL instruction pushes the return address on the stack and then jumps to the procedure (function).
RET pops the return address off the stack (into the IP) thus returning control to the instruction following the call. Naturally this depends upon proper use of the stack and stack pointer (ESP). There is no need to save the return address when a function calls another function.

By convention, function results are returned in EAX. ESP and EBP must be preserved by functions, there are no fixed conventions regarding arguments. High level languages, and the Windows operating system, pass arguments on the stack.

Stack operations

PUSH pushes its operand on the stack

POP pops a value off the stack into its operand. In both cases the stack pointer (ESP) is modified accordingly. You must bear in mind that CALL and RET also do a push and pop respectively. (Bytes cannot be pushed or popped.)

Logic and shifting

AND, OR, XOR, all set the S and Z flags. Operands just like ADD. NOT negates (changes) each bit of its single operand.

A common trick for clearing a register, and also setting S=0 and Z=1, is

      xor %eax,%eax       #set %eax = 0

sal   count, op       # shift arithmetic left the op, by count bits. Last bit out goes in Carry
sar   count, op       # shift arithmetic right (sign-extended, leftmost bit is replicated)
shl   count, op       # shift left
shr   count, op       # shift right  (0-s enter from left)

#  count can be $1, an 8-bit immediate, or register %cl
#  op can be register or memory. Example:

shl	$4, %eax	# shift register 4 bits left (multiplying by 16)

The last bit shifted out is always stored in the Carry flag.

There are also rotate instructions, ROL and ROR. Each bit shifted out is brought in at the other end. Again, the last bit rotated is also copied to the Carry flag.

Addressing modes

Source data can be immediate, this is always indicated by a '$' examples:

$257    # decimal number
$'A'    # ascii character ( same as 65 or 41h)
$0xDF   # hexadecimal number
$mynum  # the address of mynum in the data segment

both soruce and destination can be register, direct memory addressing, or various forms of indirect memory addressing. However, in general, it is not possible for both destination and source to refer to memory. examples:

	add $47, munum	  # direct memory reference, mynum := mynum + 47
	sub (%ebx), %eax  # the number in memory pointed to by ebx is subtracted from the 
			  # contents of register eax (which changes as a result)
	mov 1(ebx,edi), %bl	# memory address is the sum of 2 registers and a constant,
				# one byte is loaded

Instruction size

Most instructions can refer to operations of 8, 16, or 32 bits. Usually, at least one operand has a definite size, such as %al, %ax, or %eax (8, 16, or 32 bits respectively.) The other operand must be the same size, either implicitly or explicitly. Once in a while, no operand has a definite size. In such a case, it becomes necessary to append a letter indicating size (b, w, or l for byte, word or long) to the operation code:

cmpb	$0x41, (%esi)	# compare the byte addressed by %esi with 'A'
movw	$12345, (%ebx)	# move 16-bit integer constant to where %ebx points
pushl	$5		# push 32-bit constant on the stack

Intel Status flags

The conditional jump instructions actually test one or mor "status flags" so they respond to whichever instruction last set these flags.

Arithmetic instructions, including compare (cmp) set the status flags S=sign, Z=zero (1 means true!) and C=carry. Essentially, the S and Z flags are tested by branch instructions, and an unsigned carry (or borrow) beyond the length of the instruction, sets the C flag. (There is also an O=overflow flag, should something go wrong with a signed operation.) In ddd, you will find the "status register" in hexadecimal, this table will help you decode the bits, with 2 examples.

Flag	bit pos	value for 0x347	value for 0xe86
C	0	1	0
Z	6	1	0
S	7	0	1
O	11	0	1

In practice, every arithmetic instruction sets the status flags in such a way that the conditional jump instructions work correctly.

CMP (compare) does a subtraction, sets the status flags accordingly, but does not store the result.
INC and DEC only set the Z flag, so you can only test for zero or not zero.
MOV does not set status flags, so you will have to use CMP to test if what you just moved is zero.

That just about covers the essential instructions, except for the floating coprocessor instructions , of course.

Note to outsiders:

This summary was prepared for an introductory class , which was taught using MIPS assembly language. Therefore, it deliberately ignores the complications and limitations of the old 16-bit architecture, in favor of programming in the "flat" memory model, ignoring segment registers.

Last updated 15 April 2009. Back to Lab 11 or course Notes contents for MIPS Assembly