Prepared by Lin Jensen , Bishop's University , for CSc 116 Assembly Language class, April 2002, revised for AT&T syntax, November 2003. If you prefer nasm (masm, tasm) syntax, see the old summary.
This summary lists only the most useful general-purpose instructions. Where there are differences between treating signed v. unsigned numbers, I have only included the forms for signed numbers (this applies to mutiplication, division, and jumps for less or greater.) There are a great many "complex" instructions that are not actually needed (e.g. loope, rep movsb), and many varient forms. Most instructions can operate on 8- 16- or 32-bit quantities, so long as all operands are the same size.
There are 4 general purpose registers, named EAX, EBX, ECX, EDX.
Each of these registers is 32 bits. The lower 16 bits and lowest 8 bits of EAX can be refered to using the names AX and AL respectively, and similarly for the others. The second-lowest 8 bits can be referred to as AH.
There are also 2 "index" registers, named ESI and EDI. In fact, all registers can be used for indirect addressing.
Two "saved" registers are associated with the stack . They must be preserved
across function calls:
Data movement instructions do NOT affect the status flags. Thus merely moving a value into a register does not enable one to branch depending on its value, an arithmetic (or logic) instruction has to be used for the test.
mov src, dest # move src to dest, one (but not both) may be in memory
# src may be immediate (constant), which is indicated with a $
xchg op1, op2 # exchange 2 operands, at least one must be a register
lea memory_reference, %reg # "load effective address" calculates the addressThe last instruction can be used to calculate the address of some complex addressing mode. For loading the address of a variable in the data segment, a mov instruction can be used, as such an address is a constant:
# of the memory reference, and stores it in the reg.
mov $myString, %edx # loads the address of myString into the register
All arithmetic instructions set status flags according to their results, these can be used by following jump instructions to decide whether to branch or not. The most important are Carry, Zero, Sign, and Overflow.
add src, dest # dest := dest + src, same limitations as mov
sub src, dest # dest := dest - src
cmp src, dest # same as sub except result NOT stored. Sets status flags only
imul src, dest # just one variant. Only the Overflow flag has meaning (too large)
cdq # preceeds idiv to sign-extend eax (into edx) to a 64-bit number
idiv src # eax := [edx:eax]/src, edx := remainder. The dividend is 64 bits.
inc op # increment (add 1 to) operand, sets OSZ (not C)
dec op # decrement (subtract 1 from) operand, sets OSZ (not C)
neg op # negate operand (2's complement)
jmp label # unconditional jump to label:
Conditional jumps (known to us as branch instructions) test the status flags as necessary to decide whether to jump or not. Thus they usually follow a CMP instruction, less frequently another arithmetic or logic instruction. The machine language instruction stores an 8-bit displacement relative to the current instruction, this imposes a severe limit on how far one can branch. Sometimes inserting one additional instruction will cause a "jump out of range" error, which you must work around using the opposite branch and a JMP. There are some "aliases." JE and JZ are the same instruction , as are JL and JNGE.
je label # jump if equal (or zero), tests the Z-flag
jne label # jump if not equal
jl label # jump if less (signed) tests S- and O-flags
jg label # jump if greater tests SZO flags
jle label # jump if less than or equal opposite of jg
jge label # jump if greater than or equal
The CALL instruction pushes the return address on the stack
and then jumps to the procedure (function).
RET pops the return address off the stack (into the IP) thus returning
control to the instruction following the call. Naturally this depends
upon proper use of the stack and stack pointer (ESP). There is no need
to save the return address when a function calls another function.
By convention, function results are returned in EAX. ESP and EBP must be preserved by functions, there are no fixed conventions regarding arguments. High level languages, and the Windows operating system, pass arguments on the stack.
PUSH pushes its operand on the stack
POP pops a value off the stack into its operand. In both cases the stack pointer (ESP) is modified accordingly. You must bear in mind that CALL and RET also do a push and pop respectively. (Bytes cannot be pushed or popped.)
AND, OR, XOR, all set the S and Z flags. Operands just like ADD. NOT negates (changes) each bit of its single operand.
A common trick for clearing a register, and also setting S=0 and Z=1, is
xor %eax,%eax #set %eax = 0
sal count, op # shift arithmetic left the op, by count bits. Last bit out goes in Carry
sar count, op # shift arithmetic right (sign-extended, leftmost bit is replicated)
shl count, op # shift left
shr count, op # shift right (0-s enter from left)
# count can be $1, an 8-bit immediate, or register %clThe last bit shifted out is always stored in the Carry flag.
# op can be register or memory. Example:
shl $4, %eax # shift register 4 bits left (multiplying by 16)
$257 # decimal numberboth soruce and destination can be register, direct memory addressing, or various forms of indirect memory addressing. However, in general, it is not possible for both destination and source to refer to memory. examples:
$'A' # ascii character ( same as 65 or 41h)
$0xDF # hexadecimal number
$mynum # the address of mynum in the data segment
add $47, munum # direct memory reference, mynum := mynum + 47
sub (%ebx), %eax # the number in memory pointed to by ebx is subtracted from the
# contents of register eax (which changes as a result)
mov 1(ebx,edi), %bl # memory address is the sum of 2 registers and a constant,
# one byte is loaded
cmpb $0x41, (%esi) # compare the byte addressed by %esi with 'A'
movw $12345, (%ebx) # move 16-bit integer constant to where %ebx points
pushl $5 # push 32-bit constant on the stack
The conditional jump instructions actually test one or mor "status flags" so they respond to whichever instruction last set these flags.
Arithmetic instructions, including compare (cmp) set the status
flags
S=sign, Z=zero (1 means true!) and C=carry. Essentially, the S and Z
flags are tested by branch instructions, and an unsigned carry (or
borrow) beyond the length of the instruction, sets the C flag. (There
is also an O=overflow flag, should something go wrong with a signed
operation.) In ddd, you will find
the "status register" in hexadecimal, this table will help you decode
the bits, with 2 examples.
Flag |
bit
pos |
value
for 0x347 |
value for 0xe86 |
C |
0 |
1 |
0 |
Z |
6 |
1 |
0 |
S |
7 |
0 |
1 |
O |
11 |
0 |
1 |
That just about covers the essential instructions, except for the floating coprocessor instructions , of course.