CSc 116 notes

11) Other processor archictectures and assembly languages

Every processor has its own instruction set and assembly language. This language is specific to the architecture, especially the registers, of the processor. Nevertheless, there is a family resemblance between instruction sets and assembly languages. In this section we look at some examples, to appreciate the similarities and differences.

You will observe that in all instruction sets, there are groups of instructions for: Moving data, arithmetic, jump, compare, branch, logical, and shift . There also may be operations affecting the stack, and for input/output.

A) Processor size, the external view

The ability of a processor to access memory is directly related to speed and to two sets of wires, the "Data bus" and the "Address bus."

The size of the Data Bus determines how many bits may be transferred at one time. Common sizes are 8, 16, and 32 bits.
The size of the Address Bus determines how many memory locations may be accessed. If there are n bits (wires) in the Address bus, there must be 2ⁿ possible addresses. This is referred to as the Address Space. While not necessarily true, all the processors listed below can access single bytes of memory, so the address space is in terms of bytes.

Size	Examples	Data Bus	Address Bus	Address Space
8	6502, 6809, Z80	8	16	64 K
16	8086	16	20	1 M
16	68000	16	24	16 M
32	MIPS, SPARC,68020	32	32	4 G
32	80386, Pentium	32	46	32 T

B) The internal view: Registers and instruction sets.

As programmers, we are interested in the registers that are available to store and manipulate data in the processor, and the instructions that are available to do so. We need to know the size and number of registers, and which operations are allowed on them. In an orthogonal instruction set, such as that of MIPS, all registers are treated equally. At the other extreme, each register has its own special purpose. In the middle is the 68000, with 8 data registers and 8 address registers.

All processors have a program counter, usually named PC. It needs to have the same effective size as the Address Bus. Then they have some "general purpose" registers that can hold addresses and data. In addition, the intel 80x86 series (including the Pentium) have 4 or 6 "Segment" registers, which are used to extend the address space.

Processor	Register Size	Number of General regsites
6502	8	4
8086	16	8
68000, 68020	32	16
MIPS, SPARC*	32	32*
80386, Pentium	32	8

* accessible at one time. SPARC rotates a register window on function call and return, giving each function 8 local registers.

6502

The 6502 is an 8-bit processor that was widely used in the 1970's for the first personal computers, such as Apple and Commodore. It has 4 registers, 8 bits each.

A, the accumulator, used by all arithmetic and logical operations,
X and Y, used as counters and for indexing. They can be incremented and decremented.
S the stack pointer. A fixed block of 256 bytes of memory is used for the stack.

Arithmetic operations mostly use A as one source operand and as destination. The other operand is either immediate or from memory, using direct, indexed, or indirect-indexed addressing. Subroutine calls push the 16-bit return address on the stack, and returns take it off. PHA and PLA push and pop the accumulator, respectively.

Here is some sample code, for the well known problem of printing 2 hex digits representing a byte initially passed as an argument in the accumulator, A:

BIN2HEX         ; Convert 0..15 in A to Hex digit
        CMP     #0A     ; immediate hex. Compare sets status flags
        BCC     $1      ; Branch less than since 6502 leaves C=0 to indicate borrow
        ADC     #6      ; adds 7 since Carry is set
$1      ADC     #30     ; now C=0, we have ASCII character
        RTS             ; return with it in A

OUTHEX          ; Output byte in A as 2 hex digits (ASCII chars)
        PHA             ; push a copy on stack
        LSR  A          ; Process left nibble
        LSR  A          ; by shifting right 4 bits
        LSR  A
        LSR  A
        JSR     BIN2HEX ; call the above
        JSR     OUTCHAR ; given output routine
        PLA             ; pop original byte from stack
        AND     #0F     ; mask to get right nibble
        JSR     BIN2HEX
        JSR     OUTCHAR
        RTS             ; all done

Here is another example, to write the string "Hello World!" stored at label HELLO, using X as a counter and Y as an array index. This is called direct indexed addressing, it is equivalent to MIPS hello($t0) when $t0 indexes a string.

        LDX     #12     ; print 12 chars
        LDY     #0
LOOP
        LDA     HELLO,Y ; String starts at HELLO, indexed by Y
        JSR     OUTCHAR
        INY
        DEX             ; decrement counter 
        BNE     LOOP    ; and test for 0

80x86 instruction summary

The Intel 8086 was chosen by IBM for their first PC because it had just become available, and broke the 64K address space limitation by the introduction of Segment registers. The addresses stored with instructions are still 16 bits, but they are relative to the start of a segment. This works very well as long as a program can operate with data and code segments of a maximum of 64K. Over the years, as software has become more complex, techniques to overcome the limitations of the 8086 have proven cumbersome. The 80386 was the first 32-bit processor of the series. It increased the size of the general registers from 16 to 32 bits, added 2 additional segment registers, and a new mode of using them to overcome the 1 Meg. limitation of the 8086. It also retains the 8086 mode of operation, so older software can continue to run. It took about 15 years after its introduction for Microsoft to produce an operating system that made full use of the 32 bit design.

The 4 16 bit "general purpose" registers are named AX, BX, CX, and DX. Each is divided into two 8-bit halves, named AH and AL, BH... etc. so that they can be used as 8 8-bit registers if desired. AX is the accumulator, BX may be used as an indirect address, CX as a counter, and DX for I/O.

4 more 16-bit registers are used primarily to hold addresses. They are named SI, DI, BP, and SP. SP is the stack pointer.

Move, arithmetic, and logical instructions have 2 operands, destination and source. The destination is also the first operand of instructions like ADD. One operand must be a register, the other can be a register, memory, or immediate (source only.) For example:

          ADD   AX, mynumber
          MOV   sum, AX

adds the contents of AX to the 16 bits at memory location mynumber , and stores the result in AX, then moves it into memory location sum . This is the right-to-left pattern we are familiar with. (68000 and SPARC use left-to-right order!)

Here is the byte printed as 2 hex characters example, in 8086 assembly language. Note: this assembler is not case sensitive.

BIN2HEX:                ; Convert 0..15 in AL to Hex digit
        CMP     AL, 0AH ; immediate hex. Compare sets status flags
        JB      numeric ; Branch "below" = less than, unsigned
        ADD     AL, 7   ; adds 7 to skip 7 chars between '9' and 'A'
numeric:
        ADD     al,30h  ; make it ASCII character
        RET             ; return with it in AL

OUTHEX:         ; Output byte in AL as 2 hex digits (ASCII chars)
        PUSH    AX      ; push a copy on stack  (must push 16 bits) 
        MOV     CL,4    ; Shift count of 4 bits (must use CL for this)
        SHR     AL,CL   ; Process left nibble
                        ; by shifting right 4 bits
        CALL    BIN2HEX ; call the above
        CALL    OUTCHAR ; given output routine
        POP     AX      ; pop original byte from stack
        AND     AL,0Fh  ; mask to get right nibble
        CALL    BIN2HEX
        CALL    OUTCHAR
        RET             ; all done

SPARC

Taurus uses a SPARC-4 processor. This stands for "Sun Palo Alto Research Center," Sun published its "open architecture" based on the RISC-1 design created by a group of graduate students at UC Berkeley. Both the RISC-1 and original MIPS (developed by graduate students at Stanford Univ.) outproformed commercially designed processors of the same era.

Some key features of SPARC (shared with MIPS and other RICS processors) are:

Load/store, register-to-register computation. Memory accesses are made only with load and store instructions.
Simple fixed format instructions with few addressing modes. Instructions are 32 bits.
Pipelining: Several instructions are in various stages of processing at the same time
At least 32 general-purpose registers visible at one time, and large cache memory
Hardwired control with no microcode.

Some special characteristics of SPARC are:

Overlapping register windows: function call "rotates" a larger set of register memory, so that 8 registers overlap: the calling program's "output" registers (%o0-%o7) become the called function's "input" registers (%i0-%i7). It obtains a new set of "output" registers for communicating with deeper function calls. It also has 8 "local" registers (inaccessable both to the calling programs, and any function it calls) (%l0-%l7). Special save & restore instructions handle this rotation, as well as using the stack, if necessary, to save overflow in case of too many function calls and returns.
8 global registers are available at all times ($g0-%g7), 2 of them are also called %fp and %sp, the stack pointer.
Some concurrency is visible. Becase of pipelining, it is easier to complete the instruction following a jump before the jump takes effect, rather than conditionally undoing its effects. This is evident in compiler produced code. The compiler always inserts a nop (no-operation) instruction in the "delay slot." A programmer, or optimizing compiler, could move an appropriate instruction into this slot.

Example SPARC code:

!calling function add3, with 3 arguments (3,4,6)
	mov  3,%o0    	! instructions are LEFT to RIGHT
	mov  4,%o1
	call  add3,3 	! declare that there are 3 args.
	mov  6,%o2    	! 3rd argument put in register before call takes effect
back:			! call returns to here
!    .....
! here is the function
add3:				!add3(x,y,z)
	mov	64,%g1
	save	%sp,%g1,%sp	!sets up stack frame and rotates registers
	add	%i0,%i1,%l0	!temp = x+y
	add	%l0,%i2,%i0	!return temp+z
	ret
	restore			!undoes save, BEFORE the ret

Prepared by Lin Jensen, Bishop's University, 24 March 2000, updated 8 April 2002

Back Next Contents