CSc 116 notes

7) Bit Fiddling operations

Shifting (and rotating)
Logical (Boolean) operations

So far we have been manipulating numbers, in sizes of either bytes or words. In this section, we will look at the operations that are usually provided for working with individual bits. There are several reasons for doing this. One can compress data, using only as many bits as necessary, so several fields can share the same word. We have seen that instruction layouts divide a word up like this. An assembler must put the elements of an instruction together, for example, lining up the 5-bit fields for registers and combining them.

Multiplying and dividing by powers of 2 is also easily done simply by shifting bits. Adding a zero on the right of a binary number multiplies it by two, just as adding a zero on the right of a decimal number multiplies it by 10. A shift left instruction is handier, and often faster, than setting up a divide operation. Let's look at the bit-fiddling operations, and how they are implemented in MIPS instructions.

Shifting operations

By shifting we mean moving all the bits in a word either right or left. Think of them as books filling a shelf. If we push them all over, some books will fall off the end, and be lost. There will be space available on the other end. Since a word is always 32 bits, we need to put something in that space. In a logical shift, the empty positions are filled with 0's. If you shift a word 32 or more bits, it will contain all 0's. In a rotate operation, the bits falling out one end are replaced at the other end. (Think of picking up each book as it falls off the shelf, and replacing it at the other end.) As a result, no bits are lost, and rotating a word by 32 bit positions restores its original value.

All processors, so far as I know, have shifting hardware, and instructions, capable of shifting at least 1 bit left or right. There doesn't appear to be any reason to shift a 32 bit word by more that 31 bits, any more than one would need to set an ordinary clock 50 hours ahead. You could just set it 2 hours ahead.

Here are some examples of shift and rotate operations, each starting from the same bit pattern. I hilighted 3 bits, so you can see their new locations. The new filler 0000's are in italics

Before:             11011011011011011011011011111111

Shift right 6       00000011011011011011011011011011

Shift left  5       01101101101101101101111111100000

Rotate left 4       10110110110110110110111111111101 <-- leftmost 4 came here

So, it is possible to get any bit into any position, bringing other bits along the same distance, and either losing bits of the end, or replacing them at the other end.

Arithmetic shifts (signed numbers)

Adding a zero on the right of a binary number multiplies it by two, as I mentioned. By shifting a number 1 bit left, each bit has twice the place value it had before, and we bring in a 0 to hold the units place. Multiplication by any power of 2 is possible in this way. For instance, to multiply by 16, shift left 4 bits. Example, 7 * 16

7 * 16  =  111 * 10000  =  1110000  = 0x70  =  112

Conversely, shifting right divides. Any remainder is thrown away with the discarded bits. 127 / 8, shifting right 3 bits:

01111111  --->  00001111   =  15

There are some special considerations for arithmetic shifts.

Shift right arithmetic: For a negative number, shifting 0 bits in from the left will change the sign bit to positive, this is clearly wrong. The solution is to "sign extend" the number, by filling in with the sign bit, either 0 or 1 as the case may be.
Shift left arithmetic: If any significant bits are shifted out, the answer will be wrong, and we would like to know about it. If the number is positive, we don't want to lose a 1 bit, or have a 1 bit end up in the sign bit, making the result negative. We would like to know if either of these overflow conditions occurred. The actual shift is identical to shift left logical. Some processors set a "flag" to indicate overflow. Programmers typically ignore such flags. MIPS simply does not have a special shift left arithmetic instruction

-38  =  11111111111111111111111111011010,  divide by 4, shift right aritnmetic 2

gives   11111111111111111111111111110110,  which is -10

Now, you may not quite agree with that answer. It makes mathematicians happy, because it agrees with the remainder of +2, that is, -10*4 +2 = -38. The answer has been truncated in the negative direction. Computer scientists are divided, and in fact some hardware gives this result for division, and some gives -9. In fact the MIPS architecture does not specificy how negative integers should be divided. As for the SPIM simulator, it simply uses the divide instruction of the "host" processor, so results of divide instructions can vary! Not so when you divide by shifting, you will always get the "mathmatical" result we have just seen.

MIPS shift instructions

The format of these instructions is: op code, Destination register, source register, shift amount. The shift amout is usually a constant, 1-31 bits make sense. It may also be contained in a register (variable shift). Examples, source is $t0:

sll  $t4, $t0, 5     #shift left logical 5 bits (multiply by 32)
sra  $t5, $t0, 2     #shift right arithmetic 2 bits (divide by 4), sign extended 
srl  $v0, $t0, 1     #shift right logical 1 bit. Sign bit is now 0
srlv $v0, $t0, $t1   #shift right logical, $t1 says how many bits

Type of shift	Left	Right
Logical	`sll` `sllv`	`srl` `srlv`
Arithmetic	(none, use sll)	`sra` `srav`
Rotate	`rol`	`ror`

Logical operations - Boolean Algebra

Today's digital computers are composed entirely of logical circuits, called gates. These gates use 1 or more transistors to implement the logical operatios AND, OR, and NOT. A variation on (inclusive) OR is eXclusive-OR, usually written XOR. These are also called boolean operations, after George Boole, originator of "The Algebra of Truth."

Boolean algebra deals with two values, FALSE and TRUE. Since boolean values can be encoded in 1 bit, we can do so, and by convention 0 represents FALSE and 1 represents TRUE. The operations can be defined by enumerating all the possible values of operands and results in a "Truth table"

operands		AND	OR	XOR
0	0	0	0	0
0	1	0	1	1
1	0	0	1	1
1	1	1	1	0

NOT simply reverses the value. NOT 0 = 1, and NOT 1 = 0

Some laws of boolean algebra, these identities can be verified using truth tables: (In these laws, + means OR, a bar over something means NOT, while AND is shown like multiplication in algebra, by writing two terms together.)

     A 1 = A         A + 0  =  A                   identity
     A 0 = 0         A + 1  =  1                   null
     A A = A         A + A  =  A                   idempotent
     A(A + B) = A    A + AB =  A                   absorption
       _                         _
     A A  = 0                A + A  =  1           inverse
     AB  =  BA               A + B = B + A         commutative
     (AB)C = A(BC)           (A+B)+C = A+(B+C)     associative

     A(B + C) = AB + AC      A + BC = (A + B)(A + C)    distributive
     ____   _   _             _______   _  _
     (AB) = A + B             (A + B) = A  B       de Morgan

NOT (A AND B) = (NOT A) OR (NOT B)    1st deMorgan in Pascal notation
           _         _
A XOR B = (A B) + (A B)               XOR Defined in terms of other operations

The deMorgan laws are of particular importance to programmers, they show an important relationship between AND, OR and NOT. When NOT is distributed over an expression, AND exchanges with OR. This is worth considering when constructing WHILE loops. Here is the proof of one by truth table:

`A`	`B`	`A B`	`_____` `(A B)`	`_` `A`	`_` `B`	`_ _` `A + B`
`0`	`0`	`0`	`1`	`1`	`1`	`1`
`0`	`1`	`0`	`1`	`1`	`0`	`1`
`1`	`0`	`0`	`1`	`0`	`1`	`1`
`1`	`1`	`1`	`0`	`0`	`0`	`0`

Compilers will store boolean values in a byte or word holding either a 0 or 1, this wastes some space. An array of boolean may be packed 32 values to a word, and shift instructions used to extract specific values. Usually any non-zero value will evaluate to TRUE because beqz, bnez instructions will be used.

The important point about logical instructions in assembly language is that they operate on all corresponding bits of the operands in parallel. There are no carries or other interaction between different bit positions.

MIPS logical instructions are all the 3 operand format, just like add and sub. Like them, the second source can be a constant. (Well, of course not only has destination and source registers. It is actually implemented by nor (not or) that follows the usual pattern.)

Masking

A common use of logical operations is isolating, or "masking" parts of words. Just like a mask covers your face and lets your eyes show through, a bit mask lets some bits show through, and hides others. Think of one operand as the "data" and the other as the "mask". This is a consequence of the identity and null laws above. For the different operations:

AND - for every 1 in the mask, the data bit shows. For every 0 in the mask, the result is 0, the data bit is hidden.
OR - for every 0 in the mask, the data bit shows. For every 1 in the mask, the result is 1, hiding the data bit.
XOR - for every 0 in the mask, the data bit is unchanged. For every 1 in the mask, the data bit is reversed. If you XOR twice, the original value is restored. This property is often used in graphic programming to show moving objects. If something is put into an image with XOR, another XOR will remove it. If you see something that changes colours to contrast with whatever background there is, XOR operations are being used.

Example: Find the Source Register in an instruction

0x214afff3 addi $10, $10, -13

Instruction in binary:  00100001010010101111111111110011
Mask:                   00000011111000000000000000000000
Result of AND:          00000001010000000000000000000000
Shift right 21 bits     00000000000000000000000000001010

And we have the soruce register number, suitable for printing with syscall service 1.

Example: Changing case of letters

In order to make case-insensitive comparisons, both items being compared need to be in the same case. Only the letters should be converted, of course. By looking at the ASCII code chart, we see the only difference between the upper case and lower case letters is bit 5. To make a letter:

lower case, force this bit to be 1, by ORing with 00100000 = 0x20
upper case, force this bit to be 0, by ANDing with 11011111 = 0xDF
the opposite case, change this bit, by XORing with 00100000 = 0x20

Example: Binary digit to ASCII character

Suppose we believe that register $t0 contains a binary number that can be represented by a single digit, 0..9. (This would be true for the remainder of division by 10.) From the ASCII table, we see that the numeric characters '0' .. '9' are the hex values 0x30 .. 0x39. All we have to do is OR the register with 0x30, and we have the required character, suitable for storing in a string.

                or  $t0, 0x30      #binary number to ASCII character

Example: Binary 4 bits to ASCII character for Hexadecimal - '0' .. '9' 'A' .. 'F'

If we want to represent 4 bits as a hex digit in a string, this isn't quite good enough, because the ASCII code 'A" doesnt follow immediately after '9', instead, there is a gap of 7 other characters. The solution is to test and add 7 if necessary. It is also a good idea to make sure only the lowest 4 bits are non-zero. In this coding example, we show what happens to the value 13, in binary 1101, in $t0:

     and   $t0, 0x000f      #00001101  only allow lowest 4 bits non-zero
     or    $t0, 0x30        #00111101
     ble   $t0,'9', numeric #00000111   to be added, since > '9'
        add  $t0, 7         #01000100   ASCII 'D'
numeric:

Example: Convert word to Hex string, 8 digits

This example uses the single digit conversion code above, and fills a string of 8 characters with the correct hex digits. To do this we need to shift the word being converted 4 bits right each time, apply the "binary to hex" conversion, and store the resulting character in the string right to left.

# function to convert word value to a hex string. Returns pointer to the string
#    arguments:
#        input:  a0, word to be converted
#        output: v0, pointer to the string
#    registers used:
#        t2 - working copy of word
#        t1 - point to string characters, right to left
#        t0 - single 4-bit value, converting to character, as above

word2hex:
    la     $v0, str8            #string workspace defined in data segment
    add    $t1, $v0, 8          #point to last character position + 1
    move   $t2, $a0             #copy input, will be shifting $t2
hexloop:    
      sub    $t1, 1             #point 1 character left
      and    $t0, $t2, 0x000f   #only allow lowest 4 bits non-zero, converting in $t0
      srl    $t2,$t2, 4         #shift copy of word so next 4 bits are on right
      or    $t0, 0x30           #make it numeric ASCII character, but ...
      ble    $t0,'9', numeric   #IF > '9', 
        add  $t0, 7             #must add 7 to get 'A' .. 'F'
numeric:
      sb     $t0, ($t1)         #store hex character in string
      bne    $t1,$v0, hexloop   #UNTIL all 8 characters stored
    jr     $ra                  #RETURN from function call, $v0 points to string

A fully running version using this code is hex1.a
Another way to do this, using rol, is given in the book, program hex.a.

Prepared by Lin Jensen, Bishop's University, 11 February 1999

Back Next Contents