The Times-4 Trick
First, let's pick up a detail of what the assembler does that we left out in the first page on branches:
- When the assembler does branch_offset = branch_target - incremented_PC, it has to make sure that
the number that it gets will fit into the 16-bit immediate field of the branch instruction
- If this is not the case, the program is trying to branch farther than is possible
- The assembler just compares the number that it gets against an upper and a lower bound
- It would be easy for us to make the same test by eye:
the top 17 bits of the answer must all be the same
- That is, it must look like a number that has come out the right side of the sign extender
- That way, we can throw away the top 16 bits, and the sign extender will put them back for us
- The resulting sign-extended number will be the same one we had before throwing away the top 16 bits
Let's do an example in 8 bits instead of 32
- This is a toy system: we have 8-bit words; addresses are also 8 bits
- The immediate field is 5 bits
- We will write 8-bit values this way:
000-00000 to separate the 5-bit immediate field from the other 3 bits
- With 5 bits we can store signed numbers in the range -16 .. 15
- The multiples of 4 in that range are -16, -12, -8, -4, 0, 4, 8, 12
- Those are the only branch offsets that are possible with 5 bits
- Suppose the assembler calculates the branch offset and gets 000-00100, i.e. 4
- This is in our set of possible values
- The immediate field will get these 5 bits: 00100
- Pushing those into the sign extender gets us 000-0010, the value we started with
- Now suppose that the calculation yields 000-10100 or 20,
which is outside our acceptable range of values
- If the assembler doesn't stop and throw an error,
the immediate field would get these 5 bits: 10100 or -12
- When these go through the sign-extender, we'd get 111-10100 (still -12) as its output,
not the 000-10100 we started with
- This would end up branching in the wrong direction, not good; very bad, actually
The times-4 trick will save the day in this case
- The idea is that the 5-bit immediate field is carrying two useless bits on the right end
- We don't need them there because we know that they are zeroes
- Let's go back to our example offset of 000-10100 and divide the bits up this way instead: 0-00101-00
- Now put THOSE 5 bits into the immediate field; don't forget it's the assembler doing this
- Later, when the CPU digs those 5 bits out of the immediate field and sends them through the sign extender,
it will get 000-00101
- Then it tacks 2 zero bits back onto the right end and gets 000-00101-00 or 00-000-10100
- Then it throws away the 2 extra bits on the left end and has 000-10100.
- Mirabile dictu! That's what we started with!
- Now, instead of giving us 8 places we can branch to, our 5 bits give us 32 places we can branch to
Let's go back to the real world of 32 bits
- What happens is that, instead of putting bits [15:0] into the immediate field of branch instructions,
the assembler puts bits [17:2] there
- Later the CPU puts the 2 zeroes back onto the right end,
and voila! Now it has 18 bits instead of 16
- So the actual range of branching is 218 instead of 216 or 4 times as big
- This is well worth the trouble, since 218 = 262144, or a quarter of a Mebibyte
- Before we employed this trick, the test for trying to branch too far was that bits [31:15]
must all be the same. With the trick it is only bits [31:17]
It uses shifts to do this
- After doing the range check, the assembler shifts the computed offset right by two places
- This pushes the 2 zeroes off the right side
- Then it puts bits [15:0] into the immediate field; remember these bits started out as bits [17:2]
- At execution time, the CPU will shift the output of the sign extender left by two places
- This pulls 2 zeroes in at the right end, thus restoring the offset to its original value,
which it then adds to the PC, thus effecting the branch
Thinking of shifts as arithmetic
- Shifting decimal numerals left and right multiplies and divides their values by powers of 10
- Similarly, shifting binary numerals left and right multiplies and divides their values by powers of 2
- Shifting left by 2 places is multiplying by 4; shifting right by 2 places is dividing by 4
- The branch-offset is a count of the number of bytes that we're changing the PC by
- Shifting it right by 2 places changes that to a count of WORDS
- Remember back to the page on branches; we had a shortcut for getting the branch offset:
- count the words from the incremented PC to the branch target
- multiply by 4, which changes the word count to a byte count
Isn't This Fun?