Score:______ Section:____________ Date:__________            Name:______________________________

 

ECE 3055b Laboratory Assignment 3

Due Date Part I: Tuesday March 1

Due Date Part II: Tuesday March 8

Part I: Forwarding

Once the MIPS is pipelined as in Lab 2, data hazards can occur between the five instructions present in the pipeline. As an example, consider the following program:

Sub     $2,$1,$3

Add     $4,$2,$5

The subtract instruction stores a result in register 2 and the following add instruction uses register 2 as a source operand. The new value of register 2 is written into the register file by SUB $2,$1,$3 in the write-back stage after the old value of register 2 was read out by ADD $4,$2,$5 in the decode stage.

In the text, this problem is fixed by adding two forwarding muxes to each ALU input in the execute stage. In addition to the existing values feeding in the two ALU inputs, the forwarding multiplexers can also select the last ALU result or the last value in the data memory stage. These muxes are controlled by comparing the rd, rt, and rs register address fields of instructions in the decode, execute, or data memory stages. Instruction rd fields will need to be added to the pipelines in the execute, data memory, and write-back stages for the forwarding compare operations. Since register 0 is always zero, do not forward register 0 values. The forwarding muxes in the EX stage handle dependences on instructions in the EX and MEM stages.

For Part II of the lab with Branches you will need to have the ALU’s data forwarding muxes moved back into the decode stage, so if you intend to work both parts, you might save time by moving the forwarding muxes to the decode stage for Part I of the lab. Note that the mux signals to forward are now needed one clock cycle earlier, so you will need to rework the text’s forwarding equations to reflect this change before coding them in VHDL. It would be a good idea to work through several examples by hand to double check your new forwarding equations. If necessary, draw a picture like Fig 6.34 to obtain the new forwarding equations. You can assume that there is enough time in a single clock cycle to take the current ALU output (before a pipeline register) and forward it to the decode stage for a branch decision.

Dependence from an instruction in the WB stage to one in the ID stage must also be considered. You can handle this dependence in one of 2 ways. The first method is to ensure that the assumption used in Patterson and Hennessy holds, i.e. that the register file write in one cycle occurs before the register file read in the same cycle. The current VHDL model does not function this way. This can be accomplished by writing to the register file on the falling edge (instead of the rising edge as done currently) of each clock cycle. A register file read from the same register will then have its value clocked into the ID/EX pipeline register at the next rising edge, ensuring that the just-written value is passed to the EX stage. The second method to handle this type of dependence is to add two forwarding multiplexers to the Idecode module so that the register file is bypassed in this situation. If the register file write address equals one of the two read addresses, the register file write data value should be forwarded to the ID/EX pipeline register instead of the normal register file data value. Since you may already have the data forwarding muxes in the decode stage for Part II, you may find this option easier. Sections 6.4 and 6.5 of Computer Organization and Design The Hardware/Software Interface contain additional background information on forwarding.

Add forwarding to your pipelined datapath from Lab 2 and test it with the following program:

And     $4,$3,$1

Or       $8,$1,$4

Sub     $3,$4,$4

Add     $5,$4,$3

And     $1,$6,$7

 

 

Part II: Branch hazards

Forwarding does not resolve all hazards. In this part of the lab, you will implement a hazard unit that will flush the pipeline to handle branch hazards. Assume branch not taken, and move the branch decision hardware forward to the decode stage as shown in Fig 6.38. Also, be aware that when you implement the text’s Fig. 6.38 solution for early branch condition evaluation, your forwarding muxes must be placed in the ID stage before the comparison unit rather than the EX stage, so that the new branch comparison unit can use forwarded values for branch instructions that are dependent on preceding instructions. In addition to the new branch compare circuit that must be added to decode, the branch address adder must be moved from execute to decode. With these changes, only 1 instruction following a branch taken must be flushed. When a taken branch is detected, the IF/ID register should be reset to all 0's (i.e. a NOP) to flush the instruction that was fetched incorrectly.

Add this branch flush hardware to your pipelined datapath and test it with the following program:

 

            Beq $0,$4,label1

            Add  $8,$2,$4

            Add $5,$5,$1

           label1:         Beq        $8,$5,label2

            Sub $5,$5,$4

            label2:         Beq        $5,$6,label2

            label3:         Beq        $0,$0,label3

 

 

Grading Criteria:

Correct simulation for Part I – 7 pts.

Correct simulation for Part II – 3 pts.