Verilog Code for 16 Bit MIPS Pipelined Processor

Hello everyone,

Long time no see. I was actually very busy with my job schedule and then also working on pipeline code. Well, I have successfully completed the pipelined version of the processor. I was working on32 bit but sadly, it had gotten corrupted and I was forced to work on 16 bit which I don't know why I don't like.

What is a pipelined processor?

Below is the processor in action. Be careful what data lines you chose for.

This is the datapath of the 5 stage processor. I might miss some wiring. Do comment if a genius mind finds something different from RTL when compared to the below datapath. If I will find some error, I will myself update it.

Pipelining is a methodology which helps us to parallelly process instructions and only passes that information which is required for the current instruction. If one remembers, that without pipelining when we had set the "regwrt" signal to 1. It will remain 1 until the write operation is complete. However, in pipelining it's the opposite. Each pipeline carries the signals required for the instruction to each stage. 
IFID Pipe will carry the signals and data required for ID stage. At the same moment, IDEXE Pipe will carry the signals and data required for EXE stage. Similarly, EXEMEM Pipe will carry the signals and data required for MEM stage.

The pipelined processor helps to execute multiple instructions at a time. For instance, if an instruction is in EXE stage, another instruction will be in ID stage while another would be in MEM stage and further on. 

However, pipelining is not that simple. Hazards are associated with pipelining.

a. Data Hazards: For example,
                      ADD $1 $2 $3  //It will add contents of $1 and $2 and store it in $3
SUB $3 $5 $6  //It will subtract the contents of $3 and $5 and store it in $6

Now while the ADD instruction is in EXE stage, SUB will be in ID stage. Being in EXE stage, the output of ADD has not yet been written into $3 which means SUB will read wrong or better say old value of $3 from ID stage. This is what we call Data Hazard. We cannot move backward through a pipeline. All pipelines move in only 1 direction. 

b. Control Hazards: Now we come down to the branch instruction. Branch instructions depend mainly on zero flag from the ALU. Now to get the value of ZERO flag, the instruction must be in EXE stage. Since we are in the pipelined mode, another instruction will be in ID stage while another one would be in IF stage. This is NOT what we want. Suppose that the branch instruction sets out to be true then the two instructions in the pipelines cannot be forwarded. This is what we call control hazard.

c. Structural Hazards: This hazard arises when the hardware cannot support what we want. You can't read and write to a register simultaneously.

So how do we resolve data hazards?
a. We can stall the pipeline. I love stalling for no reason though not considered good for a good processor. Stalling pipeline means to stop forwarding the instructions through staging until a required condition is met. Well, we require a data hazard unit to make the processor detect the hazard and then stall the pipeline.

b. Forwarding: Yes, I adopted this technique even though I wanted to use the stall technique. In this technique, we move the data output from EXE stage back to ID stage to overwrite the old value. It might contradict to others that I had earlier said that we can't move back in the pipeline. Well, it still holds true. We can't because forwarding data is not done through the pipeline. It is done through wiring. It will be discussed further. So relax.

c. Scheduling: We can schedule instructions either via compiler or via hardware. We are currently not into it right now.

If there is a dependency between instruction A+1 and instruction A we would require 3 stalls only if ID/EX.WriteReg == IF/ID read-register $1 or $2. However, this is difficult as we are not yet sure what is the destination register. Either $1 or $2. If there is a dependency between the instruction A+2 and A, we would require 2 stalls only if EX/MEM.WriteReg == IF/ID read-register $1 or $2. If there is a dependency between the instruction A+3 and A, we would require 1 bubble only if MEM/WB.WriteReg == IF/ID read-register $1 or $2. ID stage and IFID pipeline register must be frozen at the same Note that stalls stop instructions in the ID stage. So we need control lines to send NOP command i.e. Create bubbles. This can be done by setting all control lines that are passed from ID to 0, hence creating a nop thus preventing new instruction fetches. 

The code for every module is different when compared with the non-pipeline code

The code for every module is different when compared with the non-pipeline code. In the non-pipelined code, the data flowing outwards from a stage were sequential i.e. clock dependent. They only flow outwards when intercepted with an always@(negedge clk) block. In pipelined stages, it is sequential. It flows outwards with the always @(*) block. (exceptions exists). The pipelines however are clocked. Never pass any signal through a pipeline without a clock signal.

Working of Forwarding:

We all know that dependency occurs when the previous instruction wants to write to a register which is required by the next instruction. So when the first instruction is in EXE stage, the previous instruction will be in ID stage. Since instruction 1 will write only after WB stage, this will pose a problem for us or better say hazard. So first we need to know that whether the 1st instruction wants to write or not? It will be unnecessary to stall or forward when no writing is present. So the first condition is:

Similarly, if Write_Register is equal to any of the source registers(A_Reg or B_Reg) from the next instruction then forwarding occurs. So our condition changes to:

  if(MEM_regwrt==1 && MEM_W_Reg==A_Reg)
         ForwardA = 1;
         ForwardA = 0;

Similarly, for the second register, our code will be:

  if(MEM_regwrt==1 && MEM_W_Reg==B_Reg)
        ForwardB = 1;
        ForwardB = 0;

The ForwardA and ForwardB are the signal wires for the multiplexors ForA and ForB. What if an instruction is dependent with 1 location difference i.e. the instruction I has a dependency on I + 2nd instruction. The then I instruction will be in WB stage while the I + 2 will be in EXE stage. For that, we will have the following condition:

if (WB_regwrt==1 && WB_W_Reg==A_Reg && (MEM_W_Reg != A_Reg || MEM_regwrt==0))
     ForwardA = 2;
     ForwardA = 0;

if(WB_regwrt==1 && WB_W_Reg==B_Reg  &&( MEM_W_Reg != B_Reg || MEM_regwrt==0))
     Forwardb = 2;
     ForwardB = 0;

Enough theory, for now, let us visualize it in step diagram.

The 5th cycle register writeback is required for other instructions. This is where we require forwarding.

Another example here below shows how instructions are dependent on each other which is mostly the case.
The AND instruction requires the value of $2 from SUB instruction which will write the result of $1 and $3 in $2. Similarly OR instruction is dependent on the SUB instruction. We do have an option to stall but that would halt the entire pipeline which we wouldn't want. After all most of the instructions in the real world have tons of dependencies.

Stalling can be easily achieved. All one has to do is freeze the PC and IF/ID pipeline register. This would continue the previous instruction for another clock cycle. During the stall condition, we provide NOP opcode which is 0000. I am currently doing the same for Flush where we erase a pipeline data.


If one looks carefully at the datapath diagram, I have included a comparator. This comparator just compares the register values and sends the signal to the Control Unit. The Control Unit looks upon the opcode and then the signal and decides whether to flush the pipeline or not. With a BEQ instruction, the Control Unit would not know whether to flush or not until it reads the Zero flag. To read that flag it will have to wait for 2 clock cycle which is another pain for us. I tried this methodology a lot but I want unable to decide the logic which would tell the Control Unit to stall of flush the pipeline. Do not confuse with stall and flush. The stall is like stopping the flow. What was flowing before will continue to flow for another n clock cycles. It is like a clock enable or disable. Flushing is like erasing the pipeline contents. It doesn't halt the previous instruction. It will just erase the unnecessary instruction. This is done with disabling writing to all components with flushing i.e. a corrupt data is in the pipeline.

Remember that MIPS was designed to avoid stalls. Although, we are not that clever to simulate and replicate a real-life MIPS but who is stopping us from trying?

Comparator: The comparator in the ID stage will help us to reduce two cycle flush during Branch instruction. The comparator will output 4 bits. 1st bit checks for equality function. 2nd bit checks whether value A is less than value B. 3rd bit checks whether value B is less than the value A. 4th bit checks whether value A is equal to value B. This 4 bit value is sent to the Control Unit which decides whether to flush the instruction or not.

Verilog Code For Fetch Stage 
Verilog Code For Program Counter 
Verilog Code For Instruction Memory 

Verilog Code for Decode Stage
Verilog Code for Control Unit
Verilog Code for Register File
Verilog Code for Adder
Verilog Code for Sign Extend
Verilog Code for Comparator
Verilog Code for Decode Pipeline

Verilog Code for ALU
Verilog Code for ForA and ForB Mux
Verilog Code for Execute Pipeline
Verilog Code for Forwarding Unit

Verilog Code for RAM
Verilog Code for Memory Stage
Verilog Code for Memory Pipeline

Verilog Code for Write Back Stage

Verilog Code for Datapath and TestBench/ Top Module

 Confused about something? Feel Free to comment.

To Be Continued...With Code as well


  1. Thanks Shashi, It is working perfectly. Could you upload the screenshots of the Flush output? I needed to verify.

  2. I would luv if u could upload 32 bit too. Making this processor dual core would be awesome. Will you try?

    1. Yeah even I wanted 32 bit.
      Making this dual core is tough i.e. I will have to make instruction scheduler too with correct scheduling
      I will research and study and then let you know

  3. Bro!!! You rock but you forgot something!!

    RTL !!
    Ijust want to cross verify the RTL

  4. NAND and EXOR is having same opcode

  5. Its Working...Thanks

  6. Hello sir,
    Code for comparator seems to be missing. Could you please check?

    1. Comparator code is simple. I guess others too did it by reverse engineering.

      Anyhow, I'll upload the code.

  7. excellent work mate..
    but can anyone help me find the top module??
    its not there

  8. Beautiful work sir... But sir i am curious about the branch hazards. When you will deal with branch hazards

    1. Thanks,
      Branch Hazards have also been converted in the code.

  9. can u plzz send me the code for 5 stage pipeline 64 bit risc processor with 32 instructions plzzz

  10. sir ID2EX pipe module is missing, can you please load it


Post a Comment

Popular posts from this blog

SPI Working with Verilog Code

Verilog Code for I2C Protocol

SR Flip Flop Verilog Code