This project implements a Quantized Neural Network (QNN) inference engine entirely in Verilog as part of my Summer Internship at IIST (Dept. of Avionics).
The design executes inference for a 784-64-10 fully connected neural network (trained on MNIST) using an FSM-based datapath and memory-mapped weight/bias storage.
- FSM-based pipeline covering:
- MAC (Multiply-Accumulate) for input β hidden layer
- Bias addition + ReLU activation
- MAC for hidden β output layer
- Final argmax for classification
- Quantization: Inputs, weights, and biases stored as 8-bit signed integers
- Simulation-only flow (no FPGA needed)
- Modular testbench that drives memory from Python-generated
.memfiles
- Input Layer: 784 neurons (28Γ28 MNIST image)
- Hidden Layer: 64 neurons with ReLU activation
- Output Layer: 10 neurons β argmax = predicted digit
- FSM States:
- Load inputs + weights (first MAC)
- Add bias-1
- Apply ReLU
- Load weights (second MAC)
- Add bias-2
- Find max
- Output prediction