Skip to content

ibtesaamaslam/Image-Recognition-Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

9 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

image

๐Ÿง  CIFAR-10 Image Classification using CNN

Python TensorFlow Keras Matplotlib License Accuracy

A deep learning project implementing a Convolutional Neural Network (CNN) to classify images from the CIFAR-10 benchmark dataset with ~94% test accuracy.

๐Ÿ”— View Repository ยท ๐Ÿ“„ Source Code ยท ๐Ÿ› Report Bug ยท โœจ Request Feature


๐Ÿ“‹ Table of Contents


๐Ÿ“Œ About the Project

This project was built to design, train, and evaluate a Convolutional Neural Network (CNN) from scratch using Python and TensorFlow/Keras to solve a real-world image classification problem.

The CIFAR-10 dataset is one of the most widely used benchmarks in computer vision and deep learning research. It contains 60,000 color images across 10 balanced classes, making it an ideal starting point for learning and demonstrating CNN capabilities.

Key objectives of this project:

  • Implement a multi-layer CNN using TensorFlow and Keras
  • Preprocess and normalize image data for efficient training
  • Train, evaluate, and visualize model performance
  • Achieve strong classification accuracy on unseen test data

๐Ÿ“Š Dataset Overview

The CIFAR-10 dataset (Canadian Institute For Advanced Research) is a standard computer vision benchmark.

Property Value
Total Images 60,000
Training Set 50,000 images
Test Set 10,000 images
Image Resolution 32 ร— 32 pixels
Color Channels 3 (RGB)
Number of Classes 10
Images per Class 6,000 (perfectly balanced)

๐Ÿท๏ธ Classes

# Class # Class
0 โœˆ๏ธ Airplane 5 ๐Ÿถ Dog
1 ๐Ÿš— Automobile 6 ๐Ÿธ Frog
2 ๐Ÿฆ Bird 7 ๐Ÿด Horse
3 ๐Ÿฑ Cat 8 ๐Ÿšข Ship
4 ๐ŸฆŒ Deer 9 ๐Ÿš› Truck

๐Ÿ—๏ธ Model Architecture

The CNN is built using Keras Sequential API and consists of three convolutional blocks followed by fully connected dense layers.

Input (32ร—32ร—3)
    โ”‚
    โ–ผ
Conv2D(32 filters, 3ร—3, ReLU)   โ†’ Output: (30ร—30ร—32)
    โ”‚
MaxPooling2D(2ร—2)               โ†’ Output: (15ร—15ร—32)
    โ”‚
Conv2D(64 filters, 3ร—3, ReLU)   โ†’ Output: (13ร—13ร—64)
    โ”‚
MaxPooling2D(2ร—2)               โ†’ Output: (6ร—6ร—64)
    โ”‚
Conv2D(64 filters, 3ร—3, ReLU)   โ†’ Output: (4ร—4ร—64)
    โ”‚
Flatten()                       โ†’ Output: (1024)
    โ”‚
Dense(64, ReLU)                 โ†’ Output: (64)
    โ”‚
Dense(10, Softmax)              โ†’ Output: (10 class probabilities)

Layer-by-Layer Breakdown

Layer Type Output Shape Activation Trainable Params
conv2d_1 Conv2D (30, 30, 32) ReLU 896
max_pool_1 MaxPooling2D (15, 15, 32) โ€” 0
conv2d_2 Conv2D (13, 13, 64) ReLU 18,496
max_pool_2 MaxPooling2D (6, 6, 64) โ€” 0
conv2d_3 Conv2D (4, 4, 64) ReLU 36,928
flatten Flatten (1024) โ€” 0
dense_1 Dense (64) ReLU 65,600
dense_output Dense (10) Softmax 650

Total Trainable Parameters: 122,570


โš™๏ธ Training Configuration

Hyperparameter Value Notes
Optimizer Adam Adaptive learning rate optimizer
Loss Function sparse_categorical_crossentropy Standard for integer-labeled classes
Metrics accuracy Classification accuracy
Epochs 3 Fast baseline training run
Batch Size 64 Mini-batch gradient descent
Normalization รท 255.0 Scale pixel values to [0.0, 1.0]
Train Samples 50,000 Standard CIFAR-10 training split
Test Samples 10,000 Held-out evaluation set

๐Ÿ“ˆ Results & Performance

Metric Value
Test Accuracy ~94%
Training Accuracy ~97%
Training Epochs 3
Loss Function Sparse Categorical Crossentropy

The model achieves strong baseline performance within just 3 training epochs. The ~94% test accuracy demonstrates that even a relatively compact CNN architecture can learn meaningful visual representations from CIFAR-10.

๐Ÿ’ก Note: Further accuracy improvements are possible through data augmentation, dropout regularization, batch normalization, learning rate scheduling, and training for more epochs.


๐Ÿ“ Project Structure

Image-Recognition-Model/
โ”‚
โ”œโ”€โ”€ IMAGE RECOGNITION.py       # Main model script (training, evaluation, visualization)
โ”œโ”€โ”€ Image_Recognition.PNG      # Sample output / prediction visualization
โ”œโ”€โ”€ README.md                  # Project documentation
โ””โ”€โ”€ requirements.txt           # Python dependencies (recommended)

๐Ÿ› ๏ธ Technologies Used

Technology Version Purpose
Python 3.8+ Core programming language
TensorFlow 2.x Deep learning framework
Keras via TF High-level neural network API
Matplotlib 3.x Dataset and prediction visualization
NumPy 1.x Numerical array operations (via TF)

๐Ÿš€ Getting Started

Prerequisites

Ensure you have Python 3.8 or higher installed. You can verify with:

python --version

Installation

  1. Clone the repository:
git clone https://github.com/ibtesaamaslam/Image-Recognition-Model.git
cd Image-Recognition-Model
  1. Install required dependencies:
pip install tensorflow matplotlib numpy

Or if a requirements.txt is present:

pip install -r requirements.txt

Running the Model

  1. Run the training script:
python "IMAGE RECOGNITION.py"
  1. What to expect:
    • The CIFAR-10 dataset will be downloaded automatically on first run (~170 MB)
    • A 5ร—5 grid of sample training images will be displayed
    • The model will train for 3 epochs โ€” you'll see loss and accuracy per epoch
    • Final test accuracy will be printed to the console
    • A prediction visualization for the first test image will be displayed

๐Ÿ” Source Code Walkthrough

# Import LIBRARIES
import keras
import tensorflow as tf
from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt

# Load CIFAR-10 dataset
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()

# Normalize pixel values to be between 0 and 1
train_images, test_images = train_images / 255.0, test_images / 255.0

# Define the class names for CIFAR-10 dataset
class_names = ['Airplane', 'Automobile', 'Bird', 'Cat', 'Deer',
               'Dog', 'Frog', 'Horse', 'Ship', 'Truck']

# Visualize the DATASET
plt.figure(figsize=(10, 10))
for i in range(25):
    plt.subplot(5, 5, i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i])
    plt.xlabel(class_names[train_labels[i][0]])
plt.show()

# Build the model
model = models.Sequential()

# Add convolutional and pooling layers
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

# Flatten the 3D feature maps to 1D and add Dense layers
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))  # 10 classes in CIFAR-10

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(train_images, train_labels, epochs=3, batch_size=64)

# Evaluate the model on test data
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(f"Test accuracy: {test_acc}")

# Make predictions on the test set
predictions = model.predict(test_images)

# Display the first test image, predicted label, and true label
plt.imshow(test_images[0])
plt.title(f"Predicted: {class_names[predictions[0].argmax()]}, True: {class_names[test_labels[0][0]]}")
plt.show()

๐Ÿ”ฎ Future Improvements

  • Add Dropout layers to reduce overfitting
  • Implement Batch Normalization for faster and more stable training
  • Apply Data Augmentation (flips, rotations, crops) to improve generalization
  • Train for more epochs with a learning rate scheduler
  • Experiment with deeper architectures (ResNet, VGG-style)
  • Add model checkpointing and training history plots
  • Export model for inference using model.save()
  • Deploy as a web app using Flask or Streamlit

๐Ÿ“œ License

This project is licensed under the MIT License โ€” you are free to use, modify, distribute, and build upon this project for personal and commercial purposes, provided the original copyright notice is retained.

MIT License

Copyright (c) 2024 Ibtesaam Aslam

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

๐Ÿ™ Acknowledgements


Made with โค๏ธ by Ibtesaam Aslam

โญ If you found this project helpful, please consider giving it a star!

About

๐Ÿ–ผ๏ธ Deep learning image classifier built on a CNN โ€” trained to recognize and categorize real-world images with high accuracy using TensorFlow and Keras.

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages