User Guide

Complete step-by-step guide for using the Hugging Face Image Classification Project.

Getting Started
Quick Start
Detailed Step-by-Step Guide
Using the Scripts
Common Workflows
Tips and Best Practices
Troubleshooting

Getting Started

Prerequisites

Python 3.8+ (Python 3.10 or newer recommended)
pip (Python package installer)
Git (optional, for version control)

System Requirements

RAM: At least 8GB (16GB recommended)
Storage: ~2GB for model files and dependencies
GPU: Optional but recommended for faster training (CPU works too)

Quick Start

For experienced users, here's the fast track:

# 1. Install dependencies
pip install -r requirements.txt

# 2. Create custom model
python model_custom.py

# 3. Add images to data/ subdirectories
#    - data/my_cat/your_images.jpg
#    - data/my_dog/your_images.jpg
#    - etc.

# 4. Train the model
python train.py --data_dir ./data --epochs 5

# 5. Test your images
python test.py --image your_image.jpg

That's it! For detailed explanations, continue reading below.

Project Structure

This project is organized into a top-level src/ package (modular code) plus small wrappers at the root (for convenience/backward compatibility).

Key files:

app.py                    # Hugging Face Spaces entry (loads models/checkpoint-final/)
main.py                   # Entry point: launches the Gradio web UI (same checkpoint path)
models/checkpoint-final/  # Default save/load location for fine-tuned weights (HF format)
archive/                  # Intermediate Trainer checkpoints + archived metrics (see archive/README.md)
src/api/inference.py      # Shared inference + overlay helpers
src/models/model_custom.py
src/models/train.py
src/web/app.py            # Gradio UI implementation
src/utils/download_images_loremflickr.py

Detailed Step-by-Step Guide

Step 1: Installation

1.1 Clone or Download the Project

If you have the project in a Git repository:

git clone <repository-url>
cd huggingface-image-project

Or simply download and extract the project folder.

1.2 Install Python Dependencies

Open a terminal/command prompt in the project directory and run:

pip install -r requirements.txt

Expected output:

PyTorch will be installed (may take a few minutes)
Transformers library
Pillow (image processing)
Accelerate (training acceleration)

Troubleshooting:

If you get permission errors, use: pip install --user -r requirements.txt
On macOS/Linux, you might need: pip3 install -r requirements.txt
If installation fails, ensure you have Python 3.8+ installed

1.3 Verify Installation

Verify everything is installed correctly:

python --version  # Should show Python 3.8+
python -c "import torch; import transformers; print('✓ All packages installed')"

Step 2: Prepare Your Custom Model

2.1 Understand What You're Creating

The model_custom.py script:

Downloads the base google/vit-base-patch16-224 model (first time only)
Modifies it from 1000 ImageNet classes to your 5 custom classes
Saves the custom model to ./custom_vit_model/

Your 5 classes are:

my_cat
my_dog
my_car
my_house
my_phone

2.2 Run the Model Customization Script

python model_custom.py

First run will:

Download the base model (~330MB) - this may take a few minutes
Modify the classification head
Save to ./custom_vit_model/

Expected output:

Loading base model: google/vit-base-patch16-224
Modifying classification head from 1000 to 5 classes...
Saving custom model to ./custom_vit_model...
Custom model created successfully!
Classes: ['my_cat', 'my_dog', 'my_car', 'my_house', 'my_phone']

Note: This only needs to be run once. The custom model is saved and can be reused.

Step 3: Prepare Your Training Data

3.1 Understand the Required Structure

Your images must be organized in folders matching your class names exactly:

data/
  my_cat/
    image1.jpg
    image2.jpg
    ...
  my_dog/
    image1.jpg
    ...
  my_car/
    ...
  my_house/
    ...
  my_phone/
    ...

3.2 Organize Your Images

Create class folders (if not already created):

mkdir -p data/my_cat data/my_dog data/my_car data/my_house data/my_phone

Copy your images to the appropriate folders:
- Cat images → data/my_cat/
- Dog images → data/my_dog/
- Car images → data/my_car/
- House images → data/my_house/
- Phone images → data/my_phone/
Image Requirements:
- Formats: .jpg, .jpeg, .png, .bmp, .gif
- Recommended: At least 50-100 images per class
- Quality: Clear, well-lit images
- Variety: Different angles, backgrounds, lighting

3.3 Verify Your Data Structure

Check that your images are properly organized:

# On Linux/macOS
for dir in data/*/; do echo "$(basename "$dir"): $(find "$dir" -type f \( -name "*.jpg" -o -name "*.jpeg" -o -name "*.png" \) | wc -l) images"; done

Or manually check each folder contains images.

Step 4: Train the Model

4.1 Basic Training Command

python train.py --data_dir ./data --epochs 5 --batch_size 8

What this does:

Loads your custom model
Loads images from data/ directory
Splits data: 80% training, 20% validation
Trains for 5 epochs
Saves the trained model to ./models/checkpoint-final/

4.2 Training Parameters Explained

Parameter	Default	Description	When to Change
`--data_dir`	`./data`	Directory with your images	If images are elsewhere
`--model_path`	`./custom_vit_model`	Path to custom model	If using different model
`--output_dir`	`./models/checkpoint-final`	Where to save trained model	To save to different location
`--epochs`	`5`	Number of training iterations	Increase for more training (10-20)
`--batch_size`	`8`	Images per batch	Reduce if out of memory (4-8)
`--learning_rate`	`2e-5`	Learning speed	Usually fine as-is

4.3 Example Training Commands

Quick training (small dataset):

python train.py --data_dir ./data --epochs 3 --batch_size 4

Thorough training (more data):

python train.py --data_dir ./data --epochs 10 --batch_size 16 --learning_rate 2e-5

Training with custom paths:

python train.py --data_dir ./my_images --output_dir ./my_trained_model --epochs 5

4.4 What to Expect During Training

Training output:

============================================================
Training Custom Vision Transformer
============================================================

Loading model from ./custom_vit_model...
Classes: ['my_cat', 'my_dog', 'my_car', 'my_house', 'my_phone']
Device: cpu

Loading dataset from ./data...
Dataset: 150 images
  my_cat: 30 images
  my_dog: 30 images
  ...
Train: 120, Val: 30

Starting training...
Epochs: 5, Batch size: 8, LR: 2e-05
------------------------------------------------------------
[Training progress bars...]
Evaluating...
Validation accuracy: 85.23%

Saving model to ./models/checkpoint-final...
============================================================
Training completed!
============================================================

Training time:

Small dataset (50 images): 1-2 minutes
Medium dataset (200 images): 5-10 minutes
Large dataset (1000+ images): 30+ minutes
With GPU: Much faster (3-5x speedup)

Step 5: Test Your Model

5.1 Test a Single Image

python test.py --image path/to/your/image.jpg

Example:

python test.py --image my_photo.jpg
python test.py --image data/my_cat/cat_test.jpg
python test.py --image /Users/yourname/Pictures/test_photo.jpg

Output:

============================================================
Testing Trained Model
============================================================

Loading model from ./models/checkpoint-final...
Classes: ['my_cat', 'my_dog', 'my_car', 'my_house', 'my_phone']

Loading image: my_photo.jpg
Running inference...

------------------------------------------------------------
PREDICTION RESULTS
------------------------------------------------------------
Image: my_photo.jpg

Predicted: my_cat
Confidence: 94.61%

All predictions:
  1. my_cat          94.61% ████████████████████████████
  2. my_phone         1.55% ░░░░░░░░░░░░░░░░░░░░░░░░░░░░
  3. my_house         1.34% ░░░░░░░░░░░░░░░░░░░░░░░░░░░░
  4. my_dog           1.31% ░░░░░░░░░░░░░░░░░░░░░░░░░░░░
  5. my_car           1.19% ░░░░░░░░░░░░░░░░░░░░░░░░░░░░
------------------------------------------------------------

5.2 Test Multiple Images

Test all images in a directory:

python test.py --directory ./my_test_photos

Output:

Found 10 images
============================================================
1. photo1.jpg                    → my_cat          (94.61%)
2. photo2.jpg                    → my_dog          (87.23%)
3. photo3.jpg                    → my_car          (92.15%)
...

5.3 Use a Different Trained Model

If you've trained multiple models:

python test.py --image photo.jpg --model_path ./my_custom_trained_model

Step 6: Test the Web UI (Gradio)

Launch the Gradio app (loads the model from ./models/checkpoint-final):

python main.py

Open the URL shown in the terminal (typically http://127.0.0.1:7860) and upload an image to see:

The predicted label with confidence score
The image with a prediction overlay

Using the Scripts

model_custom.py

Purpose: Create a custom model with 5 classes instead of 1000.

Usage:

python model_custom.py

What it does:

Loads base model (downloads if first time)
Modifies classification head
Saves to ./custom_vit_model/

When to run: Once before training

Customization: Edit the my_classes list in the script to change class names.

train.py

Purpose: Train the custom model on your images.

Usage:

python train.py [options]

Options:

--data_dir DIR          Directory with class subdirectories (default: ./data)
--model_path PATH       Path to custom model (default: ./custom_vit_model)
--output_dir DIR        Output directory (default: ./models/checkpoint-final)
--epochs N              Number of epochs (default: 5)
--batch_size N          Batch size (default: 8)
--learning_rate FLOAT   Learning rate (default: 2e-5)

Examples:

# Basic
python train.py

# With custom data directory
python train.py --data_dir ./my_images

# More training
python train.py --epochs 10 --batch_size 16

# All options
python train.py --data_dir ./data --epochs 5 --batch_size 8 --learning_rate 2e-5

test.py

Purpose: Test the trained model on new images.

Usage:

python test.py --image IMAGE_PATH
python test.py --directory DIR_PATH

Options:

--image PATH            Path to single image file
--directory PATH        Directory containing images
--model_path PATH       Path to trained model (default: ./models/checkpoint-final)

Examples:

# Single image
python test.py --image photo.jpg

# Directory of images
python test.py --directory ./test_photos

# Different model
python test.py --image photo.jpg --model_path ./my_model

main.py (Web UI entry point)

Purpose: Launch the Gradio web interface for image classification.

Usage:

python main.py

Notes:

The app loads the trained model from ./models/checkpoint-final.
You can also run python app.py if you prefer the backward-compatible wrapper, but main.py is the recommended entry point.

Common Workflows

Workflow 1: First Time Setup

# 1. Install dependencies
pip install -r requirements.txt

# 2. Create custom model (downloads base model)
python model_custom.py

# 3. Add your images to data/ folders

# 4. Train
python train.py --data_dir ./data --epochs 5

# 5. Test
python test.py --image test_image.jpg

Workflow 2: Retrain with More Data

# 1. Add more images to data/ folders

# 2. Retrain (overwrites previous model)
python train.py --data_dir ./data --epochs 5

# 3. Test again
python test.py --image test_image.jpg

Workflow 3: Train Multiple Models

# Train model version 1
python train.py --output_dir ./model_v1 --epochs 5

# Train model version 2 (different epochs)
python train.py --output_dir ./model_v2 --epochs 10

# Test both
python test.py --image photo.jpg --model_path ./model_v1
python test.py --image photo.jpg --model_path ./model_v2

Workflow 4: Quick Testing Loop

# Test, train, test again
python test.py --image test.jpg
python train.py --epochs 3
python test.py --image test.jpg

Tips and Best Practices

Data Collection

✅ Do:

Collect 50-100+ images per class
Include variety (angles, lighting, backgrounds)
Use clear, high-quality images
Keep similar images per class (consistent objects)

❌ Don't:

Use too few images (<10 per class)
Use only similar images (no variety)
Include blurry or dark images
Mix different objects in same class

Training

✅ Do:

Start with 5 epochs, increase if needed
Monitor validation accuracy
Use batch_size 8-16 (reduce if out of memory)
Balance number of images per class

❌ Don't:

Train too many epochs (may overfit)
Use very large batch_size (out of memory)
Train with unbalanced classes
Skip validation

Testing

✅ Do:

Test on images similar to training data
Test multiple images
Check confidence scores
Verify predictions make sense

❌ Don't:

Test on completely different images
Trust single test result
Ignore low confidence scores
Expect perfect results with little data

Troubleshooting

Installation Issues

Problem: pip: command not found

Solution: Use pip3 instead, or install pip

Problem: Permission denied

Solution: Use pip install --user -r requirements.txt

Problem: Package installation fails

Solution: Update pip: pip install --upgrade pip

Model Creation Issues

Problem: model_custom.py fails

Check: Internet connection (needs to download model first time)
Check: Disk space (needs ~500MB)
Solution: Run again, model download may take time

Problem: Model files not created

Check: Script completed without errors
Check: ./custom_vit_model/ directory exists
Solution: Run script again

Training Issues

Problem: No images found

Check: Images are in correct folders (data/my_cat/, etc.)
Check: Folder names match class names exactly
Check: Images have correct extensions (.jpg, .png, etc.)
Solution: Verify directory structure matches requirements

Problem: Out of memory

Solution: Reduce batch_size: --batch_size 4 or --batch_size 2
Solution: Use smaller images or fewer images
Solution: Close other applications

Problem: Training is very slow

Solution: Normal on CPU, expect 5-30 minutes
Solution: Use GPU if available (automatic if CUDA available)
Solution: Reduce number of images or epochs

Problem: Low validation accuracy

Solution: Add more training images
Solution: Train for more epochs
Solution: Check image quality
Solution: Ensure balanced dataset

Testing Issues

Problem: Model not found

Check: You've run train.py successfully
Check: ./models/checkpoint-final/ directory exists
Solution: Train model first: python train.py

Problem: Image not found

Check: Image path is correct
Check: Image file exists
Solution: Use full path or relative path from project directory

Problem: Low confidence predictions

Solution: Add more training data
Solution: Test on images similar to training data
Solution: Retrain with more epochs

Problem: Wrong predictions

Solution: Add more training images
Solution: Check image quality
Solution: Ensure test images are similar to training images
Solution: Train for more epochs

General Issues

Problem: Python version error

Check: Python version: python --version
Solution: Use Python 3.8+ (install if needed)

Problem: Import errors

Solution: Reinstall dependencies: pip install -r requirements.txt
Solution: Check virtual environment is activated (if using)

Problem: Script not found

Check: You're in the project directory
Solution: Use full path or cd to project directory

Getting Help

Check Documentation

README.md - Main project documentation
COMPREHENSIVE_RESULTS.md - Detailed results and analysis

Common Solutions

Read error messages carefully - They often tell you what's wrong
Check file paths - Ensure paths are correct
Verify installation - Make sure all packages are installed
Check data structure - Ensure images are organized correctly

Next Steps

Add more training data for better accuracy
Experiment with different hyperparameters
Try different training epochs
Test on various images

Happy training! 🚀

FilesExpand file tree

USER_GUIDE.md

Latest commit

History

USER_GUIDE.md

File metadata and controls

User Guide

Table of Contents

Getting Started

Prerequisites

System Requirements

Quick Start

Project Structure

Detailed Step-by-Step Guide

Step 1: Installation

1.1 Clone or Download the Project

1.2 Install Python Dependencies

1.3 Verify Installation

Step 2: Prepare Your Custom Model

2.1 Understand What You're Creating

2.2 Run the Model Customization Script

Step 3: Prepare Your Training Data

3.1 Understand the Required Structure

3.2 Organize Your Images

3.3 Verify Your Data Structure

Step 4: Train the Model

4.1 Basic Training Command

4.2 Training Parameters Explained

4.3 Example Training Commands

4.4 What to Expect During Training

Step 5: Test Your Model

5.1 Test a Single Image

5.2 Test Multiple Images

5.3 Use a Different Trained Model

Step 6: Test the Web UI (Gradio)

Using the Scripts

model_custom.py

train.py

test.py

main.py (Web UI entry point)

Common Workflows

Workflow 1: First Time Setup

Workflow 2: Retrain with More Data

Workflow 3: Train Multiple Models

Workflow 4: Quick Testing Loop

Tips and Best Practices

Data Collection

Training

Testing

Troubleshooting

Installation Issues

Model Creation Issues

Training Issues

Testing Issues

General Issues

Getting Help

Check Documentation

Common Solutions

Next Steps