Complete step-by-step guide for using the Hugging Face Image Classification Project.
- Getting Started
- Quick Start
- Detailed Step-by-Step Guide
- Using the Scripts
- Common Workflows
- Tips and Best Practices
- Troubleshooting
- Python 3.8+ (Python 3.10 or newer recommended)
- pip (Python package installer)
- Git (optional, for version control)
- RAM: At least 8GB (16GB recommended)
- Storage: ~2GB for model files and dependencies
- GPU: Optional but recommended for faster training (CPU works too)
For experienced users, here's the fast track:
# 1. Install dependencies
pip install -r requirements.txt
# 2. Create custom model
python model_custom.py
# 3. Add images to data/ subdirectories
# - data/my_cat/your_images.jpg
# - data/my_dog/your_images.jpg
# - etc.
# 4. Train the model
python train.py --data_dir ./data --epochs 5
# 5. Test your images
python test.py --image your_image.jpgThat's it! For detailed explanations, continue reading below.
This project is organized into a top-level src/ package (modular code) plus small wrappers at the root (for convenience/backward compatibility).
Key files:
app.py # Hugging Face Spaces entry (loads models/checkpoint-final/)
main.py # Entry point: launches the Gradio web UI (same checkpoint path)
models/checkpoint-final/ # Default save/load location for fine-tuned weights (HF format)
archive/ # Intermediate Trainer checkpoints + archived metrics (see archive/README.md)
src/api/inference.py # Shared inference + overlay helpers
src/models/model_custom.py
src/models/train.py
src/web/app.py # Gradio UI implementation
src/utils/download_images_loremflickr.py
If you have the project in a Git repository:
git clone <repository-url>
cd huggingface-image-projectOr simply download and extract the project folder.
Open a terminal/command prompt in the project directory and run:
pip install -r requirements.txtExpected output:
- PyTorch will be installed (may take a few minutes)
- Transformers library
- Pillow (image processing)
- Accelerate (training acceleration)
Troubleshooting:
- If you get permission errors, use:
pip install --user -r requirements.txt - On macOS/Linux, you might need:
pip3 install -r requirements.txt - If installation fails, ensure you have Python 3.8+ installed
Verify everything is installed correctly:
python --version # Should show Python 3.8+
python -c "import torch; import transformers; print('✓ All packages installed')"The model_custom.py script:
- Downloads the base
google/vit-base-patch16-224model (first time only) - Modifies it from 1000 ImageNet classes to your 5 custom classes
- Saves the custom model to
./custom_vit_model/
Your 5 classes are:
my_catmy_dogmy_carmy_housemy_phone
python model_custom.pyFirst run will:
- Download the base model (~330MB) - this may take a few minutes
- Modify the classification head
- Save to
./custom_vit_model/
Expected output:
Loading base model: google/vit-base-patch16-224
Modifying classification head from 1000 to 5 classes...
Saving custom model to ./custom_vit_model...
Custom model created successfully!
Classes: ['my_cat', 'my_dog', 'my_car', 'my_house', 'my_phone']
Note: This only needs to be run once. The custom model is saved and can be reused.
Your images must be organized in folders matching your class names exactly:
data/
my_cat/
image1.jpg
image2.jpg
...
my_dog/
image1.jpg
...
my_car/
...
my_house/
...
my_phone/
...
-
Create class folders (if not already created):
mkdir -p data/my_cat data/my_dog data/my_car data/my_house data/my_phone
-
Copy your images to the appropriate folders:
- Cat images →
data/my_cat/ - Dog images →
data/my_dog/ - Car images →
data/my_car/ - House images →
data/my_house/ - Phone images →
data/my_phone/
- Cat images →
-
Image Requirements:
- Formats:
.jpg,.jpeg,.png,.bmp,.gif - Recommended: At least 50-100 images per class
- Quality: Clear, well-lit images
- Variety: Different angles, backgrounds, lighting
- Formats:
Check that your images are properly organized:
# On Linux/macOS
for dir in data/*/; do echo "$(basename "$dir"): $(find "$dir" -type f \( -name "*.jpg" -o -name "*.jpeg" -o -name "*.png" \) | wc -l) images"; doneOr manually check each folder contains images.
python train.py --data_dir ./data --epochs 5 --batch_size 8What this does:
- Loads your custom model
- Loads images from
data/directory - Splits data: 80% training, 20% validation
- Trains for 5 epochs
- Saves the trained model to
./models/checkpoint-final/
| Parameter | Default | Description | When to Change |
|---|---|---|---|
--data_dir |
./data |
Directory with your images | If images are elsewhere |
--model_path |
./custom_vit_model |
Path to custom model | If using different model |
--output_dir |
./models/checkpoint-final |
Where to save trained model | To save to different location |
--epochs |
5 |
Number of training iterations | Increase for more training (10-20) |
--batch_size |
8 |
Images per batch | Reduce if out of memory (4-8) |
--learning_rate |
2e-5 |
Learning speed | Usually fine as-is |
Quick training (small dataset):
python train.py --data_dir ./data --epochs 3 --batch_size 4Thorough training (more data):
python train.py --data_dir ./data --epochs 10 --batch_size 16 --learning_rate 2e-5Training with custom paths:
python train.py --data_dir ./my_images --output_dir ./my_trained_model --epochs 5Training output:
============================================================
Training Custom Vision Transformer
============================================================
Loading model from ./custom_vit_model...
Classes: ['my_cat', 'my_dog', 'my_car', 'my_house', 'my_phone']
Device: cpu
Loading dataset from ./data...
Dataset: 150 images
my_cat: 30 images
my_dog: 30 images
...
Train: 120, Val: 30
Starting training...
Epochs: 5, Batch size: 8, LR: 2e-05
------------------------------------------------------------
[Training progress bars...]
Evaluating...
Validation accuracy: 85.23%
Saving model to ./models/checkpoint-final...
============================================================
Training completed!
============================================================
Training time:
- Small dataset (50 images): 1-2 minutes
- Medium dataset (200 images): 5-10 minutes
- Large dataset (1000+ images): 30+ minutes
- With GPU: Much faster (3-5x speedup)
python test.py --image path/to/your/image.jpgExample:
python test.py --image my_photo.jpg
python test.py --image data/my_cat/cat_test.jpg
python test.py --image /Users/yourname/Pictures/test_photo.jpgOutput:
============================================================
Testing Trained Model
============================================================
Loading model from ./models/checkpoint-final...
Classes: ['my_cat', 'my_dog', 'my_car', 'my_house', 'my_phone']
Loading image: my_photo.jpg
Running inference...
------------------------------------------------------------
PREDICTION RESULTS
------------------------------------------------------------
Image: my_photo.jpg
Predicted: my_cat
Confidence: 94.61%
All predictions:
1. my_cat 94.61% ████████████████████████████
2. my_phone 1.55% ░░░░░░░░░░░░░░░░░░░░░░░░░░░░
3. my_house 1.34% ░░░░░░░░░░░░░░░░░░░░░░░░░░░░
4. my_dog 1.31% ░░░░░░░░░░░░░░░░░░░░░░░░░░░░
5. my_car 1.19% ░░░░░░░░░░░░░░░░░░░░░░░░░░░░
------------------------------------------------------------
Test all images in a directory:
python test.py --directory ./my_test_photosOutput:
Found 10 images
============================================================
1. photo1.jpg → my_cat (94.61%)
2. photo2.jpg → my_dog (87.23%)
3. photo3.jpg → my_car (92.15%)
...
If you've trained multiple models:
python test.py --image photo.jpg --model_path ./my_custom_trained_modelLaunch the Gradio app (loads the model from ./models/checkpoint-final):
python main.pyOpen the URL shown in the terminal (typically http://127.0.0.1:7860) and upload an image to see:
- The predicted label with confidence score
- The image with a prediction overlay
Purpose: Create a custom model with 5 classes instead of 1000.
Usage:
python model_custom.pyWhat it does:
- Loads base model (downloads if first time)
- Modifies classification head
- Saves to
./custom_vit_model/
When to run: Once before training
Customization: Edit the my_classes list in the script to change class names.
Purpose: Train the custom model on your images.
Usage:
python train.py [options]Options:
--data_dir DIR Directory with class subdirectories (default: ./data)
--model_path PATH Path to custom model (default: ./custom_vit_model)
--output_dir DIR Output directory (default: ./models/checkpoint-final)
--epochs N Number of epochs (default: 5)
--batch_size N Batch size (default: 8)
--learning_rate FLOAT Learning rate (default: 2e-5)
Examples:
# Basic
python train.py
# With custom data directory
python train.py --data_dir ./my_images
# More training
python train.py --epochs 10 --batch_size 16
# All options
python train.py --data_dir ./data --epochs 5 --batch_size 8 --learning_rate 2e-5Purpose: Test the trained model on new images.
Usage:
python test.py --image IMAGE_PATH
python test.py --directory DIR_PATHOptions:
--image PATH Path to single image file
--directory PATH Directory containing images
--model_path PATH Path to trained model (default: ./models/checkpoint-final)
Examples:
# Single image
python test.py --image photo.jpg
# Directory of images
python test.py --directory ./test_photos
# Different model
python test.py --image photo.jpg --model_path ./my_modelPurpose: Launch the Gradio web interface for image classification.
Usage:
python main.pyNotes:
- The app loads the trained model from
./models/checkpoint-final. - You can also run
python app.pyif you prefer the backward-compatible wrapper, butmain.pyis the recommended entry point.
# 1. Install dependencies
pip install -r requirements.txt
# 2. Create custom model (downloads base model)
python model_custom.py
# 3. Add your images to data/ folders
# 4. Train
python train.py --data_dir ./data --epochs 5
# 5. Test
python test.py --image test_image.jpg# 1. Add more images to data/ folders
# 2. Retrain (overwrites previous model)
python train.py --data_dir ./data --epochs 5
# 3. Test again
python test.py --image test_image.jpg# Train model version 1
python train.py --output_dir ./model_v1 --epochs 5
# Train model version 2 (different epochs)
python train.py --output_dir ./model_v2 --epochs 10
# Test both
python test.py --image photo.jpg --model_path ./model_v1
python test.py --image photo.jpg --model_path ./model_v2# Test, train, test again
python test.py --image test.jpg
python train.py --epochs 3
python test.py --image test.jpg✅ Do:
- Collect 50-100+ images per class
- Include variety (angles, lighting, backgrounds)
- Use clear, high-quality images
- Keep similar images per class (consistent objects)
❌ Don't:
- Use too few images (<10 per class)
- Use only similar images (no variety)
- Include blurry or dark images
- Mix different objects in same class
✅ Do:
- Start with 5 epochs, increase if needed
- Monitor validation accuracy
- Use batch_size 8-16 (reduce if out of memory)
- Balance number of images per class
❌ Don't:
- Train too many epochs (may overfit)
- Use very large batch_size (out of memory)
- Train with unbalanced classes
- Skip validation
✅ Do:
- Test on images similar to training data
- Test multiple images
- Check confidence scores
- Verify predictions make sense
❌ Don't:
- Test on completely different images
- Trust single test result
- Ignore low confidence scores
- Expect perfect results with little data
Problem: pip: command not found
- Solution: Use
pip3instead, or install pip
Problem: Permission denied
- Solution: Use
pip install --user -r requirements.txt
Problem: Package installation fails
- Solution: Update pip:
pip install --upgrade pip
Problem: model_custom.py fails
- Check: Internet connection (needs to download model first time)
- Check: Disk space (needs ~500MB)
- Solution: Run again, model download may take time
Problem: Model files not created
- Check: Script completed without errors
- Check:
./custom_vit_model/directory exists - Solution: Run script again
Problem: No images found
- Check: Images are in correct folders (
data/my_cat/, etc.) - Check: Folder names match class names exactly
- Check: Images have correct extensions (.jpg, .png, etc.)
- Solution: Verify directory structure matches requirements
Problem: Out of memory
- Solution: Reduce batch_size:
--batch_size 4or--batch_size 2 - Solution: Use smaller images or fewer images
- Solution: Close other applications
Problem: Training is very slow
- Solution: Normal on CPU, expect 5-30 minutes
- Solution: Use GPU if available (automatic if CUDA available)
- Solution: Reduce number of images or epochs
Problem: Low validation accuracy
- Solution: Add more training images
- Solution: Train for more epochs
- Solution: Check image quality
- Solution: Ensure balanced dataset
Problem: Model not found
- Check: You've run
train.pysuccessfully - Check:
./models/checkpoint-final/directory exists - Solution: Train model first:
python train.py
Problem: Image not found
- Check: Image path is correct
- Check: Image file exists
- Solution: Use full path or relative path from project directory
Problem: Low confidence predictions
- Solution: Add more training data
- Solution: Test on images similar to training data
- Solution: Retrain with more epochs
Problem: Wrong predictions
- Solution: Add more training images
- Solution: Check image quality
- Solution: Ensure test images are similar to training images
- Solution: Train for more epochs
Problem: Python version error
- Check: Python version:
python --version - Solution: Use Python 3.8+ (install if needed)
Problem: Import errors
- Solution: Reinstall dependencies:
pip install -r requirements.txt - Solution: Check virtual environment is activated (if using)
Problem: Script not found
- Check: You're in the project directory
- Solution: Use full path or
cdto project directory
README.md- Main project documentationCOMPREHENSIVE_RESULTS.md- Detailed results and analysis
- Read error messages carefully - They often tell you what's wrong
- Check file paths - Ensure paths are correct
- Verify installation - Make sure all packages are installed
- Check data structure - Ensure images are organized correctly
- Add more training data for better accuracy
- Experiment with different hyperparameters
- Try different training epochs
- Test on various images
Happy training! 🚀