Simple Smart OCR

A powerful and easy-to-use OCR and QR code recognition desktop application built with Python and PySide6.

📸 Application Interface

💡 Project Background

In daily work, text recognition from screenshots is frequently needed. Initially, the following local solutions were attempted:

PaddleOCR Local Deployment: Baidu PaddlePaddle's OCR model
Hugging Face Models: Download excellent open-source OCR models for local execution

However, practical application revealed:

⚠️ High Resource Consumption: Model execution requires significant memory and computational resources
⚠️ Large Package Size: Applications reach hundreds of MB or even GB when including model files
⚠️ High Usage Barrier: Regular users need to configure CUDA, download models, and perform other complex operations

To make the tool more accessible and lightweight, an API-driven approach was ultimately chosen:

✅ Lightweight: Small application size, no need to download large model files
✅ Zero Configuration: Regular users only need to provide an API Key to start using
✅ High Performance: Leveraging cloud computing power for fast and accurate recognition
✅ Free Quota: Baidu Cloud offers 1000 calls/month, Google Gemini offers 60 calls/minute for free

This project focuses on providing an easy-to-use OCR tool, not a local model inference solution. If you need completely offline OCR capabilities, PaddleOCR is recommended.

🎯 Quick Understanding

graph LR
    A[Image Input] --> B{Select Recognition Method}
    B -->|OCR Text Recognition| C[Choose Engine]
    B -->|QR Code Recognition| D[Offline Decoder]

    C --> E[Baidu Cloud OCR API]
    C --> F[Google Gemini LLM]
    C --> G[Other LLM]

    E --> H[Recognition Result]
    F --> H
    G --> H
    D --> H

    H --> I[Auto Copy to Clipboard]

    style A fill:#e1f5ff
    style H fill:#c8e6c9
    style I fill:#fff9c4

✨ How to Use

Download and Run

Visit the Releases page
Download the latest version package (simple-smart-ocr-vX.X.X.zip)
Extract and run SimpleSmartOCR.exe

💡 Tip: QR code recognition works out of the box, no configuration needed!

Configure API Key (Required for OCR)

flowchart LR
    A[Start] --> B{Choose Engine}

    B -->|Baidu Cloud OCR| C[fa:fa-cloud Apply for Baidu API]
    B -->|Google Gemini| D[fa:fa-brain Apply for Gemini API]

    C --> E[1000 free calls/month]
    D --> F[60 free calls/minute]

    E --> G[Enter in Settings]
    F --> G

    G --> H[fa:fa-check Start Recognition]

    style A fill:#e3f2fd
    style E fill:#c8e6c9
    style F fill:#c8e6c9
    style H fill:#fff9c4

📂 Directory Monitoring

Features

After clicking Select Image Directory, the application automatically monitors file changes in that directory:

File List: Image files are sorted by creation date in descending order (newest first)
Image Preview: Single or double-click an image to view it in the preview area; click the preview to enlarge
Execute Recognition: After double-clicking to select an image, click the corresponding recognition button
- Text Recognition: Extract text content from images
- QR Code Recognition: Decode QR code information from images
Auto Monitoring: New images added to the directory automatically appear in the file list

🤝 Contributing

Community contributions are welcome! You can participate by:

🐛 Submitting bug reports
💡 Proposing new features
🔧 Submitting code improvements
📖 Improving documentation
🌍 Adding new language translations
⭐ Starring the project to show support

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
images		images
src		src
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
README.zh_CN.md		README.zh_CN.md
pyproject.toml		pyproject.toml
simple_smart_ocr.spec		simple_smart_ocr.spec
simple_smart_ocr_make_installer.ps1		simple_smart_ocr_make_installer.ps1
simple_smart_ocr_pyinstaller.md		simple_smart_ocr_pyinstaller.md
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simple Smart OCR

📸 Application Interface

💡 Project Background

🎯 Quick Understanding

✨ How to Use

Download and Run

Configure API Key (Required for OCR)

📂 Directory Monitoring

Features

🤝 Contributing

📄 License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Simple Smart OCR

📸 Application Interface

💡 Project Background

🎯 Quick Understanding

✨ How to Use

Download and Run

Configure API Key (Required for OCR)

📂 Directory Monitoring

Features

🤝 Contributing

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages