🛡️ RoK Vision API

Next-Gen Cognitive OCR for Rise of Kingdoms

Key Features • Architecture • Getting Started • API Usage • Roadmap • Contributing

📖 Overview

RoK Vision is a high-performance Cognitive OCR API designed to transform Rise of Kingdoms screenshots into structured data. By combining Deep Learning (PaddleOCR) with a Topological C# Orchestrator, Vision understands the context of the screen, making it resolution-independent and extremely resilient to UI variations.

🚀 Key Features

👤 Governor Profiles Extracts ID, Name, Power, Kill Points, and Civilization from the profile screen with sub-second latency.
⚔️ Battle Intelligence Full analysis of PvP and PvE reports, including troop metrics, casualty rates, and boss identification.
🎒 Inventory Intelligence Reads complex inventory screens (Action Points & XP Books). Supports Multi-Screenshot Merging and uses Color Detection to distinguish items.
🗺️ Kingdom Map Intelligence (Beta) Extracts all visible cities from a map screenshot using a Hybrid AI Engine (YOLO + OCR), resilient to screen resolution and UI variations.
🛡️ Alliance Rally Intelligence Analyzes war screens to extract Rally Leader, Target (Forts/Passes), and a detailed list of participants. Includes a Logical Inference Engine to deduce troop types based on global rally statistics.
✅ Standardized Output All endpoints now return a unified RokResponse structure with a complete Audit Log and detailed Extraction Evidence for every field.
🔍 The Magnifier (Auto-Healing) Automatic regional re-scanning with specialized digital filters (White Isolation, Inverted Binary) for low-confidence areas.
🩺 Debug Mode Add Debug: true to any request to receive granular Timings per step, Raw OCR Text, and Magnifier Attempt Logs in the response.
🌐 Multicultural Core Optimized for Latin alphabets (EN, PT, ES, FR, DE) with smart detection of unsupported characters.

🏁 Getting Started

The easiest way to run RoK Vision is using Docker. It sets up the Neural Network environment and the API Gateway automatically.

👉 Read the Installation Guide to get up and running in 5 minutes.

🏗️ Architecture

The solution follows a distributed architecture: the Eye (Python) handles the heavy AI computer vision, while the Brain (C#) manages the logical orchestration.

graph LR
    User["Client / Bot"] -->|"POST"| API["API Gateway (.NET 9)"]
    subgraph "The Brain (.NET 9)"
        API --> Orchestrator[Cognitive Orchestrator]
        Orchestrator --> Neurons[Specialized Neurons]
        Neurons --> Magnifier[The Magnifier]
    end
    subgraph "The Eye (Python)"
        Orchestrator -->|"gRPC/HTTP"| OCR[PaddleOCR Engine]
    end

🔌 API Usage

RoK Vision exposes a set of RESTful endpoints to analyze different game screens. Every response is wrapped in a standardized RokResponse<T> envelope that includes a summary with clean data, fields with extraction evidence, and an auditLog.

👉 View the Full API Reference for detailed request/response models and JSON examples.

Endpoints Overview

Method	Endpoint	Description
`POST`	`/api/governor/analyze`	Extracts all stats from a governor profile screen.
`POST`	`/api/reports/analyze`	Analyzes a PvP or PvE battle report.
`POST`	`/api/ap/analyze`	Reads Action Point items from the inventory. Supports multi-image.
`POST`	`/api/xp/analyze`	Reads Tomes of Knowledge from the inventory. Supports multi-image.
`POST`	`/api/map/analyze`	(Beta) Extracts all visible cities from a kingdom map view.
`POST`	`/api/rally/analyze`	Extracts details from Alliance Rally screens (Header, Target, Participants).

📸 Best Practices

To ensure >95% accuracy, follow the "Golden Screenshot" rules:

Full Screen: Send original screenshots. Do not crop the image manually.
No Overlays: Close the chat, notification bubbles, or side menus before capturing etc...
Brightness: Use standard in-game brightness for optimal contrast.

Support the Project

If RoKVision helps your alliance, consider buying me a coffee! ☕

Pix: 031c9e65-66a3-4611-822b-796e227e200a
Ko-fi: [link]

🤝 Contributing

See our CONTRIBUTING.md for details on how to help the project.

Pull requests are welcome! For major changes, please open an issue first.

📝 License

Distributed under the MIT License. See LICENSE for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
python-engine		python-engine
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
API_REFERENCE.md		API_REFERENCE.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
GETTING_STARTED.md		GETTING_STARTED.md
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🛡️ RoK Vision API

📖 Overview

🚀 Key Features

🏁 Getting Started

🏗️ Architecture

🔌 API Usage

Endpoints Overview

📸 Best Practices

Support the Project

🤝 Contributing

📝 License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🛡️ RoK Vision API

📖 Overview

🚀 Key Features

🏁 Getting Started

🏗️ Architecture

🔌 API Usage

Endpoints Overview

📸 Best Practices

Support the Project

🤝 Contributing

📝 License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages