Skip to content

feels-dev/RokVision

Repository files navigation

🛡️ RoK Vision API

Badge Badge Badge Badge Badge

Next-Gen Cognitive OCR for Rise of Kingdoms

Key FeaturesArchitectureGetting StartedAPI UsageRoadmapContributing


📖 Overview

RoK Vision is a high-performance Cognitive OCR API designed to transform Rise of Kingdoms screenshots into structured data. By combining Deep Learning (PaddleOCR) with a Topological C# Orchestrator, Vision understands the context of the screen, making it resolution-independent and extremely resilient to UI variations.


🚀 Key Features

  • 👤 Governor Profiles Extracts ID, Name, Power, Kill Points, and Civilization from the profile screen with sub-second latency.
  • ⚔️ Battle Intelligence Full analysis of PvP and PvE reports, including troop metrics, casualty rates, and boss identification.
  • 🎒 Inventory Intelligence Reads complex inventory screens (Action Points & XP Books). Supports Multi-Screenshot Merging and uses Color Detection to distinguish items.
  • 🗺️ Kingdom Map Intelligence (Beta) Extracts all visible cities from a map screenshot using a Hybrid AI Engine (YOLO + OCR), resilient to screen resolution and UI variations.
  • 🛡️ Alliance Rally Intelligence Analyzes war screens to extract Rally Leader, Target (Forts/Passes), and a detailed list of participants. Includes a Logical Inference Engine to deduce troop types based on global rally statistics.
  • ✅ Standardized Output All endpoints now return a unified RokResponse structure with a complete Audit Log and detailed Extraction Evidence for every field.
  • 🔍 The Magnifier (Auto-Healing) Automatic regional re-scanning with specialized digital filters (White Isolation, Inverted Binary) for low-confidence areas.
  • 🩺 Debug Mode Add Debug: true to any request to receive granular Timings per step, Raw OCR Text, and Magnifier Attempt Logs in the response.
  • 🌐 Multicultural Core Optimized for Latin alphabets (EN, PT, ES, FR, DE) with smart detection of unsupported characters.

🏁 Getting Started

The easiest way to run RoK Vision is using Docker. It sets up the Neural Network environment and the API Gateway automatically.

👉 Read the Installation Guide to get up and running in 5 minutes.


🏗️ Architecture

The solution follows a distributed architecture: the Eye (Python) handles the heavy AI computer vision, while the Brain (C#) manages the logical orchestration.

graph LR
    User["Client / Bot"] -->|"POST"| API["API Gateway (.NET 9)"]
    subgraph "The Brain (.NET 9)"
        API --> Orchestrator[Cognitive Orchestrator]
        Orchestrator --> Neurons[Specialized Neurons]
        Neurons --> Magnifier[The Magnifier]
    end
    subgraph "The Eye (Python)"
        Orchestrator -->|"gRPC/HTTP"| OCR[PaddleOCR Engine]
    end
Loading

🔌 API Usage

RoK Vision exposes a set of RESTful endpoints to analyze different game screens. Every response is wrapped in a standardized RokResponse<T> envelope that includes a summary with clean data, fields with extraction evidence, and an auditLog.

👉 View the Full API Reference for detailed request/response models and JSON examples.

Endpoints Overview

Method Endpoint Description
POST /api/governor/analyze Extracts all stats from a governor profile screen.
POST /api/reports/analyze Analyzes a PvP or PvE battle report.
POST /api/ap/analyze Reads Action Point items from the inventory. Supports multi-image.
POST /api/xp/analyze Reads Tomes of Knowledge from the inventory. Supports multi-image.
POST /api/map/analyze (Beta) Extracts all visible cities from a kingdom map view.
POST /api/rally/analyze Extracts details from Alliance Rally screens (Header, Target, Participants).

📸 Best Practices

To ensure >95% accuracy, follow the "Golden Screenshot" rules:

  1. Full Screen: Send original screenshots. Do not crop the image manually.
  2. No Overlays: Close the chat, notification bubbles, or side menus before capturing etc...
  3. Brightness: Use standard in-game brightness for optimal contrast.

Support the Project

If RoKVision helps your alliance, consider buying me a coffee! ☕

  • Pix: 031c9e65-66a3-4611-822b-796e227e200a
  • Ko-fi: [link]

🤝 Contributing

See our CONTRIBUTING.md for details on how to help the project.

Pull requests are welcome! For major changes, please open an issue first.

📝 License

Distributed under the MIT License. See LICENSE for more information.

About

High-precision cognitive OCR API for Rise of Kingdoms screenshot data with near-perfect accuracy.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors