Skip to content

mudmini009/FRA461-Thesis-AI-DM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

98 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🏰 Agentic-DualPath-Core

A Multi-Agent Hybrid TTRPG Engine | FIBO Senior Thesis (2026)

Project Status Ruleset Model


FIBO PROGRESS 2 SHOWCASE > Next-Gen Agentic Orchestration for Tabletop RPGs

This project features a Multi-Agent Hybrid Architecture designed to solve the "AI Hallucination" problem in TTRPGs. By implementing a Two-Path Orchestrator, the system autonomously routes player intent between a Deterministic Python Rules Engine (for mechanical precision) and an LLM-based Creative Arbiter (for improvisational logic). Through a rigorous Multi-Agent Handshake, the engine ensures that all game state mutations are grounded in hard-coded logic while maintaining the narrative flexibility of Large Language Models.


πŸ“Š Core Features

  • βš”οΈ Two-Path Architecture: Automatically routes player intent to either the Rules Engine (for standard actions) or the LLM Arbiter (for creative improv).
  • 🎲 Stateless Rules Engine: Deterministic Python logic for initiative, dice rolling, range checks, and damage calculation.
  • πŸ€– LLM Arbiter: A "Referee" AI that judges creative actions, assigns Difficulty Classes (DC), and applies symbolic status effects (e.g., STUNNED, RESTRAINED).
  • πŸƒ Tactical Zone Combat: Grid-less tactical movement using NEAR, MID, and FAR zones with range-based disadvantage.
  • πŸƒ Initiative Queue: A dynamic turn-order system where every character (Player & Enemy) rolls initiative at the start of combat.
  • 🧠 Contextual Short-Term Memory: Utilizes an efficient $\mathcal{O}(1)$ collections.deque sliding window to inject recent gameplay events (max 10) directly into the Arbiter and Narrator LLM prompts, ensuring contextual continuity without wasting API tokens on the stateless routing layer.
  • πŸ“œ Continuous Campaign Record: Background process that permanently logs an irreversible, real-time transcript of player inputs, DM generations, and hidden internal Python math [SYSTEM] checkpoints to a .txt file for future RAG summarization models.
  • ⚑ Dynamic Sequence Actions: Supports complex commands like "I shoot then run away" or "I run in then attack", executing them in the order specified by the user.
  • πŸŽ’ Inventory Engine: Auto-looting, dynamic disposable items, and rigorous LLM-categorized consumable mechanics.
  • πŸ—£οΈ Narrative State Transitions: Talk your way out of fights with Diplomacy (PACIFIED state), or use tactical math-based fleeing mechanics.
  • πŸ’‘ QoL Features: "Idiot-proof" automated API key wizard and developer debug toggles to expose underlying AI processing.

πŸ“œ Game Rules

For the complete mechanical breakdown of the FIBO Lite 5th Edition system, check out the official rulebook:


πŸ› οΈ How to Run

Prerequisites:

  • Python 3.10+
  • Gemini API Key (set in .env as GOOGLE_API_KEY)

Launch the Demo:

# Initialize environment (Example using Conda)
conda activate ai_dm_core

# Install dependencies
pip install -r requirements.txt

# Run the dedicated launcher
python demo_day.py

πŸš€ Quick Start / Installation

  1. Clone the Repository

    git clone [repository_url]
    cd AI_Dungeon_Master
  2. Install Dependencies

    pip install -r requirements.txt
  3. Get a Free API Key

  4. Setup the Environment (Pick ONE method)

    • The Automatic Way: Just run the game! (python main.py). The system will detect that you are missing a key, pause the game, and prompt you to paste it in the terminal. It will then automatically create the .env file for you.
    • The Manual Way: Create a new file named .env in the root folder of this project and paste your key inside like this:
      GEMINI_API_KEY=your_key_here_xyz123
  5. Configure Settings (Optional)

    • You can tweak the engine's behavior without touching Python code!
    • Open data/settings.json to configure:
      • Memory Queue Size (max_history_events)
      • Default Difficulty Classes (default_dc)
      • AI Creativity (arbiter_temperature, narrator_temperature)
      • Target Fuzzy Matching strictness (fuzzy_match_cutoff).
    • Note: If you delete this file, the engine will safely regenerate it with default values.
  6. Run the Game

    python main.py

πŸ“‚ Project Structure

AI_Dungeon_Master/
β”œβ”€β”€ docs/                    # [THESIS] Final 2026 graduation thesis reports (WIP)
β”œβ”€β”€ archive/                 # [HISTORY] Past iterations and research
β”‚   β”œβ”€β”€ phase2_demo/         # Old FIBO lab scripts and demo JSONs
β”‚   β”œβ”€β”€ references/          # Academic research papers and references
β”‚   └── old_docs/            # Past presentations and progress reports
β”œβ”€β”€ main.py                  # [ENTRY] Full game entry point
β”œβ”€β”€ LITE_5E_RULES.md         # [RULES] The formal "Lite 5e" rulebook for the AI and Player
β”œβ”€β”€ ARCHITECTURE.md          # [DOCS] High-level system design overview
β”œβ”€β”€ requirements.txt         # [DEPS] Project dependencies
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ agents/              # [AGENTS] Specialized LLM agents (post-refactor)
β”‚   β”‚   β”œβ”€β”€ base.py          # BaseLLMProvider – shared API setup & model init
β”‚   β”‚   β”œβ”€β”€ arbiter_agent.py # ArbiterAgent – action validation & item categorization
β”‚   β”‚   β”œβ”€β”€ narrator_agent.py# NarratorAgent – combat narration & outcome description
β”‚   β”‚   β”œβ”€β”€ campaign_agent.py# CampaignAgent – recap & cold-open prologue generation
β”‚   β”‚   └── character_agent.py# CharacterAgent – Zero-Hallucination character & world lore 
β”‚   β”œβ”€β”€ engine/              # [ORCHESTRATOR] Pre-game flow & main combat loop
β”‚   β”‚   β”œβ”€β”€ startup.py       # Pre-game flow: character creation, lore, prologue
β”‚   β”‚   └── game_loop.py     # Handles turn queue and execution flow
β”‚   β”œβ”€β”€ ui/                  # [CLI] Presentation layer
β”‚   β”‚   β”œβ”€β”€ character_sheet.py # Character Sheet & World Lore TUI renderers
β”‚   β”‚   β”œβ”€β”€ dashboard.py     # Renders HP, ASCII targets, and zones
β”‚   β”‚   └── menu.py          # Main menu, recap menu
β”‚   β”œβ”€β”€ router/              # [THE BRAIN] Intent Classification & Action Logic
β”‚   β”‚   β”œβ”€β”€ intent_router.py # Two-path router (FIXED vs CREATIVE)
β”‚   β”‚   └── intents.py       # Action execution handlers (MOVE/ATTACK/USE)
β”‚   β”œβ”€β”€ logic/               # [CALCULATOR] Pure Python Mechanics
β”‚   β”‚   β”œβ”€β”€ rules_engine.py  # Dice, DC checks, damage math
β”‚   β”‚   β”œβ”€β”€ combat_manager.py# Initiative queue
β”‚   β”‚   β”œβ”€β”€ enemy_ai.py      # Enemy turn logic
β”‚   β”‚   β”œβ”€β”€ dice_roller.py   # Dice rolling utilities
β”‚   β”‚   └── abilities.py     # Ability definitions
β”‚   β”œβ”€β”€ models/              # [STATE] Single Source of Truth
β”‚   β”‚   β”œβ”€β”€ character.py     # Character dataclass (stats, lore, conditions)
β”‚   β”‚   β”œβ”€β”€ game_state.py    # Global state container
β”‚   β”‚   └── toon_converter.py# TOON serializer for minimal token usage
β”‚   └── services/            # [IO] External APIs & Persistence
β”‚       β”œβ”€β”€ llm_service.py   # Backward-compatible faΓ§ade over src/agents/
β”‚       β”œβ”€β”€ data_manager.py  # JSON save/load system
β”‚       └── rag_service.py   # RAG/Context preparation
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ active/              # Live session data (written during gameplay)
β”‚   β”‚   β”œβ”€β”€ campaign_active.json  # Current save state
β”‚   β”‚   β”œβ”€β”€ campaign_log.txt      # Continuous transcript
β”‚   β”‚   └── world_lore.txt        # Active world context for the Narrator
β”‚   β”œβ”€β”€ config/              # Engine configuration (edited by user)
β”‚   β”‚   β”œβ”€β”€ settings.json         # Editable engine parameters
β”‚   β”‚   β”œβ”€β”€ settings_backup.json  # Safe default settings fallback
β”‚   β”‚   └── bestiary.json         # Enemy stat templates
β”‚   └── premade/             # Hand-crafted selection templates
β”‚       β”œβ”€β”€ characters/      # Premade class JSON files (fighter, mage, rogue…)
β”‚       └── lore/            # Premade world lore .txt files
β”œβ”€β”€ archive/progress_2/      # Deprecated files from pre-Phase-3 (kept for history)
β”œβ”€β”€ evaluation/              # [QA] Evaluation & regression suite
β”‚   └── combat/
β”‚       β”œβ”€β”€ evaluation_runner.py  # 50-scenario regression runner
β”‚       β”œβ”€β”€ scenario_suite.json   # Structured test scenarios
β”‚       └── results/              # Auto-generated trace logs and metrics CSV
└── tests/                   # [QA] Unit tests
    β”œβ”€β”€ test_rules.py        # RulesEngine pytest coverage
    β”œβ”€β”€ test_persistence.py  # DataManager save/load parity
    └── test_*.py            # Other scenario and module tests

🧩 Architectural Philosophy

The system is built as a "Stateless Symbolic Machine" to ensure 100% mechanical consistency.

  1. Rule-First Decisioning: If an action matches a standard game mechanic (Attack, Move, Item), the AI is bypassed for the calculation. The Python engine handles the math.
  2. Symbolic Grounding: When the AI allows a creative action (e.g., "I pull the rug"), it must return a Symbolic Side Effect (e.g., target_condition: PRONE). The Python engine then applies this to the live model.
  3. TOON Serialization: Uses a custom compact format for game state to reduce LLM token usage by up to 50%, ensuring faster response times and lower costs.

β˜‘οΈ Development Progress

πŸ’Ύ Data & State Management

  • External JSON State Persistence: Game state (Party & Enemies) is loaded and saved dynamically via data/campaign.json using the DataManager, avoiding hardcoded stats.
  • Bidirectional TOON Serialization (Token-Oriented Object Notation): A custom serialization pipeline (TOONConverter) drastically reduces API overhead. State is compressed into TOON before sending to the Arbiter. Furthermore, all LLM API outputs (including Intent Routers and Arbiter logic) are explicitly forced via system prompts to return 1-line TOON (key:value|key:value), completely eliminating verbose JSON outputs. This achieves ~50% total token reduction and lower latency.
  • Decoupled Health vs. Tactical Conditions: The Character model strictly separates Health Status (e.g., Unscathed, Bloodied, Critical) from Tactical Conditions (Enum: NORMAL, STUNNED, BLINDED), ensuring narrative damage doesn't overwrite mechanical penalties.
  • Dual-Stream Memory & Context Collapse (State-Dependent Pruning): Upgraded the sliding window into a two-tier system: combat_memory (maxlen=10) and story_memory (maxlen=5), both fully configurable via settings.json. Implemented an automatic interception hook on VICTORY or FLEE states that feeds the raw combat logs to an LLM Summarizer. The AI compresses the mathematical battle into a single narrative sentence, pushes it to story_memory, and flushes the combat queue, preventing token bloat.
  • Static Lore Injection (Static RAG): Implemented a fail-safe retrieval system that loads world-building context from data/world_lore.txt. This allows for instant "World Flavor" shifts without code changes. The engine handles missing lore files with a hardcoded fallback.
  • Robust Backend Configuration (settings.json): Abstracted hardcoded variables into an open-source-friendly JSON config file. Exposes core engine parameters and LLM API settings. Built a bulletproof boot sequence in DataManager with Python fallbacks to repair missing config files.
  • Persistent Campaign Journal (System Log): Implemented an append-only logging system that permanently records every player input and AI output in real-time to a local file. Includes an auto-reset mechanism triggered during new game boots, creating a parseable script for future save/load summarization.

🧠 The Intent Router (Two-Path Architecture)

  • Path A (Fixed Rules Routing): Standard RPG mechanics (Attack, Move) bypass the LLM for calculation, sending the action directly to the Python RulesEngine to mathematically guarantee zero AI hallucinations.
  • Path B (Creative Improv Routing): Complex user prompts are intelligently routed to the LLMService (Arbiter), which judges logical feasibility, assigns a DC (Difficulty Class), and automatically outputs a Symbolic Side Effect (e.g., BLINDED).
  • Dynamic Action Sequencing (FIXED_COMBO): The LLM parses multi-step user intents and extracts an action_order array. The game engine dynamically executes the sequence exactly as the user typed it.
  • Action Fairness & Multi-Agent Economy Guard: The engine strictly enforces a 5e-style action economy by tracking has_acted and has_moved flags on the Character model, refreshed via reset_turn(). If an Arbiter denies a creative request, the turn is refunded. However, invalid mechanical requests (e.g., double-attacks, out-of-range moves) are deterministically denied and the turn consumed, ensuring the AI cannot be exploited or cheat.

βš”οΈ Combat & Rules Engine

  • Stateless Rules Engine (RulesEngine): Pure Python math logic handles all 1d20 dice rolls, AC (Armor Class) checks, Critical Hit doubling logic, and Stat modifiers (PHYS/MENT/SOC). [Newly Expanded] Integrated resolve_spell which utilizes the MENT stat and a native 1d10 damage system, and resolve_item which wraps consumable logic into standardized dictionary outputs for perfect evaluation tracing.
  • Individual Rolling Initiative Queue: Upgraded from legacy "Side vs. Side" turns to a granular, individual turn order. Every combatant rolls 1d20 + PHYS at the start of combat. The loop acts seamlessly, prompting players, triggering EnemyAI, bypassing DEAD characters, and incrementing rounds.
  • 3-Tier Intelligent Target Selection: LLM ID Extraction identifies the exact hidden ID (Tier 1). Fuzzy Spell Matching utilizes difflib.get_close_matches to catch typos (Tier 2). Auto-Fallback defaults to the first active enemy to prevent wasted inputs (Tier 3).
  • Enemy AI Tactics (EnemyAI): A lightweight AI that targets the nearest valid opponent and executes a single turn, narrating the sequence automatically on its turn.
  • Mechanical Status Effects: Conditions have actual engine consequences. STUNNED characters forfeit their turn, while BLINDED triggers disadvantage mechanics inside the RulesEngine.

πŸƒ Spatial & Movement Mechanics (Zones)

  • Tactical Zone Tracking: Grid-less combat utilizing distinct range zones (NEAR, MID, FAR).
  • Range Penalties: Using melee weapons outside NEAR range automatically triggers "Out of Range" failures, forcing tactical positioning.
  • Movement Enforcement (1-Zone Rule): The engine prevents teleportation, restricting movement to exactly 1 adjacent zone per turn and resolving incorrect distance requests.
  • AI Gap-Closing: Melee-equipped enemies are programmed to automatically spend their turn moving one zone closer if they are out of range of the player.

πŸ–₯️ UI & Narrative Generation

  • LLM Generative Narration: The system translates raw, calculated Python logs into immersive, D&D-style second-person narration.
  • Immersive CLI Dashboard: A cleanly formatted terminal UI that hides raw enemy HP numbers to prevent metagaming, displaying visual health descriptors and exact player stats instead.
  • Developer Debug Mode: A toggle command that exposes the raw LLM JSON outputs, parsed intents, and true Python math logs to prove the system works.
  • Demo Day Launcher (demo_day.py): A dedicated, crash-resistant script with ASCII art, an interactive command loop, and an auto-reset function (restart) to cleanly restore state.
  • "Idiot-Proof" Onboarding (QoL): An automated boot sequence that intercepts missing GEMINI_API_KEY errors, prompts the user via CLI, and generates the .env file to prevent crashes.

πŸŽ’ Item & Inventory Mechanics

  • Symbolic Disposable Items (Path B): Complex narrative item usage is routed to the Arbiter. If the AI determines the item is destroyed, it returns a consumed_item key, prompting the engine to .pop() it from the inventory.
  • Automated Victory Looting (Auto-Loot): Upon triggering the VICTORY state, the engine extracts item strings from defeated enemies, transfers them to the player, empties enemy pockets, saves the game, and prints a formatted UI summary.
  • Hardcoded Consumable Logic (Path A - Mechanics): Standard items bypass the LLM entirely to guarantee zero hallucinations via an ITEM_EFFECTS dictionary. HEAL items restore HP, CURE items revert status conditions, and DAMAGE items apply fixed mechanical damage. [Enhanced Security] The USE command performs a secure inventory verification, asserting the item's existence in player.inventory before executing .remove(), neutralizing hallucinated item usage.

πŸ—£οΈ Narrative State Transitions

  • Diplomacy & Pacification (Path B): Players can dynamically talk their way out of fights. The LLM Arbiter can assign a PACIFIED status. The EnemyAI recognizes this, forfeits its turn, and the game loop correctly counts them as "defeated" to trigger a VICTORY.
  • Tactical Fleeing Mechanics (Path A): The Intent Router parses "flee" commands as FIXED actions. The RulesEngine resolves a contested 1d20 + PHYS check. Enemies receive a Proximity Penalty bonus to their roll based on Zone distance (+5 if Same Zone, +2 if Adjacent). A successful player roll exits combat, while failures rightfully consume the turn.

πŸ§ͺ Automated Evaluation & Verification

  • 50-Scenario Functional Stress Test: Developed a comprehensive evaluation_runner.py that utilizes unittest.mock to patch internal engine methods and capture a deep-dive trace_log.json.
  • Grounded Result Metrics: Following the implementation of Permission Guardrails and expanded resolvers, the system achieved a 100% Grounding Precision ($P_{ground}$) and 76% State Synchronization ($S_{sync}$), proving the multi-agent "Handshake" is mathematically reliable.

🚧 Future Work / Missing Features

  • Narrative State Transitions:
    • World Exploration Mode: Disabling Initiative and transitioning to a free-form RAG exploration state.
    • Lore Expansion: Populating world_lore.txt with more complex situational data to further ground the AI's creative narration.
    • Optional Story Summarizer (Load Feature): Utilizing the newly built Campaign Log to let the LLM generate a "Previously on..." summary when players load a saved game.

πŸŽ“ Acknowledgments & References

This project is built upon the foundational research of AI-assisted narrative generation and LLM-based agent architecture. We would like to acknowledge the following papers for their inspiration on our hybrid system:

  • πŸ“„ JΓΈrgensen et al. (2024) – ChatRPG: A Multi-Agent "ReAct" Game Master
  • πŸ“„ Sakellaridis (2024) – LLM-Based Agent as Dungeon Master
  • πŸ“„ Song et al. (2024) – Tool-Assisted AI DM: Function Calling & External Tools

Note

Full academic PDFs can be found in the <samp>archive/references/</samp> directory.

About

Multi-agent TTRPG engine using Two-Path Architecture to solve AI hallucinations. Combines a deterministic Python Rules Engine with LLM-based creative judgment. Verified with 100% Routing and 98% Grounding accuracy. Senior Thesis @ KMUTT FIBO.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages