RimBridgeServer Architecture and Implementation Strategy

Purpose

RimBridgeServer should become a reliable automation layer that lets an AI develop, validate, and debug RimWorld mods against the real game, not a partial simulation. The bridge needs to support unit, UX, and integration testing while keeping the implementation surface intentionally selective: only build features that are broadly useful, stable, and worth maintaining.

The immediate design goal is to turn the current proof-of-useful-tools into a platform with:

low-latency execution for common operations
explicit async handling for long-running or frame-bound work
clean separation between bridge contracts, game adapters, and optional extensions
strong discoverability for capabilities and events
a TDD workflow that can scale with the project

Current State

As of 2026-03-19, the repository has moved beyond the first vertical slice. It now has:

a working GABP host and internal capability registry
bridge diagnostics, journaling, and event/log surfaces
core and live-smoke test projects
lifecycle, debug-action, mod configuration, mod settings, architect, selection, notification, screenshot, and UI workbench capabilities
a growing scripting layer (run_script and the Lua front-end) built on the shared registry

That progress is useful, but the architectural risks this document was written to address still matter:

the tool surface is monolithic and mixes transport, orchestration, game access, UI state, and serialization
async behavior is mostly implicit, with ad hoc waiting in specific tools like screenshots
there is still substantial coupling between the host assembly and first-party capability modules
capability discovery now exists internally, and the tool surface is now annotation-driven, but longer-form design docs like this one are still hand-maintained and can drift
third-party extension discovery now exists through shared annotations plus a one-sweep reflected mod scan, but some optional mod integrations still rely on reflection where no compile-time dependency is desired

Non-Negotiable Constraints

1. Publicized RimWorld first

RimBridgeServer already references Krafs.Rimworld.Ref and swaps the normal game assembly for Assembly-CSharp_publicised.dll during build. This must become a hard rule:

direct RimWorld access should use the publicized API first
reflection should be treated as an exception path, not the default
reflection is only acceptable for third-party mod adapters where there is no stable compile-time dependency

The current project already follows the build pattern used in Achtung2. That is the correct baseline and should stay.

2. Main-thread ownership

All reads and writes that touch RimWorld state, map state, selection, windows, designators, debug menus, screenshots, save/load, or input must flow through a single execution abstraction that understands:

main-thread affinity
frame-bounded work
long events
synchronous fast paths
async wait conditions

3. Compatibility first

We should preserve existing tool names while the internal architecture changes. External automation should not break just because internals are cleaned up.

4. Testability by extraction

Full coverage is only realistic if the bulk of new logic lives outside the game-specific adapter layer. The architecture therefore needs a thin RimWorld-facing shell and thick pure logic modules around:

capability discovery
request normalization
script planning
result envelopes
operation journaling
event correlation
retry and wait policies

Verified RimWorld Seams

Using the decompiler against the publicized RimWorld reference, the following seams are already available and should anchor the architecture:

Verse.GameDataSaveLoader.SaveGame and LoadGame
Verse.LongEventHandler.QueueLongEvent, ExecuteWhenFinished, and AnyEventNowOrWaiting
Verse.ScreenshotTaker.TakeNonSteamShot and QueueSilentScreenshot
Verse.Messages.Message
Verse.LetterStack.ReceiveLetter
Verse.PlayLog.Add
Verse.Log.Notify_MessageReceivedThreadedInternal
Verse.DebugWindowsOpener.ToggleDebugActionsMenu
LudeonTK.Dialog_Debug.TrySetupNodeGraph and GetNode
Verse.Command.ProcessInput
Verse.Designator.ProcessInput
Verse.DesignatorManager.Select and Deselect
RimWorld.MapInterface.HandleMapClicks and HandleLowPriorityInput
Verse.LoadedModManager.ReadModSettings and WriteModSettings

These seams are enough to implement most of the requested core feature set without inventing large amounts of game logic ourselves.

Architectural Principles

Capability-first, not tool-first

The bridge should not be designed as a growing bag of GABP methods. Internally it should expose a capability registry where each capability declares:

a stable id
category
description
argument schema
result schema
execution kind
whether sync execution is supported
whether it emits events

GABP tools then become one transport projection of that internal registry.

First-party and third-party capabilities use the same model

If RimBridgeServer supports extension packages for other mods, the same principle should govern our own features. That means:

core RimBridgeServer features register through the same provider contract as external extensions
optional first-party feature groups should live in separate packages or projects when their scope justifies it
the host should not special-case first-party capabilities beyond bootstrapping trusted default providers
every capability package, whether shipped by us or another mod, should be discoverable through the same registry

This prevents the architecture from drifting into two incompatible systems: one for built-in features and one for extensions.

Explicit async instead of hidden waiting

Every operation should clearly declare whether it is:

immediate
frame-bound
long-event-bound
background observed

The caller can then choose:

immediate to fail fast if the game is not ready
wait to block until completion or timeout
queue to receive an operation id and poll or subscribe for completion

Stable references over fuzzy strings

Human-friendly resolution like pawn name matching is useful, but the internal API should prefer stable handles once something is resolved:

map id
thing id
pawn id
window id
menu id
operation id
screenshot id

This reduces ambiguity and repeated lookup cost.

Thin adapters, thick core

Everything not inherently tied to RimWorld runtime objects should live in shared, testable libraries.

Possible Future Project Layout

This is an aspirational split, not the current repository layout. The repo still uses a flatter Source/ structure plus focused shared libraries, but this is the direction a larger refactor could move toward:

Source/
  RimBridgeServer.Host/               // net472 RimWorld entry assembly and GABP host
  RimBridgeServer.Contracts/          // schemas, ids, result envelopes, script AST
  RimBridgeServer.Core/               // registry, execution policies, journaling, script engine
  RimBridgeServer.Game/               // shared RimWorld adapter helpers
  RimBridgeServer.Capabilities.Core.Diagnostics/
  RimBridgeServer.Capabilities.Core.Lifecycle/
  RimBridgeServer.Capabilities.Core.Selection/
  RimBridgeServer.Capabilities.Core.View/
  RimBridgeServer.Capabilities.Core.DebugActions/
  RimBridgeServer.Capabilities.Optional.Pawns/
  RimBridgeServer.Capabilities.Optional.UI/
  RimBridgeServer.Extensions.Abstractions/   // provider contract used by all packages
  RimBridgeServer.Extensions.Example/        // example third-party adapter

Tests/
  RimBridgeServer.Contracts.Tests/
  RimBridgeServer.Core.Tests/
  RimBridgeServer.Game.Integration/
  RimBridgeServer.E2E/

docs/
  architecture.md
  lua-frontend-design.md

This can be introduced incrementally. There is no need for a single risky big bang move. The important part is that feature packages, including first-party ones, plug into the same registry and execution model.

Target Runtime Architecture

1. Host layer

Responsible for:

bootstrapping the mod
starting and stopping the GABP server
exposing legacy tool aliases
exposing internal capability discovery
managing subscriptions for events and operation updates

Suggested types:

BridgeHost
ToolFacade
CapabilityRegistry
LegacyToolMapper

2. Execution kernel

Responsible for:

game-thread dispatch
safe synchronous execution
wait conditions
long-event coordination
timeout and cancellation handling
operation journaling

Suggested types:

IGameThreadDispatcher
GameThreadDispatcher
IOperationRunner
OperationRunner
OperationJournal
WaitCondition
ExecutionMode
OperationStatus

3. Capability modules

Capabilities should be grouped by domain, with each domain owning request normalization and response shaping for its own area.

Each domain should preferably ship as a provider package that registers one or more capabilities into the shared registry. For first-party code, that means the host composes a set of provider packages rather than directly owning all behavior.

Core domains:

diagnostics and logging
game lifecycle and state
pause and time control
save and load
screenshot and view targeting
input and cursor control
selection
settings and mod configuration
debug actions
scripting and batch execution

Optional domains:

pawn state and commands
faction state
context menus
widget row and gizmo access
inspect pane
designators

Extension domains:

capabilities registered by other mods
adapters for known mods

4. Observability layer

Responsible for turning game activity into structured events.

Core event sources should include:

bridge operation lifecycle
long event start and completion
warnings and errors from Verse.Log
message feed from Verse.Messages
letter feed from Verse.LetterStack
play log additions
selection changes
map changes
active window changes
screenshot completion

Suggested types:

IEventSource
EventBus
EventEnvelope
ObservationSnapshot
StateProbe

5. Script engine

Responsible for executing batches of capability calls with low overhead, consistent waiting semantics, and a detailed report.

Suggested types:

ScriptDefinition
ScriptStep
ScriptExecutionContext
ScriptRunner
ScriptReport
StepReport

Internal Contracts

The internal contract model should be established early and then reused everywhere.

Capability descriptor

Each capability descriptor should contain:

Id
Category
Summary
ArgumentsSchema
ResultSchema
ExecutionKind
SupportsImmediate
SupportsWait
SupportsQueue
EmitsEvents
Source such as core, optional, or extension

Operation envelope

Every operation should return a consistent envelope:

{
  "success": true,
  "operationId": "op_123",
  "status": "completed",
  "startedAtUtc": "2026-03-16T12:00:00Z",
  "completedAtUtc": "2026-03-16T12:00:00Z",
  "durationMs": 12,
  "result": {},
  "warnings": [],
  "events": []
}

Even immediate operations should have an operationId so logs, events, and reports can correlate cleanly.

Target references

Introduce small, reusable reference types:

MapRef
PawnRef
ThingRef
WindowRef
MenuRef
CellRef
ScreenRectRef

Wait conditions

Avoid ad hoc sleeping loops inside capabilities. Centralize wait behavior around named conditions such as:

game.idle
long_event.none
window.open
window.closed
selection.matches
screenshot.exists
log.contains

Capability Taxonomy

Core capabilities

These belong in RimBridgeServer itself because they are broadly useful across mod development.

Diagnostics

process and game running status
active program state
current map and loaded game summary
Player.log tail and structured in-game log stream
warnings and errors subscription

Lifecycle and time

pause and unpause
speed control
explicit wait_until_idle
wait_frames
wait_for_long_event

Save and load

list saves
save game
load game
quick snapshot save for test fixtures
fixture restore helpers

View and screenshot

camera state
jump and frame
screenshot capture
clipped screenshot capture
semantic screenshot targeting
screenshot metadata including map, selection, and camera context

Input and selection

mouse position
mouse click
keyboard input
selection read and mutate
click by screen rect or semantic target
input must work even when RimWorld is not the foreground application, which rules out a foreground-only desktop automation design

Settings and mod config

RimWorld prefs
mod settings persistence through LoadedModManager
safe discovery of loaded mod settings surfaces

Debug actions

discover debug action tree
resolve action by path
execute action directly or through a UI-visible mode when needed
support pinning and toggles where exposed by DebugActionNode

Script execution

run a structured batch
choose sync or async execution mode per step
capture detailed report and intermediate artifacts

Optional capabilities

These are useful, but should sit behind separate modules so they can evolve independently.

pawn state and commands
faction state
context menu inspection and execution
widget row and bottom-left gizmo access
inspect pane extraction
designator discovery and application

Extension capabilities

The extension model should allow external mods to register additional capabilities with descriptors that look identical to core ones.

Debug Actions Strategy

Debug action access should not devolve into custom tool-per-action code. The correct design is:

use the internal debug node graph as the discovery surface
expose nodes by stable path
let callers query children before execution
support direct execution where the node semantics allow it
preserve a UI-backed fallback path for actions that require actual dialog interaction

This is one of the biggest opportunities to reduce waiting time because it avoids implementing one-off wrappers for individual debug items.

Input and UI Strategy

The input stack should support two modes.

Semantic mode

The preferred mode for automation:

resolve a semantic target such as a selected pawn, menu option, designator, or gizmo
execute through the underlying command object when possible
return structured evidence of what was targeted

Physical mode

Required for cases where the UI itself is under test:

screen coordinate targeting
rect clipping
mouse move and click
key press simulation
screenshot before and after action

Physical mode should still be implemented inside RimWorld's process or window event path wherever possible. Foreground-dependent OS desktop input is not a sufficient design because automated test runs may keep RimWorld in the background.

This split matters because functional automation and UX automation are different jobs. We should not pay the cost of physical UI simulation when a direct command path is available.

Event Model

The event system should be treated as a first-class API, not a future add-on.

Each event should include:

sequence
timestampUtc
category
type
source
operationId when relevant
payload

Core categories:

bridge
operation
game
long_event
log
message
letter
selection
window
capability

For real-time debugging, the log pipeline should combine:

tailing Player.log
patched Verse.Log.Notify_MessageReceivedThreadedInternal
structured warnings and errors raised by bridge code itself

Script Language Strategy

Do not start with a custom textual parser. That is unnecessary risk. The low-risk first version should be a structured JSON script format that every client can generate easily.

Suggested first shape:

{
  "name": "load-fixture-and-run-debug-action",
  "defaults": {
    "mode": "wait",
    "timeoutMs": 10000
  },
  "steps": [
    { "call": "rimworld.load_game", "args": { "saveName": "fixture_a" } },
    { "wait": { "condition": "game.idle", "timeoutMs": 60000 } },
    { "call": "rimworld.debug_actions.execute", "args": { "path": "Pawns/..." } },
    { "call": "rimworld.take_screenshot", "args": { "fileName": "after_action" } }
  ]
}

The important property is not syntax. The important property is that every step can target any registered capability, including extension capabilities, and produces a uniform report.

Later, a human-friendly DSL can be layered on top if it is still worth it.

The current concrete recommendation for that next layer is documented in docs/lua-frontend-design.md: use Lua syntax via MoonSharp, but keep the existing script runner as the shared execution backend instead of inventing a second direct automation runtime.

Extension Strategy

The bridge now uses annotation-based discovery for third-party tools instead of a second manual registration model.

Current model:

third-party mods reference RimBridgeServer.Annotations
RimBridgeServer delays GAB startup until LoadedModManager.InitializeMods has completed
once all mods are initialized, the host scans each loaded mod assembly exactly once for annotated public tool methods
discovered tools are registered once into both the capability registry and the live GAB tool surface
each mod is isolated by try/catch, so one failing scan does not block other mods

Supported authoring shapes:

public static methods on any loaded mod assembly type
public instance methods on a loaded Verse.Mod handle type
public instance methods on a public parameterless tool class

Design constraints:

keep the shared package annotation-only so participating mods do not need a heavy runtime dependency
keep discovery one-shot and startup-bound rather than re-scanning during live execution
continue to prefer publicized RimWorld APIs for actual game access; reflection is for extension discovery and optional third-party adapters, not routine game logic

Testing Strategy

Testing objective

Target full coverage by pushing almost all branching logic into Contracts and Core, then keeping Game adapters thin and validated with deterministic in-game scenarios.

Test pyramid

Unit tests

Scope:

capability registry
request validation
result envelopes
wait condition evaluation
operation journal
script planning and report generation
event correlation
sync vs queue policy logic

These should aim for near-100 percent line and branch coverage.

Contract tests

Scope:

legacy tool alias mapping
capability descriptor completeness
schema compatibility
serialization stability

In-game integration tests

Scope:

real save and load
pause and time control
selection behavior
screenshot capture
debug action discovery and execution
log and message capture
designator and command invocation

These should run against stable fixture saves and quick-test colonies.

End-to-end tests through GABS

Scope:

launch RimWorld
connect through GABP
execute scripts end to end
verify screenshots, logs, and operation reports

Async test matrix

Every async-capable feature should be exercised across:

immediate success
queued completion
long-event overlap
timeout
cancellation
missing target after map or state change
event correlation correctness

CI and Workflow Strategy

Each incremental step should follow the same discipline:

add or update tests first
implement the smallest coherent vertical slice
run focused tests locally
run a build of the mod
if the step touches game integration, run the relevant GABS-driven smoke case
update the relevant design or usage docs when the public surface changes
commit
push

The rule is to keep the tree in a releasable state after every step.

Low-Risk Incremental Roadmap

Step A0. Architecture baseline

Deliverables:

architecture document
progress log
explicit constraints around publicized RimWorld usage

Exit criteria:

agreed target structure exists in-repo

Step A1. Extract contracts and result envelope

Deliverables:

Contracts project
Extensions.Abstractions project with the provider contract used by both first-party and third-party packages
operation envelope types
capability descriptor types
shared ids and references
adapter layer that keeps current tools working

Tests:

serialization
validation
compatibility

Step A2. Extract execution kernel

Deliverables:

GameThreadDispatcher
OperationRunner
timeout and wait condition support
operation journal

Tests:

main-thread dispatch behavior with fakes
timeout behavior
queued vs immediate execution

Step A3. Refactor existing tools into capability modules

Deliverables:

current features moved out of RimBridgeTools
first-party provider packages for the existing feature groups
legacy tool aliases preserved
new registry-backed dispatch

Tests:

capability discovery
legacy alias contract tests
integration smoke for existing feature set

Step A4. Observability and diagnostics

Deliverables:

event bus
structured in-game log capture
Player.log tail reader
long-event observation

Tests:

event envelopes
log capture filtering
long-event state transitions

Step A5. Lifecycle, save/load, and time service

Deliverables:

consolidated lifecycle service
wait helpers
faster sync control paths for common flows

Tests:

save/load lifecycle
pause and speed behavior
wait conditions around long events

Step A6. View, targeting, input, and screenshot service

Deliverables:

screen and map target references
clipped screenshot support
semantic target resolution
input service abstraction

Tests:

unit tests for target resolution
integration tests for screenshot and selection flows
UX smoke tests with fixture screenshots

Step A7. Generic debug action service

Deliverables:

debug action discovery
debug action path execution
reportable debug-action results

Tests:

path resolution
discovery stability
integration execution cases

Step A8. Optional UI adapters and god-mode designators

Deliverables:

gizmo access
context menu improvements
designator discovery
god-mode designator selection and application
inspect pane and widget row extraction

Tests:

per-adapter integration tests
designator execution cases in god mode
UI regression scenarios around structure placement

Step A9. Script runner

Deliverables:

JSON script format
script execution engine
step-level report
mixed sync and async steps

Tests:

report correctness
rollback and failure reporting semantics
end-to-end script execution

Notes:

v1 should stay deliberately small: ordered capability calls, explicit per-step reports, and no separate DSL runtime
every registered capability, including future extension capabilities, should be scriptable automatically through the shared registry rather than a second plugin model
the next increment after the first runnable slice should add controlled step-output references so scripts can create something in one step and consume its id in a later step

Step A10. Extension system

Deliverables:

extension abstraction package
annotation package and startup-bound discovery lifecycle
extension discovery endpoint
first sample extension

Tests:

discovery
namespacing
script access through extension capabilities

Step A11. Harden for autonomous development

Deliverables:

fixture management
canned repro scripts
coverage gates for Contracts and Core
nightly end-to-end runs

Tests:

complete automated smoke matrix

Decisions To Keep Us Fast

prefer internal direct execution over UI simulation when the goal is functionality testing
prefer UI simulation when the goal is UX validation
prefer JSON scripts over a custom DSL in v1
prefer annotation-based one-sweep extension discovery after all mods initialize over ad hoc per-tool reflection
prefer stable ids over repeated fuzzy name lookup
prefer publicized RimWorld APIs over reflection
prefer reflection only in isolated third-party adapters

Immediate Next Step

The next concrete slice is no longer fixed in this document. The current scripting track is functional through JSON, inline Lua, and file-backed Lua fixtures, so the near-term choice is between broadening reusable Lua examples/reference around planning patterns or switching tracks to the structured pawn-event journal backlog. If we switch tracks, the lowest-risk event slice is still job_changed, draft_changed, and mental_state_changed with cursor-based pull as the correctness path and host-level push as an optional acceleration path.

FilesExpand file tree

architecture.md

Latest commit

History

architecture.md

File metadata and controls

RimBridgeServer Architecture and Implementation Strategy

Purpose

Current State

Non-Negotiable Constraints

1. Publicized RimWorld first

2. Main-thread ownership

3. Compatibility first

4. Testability by extraction

Verified RimWorld Seams

Architectural Principles

Capability-first, not tool-first

First-party and third-party capabilities use the same model

Explicit async instead of hidden waiting

Stable references over fuzzy strings

Thin adapters, thick core

Possible Future Project Layout

Target Runtime Architecture

1. Host layer

2. Execution kernel

3. Capability modules

4. Observability layer

5. Script engine

Internal Contracts

Capability descriptor

Operation envelope

Target references

Wait conditions

Capability Taxonomy

Core capabilities

Diagnostics

Lifecycle and time

Save and load

View and screenshot

Input and selection

Settings and mod config

Debug actions

Script execution

Optional capabilities

Extension capabilities

Debug Actions Strategy

Input and UI Strategy

Semantic mode

Physical mode

Event Model

Script Language Strategy

Extension Strategy

Testing Strategy

Testing objective

Test pyramid

Unit tests

Contract tests

In-game integration tests

End-to-end tests through GABS

Async test matrix

CI and Workflow Strategy

Low-Risk Incremental Roadmap

Step A0. Architecture baseline

Step A1. Extract contracts and result envelope

Step A2. Extract execution kernel

Step A3. Refactor existing tools into capability modules

Step A4. Observability and diagnostics

Step A5. Lifecycle, save/load, and time service

Step A6. View, targeting, input, and screenshot service

Step A7. Generic debug action service

Step A8. Optional UI adapters and god-mode designators

Step A9. Script runner

Step A10. Extension system

Step A11. Harden for autonomous development

Decisions To Keep Us Fast

Immediate Next Step