Skip to content

Add domain-specific evaluation rubrics for planning #42

@kai-linux

Description

@kai-linux

Goal

Add domain-specific evaluation rubrics to planning so Agent OS knows what “good” looks like beyond README, CODEBASE.md, and STRATEGY.md. Repos should be able to declare domain expectations such as what makes a strong landing page, website, book workflow, bot, or automation harness, and planners should use those rubrics when shaping backlog and sprint work.

Success Criteria

  • Define a repo-level mechanism for declaring domain-specific evaluation criteria or skills.
  • Feed those criteria into backlog grooming and strategic planning in a structured way.
  • Make the system reusable across different repo types without collapsing into a generic checklist.
  • Document how this supports the roadmap shift from prompt-based planning to evidence-driven planning.

Task Type

architecture

Constraints

  • Prefer minimal diffs.
  • Preserve existing repo autonomy; not every repo should share the same rubric.
  • Favor explicit rubrics over hidden prompt heuristics.
  • Keep the first implementation easy to inspect and revise in-repo.

Context

One of the current Level 2 gaps is weak domain knowledge injection. The planner needs stronger repo-specific evaluation criteria so it can judge what matters for a website, book pipeline, content harness, or automation system instead of relying only on generic repo context.

Metadata

Metadata

Assignees

No one assigned

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions