[Cogitations] Bridge 27 eval JSON files to automated pytest assertions

## Quality Gate Reference
- **Checklist Item(s)**: TDD-001 (Test Coverage Baseline), TDD-003 (Test Pyramid Balance)
- **Current Score**: TDD domain 11/100 — **Tier 1 blocker**
- **Impact**: High — leverages existing eval investment
- **Effort**: High

## Description
The project has 27 JSON eval files across `evals/` and `skills/*/evals/` that define structured behavioral expectations (prompts, expected tool calls, assertions). These are currently declarative specifications evaluated manually or by AI — not connected to any automated test runner.

### Implementation Plan
1. Write a pytest plugin or parametrized test loader that reads eval JSON files
2. For each eval entry, extract the `expectations` or `assertions` arrays
3. Map deterministic expectations to pytest assertions:
   - Skill trigger detection → verify correct skill matches prompt
   - Expected tool calls → validate against known tool schemas
   - Expected state changes → check output structure
4. Use `@pytest.mark.parametrize` with `ids` from eval `name` fields
5. Non-deterministic expectations (LLM output quality) → mark as `pytest.mark.skip` with reason, or use similarity thresholds

### Benefits
- Converts existing eval investment into CI-enforceable tests
- Fills the top of the test pyramid (acceptance/behavioral specs)
- Combined with unit tests from issue #1, creates proper pyramid shape

## Acceptance Criteria
- [ ] TDD-001 score improves
- [ ] TDD-003 (test pyramid) improves from 0.05
- [ ] Eval failures detected in CI before merge
- [ ] No regression in other domain scores

## References
- [Promptfoo](https://promptfoo.dev/) — eval framework patterns
- [DeepEval](https://github.com/confident-ai/deepeval) — pytest-integrated eval framework
- Existing eval files: `evals/test-architect-evals.json`, `skills/refactor/evals/refactor-evals.json`

---
*Generated by Cogitations /cog-discover*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Cogitations] Bridge 27 eval JSON files to automated pytest assertions #3

Quality Gate Reference

Description

Implementation Plan

Benefits

Acceptance Criteria

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Cogitations] Bridge 27 eval JSON files to automated pytest assertions #3

Description

Quality Gate Reference

Description

Implementation Plan

Benefits

Acceptance Criteria

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions