[Pelis Agent Factory Advisor] Agentic Workflow Maturity Report — Mar 2026

## 📊 Executive Summary

`gh-aw-firewall` has reached an impressive **Level 4/5** agentic workflow maturity with ~27 active workflows spanning security, testing, documentation, CI/CD, and code quality automation. The most impactful near-term improvements are: fixing 3 uncompiled workflows that are currently broken, adding a workflow health meta-agent (Audit Workflows) to observe the growing agent ecosystem, and adding an Issue Triage agent to handle the evident backlog of unlabeled issues. Given the repository's security-critical domain, there is also a strong case for a Breaking Change Checker to protect CLI users.

---

## 🎓 Patterns Learned from Pelis Agent Factory

### Key Patterns from the Documentation Site

The Pelis Agent Factory documents several workflow archetypes directly applicable here:

1. **Continuous Quality** – Agents like Code Simplifier and Duplicate Code Detector run daily against recent commits, proposing PRs that preserve function while improving clarity. This is distinct from periodic cleanup sprints.

2. **Meta-Agents (Audit Workflows)** – When running 20+ agents, a dedicated meta-agent that monitors agent runs, tracks costs, and surfaces failures becomes essential. The factory's Audit Workflows and Metrics Collector fill this role with 93 audit discussions created.

3. **Fault Investigation with Proposals** – CI Doctor doesn't just open issues; it analyzes logs, proposes fixes, and opens PRs directly (69% merge rate in the factory).

4. **Skip-if-match Guard Rails** – Workflows use `skip-if-match` to prevent flooding when a similar PR/issue already exists, avoiding duplication from repeated scheduling.

5. **Chained Workflows** – Issue Monster feeds issues to Copilot coding agent; the coding agent creates PRs; Sub Issue Closer and Mergefest handle the aftermath. Coordination multiplies individual workflow value.

6. **Cache-Memory for Persistence** – The Issue Duplication Detector uses cache-memory to store issue signatures across runs, avoiding expensive full re-scans each time.

### Patterns from the `githubnext/agentics` Repository

The agentics repo provides reference implementations for:
- **daily-test-improver** – identifies coverage gaps and implements new tests incrementally (directly comparable to this repo's test-coverage-improver)
- **link-checker** – finds and fixes broken links in documentation sites
- **import-workflow** – slash-command workflow for importing reference workflows into new repos
- **daily-workflow-sync** – keeps workflows synchronized with upstream templates

### How This Repo Compares

This repository matches or exceeds the factory's implementation in:
- **Security automation** (red-team scanning, daily reviews, PR guard, dependency monitoring)
- **Multi-engine testing** (smoke tests for Claude, Codex, Copilot, and Chroot)
- **Documentation maintenance** (doc-maintainer, CLI flag checker)

This repository lags in:
- **Workflow observability** (no meta-agent monitoring other agents)
- **Continuous code quality** (no code simplifier or duplicate detector)
- **Release automation** (no changeset/version bump agent)
- **Issue organization** (no issue triage labels, no sub-issue arborist)

---

## 📋 Current Agentic Workflow Inventory

| Workflow | Purpose | Trigger | Assessment |
|----------|---------|---------|------------|
| `build-test-{bun,cpp,deno,go,java,node,rust}` | Test AWF works as firewall for 8 ecosystems | PR | ✅ Strong — unique domain-specific coverage |
| `build-test-dotnet` | .NET ecosystem smoke test | PR | ⚠️ **NOT COMPILED** — broken |
| `ci-cd-gaps-assessment` | Analyze CI/CD pipeline gaps | Daily | ✅ Good — feeds continuous improvement |
| `ci-doctor` | Investigate CI failures, propose fixes | `workflow_run` | ⚠️ **NOT COMPILED** — broken; also misses some workflows in list |
| `cli-flag-consistency-checker` | Sync CLI flags vs docs | Weekly | ✅ Good — relevant for CLI tool |
| `dependency-security-monitor` | Dependency vulnerability monitoring | Daily | ✅ Strong — security-appropriate |
| `doc-maintainer` | Sync docs with code changes | Daily | ✅ Good — essential for this complex tool |
| `issue-duplication-detector` | Flag duplicate issues | On issue open | ✅ Good — uses cache-memory pattern correctly |
| `issue-monster` | Dispatch issues to Copilot coding agent | Hourly / on issue open | ⚠️ **NOT COMPILED** — broken |
| `pelis-agent-factory-advisor` | Agentic maturity analysis | Daily | ✅ This report |
| `plan` | `/plan` slash command for issue breakdown | Slash command | ✅ Good — useful ChatOps |
| `secret-digger-{claude,codex,copilot}` | Hourly red-team credential scanning | Hourly (3×) | ✅ Excellent — domain-critical, multi-engine |
| `security-guard` | PR-level security review (Claude) | PR | ✅ Strong — essential for security tool |
| `security-review` | Daily comprehensive threat modeling | Daily | ✅ Strong — deep analysis |
| `smoke-{chroot,claude,codex,copilot}` | End-to-end firewall smoke tests | PR + Schedule | ✅ Strong — multi-engine validation |
| `test-coverage-improver` | Improve test coverage for security-critical paths | Weekly | ✅ Good — security-focused coverage |
| `update-release-notes` | Generate release notes on publish | On release | ✅ Good — automates release ceremony |

---

## 🚀 Actionable Recommendations

---

### P0 — Implement Immediately

#### P0.1 — Fix 3 Uncompiled Workflows

**What**: Three workflows (`ci-doctor`, `issue-monster`, `build-test-dotnet`) show `compiled: No` in the workflow status. They are effectively disabled.

**Why**: CI Doctor is one of the most valuable workflows in the factory with a 69% PR merge rate. With 15+ open failure issues in the tracker, it should be working. Issue Monster is the dispatcher that feeds the entire Copilot coding agent pipeline.

**How**:
1. Run `gh aw compile .github/workflows/ci-doctor.md`
2. Run `gh aw compile .github/workflows/issue-monster.md`
3. Run `gh aw compile .github/workflows/build-test-dotnet.md`
4. Run `npx tsx scripts/ci/postprocess-smoke-workflows.ts` after any smoke-related changes.

**Effort**: Low (configuration/tooling fix, no new code)

---

#### P0.2 — Issue Triage Agent

**What**: Automatically label incoming issues (bug, feature, enhancement, documentation, question, help-wanted) and leave a friendly comment explaining the label.

**Why**: The open issues list shows many issues with no labels: `[agentics] Secret Digger failed`, `firewall process takes 10sec to shutdown`, CI failure issues. Manual triage is a bottleneck. This is the "hello world" of agentic automation per the factory and takes minutes to configure.

**How**: Add a new workflow triggered on `issues: [opened, reopened]`:

````markdown
---
name: Issue Triage Agent
on:
  issues:
    types: [opened, reopened]
  workflow_dispatch:
permissions:
  issues: read
tools:
  github:
    toolsets: [issues, labels]
safe-outputs:
  add-labels:
    allowed: [bug, feature, enhancement, documentation, question, security, ci-failure, help-wanted, good-first-issue]
  add-comment:
    max: 1
timeout-minutes: 5
---

Triage new issues in $\{\{ github.repository }}. For issue #$\{\{ github.event.issue.number }},
analyze the title and body and add exactly one label from the allowed set.
Skip if the issue already has labels. After labeling, comment explaining why.
````

**Effort**: Low — copy from factory template, customize allowed labels

---

### P1 — Plan for Near-Term

#### P1.1 — Audit Workflows (Meta-Agent)

**What**: A daily workflow that scans all agentic workflow runs, tracks costs, identifies failures, and surfaces patterns across the 27-workflow ecosystem.

**Why**: The factory's Audit Workflows created 93 discussion reports and opened 9 issues, 4 of which led to downstream PRs. With 27 workflows this repo has passed the complexity threshold where a meta-observer becomes essential. Currently there's no way to know if the agents collectively are healthy, efficient, or regressing.

**How**:
```markdown
---
name: Audit Workflows
description: Daily meta-agent that audits all workflow runs for errors, costs, and quality patterns
on:
  schedule: daily
  workflow_dispatch:
permissions:
  contents: read
  actions: read
tools:
  agentic-workflows:
  github:
    toolsets: [default, actions]
  cache-memory:
    key: audit-workflows
safe-outputs:
  create-discussion:
    title-prefix: "[Audit] "
    category: "general"
  create-issue:
    labels: [workflow-health]
    max: 3
timeout-minutes: 20
---
Analyze the last 24h of agentic workflow runs. For each workflow:
- Report success/failure rate
- Flag workflows with 0 successful runs in 48h
- Identify unusually long runs (potential hangs)
- Track turn counts and estimate costs
- Create issues for workflows with consistent failures
```

**Effort**: Medium — needs the `agentic-workflows` MCP tool (already available) and cache-memory for trend tracking

---

#### P1.2 — Breaking Change Checker

**What**: A workflow that runs on PRs and/or daily to detect backward-incompatible changes to the CLI interface, API contracts, or configuration format.

**Why**: This is a distributed CLI tool installed in CI pipelines across organizations. Breaking changes to `--allow-domains` behavior, exit codes, or Docker Compose configuration can silently break hundreds of pipelines. The factory's Breaking Change Checker created alert issues like flagging CLI version incompatibilities. Especially critical before releases.

**How**:
```markdown
---
name: Breaking Change Checker
on:
  pull_request:
    paths: ["src/cli.ts", "src/types.ts", "src/docker-manager.ts", "src/squid-config.ts"]
    types: [opened, synchronize]
  workflow_dispatch:
permissions:
  contents: read
  pull-requests: read
tools:
  github:
    toolsets: [default]
  bash: ["git log:*", "cat:*", "grep:*"]
safe-outputs:
  add-comment:
    max: 1
  create-issue:
    labels: [breaking-change]
    max: 1
timeout-minutes: 10
---
Analyze PR #$\{\{ github.event.pull_request.number }} for breaking changes to:
- CLI flags (removals, renames, behavior changes in src/cli.ts)
- Exit code semantics
- Docker Compose configuration format
- Domain whitelist pattern matching behavior
- Environment variable names/semantics
Alert with a comment if breaking changes are detected. Create an issue if critical.
```

**Effort**: Medium — needs good domain knowledge baked into the prompt

---

#### P1.3 — Workflow Health Manager (ci-doctor Enhancement)

**What**: Add the missing workflows to ci-doctor's watch list and ensure ci-doctor is triggered for all new workflows added. Consider creating a dedicated Workflow Health Manager that monitors the health and activity of all agentic workflows.

**Why**: Currently ci-doctor watches ~26 workflows but `build-test-dotnet` (currently in the list) and other recently added workflows may be missing. Each time a new workflow is added, someone must manually update the ci-doctor watch list. A Workflow Health Manager automatically discovers all workflows.

**How**: 
1. Immediately: Verify ci-doctor's workflow list against actual workflows (add any missing)
2. Near-term: Create a separate `workflow-health-manager.md` that uses `agenticworkflows-status` to detect unhealthy workflows without needing an explicit list

**Effort**: Low-Medium

---

### P2 — Consider for Roadmap

#### P2.1 — Code Simplifier

**What**: A daily agent that analyzes recent commits for complexity and creates PRs proposing simplifications (early returns, extracted helpers, shorter expressions).

**Why**: The factory's Code Simplifier achieved 83% PR merge rate. TypeScript/Node.js is directly supported. `src/docker-manager.ts` (1500+ lines) and `containers/agent/entrypoint.sh` are prime candidates for simplification.

**How**: `gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/code-simplifier.md` then customize for TypeScript.

**Effort**: Low (add wizard, customize)

---

#### P2.2 — Changeset Generator (Release Automation)

**What**: A workflow that analyzes commits since the last release, proposes a version bump (major/minor/patch based on conventional commits), and opens a PR with updated CHANGELOG.

**Why**: The factory's Changeset workflow achieved 78% merge rate (22/28 PRs merged). This repo already uses conventional commits via commitlint, making version bump analysis straightforward.

**How**: `gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/changeset.md` — customize to trigger after a set of PRs merge to main.

**Effort**: Low-Medium

---

#### P2.3 — Documentation Link Checker

**What**: Weekly scan of `docs-site/` for broken internal and external links, creating issues or PRs to fix them.

**Why**: The docs site at `docs-site/src/content/docs/` contains ~20+ markdown pages with many external links. The agentics repo has a `link-checker.md` reference implementation. Broken links hurt user trust for a security-sensitive tool.

**How**: `gh aw add-wizard githubnext/agentics/link-checker` then configure for the docs-site directory.

**Effort**: Low (add wizard, configure paths)

---

#### P2.4 — PR Auto-Review Requester

**What**: When a PR is opened that touches security-critical files, automatically request review from the appropriate team/person.

**Why**: Security Guard and the smoke tests provide automated validation, but there's no automatic human review request for high-risk changes. Pairing automated security guard with human review assignment would close that gap.

**Effort**: Low — simple `pull_request` workflow with `safe-outputs: request-review`

---

### P3 — Future Ideas

#### P3.1 — Container Base Image Security Monitor

**What**: Weekly check for new CVEs in the base images used (`ubuntu/squid:latest`, `ubuntu:22.04`). Create issues when critical CVEs are published against pinned base images.

**Why**: This is a firewall tool where container security is critical. Docker Hub and security advisories publish CVE data for base images.

**Effort**: Medium — needs container registry API access or GHSA database queries

---

#### P3.2 — Performance Benchmark Tracker

**What**: Monthly or per-release tracking of AWF startup time and container launch latency as a discussion report.

**Why**: Users have opened issue #1103 about shutdown taking 10sec. A performance tracker would catch regressions before they reach users.

**Effort**: Medium — needs integration test infra to measure timing

---

#### P3.3 — Duplicate Code Detector

**What**: Daily semantic analysis of recently modified TypeScript files for duplicate patterns that could be extracted as shared utilities.

**Why**: `src/docker-manager.ts` (1500+ lines) likely has extraction opportunities. The factory's version achieved 79% merge rate.

**Effort**: High — requires Serena integration or language server setup

---

## 📈 Maturity Assessment

| Dimension | Current | Notes |
|-----------|---------|-------|
| **Security Automation** | 5/5 | Red-team scanning, PR guard, daily reviews, dependency monitoring — best-in-class |
| **Testing Automation** | 4/5 | Multi-engine smoke tests, coverage improver; missing: perf benchmarks |
| **Documentation Automation** | 4/5 | Daily doc sync, CLI flag checker; missing: link checker |
| **Code Quality Automation** | 3/5 | Missing: code simplifier, duplicate detector |
| **Issue/PR Management** | 3/5 | Have: monster, duplication; missing: triage, arborist, mergefest |
| **Workflow Observability** | 2/5 | No meta-agent monitoring the 27 workflows |
| **Release Automation** | 3/5 | Have: release notes; missing: changeset/version bump |
| **Overall** | **4/5** | Advanced — top 5% of repositories for agentic workflow adoption |

**Current Level**: 4 — Advanced adopter with comprehensive automation across most dimensions.

**Target Level**: 4.5 — Close remaining gaps in observability and code quality; fix broken compilations.

**Gap Analysis**:
1. **Quick wins** (< 1 day each): Fix 3 uncompiled workflows; add issue triage agent
2. **Medium effort** (1-3 days each): Audit Workflows meta-agent; Breaking Change Checker
3. **Longer term**: Code Simplifier; Changeset Generator; Performance Benchmarks

---

## 🔄 Comparison with Best Practices

### What This Repository Does Exceptionally Well

- **Domain-specific smoke testing**: Running 4 different AI engines (Claude, Codex, Copilot, Chroot) as smoke tests is beyond anything in the standard factory — genuinely innovative
- **Red-team security**: 3 concurrent hourly secret-digging agents across different engines is a unique security posture appropriate for a tool that handles AI agent credentials
- **Security-domain specialization**: Combining automated Security Guard (PR-level) + daily security review + dependency monitoring is a model other security-focused repos should emulate
- **Import/shared patterns**: The `shared/` directory with imported fragments (mcp-pagination, version-reporting, secret-audit) demonstrates good workflow modularity

### What Could Improve

- **Workflow observability gap**: Running 27 workflows without an Audit Workflows meta-agent means failures can go unnoticed unless someone checks manually. The factory considers this essential at this scale.
- **Code quality automation absent**: No code simplifier or duplicate detector — somewhat surprising given the size of `docker-manager.ts` and other complex files
- **Compilation hygiene**: 3 workflows in `compiled: No` state suggests the compile step is sometimes forgotten after edits. Consider adding a CI check that fails if any `.md` workflow lacks a current `.lock.yml`

### Unique Opportunities Given the Security Domain

1. **AWF could validate itself**: A workflow that runs AWF against itself (meta-firewall test) would be a powerful integration test
2. **Domain allowlist regression testing**: An agent that monitors whether recent PRs accidentally weakened domain filtering semantics
3. **Container hardening monitor**: Track drift in container security configuration (capabilities, seccomp, network mode) across releases

---

## 📝 Notes for Future Runs

*Tracked in cache-memory at `/tmp/gh-aw/cache-memory/advisor-notes.md`*

**Items to track over time**:
- [ ] Were the 3 uncompiled workflows fixed? (ci-doctor, issue-monster, build-test-dotnet)
- [ ] Was Issue Triage Agent added?
- [ ] Was Audit Workflows meta-agent added?
- [ ] Were Breaking Change Checker or CI Coach added?
- [ ] Did issue backlog size decrease (currently 15+ open issues)?

**Observed since last report** (Feb 2026 → Mar 2026):
- `test-coverage-improver` was added (new since last cycle)
- `security-review` was added (new since last cycle)
- Open issues grew: multiple `[aw]` build failures, smoke test failures visible in tracker
- PRs #1066, #1064 suggest active API proxy feature development

---

> **Note:** This was intended to be a discussion, but discussions could not be created due to permissions issues. This issue was created as a fallback.
>
> **Tip:** Discussion creation may fail if the specified category is not announcement-capable. Consider using the "Announcements" category or another announcement-capable category in your workflow configuration.




> Generated by [Pelis Agent Factory Advisor](https://github.com/github/gh-aw-firewall/actions/runs/22535032238)
> - [x] expires  on Mar 8, 2026, 3:34 AM UTC

Workflow	Purpose	Trigger	Assessment
`build-test-{bun,cpp,deno,go,java,node,rust}`	Test AWF works as firewall for 8 ecosystems	PR	✅ Strong — unique domain-specific coverage
`build-test-dotnet`	.NET ecosystem smoke test	PR	⚠️ NOT COMPILED — broken
`ci-cd-gaps-assessment`	Analyze CI/CD pipeline gaps	Daily	✅ Good — feeds continuous improvement
`ci-doctor`	Investigate CI failures, propose fixes	`workflow_run`	⚠️ NOT COMPILED — broken; also misses some workflows in list
`cli-flag-consistency-checker`	Sync CLI flags vs docs	Weekly	✅ Good — relevant for CLI tool
`dependency-security-monitor`	Dependency vulnerability monitoring	Daily	✅ Strong — security-appropriate
`doc-maintainer`	Sync docs with code changes	Daily	✅ Good — essential for this complex tool
`issue-duplication-detector`	Flag duplicate issues	On issue open	✅ Good — uses cache-memory pattern correctly
`issue-monster`	Dispatch issues to Copilot coding agent	Hourly / on issue open	⚠️ NOT COMPILED — broken
`pelis-agent-factory-advisor`	Agentic maturity analysis	Daily	✅ This report
`plan`	`/plan` slash command for issue breakdown	Slash command	✅ Good — useful ChatOps
`secret-digger-{claude,codex,copilot}`	Hourly red-team credential scanning	Hourly (3×)	✅ Excellent — domain-critical, multi-engine
`security-guard`	PR-level security review (Claude)	PR	✅ Strong — essential for security tool
`security-review`	Daily comprehensive threat modeling	Daily	✅ Strong — deep analysis
`smoke-{chroot,claude,codex,copilot}`	End-to-end firewall smoke tests	PR + Schedule	✅ Strong — multi-engine validation
`test-coverage-improver`	Improve test coverage for security-critical paths	Weekly	✅ Good — security-focused coverage
`update-release-notes`	Generate release notes on publish	On release	✅ Good — automates release ceremony

Dimension	Current	Notes
Security Automation	5/5	Red-team scanning, PR guard, daily reviews, dependency monitoring — best-in-class
Testing Automation	4/5	Multi-engine smoke tests, coverage improver; missing: perf benchmarks
Documentation Automation	4/5	Daily doc sync, CLI flag checker; missing: link checker
Code Quality Automation	3/5	Missing: code simplifier, duplicate detector
Issue/PR Management	3/5	Have: monster, duplication; missing: triage, arborist, mergefest
Workflow Observability	2/5	No meta-agent monitoring the 27 workflows
Release Automation	3/5	Have: release notes; missing: changeset/version bump
Overall	4/5	Advanced — top 5% of repositories for agentic workflow adoption

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Report — Mar 2026 #1111

Description

📊 Executive Summary

🎓 Patterns Learned from Pelis Agent Factory

Key Patterns from the Documentation Site

Patterns from the githubnext/agentics Repository

How This Repo Compares

📋 Current Agentic Workflow Inventory

🚀 Actionable Recommendations

P0 — Implement Immediately

P0.1 — Fix 3 Uncompiled Workflows

P0.2 — Issue Triage Agent

P1 — Plan for Near-Term

P1.1 — Audit Workflows (Meta-Agent)

P1.2 — Breaking Change Checker

P1.3 — Workflow Health Manager (ci-doctor Enhancement)

P2 — Consider for Roadmap

P2.1 — Code Simplifier

P2.2 — Changeset Generator (Release Automation)

P2.3 — Documentation Link Checker

P2.4 — PR Auto-Review Requester

P3 — Future Ideas

P3.1 — Container Base Image Security Monitor

P3.2 — Performance Benchmark Tracker

P3.3 — Duplicate Code Detector

📈 Maturity Assessment

🔄 Comparison with Best Practices

What This Repository Does Exceptionally Well

What Could Improve

Unique Opportunities Given the Security Domain

📝 Notes for Future Runs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Patterns from the `githubnext/agentics` Repository