-
Notifications
You must be signed in to change notification settings - Fork 11
Description
📊 Current CI/CD Pipeline Status
The repository has a well-structured, multi-layered CI/CD pipeline with 44 workflow files (both traditional YAML and agentic Markdown workflows). The pipeline covers build verification, linting, type checking, unit/integration testing, security scanning, and agentic smoke tests. However, several recent runs are failing and key coverage gaps exist.
Recent run health (2026-02-24):
| Workflow | Status |
|---|---|
| PR Title Check | ✅ Success |
| Dependency Vulnerability Audit | ✅ Success |
| Secret Digger (Copilot) | ✅ Success |
| Issue Monster | ✅ Success |
| Smoke Chroot | ❌ Failure |
| Examples Test | ❌ Failure |
| Build Test .NET | ❌ Failure |
| Build Test Rust | ❌ Failure |
| Build Test Deno | ❌ Failure |
| Chroot Integration Tests | ❌ Failure |
The spike in integration test failures is a signal that flakiness or environment drift may be affecting reliability.
✅ Existing Quality Gates
Static Analysis & Build
- ESLint (
lint.yml) — TypeScript source linting, runs on all PRs - TypeScript type check (
test-integration.yml) —tsc --noEmit, runs on all PRs - Build Verification (
build.yml) — Lint + build across Node 20 & 22, verifiesdist/cli.jsexists - PR Title Check (
pr-title.yml) — Semantic commit format, allowed scopes enforced
Testing
- Unit test coverage (
test-coverage.yml) — Runs Jest with coverage, posts diff-comparison comment on PRs, fails on regression - Examples Test (
test-examples.yml) — Runs allexamples/*.shscripts end-to-end - Test Setup Action (
test-action.yml) — Testsaction.ymlinstall across latest/specific/invalid versions - Chroot Integration Tests (
test-chroot.yml) — Multi-job: languages, package managers, procfs, edge cases - Agentic Build Tests (
build-test-*.md) — Live ecosystem tests (Node, Go, Rust, Java, .NET, Deno, Bun, C++) via GitHub Copilot agent
Security
- CodeQL (
codeql.yml) —javascript-typescript+actionsanalysis, security-extended queries, runs on PRs and weekly - Container Security Scan (
container-scan.yml) — Trivy scanning agent + squid images for CRITICAL/HIGH CVEs - Dependency Vulnerability Audit (
dependency-audit.yml) —npm audit --audit-level=highfor main + docs-site packages - Security Guard (
security-guard.md) — Claude-based agentic review of PR security impact
Operational
- CI Doctor (
ci-doctor.md) — Post-run analysis of failed workflow runs - Dependency Security Monitor (
dependency-security-monitor.md) — Daily automated monitoring - Secret Digger (3 engines) — Hourly agentic scans for leaked secrets
🔍 Identified Gaps
🔴 High Priority
1. Critically Low Unit Test Coverage with Low Thresholds
File-by-file breakdown reveals alarming gaps:
| File | Statement Coverage | Notes |
|---|---|---|
cli.ts |
0% | Entry point, signal handlers, error paths |
docker-manager.ts |
18% | Most complex file, 250 statements |
host-iptables.ts |
83% | Good but some branches uncovered |
The coverage thresholds in jest.config.js are set far below acceptable production standards:
coverageThreshold: { global: { branches: 30, functions: 35, lines: 38, statements: 38 } }A codebase where the primary entrypoint and the core Docker orchestration layer have near-zero unit coverage means PRs can introduce regressions in these paths undetected.
2. Container Scan Doesn't Run on All PRs
container-scan.yml only triggers on changes to containers/** or the workflow file itself. A PR that bumps a FROM base image reference indirectly (e.g., via ubuntu/squid:latest) or changes security-sensitive container configuration won't trigger a scan if the path filter is not matched.
3. No Shell Script Linting (ShellCheck)
The repository contains numerous critical shell scripts in containers/agent/ (setup-iptables.sh, entrypoint.sh) and scripts/ci/. These scripts implement security-critical iptables rules and cleanup logic. There is no ShellCheck linting in CI to catch issues like unquoted variables, command injection risks, or portability problems.
4. Integration Test Reliability — Multiple Active Failures
Six integration-level workflows are currently failing. Without reliable green integration tests, PRs cannot confidently use these as quality gates. The failing workflows represent the most critical end-to-end verification of the firewall's core functionality.
5. No Required Status Checks Documented/Enforced
There is no documented set of required passing checks that block PR merge. The PR title check runs, but it's not clear which of the many workflows are enforced as merge blockers in branch protection rules.
🟡 Medium Priority
6. No Coverage Trend Tracking (Codecov/Coveralls)
The test-coverage.yml generates LCOV reports and uploads them as artifacts, but does not integrate with a coverage trend service (Codecov, Coveralls, etc.). This means:
- No coverage badge in README
- No historical trend visibility
- No per-file coverage diff in PR checks as a dedicated check (only bot comment)
The COVERAGE_SUMMARY.md in the repo is a static snapshot — not a living metric.
7. Smoke Tests Are Reaction-Gated, Not Auto-Run on All PRs
The agentic smoke tests (smoke-claude.md, smoke-copilot.md, smoke-codex.md, smoke-gemini.md) require emoji reactions to trigger on PRs (e.g., 👍, ❤️, 🎉). While they do run on a 12h schedule, changes in a PR aren't automatically smoke-tested against all engines unless a maintainer adds the reaction. This creates a window where breaking changes can merge without live firewall validation.
The exception is smoke-chroot.md which has path filters on src/** and containers/**, but it's currently failing.
8. No ARM64 Runner Tests Despite ARM64 Documentation
The docs/compatibility.md references ARM64 support, but no workflow runs tests on ubuntu-24.04-arm or equivalent. All CI runs use ubuntu-latest (x86-64). Binary builds and container tests should be validated on ARM64.
9. No Artifact Size Monitoring
There's no check on the size of build artifacts (dist/, binary files via pkg). Over time, dependency additions can silently inflate the CLI binary size. A size threshold check would catch this early.
10. Missing Mutation Testing
The unit test suite achieves its coverage threshold by executing code paths but the tests may not actually verify correctness of all branches. Mutation testing (e.g., Stryker.js) would reveal whether tests are genuinely detecting bugs or merely providing coverage numbers.
🟢 Low Priority
11. No Broken Link Checking for Documentation
docs-site/ is deployed to GitHub Pages but there's no link validation in CI. Broken internal/external links in documentation degrade developer experience silently.
12. No Changelog/Commit Validation Beyond Title
While PR titles are validated for Conventional Commits format, there's no validation that CHANGELOG or release notes are updated for feature PRs, and no conventional-changelog generation in the release workflow.
13. No Code Complexity Enforcement
No cyclomatic complexity or cognitive complexity limits are enforced. docker-manager.ts (250 statements, 81 branches) is already a complexity hotspot with 18% coverage.
14. pelis-agent-factory-advisor Workflow Not Compiled
The agenticworkflows-status output shows pelis-agent-factory-advisor has compiled: No. This workflow will not execute as intended since the lock file is missing or stale.
📋 Actionable Recommendations
1. Raise Coverage Thresholds Incrementally
Issue: Thresholds are too low (30-38%), CI accepts near-zero coverage for critical files.
Solution: Increase thresholds by 5% per quarter and add per-file thresholds for cli.ts and docker-manager.ts:
coverageThreshold: {
global: { branches: 45, functions: 50, lines: 50, statements: 50 },
'./src/cli.ts': { lines: 20 },
'./src/docker-manager.ts': { lines: 30 }
}Complexity: Low | Impact: High
2. Add ShellCheck to CI
Issue: No linting for security-critical shell scripts.
Solution: Add a job to lint.yml:
- name: Run ShellCheck
uses: ludeeus/action-shellcheck@master
with:
scandir: './containers'
additional_files: 'scripts/ci/*.sh'Complexity: Low | Impact: High
3. Run Container Scan on All PRs (Remove Path Filter)
Issue: Security scan only runs when containers/** changes.
Solution: Remove the paths: filter from container-scan.yml PR trigger, or add a separate lightweight scan job that runs on every PR.
Complexity: Low | Impact: High
4. Compile the pelis-agent-factory-advisor Workflow
Issue: Workflow shows compiled: No and won't execute.
Solution: Run gh aw compile .github/workflows/pelis-agent-factory-advisor.md and commit the generated lock file.
Complexity: Low | Impact: Medium
5. Integrate Coverage with Codecov
Issue: No trend tracking or PR coverage diff as a dedicated check.
Solution: Add to test-coverage.yml:
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v5
with:
files: ./coverage/lcov.info
fail_ci_if_error: falseComplexity: Low | Impact: Medium
6. Document and Enforce Required Status Checks
Issue: No clear documentation of which checks must pass before merge.
Solution: Update branch protection rules for main to require: Build Verification, ESLint, TypeScript Type Check, Test Coverage Report, PR Title Check, and CodeQL. Document these in CONTRIBUTING.md.
Complexity: Low | Impact: High
7. Add ARM64 Runner Job
Issue: No ARM64 CI validation despite documented support.
Solution: Add an ARM64 matrix entry to build.yml:
matrix:
include:
- os: ubuntu-latest
arch: x86-64
- os: ubuntu-24.04-arm
arch: arm64Complexity: Medium | Impact: Medium
8. Auto-trigger Smoke Tests on src/** Changes
Issue: Smoke tests require manual reactions for PR-level validation.
Solution: Add path filters to smoke workflow triggers so that PRs touching src/** or containers/** automatically run at least one smoke test without requiring a reaction. Keep reactions as an additional trigger.
Complexity: Low | Impact: Medium
9. Fix Active Integration Test Failures
Issue: 6 workflows currently failing, undermining test reliability.
Solution: Investigate and fix the failing Smoke Chroot, Examples Test, Chroot Integration Tests, and Build Test (Rust, Deno, .NET) failures before adding new quality gates.
Complexity: Medium | Impact: High (prerequisite for reliable gates)
10. Add Documentation Link Checker
Issue: No broken link validation for docs-site/.
Solution: Add a workflow step using lychee or markdown-link-check to validate links in docs as part of the deploy-docs.yml build step.
Complexity: Low | Impact: Low
📈 Metrics Summary
| Metric | Value |
|---|---|
| Total workflow files | 44 (29 agentic .md + 15 YAML) |
| PR-triggered workflows | ~15 |
| Scheduled workflows | ~12 |
| Current overall statement coverage | ~38% |
| Coverage threshold (statements) | 38% |
cli.ts coverage |
0% |
docker-manager.ts coverage |
18% |
| Unit tests passing | 135/135 |
| Recent integration test failures | 6 |
| Security workflows active | 6 (CodeQL, Trivy, npm audit, security-guard, secret-diggers, dependency-monitor) |
Assessment generated on 2026-02-24. Coverage data from
COVERAGE_SUMMARY.md. Workflow run data from GitHub Actions API.
Note: This was intended to be a discussion, but discussions could not be created due to permissions issues. This issue was created as a fallback.
Tip: Discussion creation may fail if the specified category is not announcement-capable. Consider using the "Announcements" category or another announcement-capable category in your workflow configuration.
Generated by CI/CD Pipelines and Integration Tests Gap Assessment
- expires on Mar 3, 2026, 10:22 PM UTC