[copilot-cli-research] Copilot CLI Deep Research - April 12, 2026 #25936

2026-04-12T21:11:52Z

github-actions[bot]
bot Apr 12, 2026

Executive Summary

Analysis Date: 2026-04-12 | Repository: github/gh-aw | Triggered by: @pelikhan

This is the third consecutive daily analysis comparing Copilot CLI capabilities against actual usage across 186 workflows (89 explicit engine: copilot, 26 defaulting to Copilot, 44 Claude, 8 Codex). The analysis reveals a pattern of feature under-adoption: while the Copilot engine exposes rich configuration options (env vars, custom agents, autopilot, sandbox tuning), the vast majority of workflows use only the basic engine: copilot string with default settings.

Key Finding: Zero workflows use engine.env, engine.args, or engine.api-target in their frontmatter — powerful features documented and supported, but completely ignored in practice.

Most Positive Trend: playwright adoption grew from 13% → 17% and web-fetch from 19% → 22%, showing growing use of browsing capabilities.

🔴 Critical Issues (Immediate Attention)

Security gap: 73% of web-browsing workflows have no network firewall
Out of 20 workflows using tools.web-fetch, only ~5 pair it with sandbox.agent: awf. Fetching external URLs without a network firewall exposes the agent to SSRF risks and uncontrolled network access.

8/10 custom agent files are completely unused
The .github/agents/ directory has 10 specialized agent files, but only technical-doc-writer and ci-cleaner are referenced from production workflows. grumpy-reviewer, adr-writer, contribution-checker, w3c-specification-writer, and others sit unused.

🟡 Medium Priority Opportunities

max-continuations barely used (2/89 = 2%)
Copilot CLI's autopilot mode — where the agent runs multiple continuation passes — is Copilot-exclusive. Only smoke-copilot.md and test-quality-sentinel.md use it. Long-running analysis workflows (deep-report.md, daily-architecture-diagram.md, delight.md) would benefit significantly from max-continuations: 5+.

bare: true underused (2/89 = 2%)
Suppresses AGENTS.md loading via --no-custom-instructions. Useful for focused, context-clean workflows like daily-fact.md, poem-bot.md, and release.md where the repo's general instructions are irrelevant noise.

Feature Usage Matrix

Feature	Available	Used	Not Used	Adoption
`engine.model`	✅	6	83	7%
`engine.agent` (custom file)	✅	3	86	3%
`engine.agent: awf` (sandbox)	✅	14	75	16%
`engine.bare`	✅	2	87	2%
`engine.max-continuations`	✅	2	87	2%
`engine.env`	✅	0	89	0%
`engine.args`	✅	0	89	0%
`engine.api-target`	✅	0	89	0%
`engine.version` (pinning)	✅	0	89	0%
`features.copilot-requests`	✅	43	46	48%
`features.mcp-gateway`	✅	0	89	0%
`features.cli-proxy`	✅	0	89	0%
`tools.web-fetch`	✅	20	69	22%
`tools.playwright`	✅	15	74	17%
`tools.repo-memory`	✅	19	167	10%
`mcp-scripts`	✅	1	88	1%
`sandbox.agent.memory`	✅	0	14	0%
`sandbox.agent.mounts`	✅	0	14	0%

🔍 View High Priority Missed Opportunities

🔴 High Priority Opportunities

Opportunity 1: `engine.env` – 0% Adoption

What: Custom environment variables injected into the Copilot CLI step.

Why It Matters: Enables runtime config without modifying workflow files. Useful for debug logging, custom API endpoints, integration-specific tuning, and A/B testing different configurations.

Affected Workflows: Any workflow needing debug visibility or custom runtime behavior.

Implementation:

engine:
  id: copilot
  env:
    DEBUG_MODE: "true"
    CUSTOM_API_BASE: "(api.staging.example.com/redacted)"
    MY_FEATURE_FLAG: "enabled"

Why likely unused: The extended engine object syntax (id: copilot) is less commonly known than the simple string form. Most workflow authors may not know env vars can be injected this way.

Opportunity 2: Web-Fetch Workflows Without Sandbox

What: 15-18 workflows use tools.web-fetch without sandbox.agent: awf.

Why It Matters: Without AWF network firewall, the agent can reach arbitrary URLs. This creates SSRF risk and uncontrolled data exfiltration paths.

Affected Workflows: brave.md, research.md (already has AWF ✅), workflows importing shared/mcp/tavily.md, shared/mcp/brave.md.

Check: brave.md uses Brave MCP but has no sandbox: block. Add:

sandbox:
  agent: awf
network:
  allowed:
    - defaults
    - api.search.brave.com

🟡 View Medium Priority Opportunities

🟡 Medium Priority Opportunities

Opportunity 3: Custom Agent Files (8/10 Unused)

What: .github/agents/ contains specialized agent personas. Only engine.agent: wires them.

Unused agent files with clear candidates:

Agent File	Best Match Workflow
`grumpy-reviewer.agent.md`	`pr-nitpick-reviewer.md`
`adr-writer.agent.md`	`daily-architecture-diagram.md`
`contribution-checker.agent.md`	`contribution-check.md`
`w3c-specification-writer.agent.md`	`layout-spec-maintainer.md`
`agentic-workflows.agent.md`	`workflow-generator.md`, `craft.md`

Implementation (example for pr-nitpick-reviewer.md):

engine:
  id: copilot
  agent: grumpy-reviewer  # .github/agents/grumpy-reviewer.agent.md

Impact: Specialized agent behaviors applied consistently without duplicating instructions in every workflow.

Opportunity 4: `max-continuations` for Complex Analysis Workflows

What: Autopilot mode allows Copilot to run in multiple passes, each building on the previous output.

Why It Matters: One-shot runs on complex analysis tasks (architecture review, issue clustering, deep research) often produce incomplete output. Multiple continuation passes improve depth and accuracy.

Candidate workflows (long-running analysis, currently limited to single pass):

deep-report.md (timeout: 30min) → add max-continuations: 5
daily-architecture-diagram.md → max-continuations: 3
delight.md (very complex analysis) → max-continuations: 5
copilot-cli-deep-research.md (this workflow!) → max-continuations: 3
portfolio-analyst.md → max-continuations: 3

Implementation:

engine:
  id: copilot
  max-continuations: 5  # allow up to 5 continuation passes

Opportunity 5: `bare: true` for Context-Clean Workflows

What: Disables --no-custom-instructions so AGENTS.md isn't loaded. Useful for workflows where the repo's development context is irrelevant.

Candidate workflows (simple, focused, self-contained):

daily-fact.md – generates fun facts, doesn't need AGENTS.md
poem-bot.md – creative writing, repo context is noise
dictation-prompt.md – text correction task
delight.md – independent creative agent

Implementation:

engine:
  id: copilot
  bare: true

Opportunity 6: `engine.version` Pinning for Stability

What: Pin to a specific Copilot CLI version to avoid unexpected breakage from releases.

Why It Matters: gh-aw itself is tested against specific CLI versions. Uncontrolled upgrades can break workflows mid-run.

Candidate workflows (production critical, high frequency):

daily-issues-report.md – daily production workflow
auto-triage-issues.md – every 6h trigger
hourly-ci-cleaner.md – hourly trigger

Implementation:

engine:
  id: copilot
  version: "1.0.21"  # pin to current stable

🟢 View Low Priority Opportunities

🟢 Low Priority Opportunities

Opportunity 7: `mcp-scripts` for Structured Tool Calls

What: Allows defining shell script tools that the agent can call via MCP protocol. Only daily-cli-performance.md uses this.

Candidates: Any workflow that currently uses tools.bash with a small set of specific commands could convert to mcp-scripts for cleaner tool boundaries and better audit trails.

Opportunity 8: GitHub Toolset Specificity

What: Many workflows use toolsets: [default] which includes repos, issues, pull_requests, and context. Using specific toolsets reduces the agent's surface area.

Pattern observed: architecture-guardian.md correctly uses toolsets: [repos] (read-only repo access). More workflows could follow this pattern.

Opportunity 9: AWF Memory Limits

What: sandbox.agent.memory sets memory limits on the AWF container (e.g., "8g").

Why useful: Long-running workflows doing Python data analysis (daily-issues-report.md, python-data-charts.md) might benefit from explicit memory limits to prevent OOM.

Implementation:

sandbox:
  agent:
    id: awf
    memory: "8g"

Opportunity 10: `engine.api-target` for Multi-Tenant Testing

What: Routes Copilot CLI to a non-default API endpoint. Zero usage currently.

When relevant: Testing workflows against GHEC data-residency tenants or GHE Server instances. Not applicable to the current repo but worth noting for future.

Specific Workflow Recommendations

View Top 5 Quick-Win Recommendations

Workflow	Recommended Change	Estimated Effort
`pr-nitpick-reviewer.md`	Add `engine: { id: copilot, agent: grumpy-reviewer }`	2 min
`contribution-check.md`	Add `engine: { id: copilot, agent: contribution-checker }`	2 min
`brave.md`	Add `sandbox: { agent: awf }` + `network: { allowed: [defaults, api.search.brave.com] }`	5 min
`deep-report.md`	Add `engine: { id: copilot, max-continuations: 5 }`	1 min
`daily-fact.md`	Add `engine: { id: copilot, bare: true, model: gpt-5.1-codex-mini }`	2 min

📈 Trends vs. Previous Run (2026-04-11)

Metric	2026-04-11	2026-04-12	Trend
Copilot workflows (explicit)	90	89	→
`playwright` adoption	13%	17%	✅ +4%
`web-fetch` adoption	19%	22%	✅ +3%
`mcp-scripts`	0%	1%	↑
`engine.agent` (custom file)	6%	3%	↓
`max-continuations`	2%	2%	→
`engine.bare`	2%	2%	→
`engine.env`	11%*	0%	↓*
`engine.args`	10%*	0%	↓*

* Previous run may have counted differently (body text vs. frontmatter-only). Current methodology: frontmatter-only checks.

Best Practice Guidelines

Based on this research, here are the recommended best practices for Copilot workflows:

Pair web-fetch with AWF sandbox: Always add sandbox: { agent: awf } when using external URL fetching to enforce network restrictions.
Use custom agent files for specialized roles: Wire existing .github/agents/ files via engine: { id: copilot, agent: <name> } instead of embedding role instructions in every workflow.
Add max-continuations for complex analysis: Workflows with 30min+ timeouts doing deep analysis benefit from max-continuations: 3-5.
Use bare: true for focused tasks: Creative/utility workflows that don't need repo context run cleaner without AGENTS.md loading.
Specify GitHub toolsets explicitly: Use toolsets: [issues] instead of toolsets: [default] when only one resource type is needed.
Set gpt-5.1-codex-mini for simple tasks: Simple workflows (fact generation, short summaries) don't need the full model. Pin to a lighter model for cost savings.

Action Items

Immediate (this week):

Add sandbox: { agent: awf } to brave.md to restrict Brave MCP network access
Wire grumpy-reviewer agent to pr-nitpick-reviewer.md
Wire contribution-checker agent to contribution-check.md

Short-term (this month):

Add max-continuations: 5 to deep-report.md and delight.md
Add bare: true to daily-fact.md and poem-bot.md
Add engine.model: gpt-5.1-codex-mini to simple utility workflows
Review agentic-workflows.agent.md — wire to workflow-generator.md

Long-term (this quarter):

Add AWF memory limits to Python-heavy analysis workflows
Create a shared shared/engine/defaults.md import with standard engine config patterns
Evaluate mcp-scripts for high-frequency tool-heavy workflows
Document engine.env use cases in workflow examples

Research Methodology & Supporting Data

Research Methodology

Data Sources:

.github/workflows/*.md — 186 workflow markdown files (non-lock)
pkg/workflow/copilot_engine*.go — Engine implementation
pkg/constants/feature_constants.go — Feature flag definitions
docs/src/content/docs/reference/engines.md — Engine documentation
.github/agents/ — Custom agent file inventory
/tmp/gh-aw/repo-memory/default/copilot-research-latest.json — Historical data

Analysis Scripts: Python3 + grep pattern analysis on frontmatter YAML sections.

Workflow Count Methodology:

Counts all .md files in .github/workflows/ that don't end in .lock.yml
"Copilot" = explicit engine: copilot (89) + extended object with id: copilot (~10) + no engine specified (26, defaults to Copilot) = ~125 effective
Feature adoption % calculated against explicit copilot (89) for fairness

Limitations:

Body text searches may pick up documentation references vs. actual usage
Shared imports (.github/aw/imports/) not analyzed in full detail
Agent execution data (actual run success/failure) not analyzed

References:

Generated by Copilot CLI Deep Research Agent · ● 2.8M · ◷

expires on Apr 13, 2026, 9:11 PM UTC

2026-04-13T21:18:33Z

github-actions[bot]
bot Apr 13, 2026
Author

This discussion has been marked as outdated by Copilot CLI Deep Research Agent.

A newer discussion is available at Discussion #26089.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[copilot-cli-research] Copilot CLI Deep Research - April 12, 2026 #25936

Uh oh!

{{title}}

Uh oh!

🔴 High Priority Opportunities

Opportunity 1: `engine.env` – 0% Adoption

Opportunity 2: Web-Fetch Workflows Without Sandbox

🟡 Medium Priority Opportunities

Opportunity 3: Custom Agent Files (8/10 Unused)

Opportunity 4: `max-continuations` for Complex Analysis Workflows

Opportunity 5: `bare: true` for Context-Clean Workflows

Opportunity 6: `engine.version` Pinning for Stability

🟢 Low Priority Opportunities

Opportunity 7: `mcp-scripts` for Structured Tool Calls

Opportunity 8: GitHub Toolset Specificity

Opportunity 9: AWF Memory Limits

Opportunity 10: `engine.api-target` for Multi-Tenant Testing

Research Methodology

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[copilot-cli-research] Copilot CLI Deep Research - April 12, 2026 #25936

Uh oh!

github-actions[bot] bot Apr 12, 2026

Executive Summary

🔴 Critical Issues (Immediate Attention)

🟡 Medium Priority Opportunities

Feature Usage Matrix

🔴 High Priority Opportunities

Opportunity 1: engine.env – 0% Adoption

Opportunity 2: Web-Fetch Workflows Without Sandbox

🟡 Medium Priority Opportunities

Opportunity 3: Custom Agent Files (8/10 Unused)

Opportunity 4: max-continuations for Complex Analysis Workflows

Opportunity 5: bare: true for Context-Clean Workflows

Opportunity 6: engine.version Pinning for Stability

🟢 Low Priority Opportunities

Opportunity 7: mcp-scripts for Structured Tool Calls

Opportunity 8: GitHub Toolset Specificity

Opportunity 9: AWF Memory Limits

Opportunity 10: engine.api-target for Multi-Tenant Testing

Specific Workflow Recommendations

📈 Trends vs. Previous Run (2026-04-11)

Best Practice Guidelines

Action Items

Research Methodology

Replies: 1 comment

Uh oh!

github-actions[bot] bot Apr 13, 2026 Author

github-actions[bot]
bot Apr 12, 2026

Opportunity 1: `engine.env` – 0% Adoption

Opportunity 4: `max-continuations` for Complex Analysis Workflows

Opportunity 5: `bare: true` for Context-Clean Workflows

Opportunity 6: `engine.version` Pinning for Stability

Opportunity 7: `mcp-scripts` for Structured Tool Calls

Opportunity 10: `engine.api-target` for Multi-Tenant Testing

github-actions[bot]
bot Apr 13, 2026
Author