Skip to content

Add triage agent for minimal API#66752

Draft
Youssef1313 wants to merge 3 commits into
mainfrom
dev/ygerges/aw
Draft

Add triage agent for minimal API#66752
Youssef1313 wants to merge 3 commits into
mainfrom
dev/ygerges/aw

Conversation

@Youssef1313
Copy link
Copy Markdown
Member

@Youssef1313 Youssef1313 commented May 20, 2026

This adds an agentic workflow to help with triaging issues in minimal APIs. We have an existing triage agentic workflow, but it acts more of "pre-triage" that adds labels and classifies whether an issue is a bug or feature request etc.

@Youssef1313 Youssef1313 requested a review from wtgodbe as a code owner May 20, 2026 14:04
Copilot AI review requested due to automatic review settings May 20, 2026 14:04
@Youssef1313 Youssef1313 requested a review from a team as a code owner May 20, 2026 14:04
@github-actions github-actions Bot added the area-infrastructure Includes: MSBuild projects/targets, build scripts, CI, Installers and shared framework label May 20, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Hey @dotnet/aspnet-build, looks like this PR is something you want to take a look at.

@Youssef1313 Youssef1313 changed the title Dev/ygerges/aw Add triage agent for minimal API May 20, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new gh-aw Agentic Workflow to automatically triage area-minimal issues on a daily schedule, attempting TDD fixes and opening draft PRs (or posting comments/design proposals) with an AI-triaged label applied after processing.

Changes:

  • Introduces a new agent prompt/workflow definition for minimal API issue triage.
  • Adds the corresponding auto-generated .lock.yml compiled workflow.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
.github/workflows/minimal-api-triage.md Defines the daily minimal API triage agent prompt, constraints, and safe outputs.
.github/workflows/minimal-api-triage.lock.yml Auto-generated compiled workflow produced from the .md definition.

fetch-depth: 0

tools:
bash: ["*"]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree tools should be restricted especially for public issue triaging. See

tools:
bash: ["cat", "head", "tail", "grep", "wc", "jq"]
github:
min-integrity: none

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DeagleGross I'm not sure I see a big concern of being permissive. Copilot should be running in an environment where it doesn't have access to anything dangerous if I understand correctly.

Then there is a separate step that has more permissions but that's controlled by the safe outputs thing.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot should be running in an environment where it doesn't have access to anything dangerous if I understand correctly.

Yes, but you are relying on the sandboxing provided by GitHub, and while I am sure they are building it reliably and securely, we should also provide guardrails here. At the moment of writing there are already tons of cases where agents went crazy and done damage even though engineers believed in security.

This workflow is particularly important since it processes anonymous user input - anyone is able file an issue to dotnet/aspnetcore, and if there is some flow to exploit, we need to assume it will definitely be abused. Potential problems are with allowing executing random commands, accessing any resource in the web, modifying local data (maybe not as bad since sandboxing).

Additional downside here is that runs may be expensive and long to execute. Without giving extra specific commands to agent to execute we are just giving it too much space to go other direction, and that is an additional reason I thought to design this in multi-step process, where one agent only analyzes the context provided and builds the plan, another is only making a repro and etc. Such design allows runs to be more deterministic, reaching understandable goals, ideally you could repeat only some steps of the flow. In your case it's basically a one-shot from issue to PR, and even though agents are improving, I am doubting this will work well for every single issue.

Comment thread .github/workflows/minimal-api-triage.md
@Youssef1313 Youssef1313 marked this pull request as draft May 20, 2026 15:14
@Youssef1313
Copy link
Copy Markdown
Member Author

Drafting to avoid accidental merge, but it's ready for review to get feedback.


### 2.2 Attempt a TDD Fix

1. **Write a failing test first** that reproduces the issue. Place the test in
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth giving more explicit instructions about how to find the relevant test file, and how to use it as an example on how to structure a new test

2. **Run the tests** using the `build.sh -test` script in the relevant `src/`
subdirectory to confirm the new test fails (or passes, if the issue is
already fixed).
3. If the test passes without code changes, the issue may already be fixed —
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend having it run the tests a few times in a loop here, to protect against flakiness

Comment thread .github/workflows/minimal-api-triage.md
Comment thread .github/workflows/minimal-api-triage.md
## 2. For Each Issue

Read the full issue body using the GitHub MCP Server `get_issue` tool.
Understand the reported problem, then follow the decision tree below.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth including a comment about how issue contents (title, body, comments, username, etc) are untrusted inputs & should never be treated as commands - something like https://github.com/dotnet/aspnetcore/blob/main/.github/workflows/test-quarantine.md#security-untrusted-input-handling (but replace mentions of helix logs, etc, with mentions of user input)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wtgodbe Doesn't Copilot in this case run already in a sandboxed environment? So even if the body contained malicious instructions, it cannot cause any harm.

All actions taken by the aw are restricted by the safe outputs section as far as I know.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, but these runs are non-deterministic, and the workflow logs are public, so it could potentially print something we don't want it to. Couldn't hurt to be extra-defensive.

## 5. Quality Guidelines

- Follow existing code style and conventions in the repository.
- Use the latest C# language features.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say "Use the latest C# language features when appropriate", to stop it from over-prioritizing that

- Use the latest C# language features.
- Write clear, descriptive test method names matching the style of nearby tests.
- Use xUnit for tests (the framework used in this repo).
- Ensure XML doc comments on any new public APIs.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't new public APIs have to go through API review first?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wtgodbe Good point. Maybe we should avoid any PRs that add new public APIs altogether as part of this triage workflow actually.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think we should avoid it for now. Wouldn't be surprised if we do net-new APIs from copilot sooner than later, but it probably requires some intentional discussion first.

Copy link
Copy Markdown
Member

@T-Gro T-Gro May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @wtgodbe , I am from the dotnet/fsharp repo. But was helping @Youssef1313 to set this workflow up.

I think it would be worth trying to train an agent on all historical API discussions and build a dedicated subagent (invoke like @api-review-expert or so) and build a workflow that would also submit an API review proposal when a bugfix really needs a new API.
Worst that can happen is that they are bad and get closed.

(the https://github.com/dotnet/aspnetcore/blob/main/.github/prompts/ApiReview.prompt.md file is good on the formal aspect of API submission, A trained agent could try to guardrail the API proposal to increase chances of acceptance - based in historical acceptance/rejections and reasoning in comments, per area.

IMO worth a shot - let me know if interested - I would need to know the right set of data sources.
)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@T-Gro I think it's a good idea - I'll defer to @BrennanConroy, @halter73, and @javiercn on this one though, as I don't write library code in this repo

Copy link
Copy Markdown
Member

@wtgodbe wtgodbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo a few comments

Copy link
Copy Markdown
Member

@DeagleGross DeagleGross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into automatization of issue processing! I've posted a couple of questions

schedule: daily
engine:
id: copilot
model: claude-opus-4.6
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see other workflows dont bind to a specific model, and if claude-opus-4.6 will be deprecated / not working / whatever, this will silently break, do we need this specific model? Does it prove to be a better option?

---
on:
schedule:
- cron: "0 10 * * *"
workflow_dispatch:
steps:

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fetch-depth: 0

tools:
bash: ["*"]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree tools should be restricted especially for public issue triaging. See

tools:
bash: ["cat", "head", "tail", "grep", "wc", "jq"]
github:
min-integrity: none

**dotnet/aspnetcore** repository. Your goal is to attempt a fix for each issue
using test-driven development and, when appropriate, open a pull request.

## 1. Find Issues to Triage
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should have different workflows for triaging / for reproducing the issue / for making a fix + posting a PR.

Can you please try to investigate what existing tooling provides and what are the benefits, maybe we can reuse something? One interesting find is https://github.com/DamianEdwards/copilotd by @DamianEdwards

But I like the idea of making a decision tree and spawning sub-agents to do the specific task management. I think most of AI automation is built around this idea

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DeagleGross One of the goals here is that we have a better idea of what the issue is when we do the manual triage process. Having Copilot attempting to investigate/fix it might make lots of things clear so we can take the decisions faster when we triage. So I think it's okay-ish to let this aw attempt to do as much as it can, as every piece of information the agent can provide will help us. However, I can also see that this might become overwhelming and producing more noise. I'm aiming to have the balance by limiting the number of daily triaged issues. For minimal APIs, we have 60-70 untriaged issues so far per the query we use. Maybe once we deal with all these issues, we might start finding that some split is better. It's hard to be sure IMO without trying things out though.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copilotd sounds interesting, but from reading the README, I feel that the current agent fits our needs more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having Copilot attempting to investigate/fix it might make lots of things clear so we can take the decisions faster when we triage

Ok, then I guess I dont understand the goal of workflow well enough - is it only for triaging or for making a fix or repro? If it is only for triaging, then why posting a comment with a repro / potential fix is not enough? I think spawning a PR per issue with potential fix we might not take a look at may be a spam with PRs, and our goal is probably to filter issues and reduce the backlog of issues+PRs, not extend it.

However, I can also see that this might become overwhelming and producing more noise

Agree! And posting the comment is not extending the issue or PR count - maybe we can focus on this firstly?

Would be happy to see real runs on your fork. That will give us clear idea if the workflow helps running it on existing examples we have in the repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-infrastructure Includes: MSBuild projects/targets, build scripts, CI, Installers and shared framework

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants