Add triage agent for minimal API#66752
Conversation
|
Hey @dotnet/aspnet-build, looks like this PR is something you want to take a look at. |
There was a problem hiding this comment.
Pull request overview
Adds a new gh-aw Agentic Workflow to automatically triage area-minimal issues on a daily schedule, attempting TDD fixes and opening draft PRs (or posting comments/design proposals) with an AI-triaged label applied after processing.
Changes:
- Introduces a new agent prompt/workflow definition for minimal API issue triage.
- Adds the corresponding auto-generated
.lock.ymlcompiled workflow.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| .github/workflows/minimal-api-triage.md | Defines the daily minimal API triage agent prompt, constraints, and safe outputs. |
| .github/workflows/minimal-api-triage.lock.yml | Auto-generated compiled workflow produced from the .md definition. |
| fetch-depth: 0 | ||
|
|
||
| tools: | ||
| bash: ["*"] |
There was a problem hiding this comment.
I agree tools should be restricted especially for public issue triaging. See
aspnetcore/.github/workflows/issue-triage-agent.md
Lines 30 to 33 in c0c2230
There was a problem hiding this comment.
@DeagleGross I'm not sure I see a big concern of being permissive. Copilot should be running in an environment where it doesn't have access to anything dangerous if I understand correctly.
Then there is a separate step that has more permissions but that's controlled by the safe outputs thing.
There was a problem hiding this comment.
Copilot should be running in an environment where it doesn't have access to anything dangerous if I understand correctly.
Yes, but you are relying on the sandboxing provided by GitHub, and while I am sure they are building it reliably and securely, we should also provide guardrails here. At the moment of writing there are already tons of cases where agents went crazy and done damage even though engineers believed in security.
This workflow is particularly important since it processes anonymous user input - anyone is able file an issue to dotnet/aspnetcore, and if there is some flow to exploit, we need to assume it will definitely be abused. Potential problems are with allowing executing random commands, accessing any resource in the web, modifying local data (maybe not as bad since sandboxing).
Additional downside here is that runs may be expensive and long to execute. Without giving extra specific commands to agent to execute we are just giving it too much space to go other direction, and that is an additional reason I thought to design this in multi-step process, where one agent only analyzes the context provided and builds the plan, another is only making a repro and etc. Such design allows runs to be more deterministic, reaching understandable goals, ideally you could repeat only some steps of the flow. In your case it's basically a one-shot from issue to PR, and even though agents are improving, I am doubting this will work well for every single issue.
|
Drafting to avoid accidental merge, but it's ready for review to get feedback. |
|
|
||
| ### 2.2 Attempt a TDD Fix | ||
|
|
||
| 1. **Write a failing test first** that reproduces the issue. Place the test in |
There was a problem hiding this comment.
Might be worth giving more explicit instructions about how to find the relevant test file, and how to use it as an example on how to structure a new test
| 2. **Run the tests** using the `build.sh -test` script in the relevant `src/` | ||
| subdirectory to confirm the new test fails (or passes, if the issue is | ||
| already fixed). | ||
| 3. If the test passes without code changes, the issue may already be fixed — |
There was a problem hiding this comment.
I'd recommend having it run the tests a few times in a loop here, to protect against flakiness
| ## 2. For Each Issue | ||
|
|
||
| Read the full issue body using the GitHub MCP Server `get_issue` tool. | ||
| Understand the reported problem, then follow the decision tree below. |
There was a problem hiding this comment.
Worth including a comment about how issue contents (title, body, comments, username, etc) are untrusted inputs & should never be treated as commands - something like https://github.com/dotnet/aspnetcore/blob/main/.github/workflows/test-quarantine.md#security-untrusted-input-handling (but replace mentions of helix logs, etc, with mentions of user input)
There was a problem hiding this comment.
@wtgodbe Doesn't Copilot in this case run already in a sandboxed environment? So even if the body contained malicious instructions, it cannot cause any harm.
All actions taken by the aw are restricted by the safe outputs section as far as I know.
There was a problem hiding this comment.
It does, but these runs are non-deterministic, and the workflow logs are public, so it could potentially print something we don't want it to. Couldn't hurt to be extra-defensive.
| ## 5. Quality Guidelines | ||
|
|
||
| - Follow existing code style and conventions in the repository. | ||
| - Use the latest C# language features. |
There was a problem hiding this comment.
I'd say "Use the latest C# language features when appropriate", to stop it from over-prioritizing that
| - Use the latest C# language features. | ||
| - Write clear, descriptive test method names matching the style of nearby tests. | ||
| - Use xUnit for tests (the framework used in this repo). | ||
| - Ensure XML doc comments on any new public APIs. |
There was a problem hiding this comment.
Wouldn't new public APIs have to go through API review first?
There was a problem hiding this comment.
@wtgodbe Good point. Maybe we should avoid any PRs that add new public APIs altogether as part of this triage workflow actually.
There was a problem hiding this comment.
Yeah, I think we should avoid it for now. Wouldn't be surprised if we do net-new APIs from copilot sooner than later, but it probably requires some intentional discussion first.
There was a problem hiding this comment.
Hi @wtgodbe , I am from the dotnet/fsharp repo. But was helping @Youssef1313 to set this workflow up.
I think it would be worth trying to train an agent on all historical API discussions and build a dedicated subagent (invoke like @api-review-expert or so) and build a workflow that would also submit an API review proposal when a bugfix really needs a new API.
Worst that can happen is that they are bad and get closed.
(the https://github.com/dotnet/aspnetcore/blob/main/.github/prompts/ApiReview.prompt.md file is good on the formal aspect of API submission, A trained agent could try to guardrail the API proposal to increase chances of acceptance - based in historical acceptance/rejections and reasoning in comments, per area.
IMO worth a shot - let me know if interested - I would need to know the right set of data sources.
)
There was a problem hiding this comment.
@T-Gro I think it's a good idea - I'll defer to @BrennanConroy, @halter73, and @javiercn on this one though, as I don't write library code in this repo
DeagleGross
left a comment
There was a problem hiding this comment.
Thanks for looking into automatization of issue processing! I've posted a couple of questions
| schedule: daily | ||
| engine: | ||
| id: copilot | ||
| model: claude-opus-4.6 |
There was a problem hiding this comment.
I see other workflows dont bind to a specific model, and if claude-opus-4.6 will be deprecated / not working / whatever, this will silently break, do we need this specific model? Does it prove to be a better option?
aspnetcore/.github/workflows/test-quarantine.md
Lines 1 to 6 in c0c2230
There was a problem hiding this comment.
This was from @T-Gro experience. It looks like https://github.com/dotnet/runtime/blob/b521e1551ee995149d59edbf74a33033243223fe/.github/workflows/code-review.md?plain=1#L45 also uses that.
| fetch-depth: 0 | ||
|
|
||
| tools: | ||
| bash: ["*"] |
There was a problem hiding this comment.
I agree tools should be restricted especially for public issue triaging. See
aspnetcore/.github/workflows/issue-triage-agent.md
Lines 30 to 33 in c0c2230
| **dotnet/aspnetcore** repository. Your goal is to attempt a fix for each issue | ||
| using test-driven development and, when appropriate, open a pull request. | ||
|
|
||
| ## 1. Find Issues to Triage |
There was a problem hiding this comment.
I wonder if we should have different workflows for triaging / for reproducing the issue / for making a fix + posting a PR.
Can you please try to investigate what existing tooling provides and what are the benefits, maybe we can reuse something? One interesting find is https://github.com/DamianEdwards/copilotd by @DamianEdwards
But I like the idea of making a decision tree and spawning sub-agents to do the specific task management. I think most of AI automation is built around this idea
There was a problem hiding this comment.
@DeagleGross One of the goals here is that we have a better idea of what the issue is when we do the manual triage process. Having Copilot attempting to investigate/fix it might make lots of things clear so we can take the decisions faster when we triage. So I think it's okay-ish to let this aw attempt to do as much as it can, as every piece of information the agent can provide will help us. However, I can also see that this might become overwhelming and producing more noise. I'm aiming to have the balance by limiting the number of daily triaged issues. For minimal APIs, we have 60-70 untriaged issues so far per the query we use. Maybe once we deal with all these issues, we might start finding that some split is better. It's hard to be sure IMO without trying things out though.
There was a problem hiding this comment.
copilotd sounds interesting, but from reading the README, I feel that the current agent fits our needs more.
There was a problem hiding this comment.
Having Copilot attempting to investigate/fix it might make lots of things clear so we can take the decisions faster when we triage
Ok, then I guess I dont understand the goal of workflow well enough - is it only for triaging or for making a fix or repro? If it is only for triaging, then why posting a comment with a repro / potential fix is not enough? I think spawning a PR per issue with potential fix we might not take a look at may be a spam with PRs, and our goal is probably to filter issues and reduce the backlog of issues+PRs, not extend it.
However, I can also see that this might become overwhelming and producing more noise
Agree! And posting the comment is not extending the issue or PR count - maybe we can focus on this firstly?
Would be happy to see real runs on your fork. That will give us clear idea if the workflow helps running it on existing examples we have in the repo.
This adds an agentic workflow to help with triaging issues in minimal APIs. We have an existing triage agentic workflow, but it acts more of "pre-triage" that adds labels and classifies whether an issue is a bug or feature request etc.