Phase 1: Add Copilot SDK foundation alongside existing langchaingo agent#6883
Phase 1: Add Copilot SDK foundation alongside existing langchaingo agent#6883
Conversation
Add the GitHub Copilot SDK (github.com/github/copilot-sdk/go) as a new dependency and create the foundational types for a Copilot SDK-based agent implementation. All new code coexists alongside the existing langchaingo agent — no existing code is modified or deleted. New files: - pkg/llm/copilot_client.go: CopilotClientManager wrapping copilot.Client lifecycle (Start, Stop, GetAuthStatus, ListModels) - pkg/llm/session_config.go: SessionConfigBuilder that reads ai.agent.* config keys and produces copilot.SessionConfig, including MCP server merging (built-in + user-configured) and tool control - internal/agent/copilot_agent.go: CopilotAgent implementing the Agent interface backed by copilot.Session with SendAndWait - internal/agent/copilot_agent_factory.go: CopilotAgentFactory that creates CopilotAgent instances with SDK client, session, permission hooks, MCP servers, and event handlers - internal/agent/logging/session_event_handler.go: SessionEventLogger, SessionFileLogger, and CompositeEventHandler for SDK SessionEvent streaming to UX thought channel and daily log files Config additions (resources/config_options.yaml): - ai.agent.model: Default model for Copilot SDK sessions - ai.agent.mode: Agent mode (autopilot/interactive/plan) - ai.agent.mcp.servers: Additional MCP servers - ai.agent.tools.available/excluded: Tool allow/deny lists - ai.agent.systemMessage: Custom system prompt append - ai.agent.copilot.logLevel: SDK log level Resolves #6871, #6872, #6873, #6874, #6875 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR adds the GitHub Copilot SDK (v0.1.25) as a foundational dependency and creates new agent infrastructure that will eventually replace the existing langchaingo-based implementation. This is Phase 1 of a 3-phase migration, where all new code coexists alongside existing code without any deletions or modifications to current functionality.
Changes:
- Adds GitHub Copilot SDK dependency and creates wrapper types (
CopilotClientManager,SessionConfigBuilder) that bridge azd config to SDK types - Implements
CopilotAgentandCopilotAgentFactorythat satisfy the existingAgentinterface for seamless future switchover - Creates event handling infrastructure (
SessionEventLogger,SessionFileLogger) that maps SDK events to azd's existing UX patterns - Adds 8 new config options (
ai.agent.*) for customizing agent behavior (model, tools, MCP servers, system message)
Reviewed changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
go.mod / go.sum |
Adds copilot-sdk/go v0.1.25 and transitive dependency jsonschema-go v0.4.2 (both marked indirect) |
resources/config_options.yaml |
Adds 8 new config keys for Copilot SDK agent customization |
pkg/llm/copilot_client.go |
Manager wrapping SDK client lifecycle with azd-specific error messages |
pkg/llm/copilot_client_test.go |
Tests for client manager instantiation (2 cases) |
pkg/llm/session_config.go |
Bridge converting azd config → SDK SessionConfig with MCP server merging |
pkg/llm/session_config_test.go |
Tests for config reading, tool control, MCP merging (8 cases) |
internal/agent/copilot_agent.go |
Agent implementation using SDK Session.SendAndWait, reuses existing UX patterns |
internal/agent/copilot_agent_factory.go |
Factory creating agents with SDK client, session, hooks, event handlers |
internal/agent/logging/session_event_handler.go |
Event handlers mapping SDK events to thought channel + file logging |
internal/agent/logging/session_event_handler_test.go |
Tests for event handling, tool input extraction, composite handler (10 cases) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| type: object | ||
| example: '["read_file", "write_file"]' | ||
| - key: ai.agent.tools.excluded | ||
| description: "Denylist of tools blocked from the agent." | ||
| type: object |
There was a problem hiding this comment.
The type field should be array instead of object for ai.agent.tools.available and ai.agent.tools.excluded. These config keys are documented to accept JSON arrays like ["read_file", "write_file"] and the code reads them using getStringSliceFromConfig() which calls cfg.GetSlice(). The object type is misleading.
| type: object | |
| example: '["read_file", "write_file"]' | |
| - key: ai.agent.tools.excluded | |
| description: "Denylist of tools blocked from the agent." | |
| type: object | |
| type: array | |
| example: '["read_file", "write_file"]' | |
| - key: ai.agent.tools.excluded | |
| description: "Denylist of tools blocked from the agent." | |
| type: array |
| userConfig, err := b.userConfigManager.Load() | ||
| if err != nil { | ||
| // Use defaults if config can't be loaded | ||
| return cfg, nil |
There was a problem hiding this comment.
When userConfigManager.Load() fails, the function returns the default config successfully (line 43), which silently swallows the error. Based on established patterns in the codebase (see stored memory), errors from userConfigManager.Load() are typically handled with graceful fallback. However, it would be more robust to log the error at debug level so users can troubleshoot config loading issues if needed. Consider adding a debug log statement before returning the default config.
| // Wire permission hooks | ||
| sessionConfig.Hooks = &copilot.SessionHooks{ | ||
| OnPreToolUse: func(input copilot.PreToolUseHookInput, inv copilot.HookInvocation) ( | ||
| *copilot.PreToolUseHookOutput, error, | ||
| ) { | ||
| // Allow all tools by default — SDK handles its own permission model. | ||
| // In Phase 2, azd-specific security policies (path validation) will be wired here. | ||
| return &copilot.PreToolUseHookOutput{}, nil |
There was a problem hiding this comment.
The OnPreToolUse hook always returns an empty PreToolUseHookOutput (line 107), which allows all tools to execute without any validation. The comment mentions "azd-specific security policies (path validation) will be wired here in Phase 2", but until then, this creates a security gap. Tools could potentially access files outside the project directory, execute arbitrary commands, or perform other dangerous operations without validation. Consider at minimum implementing basic path validation to prevent directory traversal attacks even in Phase 1, or documenting this security limitation prominently in the PR description and release notes.
| type: string | ||
| example: "gpt-4.1" | ||
| - key: ai.agent.mode | ||
| description: "Default agent mode for Copilot SDK sessions." |
There was a problem hiding this comment.
The config option ai.agent.mode is documented in config_options.yaml (lines 89-93) but is never read or used in the SessionConfigBuilder.Build() method. Either this config option should be removed from config_options.yaml, or the code should read it and set it on the SessionConfig. If it's intended for Phase 2, that should be documented in a comment.
| description: "Default agent mode for Copilot SDK sessions." | |
| description: "Default agent mode for Copilot SDK sessions. Reserved for future use (Phase 2) and not yet applied by the runtime." |
| // Skill control | ||
| if dirs := getStringSliceFromConfig(userConfig, "ai.agent.skills.directories"); len(dirs) > 0 { | ||
| cfg.SkillDirectories = dirs | ||
| } | ||
| if disabled := getStringSliceFromConfig(userConfig, "ai.agent.skills.disabled"); len(disabled) > 0 { | ||
| cfg.DisabledSkills = disabled | ||
| } | ||
|
|
There was a problem hiding this comment.
The code reads config options ai.agent.skills.directories (line 68) and ai.agent.skills.disabled (line 72) but these are not documented in resources/config_options.yaml. Either add documentation for these config options or remove the code that reads them if they're not intended for Phase 1.
| // Skill control | |
| if dirs := getStringSliceFromConfig(userConfig, "ai.agent.skills.directories"); len(dirs) > 0 { | |
| cfg.SkillDirectories = dirs | |
| } | |
| if disabled := getStringSliceFromConfig(userConfig, "ai.agent.skills.disabled"); len(disabled) > 0 { | |
| cfg.DisabledSkills = disabled | |
| } |
| // NewCopilotAgent creates a new CopilotAgent backed by the given copilot.Session. | ||
| func NewCopilotAgent( | ||
| session *copilot.Session, | ||
| console input.Console, | ||
| opts ...CopilotAgentOption, | ||
| ) *CopilotAgent { | ||
| agent := &CopilotAgent{ | ||
| session: session, | ||
| console: console, | ||
| watchForFileChanges: true, | ||
| } | ||
|
|
||
| for _, opt := range opts { | ||
| opt(agent) | ||
| } | ||
|
|
||
| return agent | ||
| } |
There was a problem hiding this comment.
CopilotAgent and CopilotAgentFactory lack unit tests. These are core components that implement the Agent interface and manage complex lifecycle operations (cleanup ordering, event subscription, context cancellation). Consider adding tests for: CopilotAgent.SendMessage success/error paths, CopilotAgentFactory.Create cleanup on error paths, and proper resource cleanup ordering. The existing agent code has test coverage (e.g., agent_factory_test.go for the langchaingo implementation), so the same standard should apply here.
| if probe.Type == "http" { | ||
| var remote map[string]any | ||
| if err := json.Unmarshal(data, &remote); err != nil { | ||
| continue | ||
| } | ||
| result[name] = copilot.MCPServerConfig(remote) | ||
| } else { | ||
| var local map[string]any | ||
| if err := json.Unmarshal(data, &local); err != nil { | ||
| continue | ||
| } | ||
| result[name] = copilot.MCPServerConfig(local) | ||
| } |
There was a problem hiding this comment.
The logic for handling http vs stdio MCP servers is redundant. Both branches (lines 161-166 and 168-172) do essentially the same thing: unmarshal into map[string]any and assign to result[name]. The type detection logic is useful, but the separate handling can be simplified to a single code path since both produce the same MCPServerConfig type.
| if probe.Type == "http" { | |
| var remote map[string]any | |
| if err := json.Unmarshal(data, &remote); err != nil { | |
| continue | |
| } | |
| result[name] = copilot.MCPServerConfig(remote) | |
| } else { | |
| var local map[string]any | |
| if err := json.Unmarshal(data, &local); err != nil { | |
| continue | |
| } | |
| result[name] = copilot.MCPServerConfig(local) | |
| } | |
| // Unmarshal into a generic map and store as MCPServerConfig. | |
| // Both HTTP and stdio configurations share the same representation. | |
| var serverCfg map[string]any | |
| if err := json.Unmarshal(data, &serverCfg); err != nil { | |
| continue | |
| } | |
| result[name] = copilot.MCPServerConfig(serverCfg) |
| cleanupTasks := map[string]func() error{} | ||
|
|
||
| cleanup := func() error { | ||
| for name, task := range cleanupTasks { | ||
| if err := task(); err != nil { | ||
| log.Printf("failed to cleanup %s: %v", name, err) | ||
| } | ||
| } | ||
| return nil | ||
| } | ||
|
|
||
| // Start the Copilot client (spawns copilot-agent-runtime) | ||
| if err := f.clientManager.Start(ctx); err != nil { | ||
| return nil, err | ||
| } | ||
| cleanupTasks["copilot-client"] = f.clientManager.Stop | ||
|
|
||
| // Create thought channel for UX streaming | ||
| thoughtChan := make(chan logging.Thought) | ||
| cleanupTasks["thoughtChan"] = func() error { | ||
| close(thoughtChan) | ||
| return nil | ||
| } |
There was a problem hiding this comment.
The cleanup order may cause a panic. The thoughtChan is closed before the session is destroyed or events are unsubscribed. If session events are still being processed after close(thoughtChan) executes, event handlers will panic when attempting to send to a closed channel. The cleanup order should be: 1) unsubscribe from session events, 2) destroy session, 3) close thoughtChan, 4) close file logger, 5) stop client. Consider using an ordered cleanup approach (e.g., slice of cleanup tasks executed in reverse order) instead of a map which has undefined iteration order.
| var watcher watch.Watcher | ||
| if a.watchForFileChanges { | ||
| var err error | ||
| watcher, err = watch.NewWatcher(ctx) | ||
| if err != nil { | ||
| cancelCtx() | ||
| return "", fmt.Errorf("failed to start watcher: %w", err) | ||
| } | ||
| } |
There was a problem hiding this comment.
The watcher is created with ctx (line 84) but thoughtsCtx is canceled in the defer (line 98). The watcher goroutine will only stop when the original ctx is done, not when thoughtsCtx is canceled. This could cause the watcher to keep running longer than intended. Consider passing thoughtsCtx to watch.NewWatcher() instead of ctx to ensure the watcher stops when the SendMessage operation completes.
Register CopilotProvider as a named 'copilot' model provider in the IoC container. This makes 'copilot' the default agent type when ai.agent.model.type is not explicitly configured. Changes: - Add LlmTypeCopilot constant and CopilotProvider (copilot_provider.go) - Default GetDefaultModel() to 'copilot' when no model type is set - Register 'copilot' provider in container.go - Update init.go to set 'copilot' instead of 'github-copilot' - Update error message to list copilot as supported type Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add [copilot] and [copilot-event] prefixed log statements throughout the Copilot SDK agent pipeline for troubleshooting: - CopilotClientManager: Start/stop state transitions - CopilotAgentFactory: MCP server count, session config details, PreToolUse/PostToolUse/ErrorOccurred hook invocations - CopilotAgent.SendMessage: prompt size, response size, errors - SessionEventLogger: every event type received, plus detail for assistant.message, tool.execution_start, and assistant.reasoning Run with AZD_DEBUG=true to see log output. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
AgentFactory.Create() now checks the configured model type. When it's 'copilot' (the new default), it delegates to CopilotAgentFactory which creates a CopilotAgent backed by the Copilot SDK session. No call site changes needed — existing code calling AgentFactory.Create() gets the Copilot SDK agent automatically. Changes: - AgentFactory now takes CopilotAgentFactory as a dependency - Create() checks model type and delegates to CopilotAgentFactory - Register CopilotAgentFactory, CopilotClientManager, and SessionConfigBuilder in IoC container (cmd/container.go) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Set SDK LogLevel to 'debug' by default to surface the command and args the SDK uses when spawning the copilot CLI process. This will help diagnose the 'exit status 1' error during client.Start(). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Azure Dev CLI Install InstructionsInstall scriptsMacOS/Linux
bash: pwsh: WindowsPowerShell install MSI install Standalone Binary
MSI
Documentationlearn.microsoft.com documentationtitle: Azure Developer CLI reference
|
Overview
Add the GitHub Copilot SDK (github.com/github/copilot-sdk/go v0.1.25) as a new dependency and create the foundational types for a Copilot SDK-based agent implementation. All new code coexists alongside the existing langchaingo agent — no existing code is modified or deleted.
Resolves #6871, #6872, #6873, #6874, #6875
Part of Epic #6870
What's New
New Files (7)
pkg/llm/copilot_client.goCopilotClientManagerwrappingcopilot.Clientlifecycle (Start, Stop, GetAuthStatus, ListModels)pkg/llm/session_config.goSessionConfigBuilderreadsai.agent.*config keys →copilot.SessionConfigwith MCP server merging and tool controlinternal/agent/copilot_agent.goCopilotAgentimplementingAgentinterface backed bycopilot.Session.SendAndWait()internal/agent/copilot_agent_factory.goCopilotAgentFactorycreates agents with SDK client, session, permission hooks, MCP servers, event handlersinternal/agent/logging/session_event_handler.goSessionEventLogger+SessionFileLogger+CompositeEventHandlerfor SDK SessionEvent streamingNew Config Options (
resources/config_options.yaml)ai.agent.modelgpt-4.1)ai.agent.modeautopilot/interactive/planai.agent.mcp.serversai.agent.tools.availableai.agent.tools.excludedai.agent.systemMessageai.agent.copilot.logLevelTest Files (3)
pkg/llm/copilot_client_test.go— Client manager instantiation testspkg/llm/session_config_test.go— Config bridge tests (model, system message, tool control, MCP server merging)internal/agent/logging/session_event_handler_test.go— Event handler tests (assistant messages, tool start, reasoning, composite handler)Key Design Decisions
CopilotAgentimplements the sameAgentinterface as existingConversationalAzdAiAgent, so call sites can switch in Phase 2 without API changesmap[string]any: Matches SDK's flexible type definition, withconvertServerConfig()bridging from azd'smcp.ServerConfigPreToolUse/PostToolUsehooks wired as pass-through stubs; azd-specific security policies will be added in Phase 2CopilotAgent.renderThoughts()reuses the same spinner + canvas UX pattern as the existing agentTesting
go build ./...✅pkg/llm,internal/agent/...,pkg/config)