-
Notifications
You must be signed in to change notification settings - Fork 245
Description
Automated semantic analysis of all non-test Go files in pkg/workflow/ (240+ files) and utility packages (pkg/stringutil, pkg/sliceutil, etc.) identified several concrete refactoring opportunities through function clustering and duplicate detection.
Overview
| Metric | Value |
|---|---|
| Go files analyzed | 240+ (pkg/workflow) + 7 utility packages |
| Duplicate patterns found | 4 concrete cases |
| Outlier functions found | 2 |
| Estimated code reduction | ~700 lines |
| Detection method | Serena semantic analysis + naming pattern clustering |
Critical Issues
Issue 1: Duplicate Mount Syntax Validation
Two functions implement nearly identical mount string validation (source:dest:mode format) in separate files:
pkg/workflow/sandbox_validation.go→validateMountsSyntax()pkg/workflow/mcp_config_validation.go→validateMCPMountsSyntax()
Both split on :, check for exactly 3 parts, and validate that mode is "ro" or "rw". The only differences are error types (NewValidationError vs fmt.Errorf) and that the sandbox version also validates non-empty source/dest fields.
Recommendation: Extract a shared validateMountFormat(mounts []string, toolName string) error function into validation_helpers.go or a new mounts_validation.go, then call it from both files with appropriate error wrapping.
Issue 2: Near-Identical missing_data.go and missing_tool.go
pkg/workflow/missing_data.go and pkg/workflow/missing_tool.go are structurally identical (~164 lines each, ~250 duplicated lines):
View duplicate struct definitions
// missing_data.go
type MissingDataConfig struct {
BaseSafeOutputConfig `yaml:",inline"`
CreateIssue bool `yaml:"create-issue,omitempty"`
TitlePrefix string `yaml:"title-prefix,omitempty"`
Labels []string `yaml:"labels,omitempty"`
}
// missing_tool.go (IDENTICAL structure, only type name differs)
type MissingToolConfig struct {
BaseSafeOutputConfig `yaml:",inline"`
CreateIssue bool `yaml:"create-issue,omitempty"`
TitlePrefix string `yaml:"title-prefix,omitempty"`
Labels []string `yaml:"labels,omitempty"`
}Both parseMissingDataConfig() / parseMissingToolConfig() and buildCreateOutputMissingDataJob() / buildCreateOutputMissingToolJob() follow exactly the same pattern with only the output type name and environment variable prefix differing (e.g., GH_AW_MISSING_DATA_MAX vs GH_AW_MISSING_TOOL_MAX).
Recommendation: Consider a shared BaseMissingOutputConfig type and a generic buildCreateOutputMissingJob(outputType string, config BaseMissingOutputConfig) builder, reducing ~250 lines to ~80 lines of shared code. noop.go follows the same outer parse pattern and could share the same boilerplate helper.
Issue 3: Safe Output Handler Parse Function Boilerplate (11 files)
The following files all implement the same three-part pattern — a config struct embedding BaseSafeOutputConfig + SafeOutputTargetConfig, a parseXxxConfig() method on *Compiler, and a build function:
add_comment.go, add_labels.go, add_reviewer.go, assign_milestone.go, assign_to_agent.go, assign_to_user.go, hide_comment.go, link_sub_issue.go, remove_labels.go, reply_to_pr_review_comment.go, resolve_pr_review_thread.go
View duplicated parse function pattern
// add_labels.go (lines 20-40)
func (c *Compiler) parseAddLabelsConfig(outputMap map[string]any) *AddLabelsConfig {
if _, exists := outputMap["add-labels"]; !exists {
return nil
}
addLabelsLog.Print("Parsing add-labels configuration")
var config AddLabelsConfig
if err := unmarshalConfig(outputMap, "add-labels", &config, addLabelsLog); err != nil {
addLabelsLog.Printf("Failed to unmarshal config: %v", err)
return &AddLabelsConfig{}
}
return &config
}
// remove_labels.go (lines 18-38) — near-identical
func (c *Compiler) parseRemoveLabelsConfig(outputMap map[string]any) *RemoveLabelsConfig {
if _, exists := outputMap["remove-labels"]; !exists {
return nil
}
removeLabelsLog.Print("Parsing remove-labels configuration")
var config RemoveLabelsConfig
if err := unmarshalConfig(outputMap, "remove-labels", &config, removeLabelsLog); err != nil {
removeLabelsLog.Printf("Failed to unmarshal config: %v", err)
return &RemoveLabelsConfig{}
}
return &config
}The AssignToUserConfig and UnassignFromUserConfig structs are also nearly identical:
// Both have: BaseSafeOutputConfig, SafeOutputTargetConfig, Allowed []string, Blocked []string
// UnassignFromUser has one extra field: UnassignFirst *stringRecommendation: Extract a generic parse helper (similar to existing unmarshalConfig) that handles the nil-check, logging, and unmarshal boilerplate. Consider also whether assign_to_user.go and unassign_from_user.go could share a common base struct or live in one file.
Issue 4: SanitizeWorkflowName Duplication Across Packages
pkg/workflow/strings.go defines SanitizeWorkflowName() and SanitizeName(), while pkg/stringutil/identifiers.go defines NormalizeWorkflowName() — both sanitize workflow names for use in filenames/identifiers. The workflow package has a general-purpose SanitizeName() function with configurable options that overlaps with pkg/stringutil's sanitization family (SanitizeParameterName, SanitizePythonVariableName, SanitizeToolID).
Recommendation: Evaluate whether SanitizeName() with its options pattern should be moved to pkg/stringutil as the canonical sanitization primitive, with pkg/workflow/strings.go importing it. This would centralize sanitization logic.
Outlier Functions
View outlier details
Outlier 1: extractAddDirPaths() in copilot_engine.go
The function extractAddDirPaths() is a standalone package-level function in copilot_engine.go. All other engine files keep only constructor (NewXxxEngine()) and interface method implementations. This private helper function is an outlier for the file.
Recommendation: Move to copilot_engine_execution.go (where the directory add behavior is likely used), or to engine_helpers.go.
Outlier 2: generateCopilotSessionFileCopyStep() in copilot_logs.go
This function generates a YAML step for copying session files, but it lives in the log parsing file. It is closer in nature to execution/installation step generation than log parsing.
Recommendation: Move to copilot_engine_execution.go where other Copilot execution steps are generated.
Function Clusters (Well-Organized Areas)
View cluster analysis
The following areas show good organization and do not need refactoring:
Engine Interface Pattern (4 files): claude_engine.go, codex_engine.go, gemini_engine.go, copilot_engine.go all correctly implement the same engine interface. The per-engine split is intentional and appropriate.
Bundler Validation (3 files): bundler_runtime_validation.go, bundler_safety_validation.go, bundler_script_validation.go each validate genuinely distinct concerns (runtime mixing, bundle completeness, API compatibility) and are correctly separated.
Firewall Validation (2 files): firewall_validation.go (log-level validation) and network_firewall_validation.go (cross-field dependency validation like allow-urls requires ssl-bump) serve different validation dimensions and are appropriately split.
compiler_safe_outputs_ series (8 files):* Clear responsibility split by concern (job building, step building, environment setup, specialized handlers). Well-organized.
Utility Packages: pkg/sliceutil, pkg/mathutil, pkg/fileutil, pkg/gitutil, pkg/envutil, pkg/timeutil are all cleanly scoped with no cross-package duplication detected.
Recommendations by Priority
Priority 1 — High Impact, Low Risk
- Extract shared mount validation (
validateMountsSyntax/validateMCPMountsSyntax) into a single function invalidation_helpers.go - Move
extractAddDirPaths()fromcopilot_engine.gotocopilot_engine_execution.go - Move
generateCopilotSessionFileCopyStep()fromcopilot_logs.gotocopilot_engine_execution.go
Priority 2 — Medium Impact
- Extract generic parse helper for the repeated
parseXxxConfig()boilerplate across the 11 safe output handler files - Consolidate
missing_data.goandmissing_tool.gobuild functions via a shared base type or factory
Priority 3 — Long-term
- Evaluate moving
SanitizeName()topkg/stringutilas the canonical sanitization primitive
Implementation Checklist
- Extract
validateMountFormat()helper from sandbox and MCP mount validators - Relocate
extractAddDirPaths()tocopilot_engine_execution.go - Relocate
generateCopilotSessionFileCopyStep()tocopilot_engine_execution.go - Extract generic
parseOutputConfig[T]()helper or similar to reduce parse boilerplate - Explore shared base for
MissingDataConfig/MissingToolConfig - Evaluate consolidating sanitization primitives in
pkg/stringutil - Ensure all tests pass after each refactoring step
References
- Workflow run: §22281474970
- Analysis date: 2026-02-22
- Analysis scope: 240+ non-test
.gofiles inpkg/workflow/, 7 utility packages
Generated by Semantic Function Refactoring
- expires on Feb 24, 2026, 5:12 PM UTC