-
Notifications
You must be signed in to change notification settings - Fork 35
Closed
Description
🔍 Duplicate Code Detected: Dual sanitize content implementations
Analysis of commit b8c044b
Assignee: @copilot
Summary
Two separate sanitization pipelines exist: sanitize_content_old.cjs implements a full content sanitizer while sanitize_content_core.cjs provides the current implementation used by sanitize_content.cjs. The old file reimplements the same steps (command/mention neutralization, URL protocol/domain filtering, truncation) over hundreds of lines with no call sites, creating dead but duplicated logic likely to diverge from the active sanitizer.
Duplication Details
Pattern: Parallel sanitization logic maintained in two files
- Severity: Medium
- Occurrences: 2 (two full sanitizer implementations)
- Locations:
pkg/workflow/js/sanitize_content_old.cjs(lines 106-401)pkg/workflow/js/sanitize_content_core.cjs(lines 333-405 plus shared helpers)
- Code Sample:
// sanitize_content_old.cjs (lines ~171-200)
const lines = sanitized.split("\n");
const maxLines = 65000;
maxLength = maxLength || 524288;
if (lines.length > maxLines) {
const truncationMsg = "\n[Content truncated due to line count]";
const truncatedLines = lines.slice(0, maxLines).join("\n") + truncationMsg;
if (truncatedLines.length > maxLength) {
sanitized = truncatedLines.substring(0, maxLength - truncationMsg.length) + truncationMsg;
} else {
sanitized = truncatedLines;
}
} else if (sanitized.length > maxLength) {
sanitized = sanitized.substring(0, maxLength) + "\n[Content truncated due to length]";
}// sanitize_content_core.cjs (lines ~333-351)
function applyTruncation(content, maxLength) {
maxLength = maxLength || 524288;
const lines = content.split("\n");
const maxLines = 65000;
if (lines.length > maxLines) { /* same truncation path */ }
else if (content.length > maxLength) { /* same length truncation */ }
return content;
}Both files also duplicate mention/command neutralization and URL protocol/domain filtering logic.
Impact Analysis
- Maintainability: Bug fixes or safety improvements may be applied to one sanitizer and forgotten in the other, leaving dead code or future misuse.
- Bug Risk: Accidental reuse of the outdated module would diverge from current security rules (e.g., allowed aliases vs full neutralization), making behavior unpredictable.
- Code Bloat: Hundreds of lines of duplicate sanitization logic inflate the JS surface area without runtime benefit.
Refactoring Recommendations
-
Remove or alias legacy module
- Delete
sanitize_content_old.cjsor make it re-export the core implementation to ensure a single sanitizer path. - Estimated effort: <1 hour; benefits: eliminates dead duplicate logic and drift risk.
- Delete
-
Document single entry point
- Clearly export one public sanitizer (from
sanitize_content.cjs) and add a small test ensuring no stale modules are used. - Estimated effort: 1 hour; benefits: future contributors avoid reviving outdated code.
- Clearly export one public sanitizer (from
Implementation Checklist
- Review duplication findings
- Prioritize refactoring tasks
- Create refactoring plan
- Implement changes
- Update tests
- Verify no functionality broken
Analysis Metadata
- Analyzed Files: 2
- Detection Method: Serena semantic code analysis
- Commit: b8c044b
- Analysis Date: 2024-XX-XX
AI generated by Duplicate Code Detector
Copilot