-
-
Notifications
You must be signed in to change notification settings - Fork 9
Code Review
ckb review runs 20 quality checks in a single command and returns a verdict, findings, and suggested reviewers. It replaces the need to wire up individual gates manually.
Quick links: CI-CD-Integration for GitHub Action setup · Quality Gates for individual gate thresholds · Workflow Examples for production templates
# Review current branch against main
ckb review
# Custom base branch
ckb review --base=develop
# Review staged changes only (pre-commit)
ckb review --staged
# Scope to a path prefix or symbol
ckb review internal/query/
ckb review --scope=Engine
# Only run specific checks
ckb review --checks=breaking,secrets,health
# New analyzers
ckb review --checks=dead-code,test-gaps,blast-radius --max-fanout=20
# Bug pattern detection
ckb review --checks=bug-patterns
# CI mode (exit 1=fail, 2=warn)
ckb review --ci --format=json
# Markdown for PR comments
ckb review --format=markdown
# With AI-generated narrative summary
ckb review --llmckb review orchestrates 20 checks in three priority tiers. All checks run concurrently; tree-sitter calls are serialized via a mutex so git subprocess work overlaps with analysis.
| Check | What It Does | Detector | Gate Type |
|---|---|---|---|
breaking |
API breaking change detection (removed symbols, changed signatures) | SCIP CompareAPI | fail |
secrets |
Credential and secret scanning (entropy ≥ 3.5 + pattern matching) | Entropy scanner + allowlist | fail |
critical |
Safety-critical path enforcement (requires config) | Glob pattern matching | fail |
| Check | What It Does | Detector | Gate Type |
|---|---|---|---|
complexity |
Cyclomatic/cognitive complexity delta per function | tree-sitter AST | warn |
health |
8-factor weighted code health score (A-F grades) | Multi-source (see below) | warn |
risk |
Multi-factor risk score (file count, LOC, hotspots, module spread) | Heuristic | warn |
coupling |
Missing co-changed files (≥70% correlation, 365-day window) | Git history | warn |
dead-code |
Unreferenced symbols and constants in changed files | SCIP + reference counting | warn |
blast-radius |
High fan-out symbol detection; informational unless threshold set | SCIP AnalyzeImpact | warn/info |
bug-patterns |
High-confidence Go AST bug patterns (10 rules, see below) | tree-sitter AST | warn |
| Check | What It Does | Detector | Gate Type |
|---|---|---|---|
hotspots |
Overlap with volatile/high-churn files (score > 0.5) | Churn ranking | info |
tests |
Affected test coverage | GetAffectedTests | warn* |
test-gaps |
Untested functions in changed files (≥5 lines) | tree-sitter + coverage | info |
comment-drift |
Numeric constant vs comment mismatch (e.g., // 256 above const MaxSize = 512) |
Regex scan | info |
format-consistency |
Cross-formatter output divergence (Human vs Markdown) | Function pair analysis | info |
generated |
Generated file detection and exclusion from all checks | Glob + header markers | info |
traceability |
Commit-to-ticket linkage (e.g., JIRA-\d+) | Regex + branch name | warn |
independence |
Author != reviewer enforcement (regulated industries) | Commit author extraction | warn |
classify |
Change classification (new/refactor/moved/churn/test/config) | Heuristics | — |
split |
Large PR split suggestion with cluster analysis | BFS graph clustering | warn |
* tests only warns when requireTests: true is set.
# Run all (default)
ckb review
# Only security-related
ckb review --checks=breaking,secrets
# Only bug patterns
ckb review --checks=bug-patterns
# Skip slow checks
ckb review --checks=breaking,secrets,tests,couplingThe bug-patterns check uses tree-sitter AST analysis to detect 10 high-confidence Go bug patterns. Only newly introduced patterns are reported — pre-existing issues from the base branch are filtered out.
| # | Rule | What It Catches |
|---|---|---|
| 1 | defer-in-loop |
defer inside a loop — cleanup won't run per iteration, only at function exit |
| 2 | unreachable-code |
Statements after return or panic
|
| 3 | empty-error-branch |
if err != nil { } with no body — silently swallowed error |
| 4 | unchecked-type-assert |
x.(T) without ok check — panic risk at runtime |
| 5 | self-assignment |
a = a — no-op assignment |
| 6 | nil-after-deref |
Pointer dereference before nil check |
| 7 | identical-branches |
if/else with identical code in both branches |
| 8 | shadowed-err |
Inner := redeclares outer err variable |
| 9 | discarded-error |
Function call returning error with no assignment |
| 10 | missing-defer-close |
Resource from Open/Create without defer Close()
|
Scope: First 20 non-test .go files in the changeset.
Deduplication: Findings are compared against the base branch. Only patterns newly introduced by the PR are reported. Pre-existing bugs show as "pass" with a note.
By default, CKB Review only reports issues on lines actually changed in the PR, not pre-existing problems in unchanged code.
- Parse unified diff to extract changed line numbers per file
- Build a map:
file → set of changed line numbers - After all checks run, filter findings:
- Keep: File-level findings (StartLine == 0)
- Keep: Findings on changed lines
- Keep: Findings on files not in the diff map (safety fallback)
- Drop: Findings on unchanged lines
In a codebase with historical technical debt, without Hold-the-Line, hundreds of pre-existing warnings would flood the report. Reviewers see only what this PR introduced.
{
"holdTheLine": true
}Set to false to see all findings, including pre-existing ones.
| Format | Flag | Use Case |
|---|---|---|
| Human | --format=human |
Terminal output with colors and icons (default) |
| JSON | --format=json |
Machine-readable, CI pipelines, AI consumption |
| Markdown | --format=markdown |
PR comments (GitHub, GitLab) |
| GitHub Actions | --format=github-actions |
Inline ::error/::warning annotations |
| SARIF | --format=sarif |
GitHub Code Scanning, GitLab SAST |
| CodeClimate | --format=codeclimate |
GitLab Code Quality |
| Compliance | --format=compliance |
Audit trail (IEC 61508, DO-178C, ISO 26262) |
Configure quality gates via CLI flags or .ckb/config.json:
{
"review": {
"blockBreakingChanges": true,
"blockSecrets": true,
"requireTests": false,
"maxRiskScore": 0.7,
"maxComplexityDelta": 0,
"maxFiles": 0,
"maxBlastRadiusDelta": 0,
"maxFanOut": 0,
"deadCodeMinConfidence": 0.8,
"testGapMinLines": 5,
"failOnLevel": "error",
"holdTheLine": true,
"splitThreshold": 50,
"criticalPaths": ["drivers/**", "protocol/**"],
"criticalSeverity": "error",
"generatedPatterns": ["*.pb.go", "*.generated.*", "*.pb.cc", "parser.tab.c", "lex.yy.c"],
"generatedMarkers": ["DO NOT EDIT", "Generated by", "AUTO-GENERATED", "This file is generated"],
"traceabilityPatterns": [],
"traceabilitySources": ["commit-message", "branch-name"],
"requireTraceability": false,
"requireTraceForCriticalPaths": false,
"requireIndependentReview": false,
"minReviewers": 1
}
}- Hardcoded defaults (DefaultReviewPolicy)
-
Repository config from
.ckb/config.json(overrides defaults) - CLI flags (override everything)
# Quality gates
ckb review --block-breaking=true --block-secrets=true
ckb review --require-tests --max-risk=0.8
ckb review --max-complexity=10 --max-files=100
ckb review --max-fanout=20 # blast-radius threshold
ckb review --dead-code-confidence=0.9 # stricter dead code
ckb review --test-gap-lines=10 # only flag larger functions
# Verdict control
ckb review --fail-on=warning # fail on warnings too
ckb review --fail-on=none # never fail (informational)
# Safety-critical paths
ckb review --critical-paths="drivers/**,auth/**"
# Traceability
ckb review --require-trace --trace-patterns="JIRA-\d+,GH-\d+"
# Reviewer independence
ckb review --require-independent --min-reviewers=2
# AI narrative
ckb review --llmThe health check computes a 0-100 score per file using 8 weighted factors:
| Factor | Weight | Source | Scale |
|---|---|---|---|
| Cyclomatic complexity | 25% | tree-sitter | ≤5: 100, 6-10: 85, 11-20: 65, 21-30: 40, >30: 20 |
| Cognitive complexity | 15% | tree-sitter | Same scale as cyclomatic |
| File size (LOC) | 10% | Line count | ≤100: 100, 101-300: 85, 301-500: 70, 501-1000: 50, >1000: 30 |
| Churn (commits/30d) | 15% | git log | ≤2: 100, 3-5: 80, 6-10: 60, 11-20: 40, >20: 20 |
| Coupling (co-changes) | 10% | git log | ≤2: 100, 3-5: 80, 6-10: 60, >10: 40 |
| Bus factor (authors) | 10% | git blame | ≥5: 100, 3-4: 85, 2: 60, 1: 30 |
| Age (days since change) | 15% | git log | ≤30: 100, 31-90: 85, 91-180: 70, 181-365: 50, >365: 30 |
| Grade | Score Range |
|---|---|
| A | 90-100 |
| B | 70-89 |
| C | 50-69 |
| D | 30-49 |
| F | 0-29 |
The confidence factor (0-1) indicates how reliable the score is:
- Reduced by 0.4 if tree-sitter parsing fails
- Reduced by 0.3 if all metrics are at default (no git data)
- Reduced by 0.2 if only bus factor is at default
Health is computed for both base and head versions. Findings are generated when a file degrades by more than 5 points.
The review score starts at 100 and deducts points per finding:
| Severity | Points | Cap Per Check | Cap Per Rule |
|---|---|---|---|
| error | -10 | 20 | 10 |
| warning | -3 | 20 | 10 |
| info | -1 | 20 | 10 |
Maximum total deduction: 80 points (score floor: 0)
Three caps prevent noise from dominating the score:
-
Per-rule cap (10): A single noisy rule (e.g.,
ckb/bug/discarded-error) can't consume its check's entire budget. - Per-check cap (20): A single check (e.g., coupling with 100+ findings) can't overwhelm the score.
- Total cap (80): Large PRs where many checks fire don't produce meaningless scores.
fail-on-level = "error" (default), "warning", or "none"
"none" → always pass
"warning" → fail if any check has status "fail" or "warn"
"error" → fail if any check has status "fail"; warn if any has "warn"; otherwise pass
CI exit codes: 0 = pass, 1 = fail, 2 = warn
CKB estimates human review effort based on Microsoft and Google code review research.
| Code Category | Review Speed |
|---|---|
| New code | 200 LOC/hour |
| Refactored/modified | 300 LOC/hour |
| Moved/test/config | 500 LOC/hour (quick scan) |
| Factor | Overhead |
|---|---|
| File switches (> 5 files) | 2 min per file |
| Module context switches (> 1 module) | 5 min per module beyond first |
| Safety-critical files | 10 min each |
| Minimum | 5 minutes |
| Level | Estimated Time |
|---|---|
| Trivial | < 20 min |
| Moderate | 20-60 min |
| Complex | 60-240 min |
| Very complex | > 240 min |
Estimated Review: ~45min (moderate)
· 20 min from 120 LOC
· 10 min from 5 file switches
· 15 min for 3 critical files
CKB generates a 2-3 sentence review summary automatically.
Sentence 1: "Changes N files across M modules (languages)."
Sentence 2: Risk signal (failed checks, top warnings, or "No blocking issues found.")
Sentence 3: Focus area (split suggestion, critical files, or omitted)
Example: "Changes 25 files across 3 modules (Go, TypeScript). 2 breaking API changes detected; 2 safety-critical files changed. 2 safety-critical files need focused review."
With --llm, Claude generates a context-aware summary based on the top 10 findings, verdict, score, and health report.
- Model: claude-sonnet-4-20250514 (configurable)
- Timeout: 30 seconds
- Fallback: deterministic narrative on API failure
The classify check categorizes each file in the changeset:
| Category | Heuristic | Review Priority |
|---|---|---|
new |
File doesn't exist at base | High |
refactoring |
Renamed + changes | Medium |
moved |
Renamed + ≤20% content change | Low |
churn |
≥3 commits in last 30 days | High |
config |
Makefiles, CI, Docker, etc. | Low |
test |
Test file patterns (*_test.go, test_*.py, etc.) |
Medium |
generated |
Matches generated patterns/markers | Skip |
modified |
Default — none of the above | Medium |
The review effort estimate uses classification to adjust time: generated files are skipped, tests review faster, new code gets full review time.
Generated files are detected early and excluded from all other checks. This is the most important token-saving mechanism: in a monorepo with Protocol Buffers, this alone can eliminate 30-50% of files.
*.generated.*
*.pb.go # Protocol Buffers (Go)
*.pb.cc # Protocol Buffers (C++)
parser.tab.c # Yacc/Bison
lex.yy.c # Flex
"DO NOT EDIT"
"Generated by"
"AUTO-GENERATED"
"This file is generated"
Both patterns and markers are configurable via policy.
When a PR exceeds splitThreshold files (default: 50), the split check analyzes the changeset and suggests independent clusters.
- Build adjacency graph of files
- Connect files in same module (fully connected)
- Enrich edges using coupling analysis (top 20 files, correlation ≥ 0.5)
- Find connected components via BFS
- If ≤1 component → no split recommended
- If >1 component → return independent clusters
Each cluster reports:
- Name: dominant module
- Files: list with additions/deletions
- Languages: detected languages
- Independent: always true (connected components are by definition)
Sorted by file count descending.
The risk check computes a 0-1 risk score from four factors:
| Factor | Contribution |
|---|---|
| > 20 files changed | +0.3 (> 10: +0.15) |
| > 1000 LOC changes | +0.3 (> 500: +0.15) |
| Hotspot overlap | +0.1 per hotspot (max 0.3) |
| > 5 modules affected | +0.2 |
Risk levels: low (< 0.3), medium (0.3-0.6), high (> 0.6)
For regulated industries (IEC 61508, DO-178C, ISO 26262):
# Require ticket references in commits
ckb review --require-trace --trace-patterns="JIRA-\d+,GH-\d+"
# Require independent reviewer (author != reviewer)
ckb review --require-independent --min-reviewers=2
# Safety-critical path enforcement
ckb review --critical-paths="drivers/**,protocol/**"
# Full compliance output
ckb review --format=complianceConfigure where to look for ticket references:
-
commit-message— scan commit messages -
branch-name— scan the branch name
Both are checked by default when traceability is enabled.
Track finding trends across releases:
# Save current findings as a baseline
ckb review baseline save --tag=v1.0
# List saved baselines
ckb review baseline list
# Compare two baselines
ckb review baseline diff v1.0 v2.0Baseline diffs classify each finding as new, unchanged, or resolved.
Storage: .ckb/baselines/ (git-ignored)
When used via MCP (reviewPR tool), CKB Review dramatically reduces token consumption for AI code reviewers. Five mechanisms work together:
Generated files are excluded before any analysis. Protocol Buffers, build artifacts, lock files — none of this reaches the AI.
Savings: 30-50% of files in typical monorepos.
Instead of a 10KB+ unified diff, the AI receives a structured JSON response:
{
"verdict": "warn",
"score": 72,
"findings": [
{
"check": "breaking",
"severity": "error",
"file": "api/handler.go",
"startLine": 42,
"message": "Removed public function HandleAuth()",
"suggestion": "Update all call sites or provide a wrapper",
"tier": 1
}
]
}Savings: 80-90% smaller than raw diff.
Only new issues are reported. Historical debt is filtered out.
Savings: 40-70% fewer findings in codebases with existing debt.
Top 10 findings by tier + severity. Tier 3 (informational) suppressed for AI summary.
Single numbers replace pages of analysis:
-
"riskScore": 0.42instead of reading all changed files -
"healthDelta": -0.8"instead of per-file health records
| Step | Remaining Volume |
|---|---|
| Raw diff (600 files) | 100% (~500K-1M tokens) |
| After generated-file filtering | ~60% |
| After Hold-the-Line | ~25% |
| Structured findings | ~5% |
| After tier filtering | ~2-3% (~10K-30K tokens) |
Typical savings: 85-95% token reduction on PRs with 50+ files.
CKB Review is optimized for large PRs. The engine runs 20 checks in parallel with a mutex-based tree-sitter scheduler that overlaps git subprocess work with static analysis.
| PR Size | Runtime | Git Calls | Tree-Sitter Calls |
|---|---|---|---|
| 10 files | ~2s | ~15 | ~20 |
| 100 files | ~8s | ~40 | ~60 |
| 600 files | ~15s | ~50 (batched) | ~60 (capped) |
Without lock (run in parallel): breaking, secrets, tests, hotspots, risk, coupling, dead-code, blast-radius, critical, traceability, independence
With tree-sitter mutex (serialized): complexity, health, test-gaps, bug-patterns, comment-drift, format-consistency
Lock-free checks typically complete within 200ms while tree-sitter checks process their queue. Net effect: ~6-8x speedup vs sequential execution.
-
Batched git operations: one
git log --name-onlyreplaces 120+ individual calls for health scoring - Parallel git blame: 5-worker pool instead of sequential calls
- Cached hotspot scores: computed once, shared across all checks
- Capped analysis: health limited to 30 files, coupling to 20, bug-patterns to 20
CKB provides a composite action at action/ckb-review:
- uses: SimplyLiz/CodeMCP/action/ckb-review@main
with:
fail-on: 'error' # or 'warning' / 'none'
comment: 'true' # post PR comment
sarif: 'true' # upload to Code Scanning
checks: '' # all checks (or comma-separated subset)
critical-paths: 'drivers/**'
require-trace: 'false'
trace-patterns: ''
require-independent: 'false'
max-fanout: '0' # blast-radius threshold (0 = disabled)
dead-code-confidence: '0.8'
test-gap-lines: '5'Outputs: verdict (pass/warn/fail), score (0-100), findings (count)
For a complete workflow example, see the pr-review.yml template or Workflow Examples#pr-review.
The reviewPR MCP tool exposes the same engine to AI assistants:
{
"tool": "reviewPR",
"arguments": {
"baseBranch": "main",
"checks": ["breaking", "secrets", "health", "bug-patterns"],
"failOnLevel": "error",
"criticalPaths": ["drivers/**"]
}
}Returns the full review response: verdict, score, checks, findings, health report, split suggestion, change breakdown, reviewers, review effort estimate, and narrative summary.
ckb review [scope]
Flags:
--format=human|json|markdown|github-actions|sarif|codeclimate|compliance
--base=main # Base branch
--head= # Head branch (default: HEAD)
--checks=breaking,secrets,... # Filter to specific checks
--ci # CI mode: exit 0=pass, 1=fail, 2=warn
--staged # Review staged changes instead of branch diff
--scope=internal/query # Filter to path prefix or symbol
# Quality Gates
--fail-on=error|warning|none # Verdict threshold
--block-breaking # Fail on breaking changes
--block-secrets # Fail on secret leaks
--require-tests # Warn if no tests affected
--max-risk=0.7 # Risk score threshold
--max-complexity=10 # Complexity delta threshold
--max-files=0 # Max file count (0=disabled)
--max-blast-radius=0 # Blast radius threshold
--max-fanout=0 # Alias for --max-blast-radius
# Safety-Critical Paths
--critical-paths=drivers/** # Glob patterns
# Traceability
--require-trace # Enforce ticket references
--trace-patterns=JIRA-\d+ # Ticket regex patterns
# Reviewer Independence
--require-independent # Enforce independent review
--min-reviewers=2 # Minimum independent reviewers
# Analyzer Tuning
--dead-code-confidence=0.8 # Dead code confidence threshold
--test-gap-lines=5 # Min function lines for test gap reporting
--llm # Use Claude for narrative generation
# Lint Deduplication
--lint-report=report.json # External lint report to deduplicate againstThe full ReviewPRResponse contains:
| Field | Type | Description |
|---|---|---|
ckbVersion |
string | CKB version |
schemaVersion |
string | Response schema version |
verdict |
string |
"pass", "warn", or "fail"
|
score |
int | 0-100 review score |
summary |
object | File counts, LOC, modules, languages |
checks |
array | 20 check results with status, severity, duration |
findings |
array | Actionable items with file, line, message, suggestion, tier |
reviewers |
array | Suggested reviewers from CODEOWNERS + git blame |
generated |
array | Detected generated files with reasons |
splitSuggestion |
object | PR split clusters (if applicable) |
changeBreakdown |
object | Category counts (new/refactor/test/etc.) |
reviewEffort |
object | Time estimate with factors and complexity label |
clusterReviewers |
array | Per-cluster reviewer assignments |
healthReport |
object | Per-file health scores with deltas |
narrative |
string | 2-3 sentence review summary |
prTier |
string |
"small" (< 100 LOC), "medium" (≤ 600), "large" (> 600) |
provenance |
object | Repo state ID, dirty flag, query duration |
Each finding includes:
| Field | Description |
|---|---|
check |
Check name (e.g., "breaking", "bug-patterns") |
severity |
"error", "warning", or "info"
|
file |
File path |
startLine |
Line number (0 = file-level) |
message |
Human-readable description |
suggestion |
Optional fix recommendation |
ruleId |
Machine-readable ID (e.g., ckb/breaking/removed-symbol) |
hint |
Optional drilldown hint (e.g., → ckb explain Symbol) |
tier |
1 (blocking), 2 (important), 3 (informational) |
- Quality Gates — Individual gate thresholds and CI enforcement
- CI-CD-Integration — Full CI/CD integration guide
- Workflow Examples — Production-ready workflow templates
- Impact-Analysis — Blast radius and risk scoring details
- Security — Secret detection configuration
- Configuration — Global configuration options