Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 61 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,8 +65,10 @@ This action is not hardened against prompt injection attacks and should only be
| `comment-pr` | Whether to comment on PRs with findings | `true` | No |
| `upload-results` | Whether to upload results as artifacts | `true` | No |
| `exclude-directories` | Comma-separated list of directories to exclude from scanning | None | No |
| `claude-model` | Claude [model name](https://docs.anthropic.com/en/docs/about-claude/models/overview#model-names) to use. Defaults to Opus 4.5. | `claude-opus-4-5-20251101` | No |
| `claude-model` | Claude [model name](https://docs.anthropic.com/en/docs/about-claude/models/overview#model-names) to use. Defaults to Opus 4.5. For large PRs (>400k char diffs), consider using `claude-sonnet-4-5-20250929` (1M context). | `claude-opus-4-5-20251101` | No |
| `claudecode-timeout` | Timeout for ClaudeCode analysis in minutes | `20` | No |
| `max-diff-chars` | Maximum diff characters to include in prompt. Set to `0` for agentic mode (Claude uses git commands to explore). See [Diff Size Configuration](#diff-size-configuration) below. | `400000` | No |
| `max-diff-lines` | **[DEPRECATED]** Use `max-diff-chars` instead. Converts lines to chars (line × 80). | None | No |
| `run-every-commit` | Run ClaudeCode on every commit (skips cache check). Warning: May increase false positives on PRs with many commits. **Deprecated**: Use `trigger-on-commit` instead. | `false` | No |
| `trigger-on-open` | Run review when PR is first opened | `true` | No |
| `trigger-on-commit` | Run review on every new commit | `false` | No |
Expand All @@ -88,6 +90,64 @@ This action is not hardened against prompt injection attacks and should only be
| `findings-count` | Total number of code review findings |
| `results-file` | Path to the results JSON file |

### Diff Size Configuration

The action handles PRs of any size using three review modes:

#### Review Modes

1. **Full Diff Mode** (default for small PRs)
- Entire diff embedded in prompt
- Fastest and most comprehensive
- Works for diffs up to ~400k characters

2. **Partial Diff Mode** (automatic for large PRs)
- First N files embedded in prompt
- Claude uses git commands to explore remaining files
- Balances embedded context with agentic exploration

3. **Full Agentic Mode** (set `max-diff-chars: 0`)
- No diff embedded
- Claude uses git commands to explore all changes
- Most flexible for massive PRs (1000+ files)

#### Configuration

**`max-diff-chars`** - Maximum diff characters to embed (default: 400,000)

```yaml
# Default: 400k chars (fits in 200k token models)
- uses: PSPDFKit-labs/nutrient-code-review@main
with:
claude-api-key: ${{ secrets.CLAUDE_API_KEY }}
max-diff-chars: 400000 # ~100k tokens

# Large PRs: Use 1M context model with higher limit
- uses: PSPDFKit-labs/nutrient-code-review@main
with:
claude-api-key: ${{ secrets.CLAUDE_API_KEY }}
claude-model: claude-sonnet-4-5-20250929 # 1M context
max-diff-chars: 800000 # ~200k tokens

# Always use agentic mode (no embedded diff)
- uses: PSPDFKit-labs/nutrient-code-review@main
with:
claude-api-key: ${{ secrets.CLAUDE_API_KEY }}
max-diff-chars: 0 # Force agentic exploration
```

**Model Selection for Large Diffs:**

| Diff Size | Recommended Model | Context Window |
|-----------|-------------------|----------------|
| < 400k chars | `claude-opus-4-5-20251101` (default) | 200k tokens |
| 400k - 800k chars | `claude-sonnet-4-5-20250929` | 1M tokens |
| > 800k chars | Set `max-diff-chars: 0` (agentic mode) | Any model |

**Backward Compatibility:**

`max-diff-lines` is deprecated but still supported. If used, it converts to characters: `max_diff_chars = max_diff_lines × 80`

### Re-Review Trigger Configuration

The action supports multiple triggers for when reviews should be run, allowing fine-grained control over bot behavior:
Expand Down
59 changes: 56 additions & 3 deletions action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,10 +64,25 @@ inputs:
required: false
default: ''

max-diff-chars:
description: |
Maximum diff characters to embed in prompt (default: 400000 = 400k chars).
Larger diffs use agentic file reading instead. Set to 0 to always use agentic mode.

IMPORTANT: For large limits (>400k), use a model with larger context like:
- claude-sonnet-4-5-20250929 (1M context) for diffs up to 800k chars
- Set via 'claude-model' input parameter

Note: ~400k chars fits comfortably in 200k token models (Opus/Sonnet standard).
required: false
default: '400000'

max-diff-lines:
description: 'Maximum diff lines to embed in prompt. Larger diffs use agentic file reading instead. Set to 0 to always use agentic mode. Default: 5000'
description: |
[DEPRECATED] Use 'max-diff-chars' instead. This converts lines to chars (line * 80).
Kept for backward compatibility only. If both set, max-diff-chars takes precedence.
required: false
default: '5000'
default: ''

trigger-on-open:
description: 'Run review when PR is first opened'
Expand Down Expand Up @@ -142,6 +157,12 @@ runs:
exit 1
fi

if ! PR_BASE_SHA=$(echo "$PR_DATA" | jq -r '.base.sha' 2>&1); then
echo "Error: Failed to parse PR base SHA from API response"
echo "is_pr=false" >> $GITHUB_OUTPUT
exit 1
fi

# Extract PR labels (array of label names)
if ! PR_LABELS=$(echo "$PR_DATA" | jq -c '[.labels[].name]' 2>&1); then
echo "Error: Failed to parse PR labels from API response"
Expand All @@ -158,11 +179,12 @@ runs:

echo "pr_number=${{ github.event.issue.number }}" >> $GITHUB_OUTPUT
echo "pr_sha=$PR_HEAD_SHA" >> $GITHUB_OUTPUT
echo "pr_base_sha=$PR_BASE_SHA" >> $GITHUB_OUTPUT
echo "pr_labels=$PR_LABELS" >> $GITHUB_OUTPUT
echo "is_draft=$IS_DRAFT" >> $GITHUB_OUTPUT
echo "is_pr=true" >> $GITHUB_OUTPUT

echo "Issue comment is on PR #${{ github.event.issue.number }} with SHA $PR_HEAD_SHA"
echo "Issue comment is on PR #${{ github.event.issue.number }} with base SHA $PR_BASE_SHA, head SHA $PR_HEAD_SHA"
else
echo "is_pr=false" >> $GITHUB_OUTPUT
echo "Issue comment is not on a PR, skipping"
Expand Down Expand Up @@ -357,6 +379,36 @@ runs:
with:
node-version: '18'

- name: Setup git for diffing
if: steps.claudecode-check.outputs.enable_claudecode == 'true'
shell: bash
env:
BASE_SHA: ${{ github.event.pull_request.base.sha || steps.pr-info.outputs.pr_base_sha }}
HEAD_SHA: ${{ github.event.pull_request.head.sha || steps.pr-info.outputs.pr_sha }}
run: |
echo "::group::Prepare repository so 'git diff' shows PR changes"

BASE_COMMIT="$BASE_SHA"
HEAD_COMMIT="$HEAD_SHA"

echo "Base SHA (PR snapshot): $BASE_COMMIT"
echo "Head SHA (current PR): $HEAD_COMMIT"

# Fetch both commits
git fetch origin "$BASE_COMMIT" --depth=1
git fetch origin "$HEAD_COMMIT" --depth=1

# Reset to base commit (B)
git checkout "$BASE_COMMIT"

# Restore PR files to working directory (E) - NOT staged
git restore --source="$HEAD_COMMIT" --worktree .

# Now: HEAD=B, index=B, worktree=E
# So: git diff shows B → E (PR changes)
echo "Repository prepared. Plain 'git diff' now shows PR changes."
echo "::endgroup::"

- name: Install dependencies
if: steps.claudecode-check.outputs.enable_claudecode == 'true'
shell: bash
Expand Down Expand Up @@ -386,6 +438,7 @@ runs:
CUSTOM_SECURITY_SCAN_INSTRUCTIONS: ${{ inputs.custom-security-scan-instructions }}
CLAUDE_MODEL: ${{ inputs.claude-model }}
CLAUDECODE_TIMEOUT: ${{ inputs.claudecode-timeout }}
MAX_DIFF_CHARS: ${{ inputs.max-diff-chars }}
MAX_DIFF_LINES: ${{ inputs.max-diff-lines }}
ACTION_PATH: ${{ github.action_path }}
run: |
Expand Down
5 changes: 5 additions & 0 deletions claudecode/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,11 @@
# Token Limits
PROMPT_TOKEN_LIMIT = 16384 # 16k tokens max for claude-opus-4

# Diff Construction Limits
DEFAULT_MAX_DIFF_CHARS = 400000 # 400k characters (suitable for 200k token models)
# Conversion factor for deprecated MAX_DIFF_LINES -> MAX_DIFF_CHARS
CHARS_PER_LINE_ESTIMATE = 80 # Average characters per line for conversion

# Exit Codes
EXIT_SUCCESS = 0
EXIT_GENERAL_ERROR = 1
Expand Down
9 changes: 8 additions & 1 deletion claudecode/evals/eval_engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -451,7 +451,14 @@ def _run_code_review(self, test_case: EvalCase, repo_path: str) -> Tuple[bool, s
)

output = result.stdout


# Display stderr in verbose mode (contains logging from github_action_audit.py)
if self.verbose and result.stderr:
self.log("Code review stderr output:", prefix="[AUDIT]")
for line in result.stderr.splitlines():
if line.strip():
print(f"[AUDIT] {line}", file=sys.stderr)

# Parse the JSON output first to see if we got valid results
success, parsed_results = parse_json_with_fallbacks(output)
if not success:
Expand Down
Loading