Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
107 commits
Select commit Hold shift + click to select a range
04a87cf
fix: add Jest 30 support, fix time limit, and fix async function looping
mohammedahmed18 Feb 3, 2026
4c61d08
Merge branch 'main' into fix/js-jest30-loop-runner
mohammedahmed18 Feb 3, 2026
a3764f1
Merge branch 'main' of github.com:codeflash-ai/codeflash into fix/js-…
mohammedahmed18 Feb 3, 2026
4157534
fix: use getter functions for env var constants in capture.js
mohammedahmed18 Feb 3, 2026
017bde1
refactor: improve code quality and documentation in loop-runner and c…
mohammedahmed18 Feb 3, 2026
71b38d5
fix: Parse timing markers from console output for JavaScript benchmar…
mohammedahmed18 Feb 3, 2026
7273f27
chore: trigger CI workflows
mohammedahmed18 Feb 3, 2026
b83e516
fix: use lazy % formatting for logger.debug to pass ruff G004
mohammedahmed18 Feb 3, 2026
0592d92
Merge branch 'main' into fix/js-jest30-loop-runner
mohammedahmed18 Feb 4, 2026
b4d0b0f
fix: support monorepo hoisted dependencies in JS requirements check
mohammedahmed18 Feb 4, 2026
3b56d24
debug: add extensive Jest execution logging for troubleshooting
mohammedahmed18 Feb 4, 2026
9cd5d5a
fix: calculate correct import paths for JavaScript tests in temp dire…
mohammedahmed18 Feb 4, 2026
0a8d120
fix: preserve ./ prefix in JS import paths and fix TestType enum
mohammedahmed18 Feb 4, 2026
6febd69
debug: add extensive performance test debugging
mohammedahmed18 Feb 4, 2026
6c74adc
fix: disable custom loop-runner to enable basic performance testing
mohammedahmed18 Feb 4, 2026
202bdc4
Merge branch 'main' into fix/js-jest30-loop-runner
mohammedahmed18 Feb 4, 2026
bab3bd4
style: auto-fix linting issues
github-actions[bot] Feb 4, 2026
535c640
fix: resolve all linting issues from ruff and mypy
mohammedahmed18 Feb 4, 2026
d0b859a
Optimize PrComment.to_json
codeflash-ai[bot] Feb 4, 2026
c151b6c
Merge pull request #1383 from codeflash-ai/codeflash/optimize-pr1318-…
claude[bot] Feb 4, 2026
ae31ca7
Fix JavaScript test generation and benchmarking
mohammedahmed18 Feb 5, 2026
9bb05f6
style: auto-fix linting issues
github-actions[bot] Feb 5, 2026
8fcb8cc
cleanup
mohammedahmed18 Feb 5, 2026
67ea0c9
Merge branch 'fix/js-jest30-loop-runner' of github.com:codeflash-ai/c…
mohammedahmed18 Feb 5, 2026
a6b9364
fix: include same-class helper methods inside class wrapper for TypeS…
mohammedahmed18 Feb 6, 2026
f800ae3
Merge branch 'main' into fix/js-jest30-loop-runner
mohammedahmed18 Feb 6, 2026
b65711d
fix: resolve merge conflict in function_optimizer.py
github-actions[bot] Feb 6, 2026
4545b8c
fix: add export keywords to remaining JavaScript/TypeScript tests
mohammedahmed18 Feb 6, 2026
183d800
fix: detect CommonJS exports (module.exports) for function discovery
mohammedahmed18 Feb 6, 2026
6c23255
version upgrade for cf package
mohammedahmed18 Feb 6, 2026
ce13a6d
Merge branch 'main' into fix/js-jest30-loop-runner
Saga4 Feb 9, 2026
599a0e3
fix: resolve merge conflicts in verifier.py
github-actions[bot] Feb 9, 2026
afed6a3
docs: add mypy type checking instructions to CLAUDE.md
KRRT7 Feb 11, 2026
9de2e59
Merge pull request #1453 from codeflash-ai/claude-md-mypy-instructions
KRRT7 Feb 11, 2026
dcd9e2a
some fixes for test runner and instrumentation
mohammedahmed18 Feb 11, 2026
1b8f701
Merge branch 'fix/js-jest30-loop-runner' of github.com:codeflash-ai/c…
mohammedahmed18 Feb 11, 2026
1181f6a
fix: use qualified_name for coverage function identification
KRRT7 Feb 12, 2026
773e5a5
style: fix mypy type annotation in test coverage utils
github-actions[bot] Feb 12, 2026
c4ed6e3
fix: resolve pre-existing mypy errors in PrComment, concolic_utils, p…
KRRT7 Feb 12, 2026
48817d7
Optimize extract_dependent_function
codeflash-ai[bot] Feb 12, 2026
f0a2d4e
Merge pull request #1458 from codeflash-ai/codeflash/optimize-pr1457-…
KRRT7 Feb 12, 2026
0567a09
style: auto-fix ruff formatting issues
github-actions[bot] Feb 12, 2026
0f9c06b
Merge pull request #1457 from codeflash-ai/fix-coverage-qualified-name
KRRT7 Feb 12, 2026
1a3dba2
Update claude.yml
KRRT7 Feb 12, 2026
b38e3b7
Merge branch 'main' into fix-claude-bot
KRRT7 Feb 12, 2026
2e6903a
Merge pull request #1459 from codeflash-ai/fix-claude-bot
KRRT7 Feb 12, 2026
f3718d7
Restore concurrency in testgen and candidate generation
aseembits93 Feb 12, 2026
5a5b16c
style: remove unused sentry_sdk import
github-actions[bot] Feb 12, 2026
175226b
fix: correct loop index calculation in JS performance benchmarking
mohammedahmed18 Feb 12, 2026
536c1d0
remove debug statements
mohammedahmed18 Feb 12, 2026
e9b7154
Merge branch 'main' of github.com:codeflash-ai/codeflash into fix/js-…
mohammedahmed18 Feb 12, 2026
e07fd1d
fix tests
mohammedahmed18 Feb 12, 2026
4c9f4ef
Optimize StandaloneCallTransformer._parse_bracket_standalone_call
codeflash-ai[bot] Feb 12, 2026
6b77be5
ignore calls inside string litrals for instrumentation and fix e2e test
mohammedahmed18 Feb 12, 2026
f5dd109
Merge pull request #1465 from codeflash-ai/codeflash/optimize-pr1318-…
mohammedahmed18 Feb 12, 2026
9937fe0
fixes for unit tests
mohammedahmed18 Feb 12, 2026
4dedb9f
Merge branch 'fix/js-jest30-loop-runner' of github.com:codeflash-ai/c…
mohammedahmed18 Feb 12, 2026
fe63635
fix: filter test_*.py files and pytest fixtures from optimization
KRRT7 Feb 12, 2026
7337d03
style: auto-fix linting issues
github-actions[bot] Feb 12, 2026
037130b
style: remove trailing blank line
github-actions[bot] Feb 12, 2026
4a978b1
Merge pull request #1471 from codeflash-ai/fix-test-function-filtering
KRRT7 Feb 12, 2026
2f36044
Merge branch 'main' into aseembits93-patch-1
KRRT7 Feb 12, 2026
bcffb5e
Merge pull request #1461 from codeflash-ai/aseembits93-patch-1
KRRT7 Feb 12, 2026
c0de087
Fix CrossHair subprocess missing PYTHONPATH for project-relative imports
KRRT7 Feb 13, 2026
544fd7a
Merge pull request #1476 from codeflash-ai/fix-crosshair-pythonpath
KRRT7 Feb 13, 2026
b6b47ff
Unify PYTHONPATH setup into make_env_with_project_root helper
KRRT7 Feb 13, 2026
97c1249
Merge pull request #1477 from codeflash-ai/unify-pythonpath-setup
KRRT7 Feb 13, 2026
3c835d7
Fix package.json config overriding closer pyproject.toml in monorepos
KRRT7 Feb 13, 2026
54a77c2
Merge branch 'main' into fix-package-json-config-override
KRRT7 Feb 13, 2026
2e7ad77
fix: resolve mypy arg-type error for package_json_path
github-actions[bot] Feb 13, 2026
5b43dc9
Merge branch 'main' into fix/js-jest30-loop-runner
Saga4 Feb 13, 2026
6bda49b
Merge pull request #1318 from codeflash-ai/fix/js-jest30-loop-runner
Saga4 Feb 13, 2026
fbd3a81
Merge branch 'main' into fix-package-json-config-override
KRRT7 Feb 13, 2026
1d9824c
Merge pull request #1478 from codeflash-ai/fix-package-json-config-ov…
KRRT7 Feb 13, 2026
5449a32
feat: include __init__ signatures from directly imported external cla…
KRRT7 Feb 13, 2026
f4c0208
test: add unit tests for get_external_class_inits
KRRT7 Feb 13, 2026
8eb1c86
fix: resolve mypy union-attr error in test_get_external_class_inits
github-actions[bot] Feb 13, 2026
e837ad9
feat: resolve transitive type dependencies in get_external_class_inits
KRRT7 Feb 13, 2026
f344789
style: fix ruff B009 getattr-with-constant while preserving mypy safety
github-actions[bot] Feb 13, 2026
6de75e7
chore: disable ruff B009 globally to avoid conflict with mypy [misc]
KRRT7 Feb 13, 2026
c3fe9ec
style: clean up imports in parse_test_output
KRRT7 Feb 13, 2026
83c6d5c
fix: import jest patterns from source module instead of re-export
KRRT7 Feb 13, 2026
29a5324
docs: distinguish local vs CI prek commands in CLAUDE.md
KRRT7 Feb 13, 2026
15c307a
fix: normalize jest mock paths with pathlib for Windows compat
KRRT7 Feb 13, 2026
4f44286
chore: upgrade all dependencies in lockfile
KRRT7 Feb 13, 2026
42a1150
Merge pull request #1481 from codeflash-ai/include-external-class-ini…
KRRT7 Feb 13, 2026
0650973
refactor: restructure CLAUDE.md for effective context usage
KRRT7 Feb 14, 2026
53f8658
Merge pull request #1486 from codeflash-ai/restructure-claude-md
KRRT7 Feb 14, 2026
f819d60
chore: add gh-aw duplicate code detector workflow
KRRT7 Feb 14, 2026
b3c3a30
Merge pull request #1487 from codeflash-ai/add-duplicate-code-detector
KRRT7 Feb 14, 2026
ef66139
fix: configure duplicate code detector for Azure Foundry auth
KRRT7 Feb 14, 2026
9961a02
docs: add new-branch-from-main rule to git guidelines
KRRT7 Feb 14, 2026
0bb62d6
docs: add new-branch-from-main rule to git guidelines
KRRT7 Feb 14, 2026
de78ffe
Merge pull request #1490 from codeflash-ai/fix-duplicate-detector-fou…
KRRT7 Feb 14, 2026
02b9a5e
chore: replace gh-aw duplicate detector with claude-code-action + Serena
KRRT7 Feb 15, 2026
dbba5e0
Merge pull request #1491 from codeflash-ai/replace-ghaw-with-foundry-…
KRRT7 Feb 15, 2026
9af75a6
Initialize tessl.json with matched tiles
tessl-app[bot] Feb 15, 2026
9282e25
Add MCP config for .mcp.json
tessl-app[bot] Feb 15, 2026
90601c3
Merge pull request #1492 from codeflash-ai/tessl/setup-1771114839280
KRRT7 Feb 15, 2026
6718e66
feat: add private tessl tiles for codeflash rules, docs, and skills
KRRT7 Feb 15, 2026
18ad00b
chore: improve skills to 100% review score and bump to v0.2.0
KRRT7 Feb 15, 2026
289b75c
chore: add tessl-managed gitignore for codex and gemini skill symlinks
KRRT7 Feb 15, 2026
ff2abd2
chore: add eval scenarios for codeflash-skills tile
KRRT7 Feb 15, 2026
869fbe1
chore: add eval scenarios for codeflash-docs tile
KRRT7 Feb 15, 2026
d578d99
Merge pull request #1494 from codeflash-ai/add-private-tessl-tiles
KRRT7 Feb 15, 2026
c66953d
Merge commit 'd578d996' into sync-main-batch-2
KRRT7 Feb 20, 2026
8632da0
chore: fix ruff format issue in code_context_extractor
KRRT7 Feb 20, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions .claude/rules/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,17 @@ codeflash/
├── result/ # Result types and handling
└── version.py # Version information
```

## Key Entry Points

| Task | Start here |
|------|------------|
| CLI arguments & commands | `cli_cmds/cli.py` |
| Optimization orchestration | `optimization/optimizer.py` → `run()` |
| Per-function optimization | `optimization/function_optimizer.py` |
| Function discovery | `discovery/functions_to_optimize.py` |
| Context extraction | `context/code_context_extractor.py` |
| Test execution | `verification/test_runner.py`, `verification/pytest_plugin.py` |
| Performance ranking | `benchmarking/function_ranker.py` |
| Domain types | `models/models.py`, `models/function_types.py` |
| Result handling | `either.py` (`Result`, `Success`, `Failure`, `is_successful`) |
1 change: 1 addition & 0 deletions .claude/rules/code-style.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

- **Line length**: 120 characters
- **Python**: 3.9+ syntax
- **Package management**: Always use `uv`, never `pip`
- **Tooling**: Ruff for linting/formatting, mypy strict mode, prek for pre-commit checks
- **Comments**: Minimal - only explain "why", not "what"
- **Docstrings**: Do not add unless explicitly requested
Expand Down
1 change: 1 addition & 0 deletions .claude/rules/git.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Git Commits & Pull Requests

- **Always create a new branch from `main` before starting any new work** — never commit directly to `main` or reuse an existing feature branch for unrelated changes
- Use conventional commit format: `fix:`, `feat:`, `refactor:`, `docs:`, `test:`, `chore:`
- Keep commits atomic - one logical change per commit
- Commit message body should be concise (1-2 sentences max)
Expand Down
12 changes: 12 additions & 0 deletions .claude/rules/language-patterns.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
paths:
- "codeflash/languages/**/*.py"
---

# Language Support Patterns

- Current language is a module-level singleton in `languages/current.py` — use `set_current_language()` / `current_language()`, never pass language as a parameter through call chains
- Use `get_language_support(identifier)` from `languages/registry.py` to get a `LanguageSupport` instance — never import language classes directly
- New language support classes must use the `@register_language` decorator to register with the extension and language registries
- `languages/__init__.py` uses `__getattr__` for lazy imports to avoid circular dependencies — follow this pattern when adding new exports
- `is_javascript()` returns `True` for both JavaScript and TypeScript
17 changes: 17 additions & 0 deletions .claude/rules/optimization-patterns.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
---
paths:
- "codeflash/optimization/**/*.py"
- "codeflash/verification/**/*.py"
- "codeflash/benchmarking/**/*.py"
- "codeflash/context/**/*.py"
---

# Optimization Pipeline Patterns

- All major operations return `Result[SuccessType, ErrorType]` — construct with `Success(value)` / `Failure(error)`, check with `is_successful()` before calling `unwrap()`
- Code context has token limits (`OPTIMIZATION_CONTEXT_TOKEN_LIMIT`, `TESTGEN_CONTEXT_TOKEN_LIMIT` in `config_consts.py`) — exceeding them rejects the function
- `read_writable_code` can span multiple files; `read_only_context_code` is reference-only
- Code is serialized as markdown code blocks: ` ```language:filepath\ncode\n``` ` (see `CodeStringsMarkdown`)
- Candidates form a forest (DAG): refinements/repairs reference `parent_id` on previous candidates
- Test generation and optimization run concurrently — coordinate through `CandidateEvaluationContext`
- Generated tests are instrumented with `codeflash_capture.py` to record return values and traces
3 changes: 0 additions & 3 deletions .claude/rules/source-code.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,3 @@ paths:
# Source Code Rules

- Use `libcst` for code modification/transformation to preserve formatting. `ast` is acceptable for read-only analysis and parsing.
- NEVER use leading underscores for function names (e.g., `_helper`). Python has no true private functions. Always use public names.
- Any new feature or bug fix that can be tested automatically must have test cases.
- If changes affect existing test expectations, update the tests accordingly. Tests must always pass after changes.
2 changes: 2 additions & 0 deletions .claude/rules/testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,5 @@ paths:
- Use `.as_posix()` when converting resolved paths to strings (normalizes to forward slashes).
- Any new feature or bug fix that can be tested automatically must have test cases.
- If changes affect existing test expectations, update the tests accordingly. Tests must always pass after changes.
- The pytest plugin patches `time`, `random`, `uuid`, and `datetime` for deterministic test execution — never assume real randomness or real time in verification tests.
- `conftest.py` uses an autouse fixture that calls `reset_current_language()` — tests always start with Python as the default language.
12 changes: 12 additions & 0 deletions .claude/skills/fix-mypy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Fix mypy errors

When modifying code, fix any mypy type errors in the files you changed:

```bash
uv run mypy --non-interactive --config-file pyproject.toml <changed_files>
```

- Fix type annotation issues: missing return types, incorrect types, Optional/None unions, import errors for type hints
- Do NOT add `# type: ignore` comments — always fix the root cause
- Do NOT fix type errors that require logic changes, complex generic type rework, or anything that could change runtime behavior
- Files in `mypy_allowlist.txt` are checked in CI — ensure they remain error-free
9 changes: 9 additions & 0 deletions .claude/skills/fix-prek.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Fix prek failures

When prek (pre-commit) checks fail:

1. Run `uv run prek run` to see failures (local, checks staged files)
2. In CI, the equivalent is `uv run prek run --from-ref origin/main`
3. prek runs ruff format, ruff check, and mypy on changed files
4. Fix issues in order: formatting → lint → type errors
5. Re-run `uv run prek run` to verify all checks pass
2 changes: 2 additions & 0 deletions .codex/skills/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Managed by Tessl
tessl:*
2 changes: 2 additions & 0 deletions .gemini/skills/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Managed by Tessl
tessl:*
2 changes: 1 addition & 1 deletion .github/workflows/claude.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ jobs:
with:
use_foundry: "true"
use_sticky_comment: true
allowed_bots: "claude[bot]"
allowed_bots: "claude[bot],codeflash-ai[bot]"
prompt: |
REPO: ${{ github.repository }}
PR NUMBER: ${{ github.event.pull_request.number }}
Expand Down
114 changes: 114 additions & 0 deletions .github/workflows/duplicate-code-detector.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
name: Duplicate Code Detector

on:
workflow_dispatch:
pull_request:
types: [opened, synchronize]

jobs:
detect-duplicates:
if: github.event.pull_request.head.repo.full_name == github.repository || github.event_name == 'workflow_dispatch'
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
issues: write
id-token: write
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0
ref: ${{ github.event.pull_request.head.ref || github.ref }}

- name: Start Serena MCP server
run: |
docker pull ghcr.io/github/serena-mcp-server:latest
docker run -d --name serena \
--network host \
-v "${{ github.workspace }}:${{ github.workspace }}:rw" \
ghcr.io/github/serena-mcp-server:latest \
serena start-mcp-server --context codex --project "${{ github.workspace }}"

mkdir -p /tmp/mcp-config
cat > /tmp/mcp-config/mcp-servers.json << 'EOF'
{
"mcpServers": {
"serena": {
"command": "docker",
"args": ["exec", "-i", "serena", "serena", "start-mcp-server", "--context", "codex", "--project", "${{ github.workspace }}"]
}
}
}
EOF

- name: Run Claude Code
uses: anthropics/claude-code-action@v1
with:
use_foundry: "true"
use_sticky_comment: true
allowed_bots: "claude[bot],codeflash-ai[bot]"
claude_args: '--mcp-config /tmp/mcp-config/mcp-servers.json --allowedTools "Read,Glob,Grep,Bash(git diff:*),Bash(git log:*),Bash(git show:*),Bash(wc *),Bash(find *),mcp__serena__*"'
prompt: |
You are a duplicate code detector with access to Serena semantic code analysis.

## Setup

First activate the project in Serena:
- Use `mcp__serena__activate_project` with the workspace path `${{ github.workspace }}`

## Steps

1. Get the list of changed .py files (excluding tests):
`git diff --name-only origin/main...HEAD -- '*.py' | grep -v -E '(test_|_test\.py|/tests/|/test/)'`

2. Use Serena's semantic analysis on changed files:
- `mcp__serena__get_symbols_overview` to understand file structure
- `mcp__serena__find_symbol` to search for similarly named symbols across the codebase
- `mcp__serena__find_referencing_symbols` to understand usage patterns
- `mcp__serena__search_for_pattern` to find similar code patterns

3. For each changed file, look for:
- **Exact Duplication**: Identical code blocks (>10 lines) in multiple locations
- **Structural Duplication**: Same logic with minor variations (different variable names)
- **Functional Duplication**: Different implementations of the same functionality
- **Copy-Paste Programming**: Similar blocks that could be extracted into shared utilities

4. Cross-reference against the rest of the codebase using Serena:
- Search for similar function signatures and logic patterns
- Check if new code duplicates existing utilities or helpers
- Look for repeated patterns across modules

## What to Report

- Identical or nearly identical functions in different files
- Repeated code blocks that could be extracted to utilities
- Similar classes or modules with overlapping functionality
- Copy-pasted code with minor modifications
- Duplicated business logic across components

## What to Skip

- Standard boilerplate (imports, __init__, etc.)
- Test setup/teardown code
- Configuration with similar structure
- Language-specific patterns (constructors, getters/setters)
- Small snippets (<5 lines) unless highly repetitive
- Workflow files under .github/

## Output

Post a single PR comment with your findings. For each pattern found:
- Severity (High/Medium/Low)
- File locations with line numbers
- Code samples showing the duplication
- Concrete refactoring suggestion

If no significant duplication is found, say so briefly. Do not create issues — just comment on the PR.
env:
ANTHROPIC_FOUNDRY_API_KEY: ${{ secrets.AZURE_ANTHROPIC_API_KEY }}
ANTHROPIC_FOUNDRY_BASE_URL: ${{ secrets.AZURE_ANTHROPIC_ENDPOINT }}

- name: Stop Serena
if: always()
run: docker stop serena && docker rm serena || true
12 changes: 12 additions & 0 deletions .mcp.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"mcpServers": {
"tessl": {
"type": "stdio",
"command": "tessl",
"args": [
"mcp",
"start"
]
}
}
}
47 changes: 21 additions & 26 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -1,37 +1,32 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

CodeFlash is an AI-powered Python code optimizer that automatically improves code performance while maintaining correctness. It uses LLMs to generate optimization candidates, verifies correctness through test execution, and benchmarks performance improvements.

## Common Commands

```bash
# Package management (NEVER use pip)
uv sync # Install dependencies
uv sync --group dev # Install dev dependencies
uv add <package> # Add a package

# Running tests
uv run pytest tests/ # Run all tests
uv run pytest tests/test_foo.py # Run specific test file
uv run pytest tests/test_foo.py::test_bar -v # Run single test

# Type checking and linting
uv run mypy codeflash/ # Type check
uv run ruff check codeflash/ # Lint
uv run ruff format codeflash/ # Format
## Optimization Pipeline

# Linting (run before committing)
uv run prek run --from-ref origin/main

# Running the CLI
uv run codeflash --help
uv run codeflash init # Initialize in a project
uv run codeflash --all # Optimize entire codebase
```
Discovery → Ranking → Context Extraction → Test Gen + Optimization → Baseline → Candidate Evaluation → PR
```

1. **Discovery** (`discovery/`): Find optimizable functions across the codebase
2. **Ranking** (`benchmarking/function_ranker.py`): Rank functions by addressable time using trace data
3. **Context** (`context/`): Extract code dependencies (read-writable code + read-only imports)
4. **Optimization** (`optimization/`, `api/`): Generate candidates via AI service, run in parallel with test generation
5. **Verification** (`verification/`): Run candidates against tests, compare outputs via custom pytest plugin
6. **Benchmarking** (`benchmarking/`): Measure performance, select best candidate by speedup
7. **Result** (`result/`, `github/`): Create PR with winning optimization

## Domain Glossary

- **Optimization candidate**: A generated code variant that might be faster (`OptimizedCandidate`)
- **Function context**: All code needed for optimization — split into read-writable (modifiable) and read-only (reference)
- **Addressable time**: Time a function spends that could be optimized (own time + callee time / call count)
- **Candidate forest**: DAG of candidates where refinements/repairs build on previous candidates
- **Replay test**: Test generated from recorded benchmark data to reproduce real workloads
- **Tracer**: Profiling system that records function call trees and timings (`tracing/`, `tracer.py`)
- **Worktree mode**: Git worktree-based parallel optimization (`--worktree` flag)

<!-- Section below is auto-generated by `tessl install` - do not edit manually -->

Expand Down
19 changes: 13 additions & 6 deletions code_to_optimize/js/code_to_optimize_js/bubble_sort.js
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,21 @@ function bubbleSort(arr) {
const result = arr.slice();
const n = result.length;

for (let i = 0; i < n; i++) {
for (let j = 0; j < n - 1; j++) {
if (result[j] > result[j + 1]) {
const temp = result[j];
result[j] = result[j + 1];
result[j + 1] = temp;
if (n <= 1) return result;

for (let i = 0; i < n - 1; i++) {
let swapped = false;
const limit = n - i - 1;
for (let j = 0; j < limit; j++) {
const a = result[j];
const b = result[j + 1];
if (a > b) {
result[j] = b;
result[j + 1] = a;
swapped = true;
}
}
if (!swapped) break;
}

return result;
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 2 additions & 7 deletions codeflash/benchmarking/trace_benchmarks.py
Original file line number Diff line number Diff line change
@@ -1,23 +1,18 @@
from __future__ import annotations

import os
import re
import subprocess
from pathlib import Path

from codeflash.cli_cmds.console import logger
from codeflash.code_utils.compat import SAFE_SYS_EXECUTABLE
from codeflash.code_utils.shell_utils import get_cross_platform_subprocess_run_args
from codeflash.code_utils.shell_utils import get_cross_platform_subprocess_run_args, make_env_with_project_root


def trace_benchmarks_pytest(
benchmarks_root: Path, tests_root: Path, project_root: Path, trace_file: Path, timeout: int = 300
) -> None:
benchmark_env = os.environ.copy()
if "PYTHONPATH" not in benchmark_env:
benchmark_env["PYTHONPATH"] = str(project_root)
else:
benchmark_env["PYTHONPATH"] += os.pathsep + str(project_root)
benchmark_env = make_env_with_project_root(project_root)
run_args = get_cross_platform_subprocess_run_args(
cwd=project_root, env=benchmark_env, timeout=timeout, check=False, text=True, capture_output=True
)
Expand Down
Loading
Loading