Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
490 changes: 490 additions & 0 deletions docs/FRICTIONLESS_SETUP_PLAN.md

Large diffs are not rendered by default.

151 changes: 151 additions & 0 deletions docs/JS_PROMPT_PARITY_RECOMMENDATIONS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
# JavaScript Test Generation Prompt - Parity Recommendations

This document outlines gaps between Python and JavaScript test generation prompts and provides recommendations for improving the JavaScript prompts.

## Current State Comparison

### Python Prompt (76 lines) - Comprehensive sections:
1. **PRESERVE ORIGINAL FUNCTION** - Don't modify function being tested
2. **USE REAL CLASSES - NO STUBS OR FAKES** - Import actual domain classes
3. **HANDLING INSTANCE METHODS** - Proper instantiation patterns
4. **USE CONFTEST.PY FIXTURES** - Leverage existing test fixtures
5. **DO NOT MOCK THE FUNCTION UNDER TEST** - Critical rule
6. **IMPORT CLASSES FROM THEIR REAL MODULES** - Proper import sources
7. **IMPORT EVERYTHING YOU USE** - No missing imports
8. **ONLY IMPORT WHAT YOU USE** - No unused imports
9. **USE CORRECT IMPORT SOURCES** - Match context provided
10. **DO NOT USE MOCK OBJECTS FOR DOMAIN CLASSES** - Real instances
11. **USE CORRECT CONSTRUCTOR SIGNATURES** - Proper instantiation
12. **VALID PYTHON STRING LITERALS** - Escape sequences, raw strings

### JavaScript Prompt (44 lines) - Current sections:
1. Basic/Edge/Large Scale test structure ✓
2. DO NOT MOCK THE FUNCTION UNDER TEST ✓
3. IMPORT FROM REAL MODULES ✓
4. HANDLE ASYNC PROPERLY ✓
5. IMPORT PATH RULES (no extensions) ✓
6. MOCKING RULES (Jest vs Vitest) ✓

## Gap Analysis

| Python Section | JS Status | Priority |
|----------------|-----------|----------|
| PRESERVE ORIGINAL FUNCTION | Missing | High |
| USE REAL CLASSES - NO STUBS | Missing | High |
| HANDLING INSTANCE METHODS | Missing | High |
| USE CONFTEST.PY FIXTURES | N/A (JS uses different patterns) | - |
| IMPORT EVERYTHING YOU USE | Missing | Medium |
| ONLY IMPORT WHAT YOU USE | Missing | Medium |
| USE CORRECT IMPORT SOURCES | Missing | High |
| USE CORRECT CONSTRUCTOR SIGNATURES | Missing | High |
| VALID STRING LITERALS | Missing | Medium |

## Recommended Additions to JavaScript Prompt

### Core Sections (Port from Python)

```markdown
**CRITICAL: PRESERVE ORIGINAL FUNCTION**:
- Do NOT modify or rewrite the function being tested.
- Your task is ONLY to write tests for the function as given.

**CRITICAL: USE REAL CLASSES - NO STUBS OR FAKES**:
- When the function uses custom classes/types, import and use the REAL class.
- **WRONG**: Creating inline stub classes or interfaces
- **CORRECT**: `import { UserProfile } from '../models/UserProfile'`

**CRITICAL: HANDLING CLASS METHODS**:
- For instance methods, properly instantiate the class first.
- Use the constructor signature shown in the context.
- Example:
```javascript
const processor = new DataProcessor(config);
const result = processor.processData(input);
```

**CRITICAL: IMPORT EVERYTHING YOU USE**:
- Every class, type, function, or constant used in tests MUST be imported.
- Do not assume anything is globally available.

**CRITICAL: ONLY IMPORT WHAT YOU USE**:
- Do not add unused imports - they cause TypeScript/linting errors.

**CRITICAL: USE CORRECT IMPORT SOURCES**:
- Import from the exact module paths shown in the provided context.
- Do not guess or infer import paths.

**CRITICAL: USE CORRECT CONSTRUCTOR SIGNATURES**:
- Check the class definition for required constructor parameters.
- Do not omit required parameters or add non-existent ones.

**CRITICAL: VALID STRING LITERALS**:
- Use proper escaping for special characters in strings.
- For multiline strings, use template literals (`backticks`).
- Escape backslashes properly: `\\n` for literal `\n`.
```

### JS/TS-Specific Sections (New)

```markdown
**CRITICAL: HANDLE TYPESCRIPT TYPES**:
- If testing TypeScript code, ensure test file uses `.test.ts` patterns.
- Respect type annotations - don't pass wrong types to functions.
- Use type assertions (`as Type`) only when necessary.

**CRITICAL: HANDLE PROMISES AND CALLBACKS**:
- For callback-based APIs, use promisify or done() callback.
- Never leave floating promises - always await or return.
- Use `expect.assertions(n)` for async error testing.

**CRITICAL: ES MODULES VS COMMONJS**:
- Check if project uses ESM (`import/export`) or CJS (`require/module.exports`).
- Match the import style to the project configuration.

**CRITICAL: HANDLE THIS BINDING**:
- Arrow functions don't have their own `this` context.
- For methods that use `this`, test with proper binding:
```javascript
// Correct - preserves this context
const instance = new MyClass();
expect(instance.method()).toBe(expected);

// Wrong - loses this context
const { method } = new MyClass();
expect(method()).toBe(expected); // May fail!
```

**CRITICAL: NULL VS UNDEFINED**:
- JavaScript distinguishes between `null` and `undefined`.
- Test both cases when function handles optional values.
- Use `toBe(null)` vs `toBeUndefined()` appropriately.
```

## Implementation Priority

1. **Phase 1 (High Priority)**: Add core sections ported from Python
- PRESERVE ORIGINAL FUNCTION
- USE REAL CLASSES
- HANDLING CLASS METHODS
- USE CORRECT IMPORT SOURCES
- USE CORRECT CONSTRUCTOR SIGNATURES

2. **Phase 2 (Medium Priority)**: Add import management
- IMPORT EVERYTHING YOU USE
- ONLY IMPORT WHAT YOU USE
- VALID STRING LITERALS

3. **Phase 3 (JS-Specific)**: Add JavaScript-specific guidance
- HANDLE TYPESCRIPT TYPES
- HANDLE PROMISES AND CALLBACKS
- ES MODULES VS COMMONJS
- HANDLE THIS BINDING
- NULL VS UNDEFINED

## Metrics to Track

After implementing these changes, track:
- Test generation success rate (tests that compile)
- Test execution pass rate (tests that run without errors)
- Import error frequency
- Type error frequency (TypeScript projects)
- Mock-related failures
18 changes: 12 additions & 6 deletions docs/codeflash-concepts/how-codeflash-works.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,25 +3,31 @@ title: "How Codeflash Works"
description: "Understand Codeflash's generate-and-verify approach to code optimization and correctness verification"
icon: "gear"
sidebarTitle: "How It Works"
keywords: ["architecture", "verification", "correctness", "testing", "optimization", "LLM", "benchmarking"]
keywords: ["architecture", "verification", "correctness", "testing", "optimization", "LLM", "benchmarking", "javascript", "typescript", "python"]
---
# How Codeflash Works

Codeflash follows a "generate and verify" approach to optimize code. It uses LLMs to generate optimizations, then it rigorously verifies if those optimizations are indeed
faster and if they have the same behavior. The basic unit of optimization is a function—Codeflash tries to speed up the function, and tries to ensure that it still behaves the same way. This way if you merge the optimized code, it simply runs faster without breaking any functionality.

Codeflash supports **Python**, **JavaScript**, and **TypeScript** projects.

## Analysis of your code

Codeflash scans your codebase to identify all available functions. It locates existing unit tests in your projects and maps which functions they test. When optimizing a function, Codeflash runs these discovered tests to verify nothing has broken.

For Python, code analysis uses `libcst` and `jedi`. For JavaScript/TypeScript, it uses `tree-sitter` for AST parsing.

#### What kind of functions can Codeflash optimize?

Codeflash works best with self-contained functions that have minimal side effects (like communicating with external systems or sending network requests). Codeflash optimizes a group of functions - consisting of an entry point function and any other functions it directly calls.
Codeflash supports optimizing async functions.
Codeflash supports optimizing async functions in all supported languages.

#### Test Discovery

Codeflash currently only runs tests that directly call the target function in their test body. To discover tests that indirectly call the function, you can use the Codeflash Tracer. The Tracer analyzes your test suite and identifies all tests that eventually call a function.
Codeflash discovers tests that directly call the target function in their test body. For Python, it finds pytest and unittest tests. For JavaScript/TypeScript, it finds Jest and Vitest test files.

To discover tests that indirectly call the function, you can use the Codeflash Tracer. The Tracer analyzes your test suite and identifies all tests that eventually call a function.

## Optimization Generation

Expand All @@ -48,12 +54,12 @@ We recommend manually reviewing the optimized code since there might be importan

Codeflash generates two types of tests:

- LLM Generated tests - Codeflash uses LLMs to create several regression test cases that cover typical function usage, edge cases, and large-scale inputs to verify both correctness and performance.
- Concolic coverage tests - Codeflash uses state-of-the-art concolic testing with an SMT Solver (a theorem prover) to explore execution paths and generate function arguments. This aims to maximize code coverage for the function being optimized. Codeflash runs the resulting test file to verify correctness. Currently, this feature only supports pytest.
- **LLM Generated tests** - Codeflash uses LLMs to create several regression test cases that cover typical function usage, edge cases, and large-scale inputs to verify both correctness and performance. This works for Python, JavaScript, and TypeScript.
- **Concolic coverage tests** - Codeflash uses state-of-the-art concolic testing with an SMT Solver (a theorem prover) to explore execution paths and generate function arguments. This aims to maximize code coverage for the function being optimized. Currently, this feature only supports Python (pytest).

## Code Execution

Codeflash runs tests for the target function using either pytest or unittest frameworks. The tests execute on your machine, ensuring access to the Python environment and any other dependencies associated to let Codeflash run your code properly. Running on your machine also ensures accurate performance measurements since runtime varies by system.
Codeflash runs tests for the target function on your machine. For Python, it uses pytest or unittest. For JavaScript/TypeScript, it uses Jest or Vitest. Running on your machine ensures access to your environment and dependencies, and provides accurate performance measurements since runtime varies by system.

#### Performance benchmarking

Expand Down
79 changes: 11 additions & 68 deletions docs/configuration.mdx
Original file line number Diff line number Diff line change
@@ -1,83 +1,26 @@
---
title: "Manual Configuration"
description: "Configure Codeflash for your project with pyproject.toml settings and advanced options"
description: "Configure Codeflash for your project"
icon: "gear"
sidebarTitle: "Manual Configuration"
keywords:
[
"configuration",
"pyproject.toml",
"setup",
"settings",
"pytest",
"formatter",
]
---

# Manual Configuration

Codeflash is installed and configured on a per-project basis.
`codeflash init` should guide you through the configuration process, but if you need to manually configure Codeflash or set advanced settings, you can do so by editing the `pyproject.toml` file in the root directory of your project.

## Configuration Options

Codeflash config looks like the following

```toml
[tool.codeflash]
module-root = "my_module"
tests-root = "tests"
formatter-cmds = ["black $file"]
# optional configuration
benchmarks-root = "tests/benchmarks" # Required when running with --benchmark
ignore-paths = ["my_module/build/"]
pytest-cmd = "pytest"
disable-imports-sorting = false
disable-telemetry = false
git-remote = "origin"
override-fixtures = false
```

All file paths are relative to the directory of the `pyproject.toml` file.

Required Options:

- `module-root`: The Python module you want Codeflash to optimize going forward. Only code under this directory will be optimized. It should also have an `__init__.py` file to make the module importable.
- `tests-root`: The directory where your tests are located. Codeflash will use this directory to discover existing tests as well as generate new tests.

Optional Configuration:

- `benchmarks-root`: The directory where your benchmarks are located. Codeflash will use this directory to discover existing benchmarks. Note that this option is required when running with `--benchmark`.
- `ignore-paths`: A list of paths within the `module-root` to ignore when optimizing code. Codeflash will not optimize code in these paths. Useful for ignoring build directories or other generated code. You can also leave this empty if not needed.
- `pytest-cmd`: The command to run your tests. Defaults to `pytest`. You can specify extra commandline arguments here for pytest.
- `formatter-cmds`: The command line to run your code formatter or linter. Defaults to `["black $file"]`. In the command line `$file` refers to the current file being optimized. The assumption with using tools here is that they overwrite the same file and returns a zero exit code. You can also specify multiple tools here that run in a chain as a toml array. You can also disable code formatting by setting this to `["disabled"]`.
- `ruff` - A recommended way to run ruff linting and formatting is `["ruff check --exit-zero --fix $file", "ruff format $file"]`. To make `ruff check --fix` return a 0 exit code please add a `--exit-zero` argument.
- `disable-imports-sorting`: By default, codeflash uses isort to organize your imports before creating suggestions. You can disable this by setting this field to `true`. This could be useful if you don't sort your imports or while using linters like ruff that sort imports too.
- `disable-telemetry`: Disable telemetry data collection. Defaults to `false`. Set this to `true` to disable telemetry data collection. Codeflash collects anonymized telemetry data to understand how users are using Codeflash and to improve the product. Telemetry does not collect any code data.
- `git-remote`: The git remote to use for pull requests. Defaults to `"origin"`.
- `override-fixtures`: Override pytest fixtures during optimization. Defaults to `false`.

## Example Configuration

Here's an example project with the following structure:

```text
acme-project/
|- foo_module/
| |- __init__.py
| |- foo.py
| |- main.py
|- tests/
| |- __init__.py
| |- test_script.py
|- pyproject.toml
```

Here's a sample `pyproject.toml` file for the above project:

```toml
[tool.codeflash]
module-root = "foo_module"
tests-root = "tests"
ignore-paths = []
```
`codeflash init` should guide you through the configuration process, but if you need to manually configure Codeflash or set advanced settings, follow the guide for your language:

<CardGroup cols={2}>
<Card title="Python Configuration" icon="python" href="/configuration/python">
Configure via `pyproject.toml`
</Card>
<Card title="JavaScript / TypeScript Configuration" icon="js" href="/configuration/javascript">
Configure via `package.json`
</Card>
</CardGroup>
Loading