Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,5 @@ coverage.html
# OS
.DS_Store
Thumbs.db

.sisyphus
9 changes: 7 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,13 +61,18 @@ pkg/
lexer/ # tokenizer
parser/ # parser
ast/ # AST definitions
types/ # type checker
types/ # type checker (ownership tracking, borrow checking)
eval/ # interpreter
codegen/ # C code generator
codegen/ # C code generator (ownership-aware, borrow support, vtable dispatch)
module/ # module loader & carv.toml config
examples/ # example programs
docs/ # documentation
```

**Ownership & Borrowing**: The type checker (`pkg/types`) tracks ownership (move/drop) and enforces borrow rules. The codegen (`pkg/codegen`) emits ownership-aware C code with proper drop calls and borrow support (&T / &mut T).

**Interfaces**: Defined with `interface`, implemented with `impl ... for`. The checker verifies impl method signatures match the interface. Codegen emits vtable structs, fat pointer typedefs (`_ref`/`_mut_ref`), wrapper functions, and static vtable instances. Dynamic dispatch uses `obj.vt->method(obj.data, args)`. Interface refs are created via cast: `&obj as &Interface`.

## Response Time

I work on this when I have time and energy, so response times vary. Don't take it personally if I'm slow - I'll get to it eventually.
Expand Down
25 changes: 24 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,19 @@ Carv compiles to C and runs natively. It has a tree-walking interpreter too for
Features that actually work:
- Static typing with inference
- Pipe operator (`|>`) - my favorite part and why not
- `let` / `mut` / `const` with proper immutability enforcement
- Compound assignment (`+=`, `-=`, `*=`, `/=`, `%=`, `&=`, `|=`, `^=`)
- Classes with methods
- Result types (`Ok`/`Err`) with pattern matching cause **RUST**
- Hash maps
- `for-in` loops over arrays, strings, and maps
- **Module system** with `require` (Rust-inspired, package manager ready)
- **String interpolation** with `f"hello {name}"`
- **Ownership system** (move semantics, `clone()` for deep copy)
- **Borrowing** (`&T` / `&mut T`)
- **Interfaces** (`interface` / `impl` with vtable-based dynamic dispatch)
- Project config via `carv.toml`
- Basic standard library
- 40+ built-in functions (strings, files, process, environment, etc.)

---

Expand All @@ -42,6 +48,18 @@ println(f"2 + 2 = {2 + 2}");
// pipes make everything nicer
10 |> double |> add(5) |> print;

// ownership: move semantics
let s = "hello";
let t = s; // s is moved, now invalid
let u = s.clone(); // explicit deep copy

Comment on lines +50 to +54
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix moved-value usage in the ownership quick-look example.
Line 53 calls s.clone() after s has been moved on Line 52, which should be invalid. Clone the valid binding instead (or clone before the move).

✅ Suggested fix
 let s = "hello";
 let t = s;              // s is moved, now invalid
-let u = s.clone();      // explicit deep copy
+let u = t.clone();      // explicit deep copy
🤖 Prompt for AI Agents
In `@README.md` around lines 50 - 54, The ownership example incorrectly calls
s.clone() after s has been moved to t; update the snippet so the clone is
performed on a valid binding (e.g., call clone on t or call s.clone() before
assigning t) and assign that result to u so that u receives a deep copy;
reference the bindings s, t, and u and the clone operation when making the
change.

// borrowing: safe references
fn print_len(s: &string) -> int {
return len(s);
}
let msg = "world";
print_len(&msg); // immutable borrow

// error handling without exceptions
fn divide(a: int, b: int) {
if b == 0 {
Expand Down Expand Up @@ -91,6 +109,7 @@ Then:
```bash
./build/carv run file.carv # interpret
./build/carv build file.carv # compile to binary
./build/carv emit-c file.carv # emit generated C source
./build/carv init # create new project with carv.toml
./build/carv repl # mess around
```
Expand All @@ -105,7 +124,11 @@ Then:
- [x] Result types, classes, maps
- [x] Module system (`require`)
- [x] String interpolation (`f"..."`)
- [x] Ownership system (move + drop)
- [x] Borrowing (&T / &mut T)
- [x] Project config (`carv.toml`)
- [x] Interfaces (interface/impl)
- [ ] Async/await
- [ ] Package manager
- [ ] Self-hosting

Expand Down
21 changes: 21 additions & 0 deletions cmd/carv/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,11 @@ func runFile(filename string, programArgs []string) {
os.Exit(1)
}

// Print ownership warnings (non-fatal for interpreter mode)
for _, msg := range checker.Warnings() {
fmt.Fprintln(os.Stderr, msg)
}

env := eval.NewEnvironment()
result := eval.Eval(program, env)

Expand Down Expand Up @@ -261,7 +266,15 @@ func emitC(filename string) {
os.Exit(1)
}

if len(checker.Warnings()) > 0 {
for _, msg := range checker.Warnings() {
fmt.Fprintln(os.Stderr, msg)
}
os.Exit(1)
}

gen := codegen.NewCGenerator()
gen.SetTypeInfo(checker.TypeInfo())
cCode := gen.Generate(program)
fmt.Print(cCode)
}
Expand Down Expand Up @@ -292,7 +305,15 @@ func buildFile(filename string) {
os.Exit(1)
}

if len(checker.Warnings()) > 0 {
for _, msg := range checker.Warnings() {
fmt.Fprintln(os.Stderr, msg)
}
os.Exit(1)
}

gen := codegen.NewCGenerator()
gen.SetTypeInfo(checker.TypeInfo())
cCode := gen.Generate(program)

baseName := strings.TrimSuffix(filename, ".carv")
Expand Down
86 changes: 43 additions & 43 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,42 +8,9 @@ How the compiler is structured. Mostly notes for myself but might be useful if y

## Pipeline

```
Source Code (.carv)
Lexer (pkg/lexer)
Tokens
Parser (pkg/parser)
AST (pkg/ast)
Type Checker (pkg/types)
├──────────────────┐
▼ ▼
Interpreter Code Generator
(pkg/eval) (pkg/codegen)
│ │
│ │
Module Loader │
(pkg/module) │
│ │
▼ ▼
Output C Source
GCC/Clang
Binary
```
Source → Lexer → Tokens → Parser → AST → Type Checker → Interpreter or C Codegen → GCC/Clang → Binary

The type checker produces a `CheckResult` with type info, ownership tracking, and warnings. Both the interpreter and codegen consume this result.

## Package Overview

Expand Down Expand Up @@ -72,9 +39,16 @@ The parser is probably the messiest part of the codebase. It works but could use

### `pkg/types`

Type checker. Walks the AST and validates types, builds symbol tables.
Type checker. Walks the AST and validates types, builds symbol tables, tracks ownership.

Produces a `CheckResult` with:
- `NodeTypes`: type of every expression
- `FuncSigs`: function signatures
- `ClassInfo`: class field/method info
- `Errors`: type errors (fatal in codegen)
- `Warnings`: ownership/borrow violations (warnings in interpreter, fatal in codegen)

Currently pretty basic - doesn't do full type inference, mostly just checks that operations are valid.
Implements ownership tracking (move/drop), borrow checking (&T / &mut T), and a warnings system for non-fatal violations.

### `pkg/eval`

Expand All @@ -90,7 +64,28 @@ Key files:

Generates C code from the AST. The generated C is not pretty but it works.

Currently targets C99. The runtime is minimal - just some helper macros and a simple GC (eventually).
Currently targets C99. Key features:
- **Scope stack**: tracks variable lifetimes for drop insertion
- **Preamble buffer**: emits runtime helpers (carv_string, carv_array, etc.)
- **carv_string struct**: `{char* data; size_t len; bool owned;}`
- **Single-exit functions**: all returns become `goto __carv_exit` with drops at exit label
- **Ownership-aware code generation**: emits `carv_string_move()`, `carv_string_drop()`, `carv_string_clone()`
- **Borrow support**: `&T` → `const T*`, `&mut T` → `T*`
- **Interface dispatch**: vtable-based dynamic dispatch via fat pointers
- **Arena allocator**: used for all owned heap values

#### Interface Codegen

Interfaces compile to a vtable + fat pointer pattern:

1. **Vtable struct**: one function pointer per interface method, all taking `const void* self` as first param
2. **Fat pointer**: `{ const void* data; const Vtable* vt; }` — `_ref` (immutable) and `_mut_ref` (mutable) variants
3. **Impl wrappers**: static functions that cast `const void*` back to the concrete type and call the real method
4. **Vtable instances**: one `static const` vtable per impl, initialized with wrapper function pointers
5. **Cast expressions**: `&obj as &Interface` produces a fat pointer literal `{ .data = obj, .vt = &VT }`
6. **Dynamic dispatch**: `obj.method(args)` on an interface ref becomes `obj.vt->method(obj.data, args)`

Generation order: interface typedefs → impl forward decls → impl bodies → wrappers + vtable instances (all before `main()`)

### `pkg/module`

Expand All @@ -107,7 +102,7 @@ Supports:

### `cmd/carv`

CLI entry point. Handles `run`, `build`, `emit-c`, `repl` commands.
CLI entry point. Handles `run`, `build`, `emit-c`, `repl`, and `init` commands.

## Design Decisions

Expand All @@ -129,9 +124,14 @@ The goal is self-hosting - writing the Carv compiler in Carv. That means I need:

1. ~~Module/import system~~ ✓ Done!
2. ~~String interpolation~~ ✓ Done!
3. Package manager (for external dependencies)
4. Better standard library
5. Then rewrite lexer, parser, codegen in Carv
3. ~~Ownership system (move + drop)~~ ✓ Done!
4. ~~Borrowing (&T / &mut T)~~ ✓ Done!
5. ~~Interfaces (interface/impl)~~ ✓ Done!
6. Package manager (for external dependencies)
7. Better standard library
8. Then rewrite lexer, parser, codegen in Carv

6. Async/await — Next

It's a long road but that's half the fun. Getting closer though!

Expand Down
89 changes: 89 additions & 0 deletions docs/builtins.md
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,66 @@ Get character at index.
char_at("hello", 1) // 'e'
```

## Parsing

### `parse_int(str) -> int`
Parse a string as an integer.

```carv
parse_int("42") // 42
parse_int("-10") // -10
```

### `parse_float(str) -> float`
Parse a string as a float.

```carv
parse_float("3.14") // 3.14
parse_float("2.0") // 2
```
Comment on lines +249 to +255
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Minor inconsistency in example output comment.

The comment on line 254 shows // 2 for parse_float("2.0"), but since parse_float returns a float, the expected output should be // 2.0 to be consistent with float representation.

 ```carv
 parse_float("3.14")   // 3.14
-parse_float("2.0")    // 2
+parse_float("2.0")    // 2.0

<details>
<summary>🤖 Prompt for AI Agents</summary>

In @docs/builtins.md around lines 249 - 255, Update the example in the docs for
the parse_float built-in: change the output comment for the call
parse_float("2.0") from "// 2" to "// 2.0" so the example consistently reflects
that parse_float returns a float; edit the example block under the parse_float
description to adjust only that comment.


</details>

<!-- fingerprinting:phantom:medusa:ocelot -->

<!-- This is an auto-generated comment by CodeRabbit -->


## Process & Environment

### `args() -> array`
Get command-line arguments passed to the script.

```carv
let a = args();
print(a); // ["arg1", "arg2", ...]
```

### `exec(command, ...args) -> int`
Run an external command. Returns the exit code.

```carv
let code = exec("echo", "hello"); // prints "hello", returns 0
```

### `exec_output(command, ...args) -> Result`
Run an external command and capture output. Returns `Ok(stdout)` or `Err(stderr)`.

```carv
let result = exec_output("echo", "hello");
match result {
Ok(out) => print(trim(out)),
Err(e) => print("failed: " + e),
}
```

### `getenv(key) -> string`
Get an environment variable. Returns empty string if not set.

```carv
let home = getenv("HOME");
```

### `setenv(key, value)`
Set an environment variable.

```carv
setenv("MY_VAR", "hello");
```

## File I/O

### `read_file(path) -> string`
Expand All @@ -252,6 +312,13 @@ Write string to file.
write_file("out.txt", "hello");
```

### `append_file(path, content)`
Append string to file. Creates the file if it doesn't exist.

```carv
append_file("log.txt", "new line\n");
```

### `file_exists(path) -> bool`
Check if file exists.

Expand All @@ -261,6 +328,13 @@ if file_exists("config.txt") {
}
```

### `mkdir(path)`
Create a directory (and parent directories).

```carv
mkdir("build/output");
```

## Control Flow

### `exit(code?)`
Expand All @@ -278,6 +352,21 @@ Crash with error message.
panic("something went wrong");
```

## Ownership

### `clone(value) -> value`
Deep copy of any move type (string, array, map, or class instance).

```carv
let original = "hello";
let copy = original.clone();
print(original); // OK: "hello"
print(copy); // OK: "hello"

let arr = [1, 2, 3];
let arr_copy = arr.clone();
```
Comment on lines +355 to +368
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inconsistent clone() syntax between header and example.

The header declares clone(value) -> value as a function call, but the example uses method syntax original.clone(). Please clarify which syntax is canonical, or document both if both are supported.

// Header shows function syntax:
clone(value) -> value

// But example uses method syntax:
let copy = original.clone();
🤖 Prompt for AI Agents
In `@docs/builtins.md` around lines 355 - 368, The docs show inconsistent syntax
for the clone API: the header uses a function form "clone(value) -> value" while
the example uses method form "original.clone()"; update the documentation to be
explicit and consistent by either changing the header to show the method form
(e.g., "value.clone() -> value") to match the example, or document both forms
(add a brief note that both "clone(value)" and "value.clone()" are supported)
and provide matching examples for each (include both "let copy =
original.clone();" and "let copy = clone(original);" if both exist); ensure the
symbol "clone" and the example "original.clone()" are referenced so readers can
find the correct usage.


---

[← Architecture](architecture.md) | **Built-ins** | [Contributing →](../CONTRIBUTING.md)
Loading
Loading