Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -401,6 +401,56 @@ Template variables available:
- `context.conn.*`: Connection properties (paths, credentials)
- `context.auth.*`: Authentication context

### 6. Self-Packaging (single-binary deploy)

The same `flapi` binary that serves the API can also fold an entire
config tree into itself, producing a self-contained executable
deployable via `scp`.

```bash
# Pack a config tree into a new bundled binary
flapi pack --in ./examples --out flapi-prod

# Inspect what's bundled
./flapi-prod info

# Extract the bundle for debugging
./flapi-prod unpack --to /tmp/extracted

# Run it -- serves the bundled config from any cwd
cd /tmp && ./flapi-prod
```

**How it works:**

- A ZIP archive is appended after the executable (or, on macOS,
written into a reserved `__FLAPI/__bundle` Mach-O segment that
was allocated at link time -- 16 MiB default, knob
`FLAPI_RESERVED_BUNDLE_MIB`). Mach-O is re-codesigned so the
output is notarisable.
- At startup, `bundle_locator` either reverse-scans the EOCD
signature from EOF (Linux / Windows) or probes the reserved
section (macOS). On hit, entries are decompressed once into a
shared `ArchiveEntries`.
- `EmbeddedArchiveFileProvider` (implements `IFileProvider`) serves
config / SQL templates from that map. `FileProviderFactory`
dispatches non-remote paths to it when a bundle is present.
- For SQL templates that use `read_csv()` / `read_parquet()`, an
`EmbeddedFileSystem` is registered with DuckDB on the `embed://`
scheme, so `read_csv('embed://data/cities.csv')` resolves to the
same in-memory bytes.

**Secrets never go in the bundle.** `pack` refuses files matching
`*.env`, `secrets/*`, `*.pem`, `*.key` by default. Credentials come
from environment variables at runtime (`AWS_*`, `GOOGLE_*`, `AZURE_*`,
`FLAPI_CONFIG_SERVICE_TOKEN`, `{{env.VARNAME}}` interpolation in
YAML). See [DESIGN_DECISIONS §9](docs/spec/DESIGN_DECISIONS.md#9-self-packaging-via-appended-zip)
for the rationale and [CLI_REFERENCE §3](docs/CLI_REFERENCE.md#3-self-packaging-subcommands)
for full subcommand options.

**Reproducibility.** Set `SOURCE_DATE_EPOCH` (epoch seconds) before
`flapi pack` and the output is bit-identical across runs.

## Key Patterns

### Safe Query Building Pattern
Expand Down
139 changes: 131 additions & 8 deletions docs/CLI_REFERENCE.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,19 @@ This document provides a complete reference for the `flapi` server executable's
- [Validate Configuration](#validate-configuration---validate-config)
- [Configuration Service](#configuration-service---config-service)
- [Configuration Service Token](#configuration-service-token---config-service-token)
3. [Environment Variables](#3-environment-variables)
4. [Usage Examples](#4-usage-examples)
3. [Self-Packaging Subcommands](#3-self-packaging-subcommands)
- [`flapi pack`](#flapi-pack----create-a-self-contained-binary)
- [`flapi info`](#flapi-info----inspect-the-running-binarys-bundle)
- [`flapi unpack`](#flapi-unpack---to-dir----dump-the-bundle-for-debugging)
- [macOS notarisation specifics](#macos-notarisation-specifics)
4. [Environment Variables](#4-environment-variables)
5. [Usage Examples](#5-usage-examples)
- [Basic Startup](#basic-startup)
- [Development Mode](#development-mode)
- [Production Mode](#production-mode)
- [CI/CD Validation](#cicd-validation)
5. [Signal Handling](#5-signal-handling)
6. [Exit Codes](#6-exit-codes)
6. [Signal Handling](#6-signal-handling)
7. [Exit Codes](#7-exit-codes)
- [Related Documentation](#related-documentation)

---
Expand Down Expand Up @@ -396,12 +401,130 @@ export FLAPI_NO_TELEMETRY=1

---

## 3. Environment Variables
## 3. Self-Packaging Subcommands

flapi can fold its config tree (flapi.yaml + endpoint YAMLs + SQL
templates + small data files) into the binary itself, producing a
single self-contained artifact deployable via `scp`. The same binary
that serves the API also produces new bundled artifacts -- there is
no separate packager.

See [DESIGN_DECISIONS.md §9](./spec/DESIGN_DECISIONS.md#9-self-packaging-via-appended-zip)
for the architectural rationale.

### `flapi pack` -- create a self-contained binary

```
flapi pack --in <config-dir> --out <new-binary> [--allow-secrets] [--macos-append]
```

| Option | Required | Description |
|--------|----------|-------------|
| `--in` | yes | Directory containing `flapi.yaml` and friends. Walked recursively. |
| `--out` | yes | Path for the bundled output binary. Overwritten if it exists. |
| `--allow-secrets` | no | Bypass the default secret deny list. Testing only -- production users must never set this. |
| `--macos-append` | no | macOS only: append the archive after `__LINKEDIT` instead of overwriting the reserved `__FLAPI/__bundle` segment. **Not notarisable.** |

**Default secret deny list** (refusal with non-zero exit, unless
`--allow-secrets`):

- `*.env` at any depth
- `secrets/` segment at any depth
- `*.pem` at any depth
- `*.key` at any depth

**Reproducible builds.** Set `SOURCE_DATE_EPOCH` to stamp every
archive entry with a deterministic mtime; the produced binary is
then bit-identical across runs given the same input.

**Re-pack idempotence.** If the host binary already has a trailing
bundle, `pack` strips it from the _copy_ (not the running binary)
before appending the new one, so repeated invocations don't grow the
output.

**Example:**

```bash
SOURCE_DATE_EPOCH=1700000000 flapi pack --in ./examples --out flapi-prod
chmod +x flapi-prod # exec bits already preserved
scp flapi-prod user@host:/opt/flapi/ # one-file deploy
ssh user@host '/opt/flapi/flapi-prod' # serves bundled config from any cwd
```

> **Implementation:** `src/pack.cpp`, `src/archive_io.cpp` | **Tests:** `test/cpp/pack_test.cpp`, `test/integration/test_self_packaging.py`

### `flapi info` -- inspect the running binary's bundle

```
flapi info
```

Prints the EOCD offset, bundle size, and entry list (with byte
counts). Exits non-zero with `"Bundle: none (filesystem mode)"` if
the binary has no appended (or in-section, on macOS) bundle.

**Example:**

```bash
$ ./flapi-prod info
Binary: /opt/flapi/flapi-prod
Bundle offset: 70123456
Bundle size: 12534 bytes
Entries (17):
flapi.yaml (1024 bytes)
sqls/customers.yaml (412 bytes)
...
```

### `flapi unpack --to <dir>` -- dump the bundle for debugging

```
flapi unpack --to <dir>
```

Writes every bundle entry to `<dir>` (creating intermediate
directories as needed), preserving paths. Useful for diffing a
deployed bundle against a development tree.

**Example:**

```bash
$ ./flapi-prod unpack --to /tmp/extracted
Unpacked 17 entries to /tmp/extracted

$ diff -ru ./examples /tmp/extracted
# (empty -- bundle matches source tree)
```

### macOS notarisation specifics

On Darwin, `flapi` is linked with a reserved `__FLAPI/__bundle`
Mach-O section (default 16 MiB, knob `FLAPI_RESERVED_BUNDLE_MIB` at
CMake configure time). `flapi pack` overwrites this section in
place and re-invokes `codesign --force --sign $CODESIGN_IDENTITY`
(defaulting to `-` for ad-hoc) so the freshly bundled binary has a
fresh valid signature. The output is suitable for `notarytool
submit`.

The `--macos-append` flag falls back to the Linux/Windows-style
trailing-bytes layout. Use it only for local debugging -- the
signature is intentionally invalid and the binary will fail
notarisation.

> **Implementation:** `src/macho_bundle.cpp` | **Tests:** `test/cpp/macho_bundle_test.cpp`, `test/integration/test_self_packaging_macos.py`

---

## 4. Environment Variables

| Variable | Description | Used By |
|----------|-------------|---------|
| `FLAPI_CONFIG` | Path to `flapi.yaml` (fallback for `-c`) | `--config` fallback |
| `FLAPI_LOG_LEVEL` | Log verbosity (fallback for `--log-level`); invalid values exit 1 | `--log-level` fallback |
| `FLAPI_CONFIG_SERVICE_TOKEN` | Authentication token for configuration service API | `--config-service-token` fallback |
| `FLAPI_NO_TELEMETRY` | Disable telemetry when set to `1`, `true`, or `yes` | `--no-telemetry` fallback |
| `SOURCE_DATE_EPOCH` | Mtime stamped on every entry by `flapi pack` (reproducible builds) | `flapi pack` |
| `CODESIGN_IDENTITY` | macOS only: identity passed to `codesign --sign` after `flapi pack`. Defaults to `-` (ad-hoc). | `flapi pack` |

**Configuration File Variables:**

Expand All @@ -418,7 +541,7 @@ See [Configuration Reference - Environment Variables](./CONFIG_REFERENCE.md#10-e

---

## 4. Usage Examples
## 5. Usage Examples

### Basic Startup

Expand Down Expand Up @@ -474,7 +597,7 @@ fi

---

## 5. Signal Handling
## 6. Signal Handling

| Signal | Behavior |
|--------|----------|
Expand All @@ -499,7 +622,7 @@ On receiving a shutdown signal, the server:

---

## 6. Exit Codes
## 7. Exit Codes

| Code | Description |
|------|-------------|
Expand Down
47 changes: 47 additions & 0 deletions docs/spec/ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,23 @@ Storage and external data access:
| **AuthMiddleware** | `src/auth_middleware.cpp` | JWT/Basic/OIDC authentication |
| **RateLimitMiddleware** | `src/rate_limit_middleware.cpp` | Request rate limiting |

### Self-Packaging (optional)

These components are loaded only when the running binary contains an
appended (or, on macOS, in-section) ZIP bundle. They let the same
artifact serve the API _and_ produce new bundled artifacts via
`flapi pack`.

| Component | File | Purpose |
|-----------|------|---------|
| **archive_io** | `src/archive_io.cpp` | RAII wrapper around libarchive; reads/writes ZIPs in memory, with `bytes_in_last_block=1` and `SOURCE_DATE_EPOCH` mtime stamping for reproducible builds. |
| **selfpath** | `src/selfpath.cpp` | Cross-platform self-binary path (`/proc/self/exe`, `_NSGetExecutablePath`, `GetModuleFileNameW`). |
| **bundle_locator** | `src/bundle_locator.cpp` | Reverse-scans for the ZIP EOCD record (Linux/Windows); on macOS prefers the reserved `__FLAPI/__bundle` Mach-O section. Tolerates trailing zero padding. |
| **macho_bundle** | `src/macho_bundle.cpp` | 64-bit Mach-O header + LC_SEGMENT_64 parser; writes the archive into the reserved section in place and re-invokes `codesign`. |
| **EmbeddedArchiveFileProvider** | `src/embedded_archive_file_provider.cpp` | `IFileProvider` implementation backed by a `std::shared_ptr<const ArchiveEntries>`. Sibling of `LocalFileProvider` / `DuckDBVFSProvider`. |
| **EmbeddedFileSystem** | `src/duckdb_embed_fs.cpp` | `duckdb::FileSystem` for the `embed://` scheme. Lets SQL templates do `read_csv('embed://data/x.csv')`. Same `ArchiveEntries` instance as `EmbeddedArchiveFileProvider`. |
| **pack** | `src/pack.cpp` | `flapi pack` / `info` / `unpack` subcommand logic. Enforces a default secret deny list (`*.env`, `secrets/*`, `*.pem`, `*.key`). |

## Data Flow

### REST Request Flow
Expand Down Expand Up @@ -167,6 +184,36 @@ Storage and external data access:

For detailed request flows with sequence diagrams, see [REQUEST_LIFECYCLE.md](./REQUEST_LIFECYCLE.md).

### Self-Packaging Bootstrap

When `flapi` starts, _before_ loading the config:

```
1. main() calls detectAndRegisterEmbeddedBundle()
├─ bundle_locator::LocateBundleInSelf()
│ macOS: probe __FLAPI/__bundle Mach-O section first
│ fallback: reverse-scan EOCD signature from EOF
├─ if bundle found: read slice → archive_io::ReadArchive()
└─ store entries in FileProviderFactory (process-wide shared_ptr)

2. main() proceeds to initializeConfig()
ConfigLoader.loadYamlFile("flapi.yaml")
├─ FileProviderFactory::CreateProvider("flapi.yaml")
│ bundle present + non-remote path → EmbeddedArchiveFileProvider
│ no bundle + non-remote → LocalFileProvider
│ any remote scheme → DuckDBVFSProvider
└─ provider.ReadFile() returns bytes (from bundle or disk)

3. After DatabaseManager is up, main() calls
RegisterEmbeddedFileSystem() which adds the embed:// VFS to
DuckDB so SQL templates can `read_csv('embed://data/foo.csv')`.
```

If no bundle is present (Linux/Windows shipped without `pack`, or
the trailing 1 KiB has been truncated), all bundle-aware components
silently return nullopt and the binary serves from the local
filesystem -- existing behaviour, zero churn.

## Protocol Support

flAPI supports two protocols from a unified configuration:
Expand Down
Loading
Loading