Skip to content

Commit fd19470

Browse files
v0.6.92: enrichment table column type, table run fixes, scheduled jitter, hosted-key queueing
v0.6.92: enrichment table column type, table run fixes, scheduled jitter, hosted-key queueing
2 parents e532e0a + 92fd17c commit fd19470

69 files changed

Lines changed: 38369 additions & 320 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.claude/commands/add-enrichment.md

Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
---
2+
description: Add a code-defined table enrichment (registry entry) backed by a provider cascade, ensuring each provider tool has hosted-key support
3+
argument-hint: <enrichment-name>
4+
---
5+
6+
# Adding a Table Enrichment
7+
8+
Enrichments are code-defined entries in `apps/sim/enrichments/` that run **directly per table row** (no workflow). Each enrichment declares inputs, outputs, and an ordered list of **providers**; the cascade runner tries providers in order and the first non-empty result fills the cell. Each provider calls one existing Sim tool via `executeTool`, which injects the workspace's BYOK key or a **hosted key** and bills usage automatically.
9+
10+
Because enrichments run on Sim's hosted keys by default, **every provider tool you reference must have hosted-key support** — otherwise it can only run when the workspace brings its own key. This command makes that check a required step.
11+
12+
## Overview
13+
14+
| Step | What | Where |
15+
|------|------|-------|
16+
| 1 | Pick the data-source tool(s) for each output | `tools/{service}/` + `tools/registry.ts` |
17+
| 2 | **Verify each tool has `hosting`; if not, run `/add-hosted-key`** | `tools/{service}/{action}.ts` |
18+
| 3 | Write the enrichment definition | `enrichments/{name}/{name}.ts` + `index.ts` |
19+
| 4 | Register it | `enrichments/registry.ts` |
20+
| 5 | Verify | tsc / biome / manual run |
21+
22+
## Architecture (what you're plugging into)
23+
24+
- **`enrichments/types.ts`**`EnrichmentConfig { id, name, description, icon, inputs, outputs, providers }` and `EnrichmentProvider { id, label, toolId, buildParams, mapOutput }`. Providers are **plain data** (no `@/tools` import) so the catalog stays client-safe.
25+
- **`enrichments/providers.ts`**`toolProvider(...)` (typed passthrough) plus shared input helpers: `str(v)`, `normalizeDomain(v)`, `firstNonEmpty(arr)`, `splitName(fullName)`.
26+
- **`enrichments/run.ts`** — the server-only cascade runner. Calls `executeTool(provider.toolId, { ...params, _context: { workspaceId } })`, accumulates hosted-key cost, returns the first non-empty mapped result. **You do not edit this** — it works for any registry entry.
27+
- **`enrichments/registry.ts`**`ENRICHMENT_REGISTRY` / `ALL_ENRICHMENTS` / `getEnrichment`. Register new entries here.
28+
29+
Outputs automatically become table columns; billing, the catalog/sidebar UI, the column meta-header icon, and per-row execution all work with no extra wiring.
30+
31+
## Step 1: Pick the data-source tool(s)
32+
33+
For each output the enrichment produces, decide which existing tool provides it. Look up the service's API and the tool in `apps/sim/tools/{service}/` (e.g. `hunter_email_finder`, `pdl_person_enrich`, `pdl_company_enrich`). Confirm:
34+
35+
- The tool id is registered in `apps/sim/tools/registry.ts`.
36+
- Its `params` accept what you can derive from table columns (read the tool's `params`).
37+
- Its `outputs` / `transformResponse` actually expose the field you need (read the real output shape — don't assume).
38+
39+
Order providers **cheapest / most-likely-to-hit first**; the cascade stops at the first non-empty result. Apollo / LinkedIn are not hosted-safe (ToS) — don't use them.
40+
41+
## Step 2: Verify hosted-key support — chain to `/add-hosted-key` if missing
42+
43+
**This is the required gate.** For every tool a provider calls, open `apps/sim/tools/{service}/{action}.ts` and check for a `hosting` block:
44+
45+
```typescript
46+
hosting: {
47+
envKeyPrefix: 'SERVICE_API_KEY',
48+
apiKeyParam: 'apiKey',
49+
byokProviderId: 'service',
50+
pricing: { /* ... */ },
51+
rateLimit: { /* ... */ },
52+
}
53+
```
54+
55+
- **If `hosting` is present** — good. Note the `envKeyPrefix`; the deployment needs `{PREFIX}_COUNT` + `{PREFIX}_1..N` env vars set for the hosted key to actually resolve at runtime (ops concern, not code). If those env vars aren't set in the target environment, the provider will only run with a workspace BYOK key.
56+
- **If `hosting` is absent** — the tool can't use a Sim-provided key, so the enrichment would silently produce blank cells on hosted Sim. **Stop and run `/add-hosted-key <service>`** to add hosted-key support to that tool first, then come back. Do this for every provider tool that lacks it.
57+
58+
Why it matters: the cascade runner only bills (and only reads `output.cost.total`) when `executeTool` injected a hosted key, which requires the tool's `hosting` config. No `hosting` → no hosted key → the enrichment depends entirely on per-workspace BYOK.
59+
60+
## Step 3: Write the enrichment definition
61+
62+
Create `apps/sim/enrichments/{name}/{name}.ts` and a barrel `index.ts`. Mirror the existing entries (`work-email`, `phone-number`, `company-domain`, `company-info`).
63+
64+
```typescript
65+
import { SomeIcon } from 'lucide-react'
66+
import { filterUndefined } from '@sim/utils/object'
67+
import { normalizeDomain, splitName, str, toolProvider } from '@/enrichments/providers'
68+
import type { EnrichmentConfig } from '@/enrichments/types'
69+
70+
export const myEnrichment: EnrichmentConfig = {
71+
id: 'my-enrichment',
72+
name: 'My Enrichment',
73+
description: 'One concise sentence describing what it finds.',
74+
icon: SomeIcon,
75+
inputs: [
76+
// Person enrichments take a single canonical `fullName` (Clay-style);
77+
// split it with splitName() for tools that need first/last.
78+
{ id: 'fullName', name: 'Full name', type: 'string', required: true },
79+
{ id: 'companyDomain', name: 'Company domain', type: 'string' },
80+
],
81+
outputs: [{ id: 'value', name: 'value', type: 'string' }],
82+
providers: [
83+
toolProvider({
84+
id: 'provider-a',
85+
label: 'Provider A',
86+
toolId: 'service_action', // must have `hosting` (Step 2)
87+
buildParams: (inputs) => {
88+
// Return null when there aren't enough inputs → cascade skips this provider.
89+
const name = splitName(inputs.fullName)
90+
const domain = normalizeDomain(inputs.companyDomain)
91+
if (!name || !domain) return null
92+
return { domain, first_name: name.firstName, last_name: name.lastName }
93+
},
94+
mapOutput: (output) => {
95+
// Return { [outputId]: value } on a hit, or null to fall through.
96+
const value = str(output.value)
97+
return value ? { value } : null
98+
},
99+
}),
100+
// ...additional fallback providers, in priority order.
101+
],
102+
}
103+
```
104+
105+
```typescript
106+
// apps/sim/enrichments/{name}/index.ts
107+
export { myEnrichment } from './my-enrichment'
108+
```
109+
110+
Rules:
111+
- Keep the file **client-safe**: import only `lucide-react`, `@sim/utils/*`, `@/enrichments/providers`, and the types. **Never import `@/tools`** here — the runner does the tool call.
112+
- `buildParams` returns `null` when inputs are insufficient (provider skipped). `mapOutput` returns `null`/empty for a miss (falls through). Use `filterUndefined` when assembling optional tool params; coerce numbers explicitly (don't pass `''` to number outputs).
113+
- Output `id`s are the keys `mapOutput` returns; output `name`s are the default column names (the user can rename them in the config).
114+
115+
## Step 4: Register it
116+
117+
In `apps/sim/enrichments/registry.ts`, import and add the entry (catalog order is registration order):
118+
119+
```typescript
120+
import { myEnrichment } from '@/enrichments/my-enrichment'
121+
122+
export const ENRICHMENT_REGISTRY: EnrichmentRegistry = {
123+
// ...existing
124+
[myEnrichment.id]: myEnrichment,
125+
}
126+
```
127+
128+
## Step 5: Verify
129+
130+
1. `bunx tsc --noEmit` (from `apps/sim`, `NODE_OPTIONS=--max-old-space-size=8192`) and `bunx biome check` on the changed files.
131+
2. In a table → **+ New column → Enrichments** → pick the new enrichment, map its inputs to columns, name the output column(s), Save. Confirm it appears in the catalog with its icon/description.
132+
3. With hosted keys (or a workspace BYOK key) configured for each provider's service, run a row and confirm the cell fills; the dev-server log shows `Enrichment hit { provider }`. A row whose providers all miss completes blank; a row where every provider errored shows an error cell.
133+
134+
## Checklist
135+
136+
- [ ] Each output mapped to a real tool field (verified against the tool's `params`/`outputs`)
137+
- [ ] **Every provider tool has a `hosting` block — ran `/add-hosted-key` for any that didn't**
138+
- [ ] Providers ordered cheapest / most-likely-first; Apollo/LinkedIn not used
139+
- [ ] Enrichment file is client-safe (no `@/tools` import); uses `toolProvider` + shared helpers
140+
- [ ] `buildParams` returns `null` on insufficient inputs; `mapOutput` returns `null` on a miss
141+
- [ ] Registered in `enrichments/registry.ts`
142+
- [ ] tsc + biome clean; created and ran the column end-to-end

apps/sim/app/api/schedules/execute/route.ts

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,14 @@ const JOB_CHUNK_SIZE = 100
2626
const MAX_TICK_DURATION_MS = 3 * 60 * 1000
2727
const STALE_SCHEDULE_CLAIM_MS = getMaxExecutionTimeout()
2828

29+
/**
30+
* Upper bound (ms) for the random start delay applied to each scheduled
31+
* execution. Cron schedules all fire on the same boundary (e.g. every `:00`),
32+
* which stampedes the database connection pool at the top of each minute/hour.
33+
* Spreading starts across a [0, 30s) window smooths that burst.
34+
*/
35+
const SCHEDULE_JITTER_MAX_MS = 30_000
36+
2937
const dueFilter = (queuedAt: Date) =>
3038
and(
3139
isNull(workflowSchedule.archivedAt),
@@ -217,6 +225,7 @@ async function processScheduleItem(
217225
const jobId = await jobQueue.enqueue('schedule-execution', payload, {
218226
jobId: scheduleJobId,
219227
concurrencyKey: scheduleJobId,
228+
delayMs: Math.floor(Math.random() * SCHEDULE_JITTER_MAX_MS),
220229
metadata: {
221230
workflowId: schedule.workflowId ?? undefined,
222231
workspaceId: resolvedWorkspaceId ?? undefined,

apps/sim/app/api/table/[tableId]/columns/run/route.ts

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ export const POST = withRouteHandler(async (request: NextRequest, { params }: Ro
2525
const parsed = await parseRequest(runColumnContract, request, { params })
2626
if (!parsed.success) return parsed.response
2727
const { tableId } = parsed.data.params
28-
const { workspaceId, groupIds, runMode, rowIds } = parsed.data.body
28+
const { workspaceId, groupIds, runMode, rowIds, limit } = parsed.data.body
2929
const access = await checkAccess(tableId, auth.userId, 'write')
3030
if (!access.ok) return accessError(access, requestId, tableId)
3131

@@ -35,6 +35,7 @@ export const POST = withRouteHandler(async (request: NextRequest, { params }: Ro
3535
groupIds,
3636
mode: runMode,
3737
rowIds,
38+
limit,
3839
requestId,
3940
})
4041

apps/sim/app/api/table/[tableId]/dispatches/route.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ export const GET = withRouteHandler(async (request: NextRequest, { params }: Rou
4646
isManualRun: r.isManualRun,
4747
cursor: r.cursor,
4848
scope: r.scope,
49+
...(r.limit ? { limit: r.limit } : {}),
4950
}))
5051

5152
return NextResponse.json({

apps/sim/app/api/table/[tableId]/groups/route.ts

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,10 @@ export const PATCH = withRouteHandler(async (request: NextRequest, { params }: R
113113
...(validated.mappingUpdates !== undefined
114114
? { mappingUpdates: validated.mappingUpdates }
115115
: {}),
116+
...(validated.inputMappings !== undefined
117+
? { inputMappings: validated.inputMappings }
118+
: {}),
119+
...(validated.type !== undefined ? { type: validated.type } : {}),
116120
...(validated.autoRun !== undefined ? { autoRun: validated.autoRun } : {}),
117121
},
118122
requestId

apps/sim/app/workspace/[workspaceId]/tables/[tableId]/components/context-menu/context-menu.tsx

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,10 @@ interface ContextMenuProps {
4242
runningInSelectionCount?: number
4343
/** Whether the table has any workflow columns; gates the run-workflows item. */
4444
hasWorkflowColumns?: boolean
45+
/** True when the menu was opened on a workflow-output cell, so Run / Re-run
46+
* act on that cell's group only (the cascade handles dependents). Switches
47+
* the labels from row-wide ("all cells") to cell-scoped ("cell"). */
48+
workflowCellScoped?: boolean
4549
disableEdit?: boolean
4650
disableInsert?: boolean
4751
disableDelete?: boolean
@@ -64,17 +68,26 @@ export function ContextMenu({
6468
onStopWorkflows,
6569
runningInSelectionCount = 0,
6670
hasWorkflowColumns = false,
71+
workflowCellScoped = false,
6772
disableEdit = false,
6873
disableInsert = false,
6974
disableDelete = false,
7075
}: ContextMenuProps) {
7176
const deleteLabel = selectedRowCount > 1 ? `Delete ${selectedRowCount} rows` : 'Delete row'
72-
const runLabel =
73-
selectedRowCount > 1
77+
const runLabel = workflowCellScoped
78+
? selectedRowCount > 1
79+
? `Run cell on ${selectedRowCount} rows`
80+
: 'Run cell'
81+
: selectedRowCount > 1
7482
? `Run empty or failed cells on ${selectedRowCount} rows`
7583
: 'Run empty or failed cells'
76-
const refreshLabel =
77-
selectedRowCount > 1 ? `Re-run all cells on ${selectedRowCount} rows` : 'Re-run all cells'
84+
const refreshLabel = workflowCellScoped
85+
? selectedRowCount > 1
86+
? `Re-run cell on ${selectedRowCount} rows`
87+
: 'Re-run cell'
88+
: selectedRowCount > 1
89+
? `Re-run all cells on ${selectedRowCount} rows`
90+
: 'Re-run all cells'
7891
const stopLabel =
7992
runningInSelectionCount === 1
8093
? 'Stop running workflow'

0 commit comments

Comments
 (0)