From 568dee7d9cf69a55d2d27f8ca442a660f2c751f2 Mon Sep 17 00:00:00 2001 From: bean1352 Date: Wed, 26 Nov 2025 12:39:32 +0200 Subject: [PATCH 1/7] Add SQLite JSON querying guide --- docs.json | 1 + .../query-json-in-sqlite.mdx | 466 ++++++++++++++++++ 2 files changed, 467 insertions(+) create mode 100644 usage/use-case-examples/query-json-in-sqlite.mdx diff --git a/docs.json b/docs.json index c2b9238d..ac4f84c3 100644 --- a/docs.json +++ b/docs.json @@ -192,6 +192,7 @@ "usage/use-case-examples/offline-only-usage", "usage/use-case-examples/postgis", "usage/use-case-examples/prioritized-sync", + "usage/use-case-examples/query-json-in-sqlite", "usage/use-case-examples/raw-tables", "usage/use-case-examples/custom-write-checkpoints" ] diff --git a/usage/use-case-examples/query-json-in-sqlite.mdx b/usage/use-case-examples/query-json-in-sqlite.mdx new file mode 100644 index 00000000..9a37a620 --- /dev/null +++ b/usage/use-case-examples/query-json-in-sqlite.mdx @@ -0,0 +1,466 @@ +--- +title: "Querying JSON Data in SQLite" +description: "How to query JSON data synced from your backend and stored as strings in SQLite" +--- + +# Overview + +When syncing data from your backend to PowerSync, JSON columns (whether from MongoDB embedded documents, PostgreSQL JSONB fields, or other JSON data types) are stored as JSON strings in SQLite. This guide shows you how to effectively query and filter this nested data using SQLite's powerful JSON functions. + +## Understanding JSON Storage in PowerSync + +Your backend database might store structured data in various ways: +- **MongoDB**: Embedded documents and nested arrays +- **PostgreSQL**: JSONB or JSON column types +- **MySQL**: JSON column type + +Regardless of the source, PowerSync syncs these JSON structures to SQLite as string columns. On the client side, you use SQLite's built-in JSON functions to query this data without parsing it yourself. + +## Example Data Structure + +Let's use a task management system where tasks have nested metadata: +```json +{ + "id": "task_123", + "title": "Redesign homepage", + "assignees": [ + { + "user_id": "user_001", + "role": "designer", + "hours_allocated": 20 + }, + { + "user_id": "user_002", + "role": "developer", + "hours_allocated": 40 + } + ], + "tags": ["urgent", "frontend", "design"], + "metadata": { + "priority": 1, + "sprint": "2024-Q1", + "dependencies": ["task_100", "task_101"] + } +} +``` + +In SQLite, the `assignees`, `tags`, and `metadata` columns are stored as JSON strings. + +## JSON Extraction Basics + +### Standard `json_extract()` Function + +Extract values from JSON using path expressions: +```sql +SELECT + id, + title, + json_extract(metadata, '$.priority') AS priority, + json_extract(metadata, '$.sprint') AS sprint +FROM tasks; +``` + +**Path syntax:** +- `$` - root element +- `.` - object member access +- `[index]` - array element access + +### Shorthand: The -> and ->> Operators + +SQLite provides convenient shorthand operators for JSON extraction: +```sql +SELECT + id, + title, + metadata -> '$.priority' AS priority, + metadata ->> '$.sprint' AS sprint +FROM tasks; +``` + +**The difference between -> and ->>:** +- `->` returns JSON (preserves type information, quotes strings) +- `->>` returns TEXT (unquoted, for final values) +```sql +-- Using -> (returns JSON) +SELECT metadata -> '$.priority' FROM test; +-- Result: '1' (as JSON string with quotes) + +-- Using ->> (returns actual value) +SELECT metadata ->> '$.priority' FROM test; +-- Result: 1 (as actual number without quotes) + +-- For strings, the difference is clearer: +SELECT metadata -> '$.sprint' FROM test; +-- Result: "2024-Q1" (with quotes, as JSON) + +SELECT metadata ->> '$.sprint' FROM test; +-- Result: 2024-Q1 (without quotes, as text) +``` + + +**When to use which:** +- Use `->>` when extracting **final values** for display or comparison +- Use `->` when extracting **intermediate JSON** for further processing +- `->>` preserves data types (numbers stay numbers, not strings) + + +### Nested Path Access + +Access deeply nested values: +```sql +-- All three are equivalent: +json_extract(metadata, '$.dependencies[0]') +metadata -> '$.dependencies[0]' +metadata -> '$.dependencies' -> '$[0]' +``` + +## Querying Arrays with `json_each()` + +### Flattening Simple Arrays + +For the `tags` array, use `json_each()` to create one row per element: +```sql +SELECT + t.id, + t.title, + tag.value AS tag +FROM tasks t, + json_each(t.tags) AS tag +WHERE tag.value = 'urgent'; +``` + +**What's happening:** +- `json_each(t.tags)` creates a virtual table with one row per tag +- `tag.value` contains each individual tag string +- You can filter, join, or aggregate these expanded rows + +### Querying Nested Objects in Arrays + +For complex objects like `assignees`: +```sql +SELECT + t.id, + t.title, + assignee.value ->> '$.user_id' AS user_id, + assignee.value ->> '$.role' AS role, + assignee.value -> '$.hours_allocated' AS hours +FROM tasks t, + json_each(t.assignees) AS assignee +WHERE assignee.value ->> '$.role' = 'developer'; +``` + +**Key points:** +- Each `assignee.value` is a JSON object representing one assignee +- Use `->>` to extract text values for comparison +- Use `->` when you need numeric values for calculations + +## Real-World Query Examples + +### Example 1: Finding Tasks by Assignee + +**Use case:** Show all tasks assigned to a specific user. +```sql +SELECT DISTINCT + t.id, + t.title, + t.metadata ->> '$.priority' AS priority +FROM tasks t, + json_each(t.assignees) AS assignee +WHERE assignee.value ->> '$.user_id' = 'user_001' +ORDER BY CAST(t.metadata -> '$.priority' AS INTEGER); +``` + + +**Note the CAST**: JSON numbers are extracted as text. When doing numeric comparisons or sorting, cast to the appropriate type. + + +### Example 2: Calculating Total Hours by Role + +**Use case:** Aggregate hours across all tasks grouped by role. +```sql +SELECT + assignee.value ->> '$.role' AS role, + SUM(CAST(assignee.value -> '$.hours_allocated' AS INTEGER)) AS total_hours, + COUNT(DISTINCT t.id) AS task_count +FROM tasks t, + json_each(t.assignees) AS assignee +GROUP BY role +ORDER BY total_hours DESC; +``` + +### Example 3: Tasks with Specific Tags + +**Use case:** Find tasks tagged with multiple specific tags. +```sql +-- Tasks with BOTH 'urgent' AND 'frontend' tags +SELECT DISTINCT t.* +FROM tasks t +WHERE EXISTS ( + SELECT 1 FROM json_each(t.tags) + WHERE value = 'urgent' +) +AND EXISTS ( + SELECT 1 FROM json_each(t.tags) + WHERE value = 'frontend' +); +``` + +Or using a simpler approach for single tags: +```sql +-- Tasks with 'urgent' tag +SELECT * +FROM tasks t, + json_each(t.tags) AS tag +WHERE tag.value = 'urgent'; +``` + +### Example 4: Filtering by Array Contents + +**Use case:** Find tasks that depend on a specific task ID. +```sql +SELECT * +FROM tasks t, + json_each(t.metadata -> '$.dependencies') AS dep +WHERE dep.value = 'task_100'; +``` + +### Example 5: Checking for Array Membership + +**Use case:** Check if a task has any dependencies. +```sql +SELECT + id, + title, + json_array_length(metadata -> '$.dependencies') AS dep_count +FROM tasks +WHERE json_array_length(metadata -> '$.dependencies') > 0; +``` + +## Working with Comma or Delimiter-Separated Values + +Sometimes JSON strings contain delimiter-separated values (like `"NYC;LAX;MIA"`). Here's how to query them efficiently: +```sql +-- Assume tasks have a field: "approved_by": "user_001;user_002;user_003" + +-- Find tasks approved by a specific user +SELECT * +FROM tasks +WHERE instr( + ';' || (metadata ->> '$.approved_by') || ';', + ';user_001;' +) > 0; +``` + +**Why this pattern works:** +- Wraps the value: `";user_001;user_002;user_003;"` +- Searches for `;user_001;` ensuring exact delimiter-bounded match +- Prevents false matches (won't match "user_0011" when searching for "user_001") + +**Avoid `LIKE` for delimited strings:** +```sql +-- ❌ WRONG - can match partial values +WHERE metadata ->> '$.approved_by' LIKE '%user_001%' +-- This would incorrectly match "user_0011" or "user_001_archive" + +-- ✅ CORRECT - exact delimiter match +WHERE instr(';' || (metadata ->> '$.approved_by') || ';', ';user_001;') > 0 +``` + +## Advanced Techniques + +### Using CTEs for Cleaner Queries + +Common Table Expressions make complex JSON queries more readable: +```sql +WITH task_assignees AS ( + SELECT + t.id, + t.title, + assignee.value ->> '$.user_id' AS user_id, + assignee.value ->> '$.role' AS role, + CAST(assignee.value -> '$.hours_allocated' AS INTEGER) AS hours + FROM tasks t, + json_each(t.assignees) AS assignee +) +SELECT + user_id, + role, + SUM(hours) AS total_hours, + COUNT(*) AS assignment_count +FROM task_assignees +WHERE hours > 10 +GROUP BY user_id, role; +``` + +### Combining Multiple JSON Arrays + +Query across multiple nested arrays: +```sql +SELECT DISTINCT + t.id, + t.title, + assignee.value ->> '$.user_id' AS assigned_to, + tag.value AS tag +FROM tasks t, + json_each(t.assignees) AS assignee, + json_each(t.tags) AS tag +WHERE tag.value IN ('urgent', 'high-priority') + AND assignee.value ->> '$.role' = 'developer'; +``` + + +**Cartesian product warning:** When using multiple `json_each()` calls, you create a Cartesian product. A task with 3 assignees and 4 tags creates 12 rows. Use `DISTINCT` when needed and filter early to minimize row expansion. + + +### Checking for Key Existence + +Verify if a JSON key exists: +```sql +-- Check if 'sprint' key exists +SELECT * +FROM tasks +WHERE json_extract(metadata, '$.sprint') IS NOT NULL; + +-- Or using shorthand +SELECT * +FROM tasks +WHERE metadata -> '$.sprint' IS NOT NULL; +``` + +## Performance Optimization + +**Important Performance Considerations** + +1. **Index JSON columns for better performance**: If you frequently query JSON fields, add indexes to the JSON string columns in your `AppSchema`: +```typescript +const tasks = new Table( + { + id: column.text, + title: column.text, + metadata: column.text, + tags: column.text, + }, + { + indexes: { + tagsIndex: ['tags'] + } + } +); +``` + +2. **Minimize `json_each()` usage**: Each `json_each()` call expands rows. For a table with 10,000 tasks averaging 5 assignees each, you're processing 50,000 rows. + +3. **Filter before expanding**: Apply non-JSON filters first: +```sql +-- ✅ GOOD - Filter by title first, then expand JSON +SELECT t.*, assignee.value +FROM tasks t, + json_each(t.assignees) AS assignee +WHERE t.title LIKE '%homepage%' + AND assignee.value ->> '$.role' = 'developer'; + +-- ❌ LESS EFFICIENT - Expands all rows first +SELECT t.*, assignee.value +FROM tasks t, + json_each(t.assignees) AS assignee +WHERE assignee.value ->> '$.role' = 'developer' + AND t.title LIKE '%homepage%'; +``` + +4. **Use EXISTS for membership checks**: More efficient than joining: +```sql +-- ✅ BETTER for large datasets +SELECT * FROM tasks t +WHERE EXISTS ( + SELECT 1 FROM json_each(t.tags) WHERE value = 'urgent' +); + +-- vs joining which creates all row combinations +``` + +5. **Cache extracted values in CTEs**: Extract once, use multiple times: +```sql +WITH enriched_tasks AS ( + SELECT + *, + metadata ->> '$.sprint' AS sprint, + CAST(metadata -> '$.priority' AS INTEGER) AS priority_num + FROM tasks +) +SELECT * FROM enriched_tasks +WHERE sprint = '2024-Q1' AND priority_num < 3; +``` + +## Useful JSON Functions + +Beyond extraction, SQLite offers many JSON utilities: +```sql +-- Get array length +SELECT json_array_length(tags) FROM tasks; + +-- Check JSON validity +SELECT json_valid(metadata) FROM tasks; + +-- Get all object keys +SELECT json_each.key, json_each.value +FROM tasks, + json_each(tasks.metadata) +WHERE id = 'task_123'; + +-- Get JSON type of a value +SELECT json_type(metadata -> '$.priority') FROM tasks; +-- Returns: 'integer', 'text', 'array', 'object', 'null', etc. + +-- Aggregate JSON arrays +SELECT json_group_array(tag.value) +FROM tasks t, + json_each(t.tags) AS tag +WHERE t.id = 'task_123'; +``` + +## Common Gotchas + +**Watch out for these common issues:** + +1. **NULL vs missing keys**: `json_extract()` returns `NULL` for non-existent paths. Always check for NULL: +```sql +WHERE COALESCE(metadata ->> '$.priority', '999') = '1' +``` + +2. **Type mismatches**: JSON numbers are returned as text with `->`. Always cast for numeric operations: +```sql +-- ❌ String comparison (wrong!) +WHERE metadata -> '$.priority' > 5 + +-- ✅ Numeric comparison (correct!) +WHERE CAST(metadata -> '$.priority' AS INTEGER) > 5 +``` + +3. **Array index bounds**: Out-of-bounds array access returns NULL, not an error: +```sql +SELECT metadata -> '$.dependencies[99]' -- Returns NULL if not enough elements +``` + +4. **Quotes in JSON strings**: Use `->>` to get unquoted text, not `->`: +```sql +-- ❌ Returns: "2024-Q1" (with quotes) +WHERE metadata -> '$.sprint' = '2024-Q1' + +-- ✅ Returns: 2024-Q1 (without quotes) +WHERE metadata ->> '$.sprint' = '2024-Q1' +``` + +5. **Performance on large arrays**: `json_each()` on arrays with thousands of elements can be slow. Consider data restructuring for such cases. + +## Summary + +Querying JSON data in SQLite effectively requires: +- Understanding that JSON is stored as strings but queryable with built-in functions +- Using `json_extract()` or the shorthand `->` and `->>` operators +- Leveraging `json_each()` to flatten arrays for filtering and aggregation +- Being mindful of type conversions and NULL handling +- Optimizing queries by filtering early and considering denormalization for critical paths + +With these techniques, you can query complex nested data structures synced from your backend while maintaining good performance on mobile and edge devices. + +For complete SQLite JSON function reference, see the [SQLite JSON documentation](https://www.sqlite.org/json1.html). \ No newline at end of file From 257d6068bd94f4fa03489d6c426c3a5ca0cc3f65 Mon Sep 17 00:00:00 2001 From: bean1352 Date: Wed, 26 Nov 2025 16:36:30 +0200 Subject: [PATCH 2/7] Refine SQLite JSON querying documentation for clarity and detail. Update text column reference and add links for further information on JSON handling and database type mapping. --- usage/use-case-examples/query-json-in-sqlite.mdx | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/usage/use-case-examples/query-json-in-sqlite.mdx b/usage/use-case-examples/query-json-in-sqlite.mdx index 9a37a620..9c0368f2 100644 --- a/usage/use-case-examples/query-json-in-sqlite.mdx +++ b/usage/use-case-examples/query-json-in-sqlite.mdx @@ -14,7 +14,7 @@ Your backend database might store structured data in various ways: - **PostgreSQL**: JSONB or JSON column types - **MySQL**: JSON column type -Regardless of the source, PowerSync syncs these JSON structures to SQLite as string columns. On the client side, you use SQLite's built-in JSON functions to query this data without parsing it yourself. +Regardless of the source, PowerSync syncs these JSON structures to SQLite as text columns. On the client side, you can query this data using SQLite's built-in JSON functions without needing to parse it yourself. Learn more about [how PowerSync handles JSON, arrays, and custom types](/usage/use-case-examples/custom-types-arrays-and-json#javascript). ## Example Data Structure @@ -44,7 +44,7 @@ Let's use a task management system where tasks have nested metadata: } ``` -In SQLite, the `assignees`, `tags`, and `metadata` columns are stored as JSON strings. +In SQLite, the `assignees`, `tags`, and `metadata` columns are stored as JSON strings. For details on how different backend types map to SQLite, see [database types and mapping](/usage/sync-rules/types). ## JSON Extraction Basics From d2447cdcb859b5fc6252eb14f8b05d33c72b14c2 Mon Sep 17 00:00:00 2001 From: bean1352 Date: Tue, 2 Dec 2025 10:03:25 +0200 Subject: [PATCH 3/7] Clarified data storage descriptions, updating terminology, and adding links to relevant SQLite functions for better user guidance. --- usage/use-case-examples/query-json-in-sqlite.mdx | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/usage/use-case-examples/query-json-in-sqlite.mdx b/usage/use-case-examples/query-json-in-sqlite.mdx index 9c0368f2..aec4794e 100644 --- a/usage/use-case-examples/query-json-in-sqlite.mdx +++ b/usage/use-case-examples/query-json-in-sqlite.mdx @@ -5,12 +5,12 @@ description: "How to query JSON data synced from your backend and stored as stri # Overview -When syncing data from your backend to PowerSync, JSON columns (whether from MongoDB embedded documents, PostgreSQL JSONB fields, or other JSON data types) are stored as JSON strings in SQLite. This guide shows you how to effectively query and filter this nested data using SQLite's powerful JSON functions. +When syncing data from your source backend database to PowerSync, JSON columns (whether from MongoDB, documents, PostgreSQL JSONB columns, or other JSON data types) are stored as TEXT in SQLite. This guide shows you how to effectively query and filter JSON data using SQLite's powerful JSON functions on the client. ## Understanding JSON Storage in PowerSync Your backend database might store structured data in various ways: -- **MongoDB**: Embedded documents and nested arrays +- **MongoDB**: Documents with nested documents and arrays - **PostgreSQL**: JSONB or JSON column types - **MySQL**: JSON column type @@ -48,7 +48,7 @@ In SQLite, the `assignees`, `tags`, and `metadata` columns are stored as JSON st ## JSON Extraction Basics -### Standard `json_extract()` Function +### Standard [`json_extract()`](https://sqlite.org/json1.html#jex) Function Extract values from JSON using path expressions: ```sql @@ -67,7 +67,8 @@ FROM tasks; ### Shorthand: The -> and ->> Operators -SQLite provides convenient shorthand operators for JSON extraction: +SQLite provides convenient [shorthand operators](https://sqlite.org/json1.html#jptr) for JSON extraction: + ```sql SELECT id, @@ -114,7 +115,7 @@ metadata -> '$.dependencies[0]' metadata -> '$.dependencies' -> '$[0]' ``` -## Querying Arrays with `json_each()` +## Querying Arrays with [`json_each()`](https://sqlite.org/json1.html#jeach) ### Flattening Simple Arrays From a091bedea9f72fccca8239739face0e2ca402347 Mon Sep 17 00:00:00 2001 From: bean1352 Date: Thu, 11 Dec 2025 17:56:33 +0200 Subject: [PATCH 4/7] Updated SQLite JSON querying documentation to clarify the usage of JSON functions, corrected example references to 'tasks', and added a link to the SQLite JSON functions for enhanced user guidance. --- usage/use-case-examples/query-json-in-sqlite.mdx | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/usage/use-case-examples/query-json-in-sqlite.mdx b/usage/use-case-examples/query-json-in-sqlite.mdx index aec4794e..5467fe3a 100644 --- a/usage/use-case-examples/query-json-in-sqlite.mdx +++ b/usage/use-case-examples/query-json-in-sqlite.mdx @@ -5,7 +5,7 @@ description: "How to query JSON data synced from your backend and stored as stri # Overview -When syncing data from your source backend database to PowerSync, JSON columns (whether from MongoDB, documents, PostgreSQL JSONB columns, or other JSON data types) are stored as TEXT in SQLite. This guide shows you how to effectively query and filter JSON data using SQLite's powerful JSON functions on the client. +When syncing data from your source backend database to PowerSync, JSON columns (whether from MongoDB documents, PostgreSQL JSONB columns, or other JSON data types) are stored as `TEXT` in SQLite. See the [sync rule type mapping guide](/usage/sync-rules/types#types) for more details. This guide shows you how to effectively query and filter JSON data using SQLite's powerful JSON functions on the client. ## Understanding JSON Storage in PowerSync @@ -78,23 +78,23 @@ SELECT FROM tasks; ``` -**The difference between -> and ->>:** +**The difference between [-> and ->>](https://sqlite.org/json1.html#jptr):** - `->` returns JSON (preserves type information, quotes strings) - `->>` returns TEXT (unquoted, for final values) ```sql -- Using -> (returns JSON) -SELECT metadata -> '$.priority' FROM test; +SELECT metadata -> '$.priority' FROM tasks; -- Result: '1' (as JSON string with quotes) -- Using ->> (returns actual value) -SELECT metadata ->> '$.priority' FROM test; +SELECT metadata ->> '$.priority' FROM tasks; -- Result: 1 (as actual number without quotes) -- For strings, the difference is clearer: -SELECT metadata -> '$.sprint' FROM test; +SELECT metadata -> '$.sprint' FROM tasks; -- Result: "2024-Q1" (with quotes, as JSON) -SELECT metadata ->> '$.sprint' FROM test; +SELECT metadata ->> '$.sprint' FROM tasks; -- Result: 2024-Q1 (without quotes, as text) ``` From 044f79158bffd164a30c96e35a38a0b7a25bbb44 Mon Sep 17 00:00:00 2001 From: bean1352 Date: Fri, 12 Dec 2025 09:58:55 +0200 Subject: [PATCH 5/7] Enhance SQLite JSON querying documentation by refining data storage descriptions, updating example queries for clarity, and correcting references to JSON functions. Added SQL Server JSON column type and improved formatting for better readability. --- .../query-json-in-sqlite.mdx | 71 ++++++++----------- 1 file changed, 29 insertions(+), 42 deletions(-) diff --git a/usage/use-case-examples/query-json-in-sqlite.mdx b/usage/use-case-examples/query-json-in-sqlite.mdx index 5467fe3a..eeabe651 100644 --- a/usage/use-case-examples/query-json-in-sqlite.mdx +++ b/usage/use-case-examples/query-json-in-sqlite.mdx @@ -9,12 +9,13 @@ When syncing data from your source backend database to PowerSync, JSON columns ( ## Understanding JSON Storage in PowerSync -Your backend database might store structured data in various ways: -- **MongoDB**: Documents with nested documents and arrays -- **PostgreSQL**: JSONB or JSON column types -- **MySQL**: JSON column type +Your backend database might store structured data as JSON in various ways: +- **MongoDB**: Nested documents and arrays +- **PostgreSQL**: JSONB, JSON, array, or custom types +- **MySQL**: JSON columns +- **SQL Server**: JSON columns -Regardless of the source, PowerSync syncs these JSON structures to SQLite as text columns. On the client side, you can query this data using SQLite's built-in JSON functions without needing to parse it yourself. Learn more about [how PowerSync handles JSON, arrays, and custom types](/usage/use-case-examples/custom-types-arrays-and-json#javascript). +Regardless of the source, PowerSync syncs these JSON structures to SQLite as `TEXT` columns. On the client side, you can query this data using SQLite's built-in JSON functions without needing to parse it yourself. Learn more about [how PowerSync handles JSON, arrays, and custom types](/usage/use-case-examples/custom-types-arrays-and-json#javascript). ## Example Data Structure @@ -80,7 +81,7 @@ FROM tasks; **The difference between [-> and ->>](https://sqlite.org/json1.html#jptr):** - `->` returns JSON (preserves type information, quotes strings) -- `->>` returns TEXT (unquoted, for final values) +- `->>` extracts the value unquoted (strings as TEXT, numbers/booleans as their native types) ```sql -- Using -> (returns JSON) SELECT metadata -> '$.priority' FROM tasks; @@ -147,7 +148,7 @@ SELECT assignee.value -> '$.hours_allocated' AS hours FROM tasks t, json_each(t.assignees) AS assignee -WHERE assignee.value ->> '$.role' = 'developer'; +WHERE (assignee.value ->> '$.role') = 'developer'; ``` **Key points:** @@ -167,21 +168,17 @@ SELECT DISTINCT t.metadata ->> '$.priority' AS priority FROM tasks t, json_each(t.assignees) AS assignee -WHERE assignee.value ->> '$.user_id' = 'user_001' -ORDER BY CAST(t.metadata -> '$.priority' AS INTEGER); +WHERE (assignee.value ->> '$.user_id') = 'user_001' +ORDER BY t.metadata ->> '$.priority'; ``` - -**Note the CAST**: JSON numbers are extracted as text. When doing numeric comparisons or sorting, cast to the appropriate type. - - ### Example 2: Calculating Total Hours by Role **Use case:** Aggregate hours across all tasks grouped by role. ```sql SELECT assignee.value ->> '$.role' AS role, - SUM(CAST(assignee.value -> '$.hours_allocated' AS INTEGER)) AS total_hours, + SUM(assignee.value ->> '$.hours_allocated') AS total_hours, COUNT(DISTINCT t.id) AS task_count FROM tasks t, json_each(t.assignees) AS assignee @@ -260,7 +257,7 @@ WHERE instr( **Avoid `LIKE` for delimited strings:** ```sql -- ❌ WRONG - can match partial values -WHERE metadata ->> '$.approved_by' LIKE '%user_001%' +WHERE (metadata ->> '$.approved_by') LIKE '%user_001%' -- This would incorrectly match "user_0011" or "user_001_archive" -- ✅ CORRECT - exact delimiter match @@ -279,7 +276,7 @@ WITH task_assignees AS ( t.title, assignee.value ->> '$.user_id' AS user_id, assignee.value ->> '$.role' AS role, - CAST(assignee.value -> '$.hours_allocated' AS INTEGER) AS hours + assignee.value ->> '$.hours_allocated' AS hours FROM tasks t, json_each(t.assignees) AS assignee ) @@ -351,24 +348,7 @@ const tasks = new Table( 2. **Minimize `json_each()` usage**: Each `json_each()` call expands rows. For a table with 10,000 tasks averaging 5 assignees each, you're processing 50,000 rows. -3. **Filter before expanding**: Apply non-JSON filters first: -```sql --- ✅ GOOD - Filter by title first, then expand JSON -SELECT t.*, assignee.value -FROM tasks t, - json_each(t.assignees) AS assignee -WHERE t.title LIKE '%homepage%' - AND assignee.value ->> '$.role' = 'developer'; - --- ❌ LESS EFFICIENT - Expands all rows first -SELECT t.*, assignee.value -FROM tasks t, - json_each(t.assignees) AS assignee -WHERE assignee.value ->> '$.role' = 'developer' - AND t.title LIKE '%homepage%'; -``` - -4. **Use EXISTS for membership checks**: More efficient than joining: +3. **Use EXISTS for membership checks**: More efficient than joining: ```sql -- ✅ BETTER for large datasets SELECT * FROM tasks t @@ -379,17 +359,24 @@ WHERE EXISTS ( -- vs joining which creates all row combinations ``` -5. **Cache extracted values in CTEs**: Extract once, use multiple times: +4. **Cache extracted values in CTEs**: Extract once, use multiple times: ```sql -WITH enriched_tasks AS ( +WITH task_metrics AS ( SELECT - *, - metadata ->> '$.sprint' AS sprint, - CAST(metadata -> '$.priority' AS INTEGER) AS priority_num - FROM tasks + t.id, + t.title, + t.metadata, + COUNT(assignee.value) AS assignee_count, + SUM(assignee.value ->> '$.hours_allocated') AS total_hours + FROM tasks t, + json_each(t.assignees) AS assignee + GROUP BY t.id, t.title, t.metadata ) -SELECT * FROM enriched_tasks -WHERE sprint = '2024-Q1' AND priority_num < 3; +SELECT * +FROM task_metrics +WHERE metadata ->> '$.sprint' = '2024-Q1' + AND assignee_count > 1 +ORDER BY total_hours DESC; ``` ## Useful JSON Functions From fe4d551eda34b0204b5fbab4460a97a614336b5e Mon Sep 17 00:00:00 2001 From: bean1352 Date: Fri, 12 Dec 2025 10:07:22 +0200 Subject: [PATCH 6/7] Added examples demonstrating correct casting and usage of JSON extraction for numeric values. --- usage/use-case-examples/query-json-in-sqlite.mdx | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/usage/use-case-examples/query-json-in-sqlite.mdx b/usage/use-case-examples/query-json-in-sqlite.mdx index eeabe651..6bf5176a 100644 --- a/usage/use-case-examples/query-json-in-sqlite.mdx +++ b/usage/use-case-examples/query-json-in-sqlite.mdx @@ -415,13 +415,16 @@ WHERE t.id = 'task_123'; WHERE COALESCE(metadata ->> '$.priority', '999') = '1' ``` -2. **Type mismatches**: JSON numbers are returned as text with `->`. Always cast for numeric operations: +2. **Type mismatches**: ```sql -- ❌ String comparison (wrong!) WHERE metadata -> '$.priority' > 5 --- ✅ Numeric comparison (correct!) +-- ✅ Cast to numeric type WHERE CAST(metadata -> '$.priority' AS INTEGER) > 5 + +-- ✅ BEST: Use ->> for direct numeric extraction +WHERE metadata ->> '$.priority' > 5 ``` 3. **Array index bounds**: Out-of-bounds array access returns NULL, not an error: From b397514fc053e8888a1bc91b769b0763c1efac0d Mon Sep 17 00:00:00 2001 From: bean1352 Date: Fri, 12 Dec 2025 12:17:52 +0200 Subject: [PATCH 7/7] Updated SQLite JSON querying examples to reflect changes in JSON extraction syntax and data types, ensuring accurate representation of numeric values and improving clarity in usage instructions. --- usage/use-case-examples/query-json-in-sqlite.mdx | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/usage/use-case-examples/query-json-in-sqlite.mdx b/usage/use-case-examples/query-json-in-sqlite.mdx index 6bf5176a..9df31605 100644 --- a/usage/use-case-examples/query-json-in-sqlite.mdx +++ b/usage/use-case-examples/query-json-in-sqlite.mdx @@ -74,7 +74,7 @@ SQLite provides convenient [shorthand operators](https://sqlite.org/json1.html#j SELECT id, title, - metadata -> '$.priority' AS priority, + metadata -> '$.dependencies' AS dependencies_array, -- maintains JSON array metadata ->> '$.sprint' AS sprint FROM tasks; ``` @@ -85,7 +85,7 @@ FROM tasks; ```sql -- Using -> (returns JSON) SELECT metadata -> '$.priority' FROM tasks; --- Result: '1' (as JSON string with quotes) +-- Result: 1 (as JSON without quotes) -- Using ->> (returns actual value) SELECT metadata ->> '$.priority' FROM tasks; @@ -412,7 +412,7 @@ WHERE t.id = 'task_123'; 1. **NULL vs missing keys**: `json_extract()` returns `NULL` for non-existent paths. Always check for NULL: ```sql -WHERE COALESCE(metadata ->> '$.priority', '999') = '1' +WHERE COALESCE(metadata ->> '$.priority', 999) = 1 ``` 2. **Type mismatches**: @@ -420,9 +420,6 @@ WHERE COALESCE(metadata ->> '$.priority', '999') = '1' -- ❌ String comparison (wrong!) WHERE metadata -> '$.priority' > 5 --- ✅ Cast to numeric type -WHERE CAST(metadata -> '$.priority' AS INTEGER) > 5 - -- ✅ BEST: Use ->> for direct numeric extraction WHERE metadata ->> '$.priority' > 5 ```