Skip to content

feat: provision + export documents table with generated SQL, SDK, and CLI#35

Merged
pyramation merged 2 commits intodevin/1777497144-add-documents-tablefrom
feat/documents-provision-export
Apr 30, 2026
Merged

feat: provision + export documents table with generated SQL, SDK, and CLI#35
pyramation merged 2 commits intodevin/1777497144-add-documents-tablefrom
feat/documents-provision-export

Conversation

@pyramation
Copy link
Copy Markdown
Contributor

@pyramation pyramation commented Apr 30, 2026

Summary

Completes the provision → export → codegen pipeline for the documents table and all existing schemas. This PR builds on #34 (blueprint definitions) and adds:

  1. DataId node restored — all schema files now include 'DataId' alongside 'DataTimestamps' in their nodes arrays. Removing ORG_NODES previously dropped DataId, which meant tables had no primary key constraint registered in metaschema. This caused chunk table FK generation to produce invalid SQL (REFERENCES parent_table () instead of REFERENCES parent_table (id)).

  2. Removed duplicate BM25 indexesSearchUnified already creates a BM25 index on embedding_text (which includes content via source_fields). The explicit content BM25 index on documents and notes was redundant and caused:

    • Duplicate filter/score fields in the GraphQL API (bm25Content + bm25EmbeddingText)
    • Double-counted content relevance in the composite searchScore
  3. Provisioned all schemas — 11 blueprints (CRM, Agent, Runtime, Projects, Life OS, Email & Calendar, Staging, Autonomy, Documents, Cross-Relations, Spatial Relations) successfully applied to a live database.

  4. Generated SQL migrations (packages/agentic-db/, packages/agentic-db-services/) — deterministic pgpm export from the provisioned database.

  5. Regenerated SDK + CLI — GraphQL schema (1398 types), ORM client (95 tables including Document, DocumentsChunk, CompanyDocument, ProjectDocument), and CLI commands.

New tables added:

  • documents — with SearchUnified (embedding + BM25 + chunks), repo/path/commit tracking
  • documents_chunks — chunked embeddings for RAG (auto-generated by SearchUnified chunks: {})
  • company_documents — M2M junction table
  • project_documents — M2M junction table

Review & Testing Checklist for Human

  • Verify documents table columns match expectations: title, content, metadata (jsonb), repo_name, file_path, commit_hash, tags (citext[])
  • Confirm DataId is present in all schema files' nodes arrays
  • Confirm only one BM25 index per table (on embedding_text, not on raw content)
  • Deploy to a fresh database (pgpm docker start --recreate → deploy → provision) to verify migrations apply cleanly
  • Check that the generated SDK includes Document, DocumentsChunk, CompanyDocument, ProjectDocument models with expected fields

Notes

  • The alteration numbering in packages/agentic-db/ changed because DataId now creates the id column explicitly (previously it was implicit), which shifts the deterministic ID sequence for all tables.
  • packages/agentic-db-services/ had some tables removed (apis, api_schemas, domains) — this appears to be a cleanup from the latest constructive-db platform.
    </pr_template>

Link to Devin session: https://app.devin.ai/sessions/2a4faca18f1a4b4aaf3d8e1162f67e0d
Requested by: @pyramation

… CLI

- Add 'DataId' node to all schema files (required for PK constraint creation)
- Run provision: all 11 schemas (including Documents) applied successfully
- Run export:pgpm: SQL migrations generated in packages/agentic-db/
- Run generate:all: GraphQL schema, ORM SDK, and CLI regenerated
- Documents table includes SearchUnified (embedding + BM25 + chunks)
- Junction tables: company_documents, project_documents
@devin-ai-integration
Copy link
Copy Markdown

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

SearchUnified already creates a BM25 index on embedding_text (which
includes content via source_fields). The explicit content BM25 index
caused duplicate filter/score fields in the GraphQL API and
double-counted content relevance in the composite searchScore.

Removed from:
- documents.ts: content BM25 index
- crm.ts: notes content BM25 index

Re-provisioned and regenerated all output.
@pyramation pyramation merged commit d190546 into devin/1777497144-add-documents-table Apr 30, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant