feat(intelligence): add Subject-Predicate Specialisation (mutual information)#2995
feat(intelligence): add Subject-Predicate Specialisation (mutual information)#2995aashir-athar wants to merge 1 commit into
Conversation
📝 WalkthroughWalkthroughThis PR introduces Subject-Predicate Mutual Information (MI) analytics: a pure computation engine for information-theoretic specialization metrics, an RPC facade integrating with memory graph queries, container and presentational React components with async state management and multi-state UI, comprehensive test coverage, and multilingual translations integrated into the Intelligence page. ChangesSubject-Predicate Mutual Information Analytics
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches⚔️ Resolve merge conflicts
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
app/src/components/intelligence/SubjectPredicateMITab.tsx (1)
40-46: ⚡ Quick winUse
async/awaitfor the namespace bootstrap path.This effect is the only place in the component still using a
.then()/.catch()chain. Converting it to an inner async function keeps promise handling consistent and matches the repo rule for TSX files.♻️ Suggested refactor
useEffect(() => { - // Namespaces are optional UI sugar; a failure to list them must not block - // the specialisation view, so swallow that error specifically. - loadNamespaces() - .then(setNamespaces) - .catch(() => setNamespaces([])); - void load(''); + const bootstrap = async () => { + // Namespaces are optional UI sugar; a failure to list them must not block + // the specialisation view, so swallow that error specifically. + try { + setNamespaces(await loadNamespaces()); + } catch { + setNamespaces([]); + } + await load(''); + }; + + void bootstrap(); }, [load]);As per coding guidelines "Always use async/await for promises instead of
.then()chains".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/src/components/intelligence/SubjectPredicateMITab.tsx` around lines 40 - 46, Refactor the useEffect to define and immediately invoke an inner async function that awaits loadNamespaces and sets state via setNamespaces inside a try/catch (on error call setNamespaces([])) instead of using .then/.catch; keep the existing call to load('') (you can await it or keep it prefixed with void) so the behavior of load remains unchanged; update references to useEffect, loadNamespaces, setNamespaces, and load accordingly.app/src/components/intelligence/SubjectPredicateMITab.test.tsx (1)
33-63: ⚡ Quick winAdd a stale-response regression test for the request token guard.
The container’s most subtle behavior is ignoring an older load that resolves after a newer namespace selection. This suite covers happy-path/error rendering, but not the out-of-order case the
latestRequestIdref is protecting.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/src/components/intelligence/SubjectPredicateMITab.test.tsx` around lines 33 - 63, Add a regression test to SubjectPredicateMITab that verifies the request-token guard (latestRequestId ref) ignores stale responses: mock mockLoadNamespaces to return some namespaces, then mock mockLoadMI so the first call returns a promise that resolves only after the second call, trigger a namespace change to cause a second (newer) mockLoadMI call, resolve the older promise afterwards and assert the UI reflects only the newer response (and that mockLoadMI results from the stale call do not update the DOM); reference the component <SubjectPredicateMITab /> and the mock function mockLoadMI and the latestRequestId behavior when implementing the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@app/src/services/api/subjectPredicateMIApi.ts`:
- Around line 18-26: Change the debug prefix to a stable bracketed grep-friendly
prefix and include explicit correlation fields in the log call: replace the
current logger instantiation (log = debug('subject-predicate-mi:api')) with a
bracketed prefix like '[domain] subject-predicate-mi:api' and update the log
call inside loadSubjectPredicateMI to log structured correlation values (e.g.
requestId, method: 'loadSubjectPredicateMI', namespace) along with
relations.length so searches can reliably find and correlate traces.
---
Nitpick comments:
In `@app/src/components/intelligence/SubjectPredicateMITab.test.tsx`:
- Around line 33-63: Add a regression test to SubjectPredicateMITab that
verifies the request-token guard (latestRequestId ref) ignores stale responses:
mock mockLoadNamespaces to return some namespaces, then mock mockLoadMI so the
first call returns a promise that resolves only after the second call, trigger a
namespace change to cause a second (newer) mockLoadMI call, resolve the older
promise afterwards and assert the UI reflects only the newer response (and that
mockLoadMI results from the stale call do not update the DOM); reference the
component <SubjectPredicateMITab /> and the mock function mockLoadMI and the
latestRequestId behavior when implementing the test.
In `@app/src/components/intelligence/SubjectPredicateMITab.tsx`:
- Around line 40-46: Refactor the useEffect to define and immediately invoke an
inner async function that awaits loadNamespaces and sets state via setNamespaces
inside a try/catch (on error call setNamespaces([])) instead of using
.then/.catch; keep the existing call to load('') (you can await it or keep it
prefixed with void) so the behavior of load remains unchanged; update references
to useEffect, loadNamespaces, setNamespaces, and load accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 208a3d56-b804-46fa-b6fa-6045bc67c83d
📒 Files selected for processing (23)
app/src/components/intelligence/SubjectPredicateMIPanel.test.tsxapp/src/components/intelligence/SubjectPredicateMIPanel.tsxapp/src/components/intelligence/SubjectPredicateMITab.test.tsxapp/src/components/intelligence/SubjectPredicateMITab.tsxapp/src/lib/i18n/chunks/ar-1.tsapp/src/lib/i18n/chunks/bn-1.tsapp/src/lib/i18n/chunks/de-1.tsapp/src/lib/i18n/chunks/en-1.tsapp/src/lib/i18n/chunks/es-1.tsapp/src/lib/i18n/chunks/fr-1.tsapp/src/lib/i18n/chunks/hi-1.tsapp/src/lib/i18n/chunks/id-1.tsapp/src/lib/i18n/chunks/it-1.tsapp/src/lib/i18n/chunks/ko-1.tsapp/src/lib/i18n/chunks/pt-1.tsapp/src/lib/i18n/chunks/ru-1.tsapp/src/lib/i18n/chunks/zh-CN-1.tsapp/src/lib/i18n/en.tsapp/src/lib/memory/subjectPredicateMI.test.tsapp/src/lib/memory/subjectPredicateMI.tsapp/src/pages/Intelligence.tsxapp/src/services/api/subjectPredicateMIApi.test.tsapp/src/services/api/subjectPredicateMIApi.ts
| const log = debug('subject-predicate-mi:api'); | ||
|
|
||
| /** Fetch graph relations for a namespace (or all) and compute MI + specialisation. */ | ||
| export async function loadSubjectPredicateMI( | ||
| namespace?: string | ||
| ): Promise<SubjectPredicateMIResult> { | ||
| const relations = await memoryGraphQuery(namespace); | ||
| log('loadSubjectPredicateMI namespace=%s relations=%d', namespace ?? '(all)', relations.length); | ||
| return computeSubjectPredicateMI(relations); |
There was a problem hiding this comment.
Use grep-friendly log prefix and explicit correlation fields.
The debug message should follow the repo’s bracketed prefix convention ([rpc], [domain], etc.) for stable log searchability.
Proposed change
-const log = debug('subject-predicate-mi:api');
+const log = debug('subject-predicate-mi:api');
@@
- log('loadSubjectPredicateMI namespace=%s relations=%d', namespace ?? '(all)', relations.length);
+ log(
+ '[rpc] method=openhuman.memory_graph_query namespace=%s relations=%d',
+ namespace ?? '(all)',
+ relations.length
+ );As per coding guidelines, **/*.{rs,ts,tsx}: “Use stable grep-friendly log prefixes ([domain], [rpc], [ui-flow]) and correlation fields (request IDs, method names, entity IDs)”.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@app/src/services/api/subjectPredicateMIApi.ts` around lines 18 - 26, Change
the debug prefix to a stable bracketed grep-friendly prefix and include explicit
correlation fields in the log call: replace the current logger instantiation
(log = debug('subject-predicate-mi:api')) with a bracketed prefix like '[domain]
subject-predicate-mi:api' and update the log call inside loadSubjectPredicateMI
to log structured correlation values (e.g. requestId, method:
'loadSubjectPredicateMI', namespace) along with relations.length so searches can
reliably find and correlate traces.
01838ec to
7784945
Compare
…rmation) A new read-only "Specialisation" tab. The vocabulary lens (tinyhumansai#17 Predicate Diversity) tells you how varied the global predicate set is. The thickness lens (tinyhumansai#18 Predicate Bundles) tells you which entity pairs share many predicates. The frequency lens (tinyhumansai#7 Relationship Types) tells you which predicates dominate overall. None of them answers the question this lens does: how much does knowing the SUBJECT predict which predicate it will use? And which entities speak in a specialised vocabulary versus a generalist one? Engine (pure, deterministic — no React/RPC/clock/RNG): - Global: I(S; P) in bits with canonical-order summation; H(S) and H(P); normalisedMI = I / min(H(S), H(P)). - Per-subject: specialisation = 1 - H(P|S=s)/log2(D_s) in [0,1] (1 when D_s <= 1 — single-predicate subject is maximally specialised by convention), plus dominantPredicate (tie-broken by predicate ASC) and dominantPredicateShare. - All p·log2(p) and joint·log2 ratio sums walk pairs in canonical (sortedSubjects, sortedPredicates) order so the result is byte-identical regardless of relation insertion order (lesson from tinyhumansai#2978 Cohesion's float-order bug). Selected from the prior loop-19 design workflow's runner-up (8.35/10 — a genuinely new lens not covered by any of the 19 shipped features). Adds ZERO new core surface: composes the already-shipped memoryGraphQuery / memoryListNamespaces wrappers. Container/presentational split with a monotonic request-token race guard for load-on-mount; i18n across all 13 locales. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
7784945 to
89bce44
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
app/src/components/intelligence/SubjectPredicateMIPanel.tsx (1)
9-9: ⚡ Quick winRename constant to camelCase to match TS naming rules.
MAX_ROWSviolates the repository’s camelCase variable naming rule for TS/TSX files.Proposed fix
-const MAX_ROWS = 25; +const maxRows = 25; ... - const rows = result.subjects.slice(0, MAX_ROWS); + const rows = result.subjects.slice(0, maxRows);As per coding guidelines: "
**/*.{ts,tsx,js,jsx}: Use camelCase for variable names in TypeScript/JavaScript code".Also applies to: 100-100
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/src/components/intelligence/SubjectPredicateMIPanel.tsx` at line 9, Rename the exported/used constant MAX_ROWS to camelCase (e.g., maxRows) throughout SubjectPredicateMIPanel.tsx: update the declaration of MAX_ROWS to maxRows and update every reference where MAX_ROWS is used (in component logic, props, or JSX) to the new name so it conforms to the repository's camelCase rule.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@app/src/components/intelligence/SubjectPredicateMIPanel.tsx`:
- Line 9: Rename the exported/used constant MAX_ROWS to camelCase (e.g.,
maxRows) throughout SubjectPredicateMIPanel.tsx: update the declaration of
MAX_ROWS to maxRows and update every reference where MAX_ROWS is used (in
component logic, props, or JSX) to the new name so it conforms to the
repository's camelCase rule.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: db181071-ed1b-4e1e-b372-c68730aa0a6d
📒 Files selected for processing (23)
app/src/components/intelligence/SubjectPredicateMIPanel.test.tsxapp/src/components/intelligence/SubjectPredicateMIPanel.tsxapp/src/components/intelligence/SubjectPredicateMITab.test.tsxapp/src/components/intelligence/SubjectPredicateMITab.tsxapp/src/lib/i18n/ar.tsapp/src/lib/i18n/bn.tsapp/src/lib/i18n/de.tsapp/src/lib/i18n/en.tsapp/src/lib/i18n/es.tsapp/src/lib/i18n/fr.tsapp/src/lib/i18n/hi.tsapp/src/lib/i18n/id.tsapp/src/lib/i18n/it.tsapp/src/lib/i18n/ko.tsapp/src/lib/i18n/pl.tsapp/src/lib/i18n/pt.tsapp/src/lib/i18n/ru.tsapp/src/lib/i18n/zh-CN.tsapp/src/lib/memory/subjectPredicateMI.test.tsapp/src/lib/memory/subjectPredicateMI.tsapp/src/pages/Intelligence.tsxapp/src/services/api/subjectPredicateMIApi.test.tsapp/src/services/api/subjectPredicateMIApi.ts
💤 Files with no reviewable changes (19)
- app/src/lib/i18n/ru.ts
- app/src/lib/i18n/hi.ts
- app/src/lib/i18n/ko.ts
- app/src/lib/i18n/bn.ts
- app/src/lib/i18n/fr.ts
- app/src/lib/i18n/ar.ts
- app/src/lib/i18n/zh-CN.ts
- app/src/lib/i18n/de.ts
- app/src/lib/i18n/it.ts
- app/src/lib/i18n/pl.ts
- app/src/lib/i18n/id.ts
- app/src/lib/i18n/es.ts
- app/src/services/api/subjectPredicateMIApi.test.ts
- app/src/lib/i18n/pt.ts
- app/src/pages/Intelligence.tsx
- app/src/lib/memory/subjectPredicateMI.test.ts
- app/src/lib/i18n/en.ts
- app/src/services/api/subjectPredicateMIApi.ts
- app/src/lib/memory/subjectPredicateMI.ts
🚧 Files skipped from review as they are similar to previous changes (3)
- app/src/components/intelligence/SubjectPredicateMITab.test.tsx
- app/src/components/intelligence/SubjectPredicateMIPanel.test.tsx
- app/src/components/intelligence/SubjectPredicateMITab.tsx
Summary
Adds a new read-only "Specialisation" tab. The vocabulary lens (#17 Predicate Diversity) tells you how varied the global predicate set is. The thickness lens (#18 Predicate Bundles) tells you which entity pairs share many predicates. The frequency lens (#7 Relationship Types) tells you which predicates dominate overall.
None of them answers the question this lens does: how much does knowing the subject predict which predicate it will use? And which entities speak in a specialised vocabulary versus a generalist one?
The classical information-theoretic answer is mutual information
I(S; P)between the subject and predicate distributions, in bits, and the normalised formI / min(H(S), H(P))(0 = independent, 1 = perfect correspondence). Per entity, the same machinery gives a specialisation index in [0, 1] (1 = always uses one predicate, 0 = predicates perfectly evenly spread).Provenance & design discipline
This is the runner-up from the prior loop-19 judge-panel design workflow (8.35/10, the closest competitor to Document Provenance) — a genuinely distinct lens not covered by any of the 19 shipped features. Re-using vetted design work; built solo following the proven pattern; then adversarial-reviewed before push (5 raw findings, 0 confirmed).
Engine design (pure, deterministic — no React/RPC/clock/RNG)
(subject, predicate)co-occurrences (every relation contributes one observation; parallels not collapsed).H(S),H(P),I(S; P)summed in canonical sorted order (sortedSubjects × sortedPredicates) so the entropy / MI is byte-identical regardless of relation input order — float addition is not associative andp·log2(p)summands are not exactly representable; the cohesion-bug lesson applied throughout.specialisation = 1 - H(P|S=s)/log2(D_s), with the conventional fill1.0whenD_s <= 1(single-predicate subject is maximally specialised).dominantPredicatetie-broken by predicate ASC for stable output.Zero new core surface
Composes the already-shipped
memoryGraphQuery/memoryListNamespacesJSON-RPC wrappers. Container/presentational split with a monotonic request-token race guard for load-on-mount; i18n across all 13 locales.Test plan
vitest— 24 tests (engine: 4 boundary fixtures with hand-derived MI (1×1 = 0; fixed-subject = 0; perfect-bijection = 1; specialist-vs-generalist) + specialisation rank + non-string s/p drop + no case-folding + byte-identical MI/entropy across permutations + tie-break ordering + duplicate-relation parallel-count behaviour — plus api facade, panel metric tiles + ranked table, container load/namespace-requery/error)tsc --noEmit— cleaneslint— 0 errorsprettier --check— clean🤖 Generated with Claude Code
Summary by CodeRabbit