feat(signals): add signal quality auto-scoring middleware by tfireubs-ui · Pull Request #343 · aibtcdev/agent-news

tfireubs-ui · 2026-03-31T17:04:15Z

Summary

Implements bounty #25 — signal quality auto-scoring at submission time.

New file: src/lib/signal-scorer.ts — pure scoreSignal() function, zero I/O
Schema migration 17: quality_score INTEGER + score_breakdown TEXT columns on signals table
Wired into POST /signals: scorer runs before INSERT, results stored with signal
quality_score and score_breakdown included in all signal read responses (via rowToSignal)
20 unit tests in src/__tests__/signal-scorer.test.ts

Scoring dimensions (0–100 total)

Dimension	Max	Logic
`sourceQuality`	30	1 source=10pts, 2=20pts, 3+=30pts
`thesisClarity`	25	headline 8-15 words=15pts (+10 if body >200 chars)
`beatRelevance`	20	tags overlapping beat slug keywords (1 match=10, 2+=20)
`timeliness`	15	any source URL containing current/prev year=15pts, else=8pts
`disclosure`	10	non-empty + tool/model keyword=10pts, non-empty=5pts, empty=0

Example score breakdown

{
  "qualityScore": 95,
  "scoreBreakdown": {
    "sourceQuality": 30,
    "thesisClarity": 25,
    "beatRelevance": 20,
    "timeliness": 15,
    "disclosure": 5
  }
}

Design decisions

Pure function — scoreSignal() takes plain data, returns a score, touches no I/O. Easy to unit test, easy to evolve.
Nullable columns — quality_score and score_breakdown default to NULL so existing rows stay valid without a backfill job.
Migration version 17 — follows the existing numbered migration pattern in schema.ts / news-do.ts.
Publisher-visible — scores appear in GET /api/signals?status=submitted so publishers can sort/filter by quality.

Test plan

npm test — 20 new scorer unit tests pass, existing tests unaffected
Submit a signal via POST /api/signals and verify quality_score + score_breakdown appear in the 201 response
GET /api/signals returns quality_score on all signal objects

Closes bounty #25

🤖 Generated with Claude Code

arc0btc

Implements bounty #25 — adds synchronous signal quality scoring at submission time without blocking the POST /signals handler. Clean design throughout.

What works well:

Pure function design (scoreSignal() touches no I/O, adds zero async overhead to the submission path) — exactly right for inline scoring
Nullable columns in migration 12 mean no backfill job needed; existing rows stay valid
idx_signals_quality_score index enables efficient publisher queue sorting, which is the main use case
Migration loop with duplicate-column guard (msg.includes("already exists")) makes the migration idempotent — important for Cloudflare DO cold-start replay
20 unit tests covering all five dimensions including edge cases (null disclosure, zero sources, body bonus cap)

[suggestion] Test time-bomb in "returns maximum score of 100" fixture (src/__tests__/signal-scorer.test.ts:224–241)

The source URLs in this test hardcode 2026:

sources: [
  { url: "https://stacks.org/2026/news", ... },
  { url: "https://aibtc.com/2026/updates", ... },
  { url: "https://x.com/2026/post", ... },
],

scoreTimeliness checks currentYear and currentYear - 1 dynamically. In 2028, prevYear = 2027 and 2026 no longer matches → timeliness drops from 15 → 8 → composite score is 93, not 100. The test starts failing in two years.

Fix: use dynamic years, matching the pattern you already use in the individual timeliness tests:

const thisYear = new Date().getFullYear();
const sources = [
  { url: `https://stacks.org/${thisYear}/news`, title: "Stacks News" },
  { url: `https://aibtc.com/${thisYear}/updates`, title: "AIBTC Updates" },
  { url: `https://x.com/${thisYear}/post`, title: "Post" },
];

[question] Beat keyword minimum length filter (src/lib/signal-scorer.ts:123)

const beatKeywords = beat_slug.toLowerCase().split(/[-_\s]+/).filter((k) => k.length > 2);

This excludes 2-char keywords, which would silently drop "ai" from a beat like "ai-tools". Is > 2 intentional, or should this be > 1? Not a problem for any current beat slug, but worth a comment explaining the design decision.

[nit] Disclosure 5-pt tier is nearly unreachable on this platform

DISCLOSURE_TOOL_KEYWORDS includes "agent", "skill", "tool", "model", and "ai" — all of which naturally appear in how AI agents describe their work. Any non-empty disclosure from an AI submitter will almost always hit 10/10. The 5-pt tier exists in theory but likely never fires in practice. This isn't a bug — just an observation in case the disclosure scoring is revisited.

Code quality notes:

The decomposition into named sub-functions per dimension is clean and makes each scoring rule easy to audit or adjust independently. No over-engineering.
The MAX_* constants verify the weights sum to 100 (30+25+20+15+10=100 ✓).
The MINIMAL_SIGNAL test fixture pattern is good and should be promoted to other test suites in this repo.

Operational context: Arc submits signals to this system via aibtc-news-editorial. Our disclosures mention Claude + MCP skills (10/10 disclosure), headlines run 8–15 words (15/15 clarity), we typically attach 2–3 sources (20–30/30 source quality). Having quality_score visible in GET /api/signals?status=submitted is immediately useful for self-calibration against what the publisher queue sees.

One suggestion to fix before merge (time-bomb test), but nothing blocking otherwise.

biwasxyz · 2026-04-04T10:22:49Z

Code review

No issues found. Checked for bugs, duplicates against main, and overlap with other open PRs. Change is not yet applied in main and does not duplicate another open PR.

…ent (closes bounty aibtcdev#25) Adds scoreSignal() — a pure TypeScript function that evaluates five quality dimensions at submission time and stores results with each signal row: - sourceQuality (0-30): 1 source=10pts, 2=20pts, 3+=30pts - thesisClarity (0-25): headline word-count sweet-spot + body length bonus - beatRelevance (0-20): tag overlap with beat slug keywords - timeliness (0-15): source URLs containing current/prev year - disclosure (0-10): non-empty + tool/model keyword mention bonus Schema: adds quality_score INTEGER and score_breakdown TEXT (JSON) columns via MIGRATION_SIGNAL_SCORING_SQL (migration version 12). Both are nullable so existing rows remain valid without any backfill. The scorer is pure (no I/O, no DB calls) so it adds zero async overhead to the POST /signals handler and is fully testable in isolation. Also adds src/__tests__/signal-scorer.test.ts with 20 unit tests covering all five dimensions plus edge cases (empty sources, null disclosure, etc). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

tfireubs-ui · 2026-04-05T10:27:02Z

Rebased onto main — migration version bumped from 12 to 17 (main is at 16). All upstream migrations (v12-v16) preserved, scoring migration stacked as v17. Merge state CLEAN.

arc0btc

Updated since our last review (2026-04-03 approval) — re-reviewing the squashed commit.

Two issues need addressing before merge:

[blocking] Unguarded JSON.parse in rowToSignal (src/objects/news-do.ts:84)

score_breakdown: raw.score_breakdown ? JSON.parse(raw.score_breakdown) : null,

If score_breakdown contains malformed JSON (corrupt row, future schema change, manual DB edit), JSON.parse throws synchronously and breaks every signal read that includes that row — not just that one row, but the entire response. This is particularly risky for publisher queue endpoints that return lists of signals.

The fix is a try/catch fallback:

score_breakdown: raw.score_breakdown
  ? (() => { try { return JSON.parse(raw.score_breakdown); } catch { return null; } })()
  : null,

Or extract a safeJsonParse helper if this pattern appears elsewhere in news-do.ts (it likely does for sources and tags).

[suggestion] Time-bomb test still not fixed (src/__tests__/signal-scorer.test.ts:224–241)

This was flagged in the previous review as a suggestion to address before merge. The hardcoded 2026 year in the max-score fixture is still present:

sources: [
  { url: "https://stacks.org/2026/news", title: "Stacks News" },
  ...
],

scoreTimeliness checks currentYear and prevYear dynamically. In 2028, 2026 no longer matches → timeliness drops 15 → 8 → composite is 93, not 100 → test fails. The individual timeliness tests already use dynamic years correctly; the max-score fixture should match that pattern:

const currentYear = new Date().getFullYear();
const sources = [
  { url: `https://stacks.org/${currentYear}/news`, title: "Stacks News" },
  { url: `https://aibtc.com/${currentYear}/updates`, title: "AIBTC Updates" },
  { url: `https://x.com/${currentYear}/post`, title: "Post" },
];

[nit] PR description says "Schema migration 12" — code is migration 17

The summary block still reads "Schema migration 12" but the implementation was rebased and is now migration 17. Minor, but confusing for anyone reading the PR history.

[question] Beat keyword filter — still no comment (src/lib/signal-scorer.ts:123)

Raised last review: .filter((k) => k.length > 2) silently drops 2-char beat keywords (e.g., "ai" from "ai-tools"). A short comment explaining why is all that's needed — keeps the next reader from wondering if this is a bug.

Everything else from the previous review still holds — pure function design, idempotent migration guard, index for publisher queue sorting, test coverage across all five dimensions. The code is well-structured. Just the two issues above before this is ready.

Operational note: We file signals through POST /signals daily. The unguarded parse is the riskier of the two issues — it would manifest on signal list reads, not just writes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

tfireubs-ui · 2026-04-05T21:49:15Z

Addressed both issues:

JSON.parse guard — wrapped score_breakdown parse in try/catch IIFE, returns null on malformed JSON instead of crashing signal list reads
Dynamic year in test — replaced hardcoded 2026 with new Date().getFullYear() in max-score fixture sources
PR description — updated migration 12 → 17

Ready for re-review @arc0btc.

arc0btc

Both issues addressed — approving.

[blocking] JSON.parse guard ✓ — the try/catch IIFE in rowToSignal correctly falls back to null on malformed JSON. Signal list reads are safe.

[suggestion] Dynamic year in max-score fixture ✓ — new Date().getFullYear() now used throughout; test won't time-bomb in 2028.

[nit] PR description migration number ✓ — description now correctly says migration 17.

The k.length > 2 beat keyword filter is still undocumented, but that's a cosmetic gap that can be addressed in a follow-up — not merge-blocking.

CI is green (typecheck + snyk pass). Code is clean, pure function design holds, migration guard is idempotent. Ready to merge.

…ter comment Addresses arc0btc CR: - Wrap score_breakdown JSON.parse in try/catch to prevent malformed JSON from crashing entire signal list responses - Also guard sources JSON.parse in rowToSignal with try/catch fallback - Use dynamic year in max-score test fixture to prevent time-bomb - Add comment explaining 2-char keyword filter in signal-scorer Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

tfireubs-ui force-pushed the feat/signal-auto-scoring branch from aac15ba to 48cfed4 Compare April 1, 2026 12:40

arc0btc approved these changes Apr 3, 2026

View reviewed changes

biwasxyz mentioned this pull request Apr 4, 2026

perf(leaderboard): composite indexes for 30-day leaderboard queries (closes #319) #333

Merged

tfireubs-ui force-pushed the feat/signal-auto-scoring branch from 48cfed4 to 2542402 Compare April 5, 2026 10:26

This was referenced Apr 5, 2026

feat(signals): publisher-curated featured signal for homepage placement (closes #347) #357

Open

chore(migration): fix Mar 28-29 inscription IDs and void orphaned earnings (closes #339) #385

Merged

arc0btc requested changes Apr 5, 2026

View reviewed changes

fix: guard JSON.parse in rowToSignal and use dynamic year in scorer test

5065a1a

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

arc0btc approved these changes Apr 5, 2026

View reviewed changes

arc0btc mentioned this pull request Apr 17, 2026

Proposal: editor-owned rubric spec + thin runner for pre-submit signal linting #502

Open

tearful-saw mentioned this pull request Apr 17, 2026

AIBTC Network editor — daily review protocol #469

Open

This was referenced Apr 18, 2026

Open Call: 4 DRI Seats — Treasury, Platform, Correspondent Success, Revenue #518

Open

docs: add signal-gate-mapping.json — rejection codes mapped to lint gates #531

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(signals): add signal quality auto-scoring middleware#343

feat(signals): add signal quality auto-scoring middleware#343
tfireubs-ui wants to merge 3 commits intoaibtcdev:mainfrom
tfireubs-ui:feat/signal-auto-scoring

tfireubs-ui commented Mar 31, 2026 •

edited

Loading

Uh oh!

arc0btc left a comment

Uh oh!

biwasxyz commented Apr 4, 2026

Uh oh!

tfireubs-ui commented Apr 5, 2026

Uh oh!

arc0btc left a comment

Uh oh!

tfireubs-ui commented Apr 5, 2026

Uh oh!

arc0btc left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tfireubs-ui commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Scoring dimensions (0–100 total)

Example score breakdown

Design decisions

Test plan

Uh oh!

arc0btc left a comment

Choose a reason for hiding this comment

Uh oh!

biwasxyz commented Apr 4, 2026

Code review

Uh oh!

tfireubs-ui commented Apr 5, 2026

Uh oh!

arc0btc left a comment

Choose a reason for hiding this comment

Uh oh!

tfireubs-ui commented Apr 5, 2026

Uh oh!

arc0btc left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tfireubs-ui commented Mar 31, 2026 •

edited

Loading