Skip to content

Filter moderation-blocked slugs from /api/index#15

Open
LynnColeArt wants to merge 9 commits into
BaderBC:masterfrom
LynnColeArt:fix/api-index-moderation
Open

Filter moderation-blocked slugs from /api/index#15
LynnColeArt wants to merge 9 commits into
BaderBC:masterfrom
LynnColeArt:fix/api-index-moderation

Conversation

@LynnColeArt
Copy link
Copy Markdown

@LynnColeArt LynnColeArt commented May 14, 2026

What this PR fixes

Closes a content-safety gap in the public index endpoint.

Summary

  • Added a moderation-aware filter in GET /api/index.
  • Index items are checked against article_moderation and any slug with status pending, checking, or banned is excluded from responses.
  • total is reconciled when the first page contains the entire dataset so it reflects the filtered set instead of stale or pre-moderation counts.

Validation notes

  • No schema changes.
  • No client changes.

Related issues

Summary by CodeRabbit

  • Bug Fixes

    • Exclude moderated items (banned/pending/checking) from index/search results; DB failures return a temporary-unavailable (503) with empty items and reconciled totals for first-page responses.
  • New Features

    • Page generation now enforces a pre-generation title moderation check and rejects disallowed titles with 403.
    • Comment and article voting enforce configurable per-user/IP/subnet and short-burst rate limits, returning 429 with retry info when exceeded.
  • Chores

    • Added a small IP utility dependency.
  • Documentation

    • Added per-IP comment rate-limit config to README and deployment vars.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds moderation-aware index filtering and an early title-moderation gate in page generation; introduces configurable article-vote and comment-creation rate limits with IP/subnet normalization, enforcement that throws 429-style errors, handler integration, and adds ipaddr.js.

Changes

Moderation and page-generation gating

Layer / File(s) Summary
Moderation imports & filter implementation
src/worker/index.ts
Adds isTitleModerationApproved and topicRejectedMessage imports and implements filterModeratedIndexItems() to exclude slugs with banned, pending, or checking statuses and throw on DB errors.
Index endpoint integration
src/worker/index.ts
/api/index now runs KV-derived items through filterModeratedIndexItems, returns items: filteredItems, reconciles stored __total against filteredItems.length on first-page when KV list is complete, and returns 503 with items: [] on filter DB errors.
Page generation gating and rate-limit reorder
src/worker/index.ts
In /api/page/:slug resolves client IP earlier, validates OPENROUTER_API_KEY earlier, computes title, runs isTitleModerationApproved(title, env) (403 + topicRejectedMessage() on rejection), then applies per-IP generation rate limiting after approval; removes duplicate checks/assignments.

Comment/article vote rate limiting

Layer / File(s) Summary
Dependency and import
package.json, src/worker/comments.ts
Adds ipaddr.js to dependencies and imports it for IP parsing used in subnet normalization.
Env types and defaults
src/worker/comments.ts
Extends CommentsEnv with ARTICLE_VOTE_PER_USER_PER_HOUR, ARTICLE_VOTE_PER_IP_PER_HOUR, ARTICLE_VOTE_PER_SUBNET_PER_HOUR, ARTICLE_VOTE_PER_USER_PER_MINUTE, COMMENT_PER_IP_PER_HOUR, COMMENT_PER_IP_PER_MINUTE and adds numeric default fallbacks.
Ranking clause refactor and helpers
src/worker/comments.ts
Refactors the recommended ranking ORDER BY construction and adds parsePositiveInt and ipToSubnet helpers for env parsing and stable subnet key derivation; implements enforceArticleVoteRateLimits and enforceCommentCreationRateLimits.
Rate-limit enforcement and handler integration
src/worker/comments.ts
Calls enforceCommentCreationRateLimits after ensureUser(...) in comment POST, and enforceArticleVoteRateLimits after ensureUser(...) in comment upvote toggle and article upvote toggle handlers.
Docs and wrangler vars
README.md, wrangler.toml
Adds COMMENT_PER_IP_PER_HOUR and COMMENT_PER_IP_PER_MINUTE to README and wrangler.toml vars.

Sequence Diagrams

sequenceDiagram
  participant Client
  participant Worker
  participant DB
  Client->>Worker: GET /api/index
  Worker->>DB: query article_moderation for slugs with banned/pending/checking
  DB-->>Worker: moderation rows
  Worker->>Client: return filtered items or 503 + items:[]
Loading
sequenceDiagram
  participant Client
  participant Worker
  participant Moderation
  participant RateLimiter
  Client->>Worker: GET /api/page/:slug
  Worker->>Worker: resolve client IP
  Worker->>Worker: validate OPENROUTER_API_KEY presence
  Worker->>Moderation: isTitleModerationApproved(title)
  Moderation-->>Worker: approved / rejected
  alt approved
    Worker->>RateLimiter: check per-IP generation rate
    RateLimiter-->>Worker: allowed / throttled
    Worker->>Client: generate / return page
  else rejected
    Worker->>Client: 403 + topicRejectedMessage()
  end
Loading

Possibly related issues

  • #16: Implements pre-generation title moderation and index filtering similar to the issue’s objectives.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hop through slugs and count each one,

I hide the banned until the day is done,
I sniff each title, keep the garden clean,
I pace the votes so the meadow stays green,
Hooray — small hops keep the patch serene.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the primary change: filtering moderation-blocked slugs from the /api/index endpoint, which is the main focus of this PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/worker/index.ts`:
- Around line 151-153: The catch block currently logs the moderation error and
returns the original items (variable items), which reintroduces blocked content;
instead fail closed: remove "return items" and rethrow the caught error (throw
e) so the caller can surface a 503, or alternatively return a safe
empty/filtered set (e.g., []) and ensure the caller converts that to a 503
response; update the catch in the moderation/filtering code (the catch handling
e) accordingly and keep the console.error message with the error for
diagnostics.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 686165e2-308f-47cd-bb3c-200c1309832e

📥 Commits

Reviewing files that changed from the base of the PR and between 46759b0 and 7787f79.

📒 Files selected for processing (1)
  • src/worker/index.ts

Comment thread src/worker/index.ts Outdated
@LynnColeArt
Copy link
Copy Markdown
Author

Why this is necessary

We observed scripted abuse creating offensive/zero-gibberish topics on live. Moving from reactive moderation to pre-generation gating prevents bad topics from reaching KV and cuts off attack surface and cost.

What was changed

  • Added pre-generation moderation gate in the article generation path using isTitleModerationApproved(...).
  • Added deterministic rejection for numeric/gibberish abuse via isObviousGibberishTitle(...).
  • Expanded permanent slug blocklist for zero-heavy / numeric pattern spam.
  • Excluded permanently blocked slugs from /api/index results.
  • Updated client UX so blocked requests display the refusal message.

Evidence

Screenshot reference:

/home/lynn/Downloads/screenship-All-entries-Halupedia-2026-05-14T07-55-13.png

Related issue

@LynnColeArt
Copy link
Copy Markdown
Author

Added upvote-abuse protections in this same branch/PR: article and comment vote endpoints now use shared server-side rate limits to reduce upvote gaming (per-user/hour, per-user/minute burst, per-IP/hour, per-subnet/hour). Added helper in with configurable env vars and 429 retry-after responses. Commit: 5a1259a.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/worker/comments.ts`:
- Around line 544-545: The article vote POST handler bypasses the rate limiter;
call enforceArticleVoteRateLimits(c, c.env, user.id) at the start of the
app.post("/api/articles/:slug/vote", ...) handler (before any writes to
article_votes or articles) so the route is protected from burst abuse; also add
the same call in the other missing site around the article vote logic referenced
near the other call sites (the ones touching article_votes/articles) so both
spots invoke enforceArticleVoteRateLimits with the same c, c.env, and user.id
parameters.
- Around line 242-257: The ipToSubnet function currently uses string splitting
which breaks IPv6 compression semantics; replace the manual IPv6 logic in
ipToSubnet with a proper IPv6 parsing/normalization library (for example
ipaddr.js or ip6addr) to parse the input, handle IPv4-mapped IPv6 addresses, and
then produce a canonical /64 prefix for IPv6 and /24 for IPv4 while preserving
the "unknown" return. Locate ipToSubnet and: use the library to parse the IP
string, call its normalization/first-n-bits or subnet function to derive the /64
(for IPv6) or /24 (for IPv4) prefix, and return that canonical prefix string
instead of the current split-join result.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: b6ce2387-d28b-4e85-b9d6-69891c552b3a

📥 Commits

Reviewing files that changed from the base of the PR and between 7787f79 and 5a1259a.

📒 Files selected for processing (1)
  • src/worker/comments.ts

Comment thread src/worker/comments.ts
Comment thread src/worker/comments.ts Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/worker/index.ts`:
- Line 153: The function currently returns items when results is empty, which
skips the earlier permanent-block filtering; change the return to return
safeItems instead of items in the early-exit case (the branch that checks if
(!results || results.length === 0)) so that the isPermanentlyBlockedSlug filter
applied to produce safeItems is always respected; update the return at that
condition to return safeItems (referencing safeItems, items, results, and
isPermanentlyBlockedSlug to locate the code).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: b86c643f-0e6a-4238-ad25-aab9d041b474

📥 Commits

Reviewing files that changed from the base of the PR and between 5a1259a and e4b1466.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (3)
  • package.json
  • src/worker/comments.ts
  • src/worker/index.ts
✅ Files skipped from review due to trivial changes (1)
  • package.json

Comment thread src/worker/index.ts Outdated
@LynnColeArt
Copy link
Copy Markdown
Author

@coderabbitai please re-run review for the latest push

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

@LynnColeArt Sure! I'll kick off a fresh review of the latest push now.

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/worker/index.ts`:
- Around line 566-591: The rate limit logic should run before calling the
expensive moderation model to avoid wasting moderation calls; move the rateLimit
call (using rateLimit with kv: c.env.ARTICLES, bucket: "gen", ip, limit parsed
from c.env.GEN_PER_IP_PER_HOUR, windowSec: 3600) and its rl.ok check so it
executes prior to calling isTitleModerationApproved(title, c.env). Keep the same
response behavior when rl.ok is false (return the 429 JSON with retry-after and
x-robots-tag headers) and ensure ip and perHour parsing remain identical when
relocated.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: b2f940e9-5775-4f5e-9d17-75100862af93

📥 Commits

Reviewing files that changed from the base of the PR and between e4b1466 and 4397aa0.

📒 Files selected for processing (1)
  • src/worker/index.ts

Comment thread src/worker/index.ts Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/worker/comments.ts`:
- Around line 557-560: The comment route is incorrectly consuming the article
voting rate limit by calling enforceArticleVoteRateLimits during comment
creation; remove that call and instead implement and call a separate comment
creation limiter (e.g., enforceCommentCreationRateLimits) that reads its own env
knobs from c.env (like COMMENT_CREATION_LIMIT, COMMENT_CREATION_WINDOW) and
accepts the same user.id/context signature; update the try block to call
ensureUser(c, c.env) then await enforceCommentCreationRateLimits(c, c.env,
user.id) so comment publishing is throttled independently of voting.
- Around line 282-345: The current checks array constructs all rateLimit calls
up-front (result: rateLimit(...)), starting concurrent mutations that can
consume shared IP/subnet buckets even after an earlier check fails; change the
implementation so rateLimit is invoked lazily and sequentially: instead of
storing result: rateLimit(...) in each object, store a function (e.g., run: ()
=> rateLimit(...)) or move the rateLimit(...) calls into the for-loop and await
each call one-by-one, then preserve the existing error construction (err.status
= 429, err.retryAfter = result.retryAfter) when a check fails; update references
to check.result to call/await the function (e.g., const result = await
check.run()) and keep scope, parsePositiveInt and ARTICLE_VOTE_DEFAULTS usage
unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 02b5c569-a038-4fdf-932d-35033cbd0b01

📥 Commits

Reviewing files that changed from the base of the PR and between 46759b0 and 4397aa0.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (3)
  • package.json
  • src/worker/comments.ts
  • src/worker/index.ts

Comment thread src/worker/comments.ts
Comment thread src/worker/comments.ts
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/worker/comments.ts`:
- Around line 617-619: Move the per-IP comment creation throttling call to run
after user identity resolution: call ensureUser(c, c.env) first to perform
identity-based rate checks and only then invoke
enforceCommentCreationRateLimits(c, c.env). In other words, swap the two calls
so ensureUser executes before enforceCommentCreationRateLimits (so identity
rejections don't consume the shared comment-create quota), keeping the same
error handling and awaits around both functions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: a2100802-3c32-429f-ad60-bed2d876771a

📥 Commits

Reviewing files that changed from the base of the PR and between 4397aa0 and f123787.

📒 Files selected for processing (4)
  • README.md
  • src/worker/comments.ts
  • src/worker/index.ts
  • wrangler.toml
✅ Files skipped from review due to trivial changes (1)
  • README.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/worker/index.ts

Comment thread src/worker/comments.ts
@BaderBC
Copy link
Copy Markdown
Owner

BaderBC commented May 16, 2026

Tried to run it locally but encountered such error:

⚡➜ halupedia ( fix/api-index-moderation) pnpm dev                                                                                                                                                                                   󱙺 FORGE 61.3k  claude-opus-4-7 MEDIUM

> hallupedia@0.1.0 dev /Users/bstrama/work_dir/halupedia
> concurrently -n vite,wrangler -c cyan,magenta "pnpm dev:client" "pnpm dev:worker"

[vite] 
[vite] > hallupedia@0.1.0 dev:client /Users/bstrama/work_dir/halupedia
[vite] > vite build --watch --mode development
[vite] 
[wrangler] 
[wrangler] > hallupedia@0.1.0 dev:worker /Users/bstrama/work_dir/halupedia
[wrangler] > wrangler dev
[wrangler] 
[vite] vite v6.4.2 building for development...
[vite] 
[vite] watching for file changes...
[vite] 
[vite] build started...
[vite] transforming...
[wrangler] 
[wrangler]  ⛅️ wrangler 3.114.17 (update available 4.92.0)
[wrangler] ---------------------------------------------------------
[wrangler] 
[wrangler] ▲ [WARNING] The version of Wrangler you are using is now out-of-date.
[wrangler] 
[wrangler]   Please update to the latest version to prevent critical errors.
[wrangler]   Run `npm install --save-dev wrangler@4` to update to the latest version.
[wrangler]   After installation, run Wrangler with `npx wrangler`.
[wrangler] 
[wrangler] 
[wrangler] Using vars defined in .dev.vars
[wrangler] Your Worker and resources are simulated locally via Miniflare. For more information, see: https://developers.cloudflare.com/workers/testing/local-development.
[wrangler] 
[wrangler] Your worker has access to the following bindings:
[wrangler] - Durable Objects:
[wrangler]   - PRESENCE: PresenceDO
[wrangler] - KV Namespaces:
[wrangler]   - ARTICLES: 722fb41fc7af44f6bf6210a884025781 [simulated locally]
[wrangler] - D1 Databases:
[wrangler]   - DB: hallupedia (7e6d3ae5-a12a-4aaf-9e79-a4bb5c53616a) [simulated locally]
[wrangler] - Vars:
[wrangler]   - OPENROUTER_MODEL: "google/gemini-2.5-flash-lite"
[wrangler]   - MAX_ARTICLES_PER_DAY: "5000"
[wrangler]   - GEN_PER_IP_PER_HOUR: "100"
[wrangler]   - IDENT_PER_IP_PER_HOUR: "10"
[wrangler]   - SEARCH_PER_IP_PER_HOUR: "15"
[wrangler]   - COMMENT_PER_IP_PER_HOUR: "90"
[wrangler]   - COMMENT_PER_IP_PER_MINUTE: "15"
[wrangler]   - OPENROUTER_API_KEY: "(hidden)"
[vite] ✓ 33 modules transformed.
[wrangler] 
[wrangler] ✘ [ERROR] Build failed with 2 errors:
[wrangler] 
[wrangler]   ✘ [ERROR] No matching export in "src/worker/moderation.ts" for import "isTitleModerationApproved"
[wrangler]   
[wrangler]       src/worker/index.ts:20:2:
[wrangler]         20 │   isTitleModerationApproved,
[wrangler]            ╵   ~~~~~~~~~~~~~~~~~~~~~~~~~
[wrangler]   
[wrangler]   
[wrangler]   ✘ [ERROR] No matching export in "src/worker/moderation.ts" for import "topicRejectedMessage"
[wrangler]   
[wrangler]       src/worker/index.ts:21:2:
[wrangler]         21 │   topicRejectedMessage,
[wrangler]            ╵   ~~~~~~~~~~~~~~~~~~~~
[wrangler]   
[wrangler]   
[wrangler] 
[wrangler] 
[wrangler] 🪵  Logs were written to "/Users/bstrama/Library/Preferences/.wrangler/logs/wrangler-2026-05-16_15-15-24_998.log"
[vite] rendering chunks...
[vite] computing gzip size...
[vite] ../../dist/index.html                   1.16 kB │ gzip:  0.62 kB
[vite] ../../dist/assets/index-DAo9p5-V.css   24.65 kB │ gzip:  4.82 kB
[vite] ../../dist/assets/index-CEPVEGQA.js   174.33 kB │ gzip: 55.61 kB
[vite] built in 446ms.
[wrangler]  ELIFECYCLE  Command failed with exit code 1.
[wrangler] pnpm dev:worker exited with code 1

@LynnColeArt
Copy link
Copy Markdown
Author

Huh. Nice. Okay, I'll take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants