Skip to content

Fix SEO: www base URL + skip middleware for tool rewrites#832

Draft
PavelMakarchuk wants to merge 2 commits intomainfrom
fix/seo-www-base-url-middleware-bypass
Draft

Fix SEO: www base URL + skip middleware for tool rewrites#832
PavelMakarchuk wants to merge 2 commits intomainfrom
fix/seo-www-base-url-middleware-bypass

Conversation

@PavelMakarchuk
Copy link
Contributor

@PavelMakarchuk PavelMakarchuk commented Mar 13, 2026

Summary

  • Fix BASE_URL: Change from policyengine.org to www.policyengine.org. The non-www domain 307-redirects to www, so all canonical URLs and OG tags generated by middleware pointed to a redirecting URL. This affects all pages served by the middleware.
  • Fix hardcoded DEFAULT_OG.image: Also used non-www.
  • Skip middleware for rewritten tool paths: Paths like /us/keep-your-pay-act, /us/taxsim, and /us/api are Vercel-rewritten to external Next.js apps that have their own SSR, meta tags, OG images, and JSON-LD. The middleware was intercepting Googlebot and replacing these rich SSR pages with a 3-line HTML stub containing only a title, description, and link — preventing Google from seeing the actual content.

Context

/us/keep-your-pay-act is not appearing in Google search results despite being live for 4+ days. Investigation found:

  1. curl -A "Googlebot" https://www.policyengine.org/us/keep-your-pay-act returns a minimal stub HTML page from middleware instead of the full SSR calculator page
  2. The canonical in that stub points to https://policyengine.org/... (no www), which 307-redirects — a confusing signal for Google indexer

The middleware stub Googlebot currently receives is just an h1, a one-line description, and a link — while the actual SSR page has the full calculator with policy overview, charts, tables, form inputs, semantic HTML, heading hierarchy, and rich content.

Companion PR with app-side fixes: PolicyEngine/keep-your-pay-act#31

Risk assessment

Concern Risk Detail
Analytics None GA fires in client-side apps for real users. Middleware only serves bots, which do not execute JS. Zero change to analytics.
Existing SEO rankings Minimal, temporary Canonical URLs change from non-www to www for all middleware-served pages. Google handles this gracefully — brief re-indexing period, no lasting impact. This is the correct fix since non-www 307-redirects to www.
Social share previews None See tool-by-tool breakdown below.
Deployment Safe No config changes, no env var changes, no build changes. Just middleware logic and meta tags.

Tool-by-tool bypass assessment

Each rewritten tool was checked to see if it has its own meta tags before bypassing middleware:

Tool Has own OG tags? Has og:image? Bypassed? Notes
/us/keep-your-pay-act Yes (PR 31 adds them) Yes (PR 31 adds it) Yes Full SSR with all meta tags
/us/watca Title and description only No No Lacks OG tags, canonical, og:image — would break social previews
/us/taxsim Yes (title, desc, canonical, og) No og:image Yes Has full OG tags, just missing image (same as before via middleware)
/us/api Yes (title, desc, canonical, og) No og:image Yes Has full OG tags, just missing image (same as before via middleware)

Known remaining issues (not in scope)

  • app/public/sitemap.xml and app/scripts/generate-sitemap.ts use non-www URLs for all 253 pages — same mismatch but a larger change, should be a separate PR
  • app/public/robots.txt sitemap directive uses non-www
  • WATCA needs its own OG tags before it can be added to the bypass list
  • TAXSIM and API docs canonicals use non-www (in their own repos)

Test plan

  • Verify curl -A "Googlebot" https://www.policyengine.org/us/keep-your-pay-act returns the full SSR page (not the 3-line stub)
  • Verify curl -A "Googlebot" https://www.policyengine.org/us/research/some-post still returns the pre-rendered/OG response (middleware still works for non-rewrite paths)
  • Verify curl -A "Twitterbot" https://www.policyengine.org/us/watca still returns OG stub with image (not bypassed)
  • Verify social share previews still work for research posts
  • After deploy, request indexing in Google Search Console for https://www.policyengine.org/us/keep-your-pay-act

Generated with Claude Code

- Change BASE_URL from policyengine.org to www.policyengine.org
  (non-www 307-redirects to www, so all canonical URLs and OG tags
  generated by middleware pointed to a redirecting URL)
- Skip middleware interception for paths that Vercel rewrites to
  external apps (KYPA, WATCA, TAXSIM, API docs). These apps have
  their own SSR with full content, meta tags, OG images, and JSON-LD.
  The middleware was replacing rich SSR pages with 3-line stubs,
  preventing Google from seeing the actual content.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@vercel
Copy link

vercel bot commented Mar 13, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
policyengine-app-v2 Ready Ready Preview, Comment Mar 13, 2026 0:11am
policyengine-calculator Ready Ready Preview, Comment Mar 13, 2026 0:11am

Request Review

WATCA lacks its own OG tags — bypassing middleware would break
social share previews. Also fixes DEFAULT_OG.image to use www.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant