align GPT-5 model routing with current OpenAI defaults by ndycode · Pull Request #309 · ndycode/codex-multi-auth

ndycode · 2026-03-22T07:32:46Z

Summary

add first-class GPT-5.4, GPT-5.4-pro, GPT-5.4-mini, and GPT-5.4-nano routing metadata
stop silently downgrading generic GPT-5 aliases to older GPT-5.1-era mappings
surface normalized model routing and capability info in CLI diagnostics

Stack

base: main

coderabbitai · 2026-03-22T07:33:02Z

📝 Walkthrough

Walkthrough

The changes introduce a centralized model mapping and profiling system that replaces scattered model normalization logic with canonical model profiles, reasoning effort defaults, and tool capability metadata. New functions resolveNormalizedModel(), getModelProfile(), and getModelCapabilities() standardize model identification across the codebase, while updating default model routing from gpt-5.1 to gpt-5.4.

Changes

Cohort / File(s)	Summary
Model Profiling System `lib/request/helpers/model-map.ts`	Major refactor introducing `MODEL_PROFILES` record mapping canonical model IDs to normalized names, prompt families, reasoning effort defaults/supported tiers, and tool capabilities. New exported types (`PromptModelFamily`, `ModelReasoningEffort`, `ModelCapabilities`, `ModelProfile`) and functions (`resolveNormalizedModel()`, `getModelProfile()`, `getModelCapabilities()`) provide structured access to model metadata. Updated `getNormalizedModel()` to strip provider prefixes and handle case-insensitive lookup; added new `DEFAULT_MODEL` constant set to `gpt-5.4`.
Model Normalization Integration `lib/capability-policy.ts`, `lib/request/request-transformer.ts`	`capability-policy.ts:line` renames helper to `resolveNormalizedModel()` and removes fallback to `withoutProvider`. `request-transformer.ts:line` delegates normalization entirely to `resolveNormalizedModel()` and replaces multi-branch effort downgrade/upgrade logic with generic `coerceReasoningEffort()` that checks model profile support and applies predefined fallback ordering per effort tier.
Prompt Family Centralization `lib/prompts/codex.ts`, `lib/codex-manager.ts`	`codex.ts:line` replaces local `getModelFamily()` implementation with lookup via `getModelProfile(normalizedModel).promptFamily`, changes `ModelFamily` type to imported `PromptModelFamily`, and updates default prewarm candidates from `["gpt-5-codex", "gpt-5.1"]` to `["gpt-5-codex", "gpt-5.4"]`. `codex-manager.ts:line` adds `ModelInspection` helpers to compute normalized names, detect remapping, derive prompt family, and capture capabilities, then surfaces `modelSelection` object in JSON reports with requested/normalized/remapped/promptFamily/capabilities breakdown.
Test Expectations `test/codex-manager-cli.test.ts`, `test/codex-prompts.test.ts`, `test/codex.test.ts`, `test/config.test.ts`, `test/model-map.test.ts`, `test/property/transformer.property.test.ts`, `test/request-transformer.test.ts`	Updated test coverage across model normalization, family routing, and reasoning effort tiers. `codex.test.ts:line` consolidates assertions and adds gpt-5.4 general model routing to `gpt-5.2` prompt family. `config.test.ts:line` reflects new effort defaults for lightweight models (`gpt-5-mini` now defaults to `medium` instead of `minimal`). `model-map.test.ts:line` adds `resolveNormalizedModel()` test block with new "unknown gpt-5-ish" fallback to `gpt-5.4`. All tests updated to reflect gpt-5.4 as default normalized model instead of gpt-5.1.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

key review vectors: the model profile system in lib/request/helpers/model-map.ts:line introduces dense type definitions and interconnected lookup logic that cascades through multiple subsystems. the reasoning effort coercion in request-transformer.ts:line replaces branching pattern-matching with profile-driven fallback logic—verify REASONING_FALLBACKS ordering matches intent for each effort tier. missing regression coverage: no explicit tests for edge cases where MODEL_PROFILES lacks an entry and DEFAULT_MODEL fallback is invoked; verify this doesn't mask provider-prefix stripping failures. concurrency risk: __clearCacheForTesting() is called in tests but no evidence caching mechanism itself is thread-safe if future async model loading is added. windows edge case: provider-prefix stripping at lib/request/helpers/model-map.ts:line uses / hardcoded—verify this doesn't collide with path handling on windows if model IDs ever encode paths.

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	⚠️ Warning	the pull request description is minimal and generic. it lacks required sections including validation checklist completion, docs/governance checklist status, and risk assessment details.	complete all checkbox items (npm run lint/typecheck/test/build), mark docs checklist items as done or explicitly explain why they don't apply, and provide concrete rollback steps beyond the commit hash reference.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	title follows conventional commits format (lowercase imperative, 54 chars), directly summarizes the core change aligning gpt-5 routing.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/openai-parity-pr1

✨ Simplify code

Create PR with simplified code
Commit simplified code in branch feat/openai-parity-pr1

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

lib/capability-policy.ts

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/codex-manager.ts`:
- Around line 1937-1938: Normalize the model alias once before any live-probe
paths and reuse that normalized value everywhere instead of passing raw
options.model into fetchCodexQuotaSnapshot; update the code that currently
creates probeModel/inspectRequestedModel at lines like const probeModel =
options.model?.trim() || "gpt-5-codex" and ensure that the same normalized
variable is used for all calls to fetchCodexQuotaSnapshot (symbols: probeModel,
modelInspection, fetchCodexQuotaSnapshot) across the other live paths
(previously at the spots calling fetchCodexQuotaSnapshot) so routing is
consistent; add a vitest that takes a single alias and invokes the CLI handlers
for check, report, forecast, best, and fix, asserting the same resolved backend
model is used for each.

In `@lib/prompts/codex.ts`:
- Around line 94-95: The prewarm path is currently firing per-model targets
(e.g., both "gpt-5.4" and "gpt-5.2") which map to the same prompt family via
getModelFamily; update the prewarm logic (the function that builds prewarm
targets around the code referenced at lines ~372-379) to deduplicate targets by
prompt family using getModelFamily so only one request per promptFamily is
issued, add retry/backoff handling for EBUSY/429 in that queue, and ensure no
sensitive tokens/emails are logged; then add a deterministic vitest that covers:
multiple models mapping to the same family produce a single prewarm request,
retries on simulated EBUSY/429, and that gpt-5.1 still gets prewarmed when
present to prevent coverage loss.

In `@lib/request/helpers/model-map.ts`:
- Around line 183-184: Replace the direct addAlias calls for the chat-latest
entries with addReasoningAliases so wildcard legacy variants (e.g.,
"gpt-5-chat-latest-high") map to the correct base model and reasoning defaults;
specifically update the lines that call addAlias("gpt-5.1-chat-latest",
"gpt-5.1") and addAlias("gpt-5-chat-latest", "gpt-5") to use addReasoningAliases
instead, and add a vitest regression that asserts variants like
"gpt-5-chat-latest-high" and "gpt-5.1-chat-latest-foo" resolve to the intended
model IDs and reasoning settings (matching the behavior of addReasoningAliases).

In `@test/codex-prompts.test.ts`:
- Around line 467-486: The direct-family tests for codex variants only assert
the mocked response body and miss verifying the raw GitHub fetch path; update
the two earlier tests that call getCodexInstructions for the direct families
(the tests around the 'codex' and 'codex-max' cases that currently only check
the mocked body) to inspect mockFetch.mock.calls, find the
raw.githubusercontent.com call (like in the gpt-5.4 test), and add an assertion
that the fetched URL contains the expected raw prompt filename (e.g.,
"gpt_5_codex_prompt.md" for codex and "gpt_5_codex_max_prompt.md" for codex-max)
so regressions that collapse families into the wrong prompt file will fail.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 183f50c9-5a91-43e8-91a0-f5088cfe0930

📥 Commits

Reviewing files that changed from the base of the PR and between 1be5e95 and 9e13a24.

📒 Files selected for processing (12)

lib/capability-policy.ts
lib/codex-manager.ts
lib/prompts/codex.ts
lib/request/helpers/model-map.ts
lib/request/request-transformer.ts
test/codex-manager-cli.test.ts
test/codex-prompts.test.ts
test/codex.test.ts
test/config.test.ts
test/model-map.test.ts
test/property/transformer.property.test.ts
test/request-transformer.test.ts

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Greptile Review

🧰 Additional context used

📓 Path-based instructions (2)

lib/**

⚙️ CodeRabbit configuration file

focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.

Files:

lib/capability-policy.ts
lib/request/request-transformer.ts
lib/codex-manager.ts
lib/prompts/codex.ts
lib/request/helpers/model-map.ts

test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

test/property/transformer.property.test.ts
test/config.test.ts
test/codex-prompts.test.ts
test/codex-manager-cli.test.ts
test/codex.test.ts
test/request-transformer.test.ts
test/model-map.test.ts

🔇 Additional comments (5)

test/property/transformer.property.test.ts (1)

48-56: good property coverage for the new defaults.

test/property/transformer.property.test.ts:48-56 and test/property/transformer.property.test.ts:206-217 pin the gpt-5.4 default and the xhigh downgrade path cleanly.

Also applies to: 206-217

test/codex-prompts.test.ts (1)

89-97: good family-routing regression coverage.

test/codex-prompts.test.ts:89-97 pins the gpt-5.4-era general models to the gpt-5.2 prompt family cleanly.

test/codex-manager-cli.test.ts (1)

5705-5776: good json report coverage for model diagnostics.

test/codex-manager-cli.test.ts:5705-5776 locks the new modelSelection payload for both the no-remap and remap cases.

test/config.test.ts (1)

93-118: good reasoning-tier regression coverage.

test/config.test.ts:93-118 and test/config.test.ts:191-196 cover the new lightweight defaults, the gpt-5.4 none default, and the gpt-5.4-pro clamp.

Also applies to: 175-196

test/codex.test.ts (1)

14-30: good prompt-family routing coverage.

test/codex.test.ts:14-30 pins the gpt-5.4-era routing and the default-family fallback in one place.

lib/codex-manager.ts

lib/prompts/codex.ts

lib/request/helpers/model-map.ts

test/codex-prompts.test.ts

ndycode · 2026-03-22T11:43:01Z

@coderabbitai review
@greptile @cubic review

coderabbitai · 2026-03-22T11:43:15Z

@ndycode: on it — kicking off a full review of PR #309 now.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

ndycode · 2026-03-22T11:43:29Z

@coderabbitai review

ndycode · 2026-03-22T11:43:30Z

@greptile @cubic review

coderabbitai · 2026-03-22T11:43:33Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

ndycode · 2026-03-22T12:02:03Z

@coderabbitai review
@greptile @cubic review

coderabbitai · 2026-03-22T12:02:17Z

@ndycode: on it — triggering a full review of PR #309 now. will cover the new model catalog in lib/request/helpers/model-map.ts, the updated routing and reasoning coercion in lib/request/request-transformer.ts, the capability-policy fallback change in lib/capability-policy.ts, the diagnostics additions in lib/codex-manager.ts, and all associated test coverage.

✅ Actions performed

Full review triggered.

All review threads are resolved and follow-up commits addressed this stale automated change request.

align gpt-5 model routing with current OpenAI defaults

9e13a24

ndycode added the needs-human-review Flagged by automated PR quality screening; maintainer review required. label Mar 22, 2026

greptile-apps bot reviewed Mar 22, 2026

View reviewed changes

lib/capability-policy.ts Outdated Show resolved Hide resolved

coderabbitai bot previously requested changes Mar 22, 2026

View reviewed changes

lib/codex-manager.ts Show resolved Hide resolved

lib/prompts/codex.ts Show resolved Hide resolved

lib/request/helpers/model-map.ts Outdated Show resolved Hide resolved

test/codex-prompts.test.ts Show resolved Hide resolved

ndycode mentioned this pull request Mar 22, 2026

add previous_response_id continuation and responses contract typing #310

Closed

ndycode and others added 3 commits March 22, 2026 18:21

Fix parity model normalization regressions

16bbcc6

Harden prompt raw-path assertions

e17a5c5

warn on reasoning effort coercion

163377e

ndycode removed the needs-human-review Flagged by automated PR quality screening; maintainer review required. label Mar 22, 2026

ndycode mentioned this pull request Mar 22, 2026

release: merge current main-target PR set #318

Merged

ndycode merged commit e820c32 into main Mar 23, 2026
2 checks passed

Conversation

ndycode commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Stack

Uh oh!

coderabbitai bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ndycode commented Mar 22, 2026

Uh oh!

coderabbitai bot commented Mar 22, 2026

Uh oh!

ndycode commented Mar 22, 2026

Uh oh!

ndycode commented Mar 22, 2026

Uh oh!

coderabbitai bot commented Mar 22, 2026

Uh oh!

ndycode commented Mar 22, 2026

Uh oh!

coderabbitai bot commented Mar 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ndycode commented Mar 22, 2026 •

edited

Loading

coderabbitai bot commented Mar 22, 2026 •

edited

Loading