feat: add image generation support (gemini-3.1-flash-image)#522
feat: add image generation support (gemini-3.1-flash-image)#522bentcc wants to merge 1 commit intoNoeFabris:devfrom
Conversation
…lags, and nanobanana output - Add antigravity-gemini-3-pro-image and antigravity-gemini-3.1-flash-image model definitions - Per-model aspect ratio validation (flash supports 1:4, 4:1, 1:8, 8:1; pro does not) - Per-model imageSize validation (flash supports 0.5K; pro does not) - Parse --resolution and --aspect-ratio flags from prompt text (stripped before sending to Gemini) - Prompt flag overrides take priority over env vars (OPENCODE_IMAGE_SIZE, OPENCODE_IMAGE_ASPECT_RATIO) - Change image output directory from ~/.opencode/generated-images/ to ./nanobanana/ - Flash-image thinking support (minimal/high) - Update README with image model config and usage docs
WalkthroughThis pull request introduces a new image-generation model called antigravity-gemini-3.1-flash-image with thinking capabilities. It adds a prompt flag parser to extract Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/plugin/transform/prompt-flags.ts (1)
45-56: Prefer last-occurrence precedence when the same flag appears multiple times.Current extraction captures only the first
--resolution/--aspect-ratio. If users append a later override, it is ignored.♻️ Suggested change
- const resolutionMatch = RESOLUTION_PATTERN.exec(prompt) - if (resolutionMatch?.[1]) { - result.resolution = resolutionMatch[1] - } + let resolutionMatch: RegExpExecArray | null + while ((resolutionMatch = RESOLUTION_PATTERN.exec(prompt)) !== null) { + if (resolutionMatch[1]) { + result.resolution = resolutionMatch[1] + } + } // Reset lastIndex for global regex RESOLUTION_PATTERN.lastIndex = 0 // Extract --aspect-ratio - const aspectRatioMatch = ASPECT_RATIO_PATTERN.exec(prompt) - if (aspectRatioMatch?.[1]) { - result.aspectRatio = aspectRatioMatch[1] - } + let aspectRatioMatch: RegExpExecArray | null + while ((aspectRatioMatch = ASPECT_RATIO_PATTERN.exec(prompt)) !== null) { + if (aspectRatioMatch[1]) { + result.aspectRatio = aspectRatioMatch[1] + } + } ASPECT_RATIO_PATTERN.lastIndex = 0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/plugin/transform/prompt-flags.ts` around lines 45 - 56, Current code uses RESOLUTION_PATTERN.exec(prompt) and ASPECT_RATIO_PATTERN.exec(prompt) which only return the first match; change each extraction to iterate over all matches and assign the capture group on each iteration so the last occurrence wins (e.g., use a while loop with RESOLUTION_PATTERN.exec(prompt) and set result.resolution = match[1] each iteration, then reset RESOLUTION_PATTERN.lastIndex = 0; do the same for ASPECT_RATIO_PATTERN/result.aspectRatio), or use prompt.matchAll(...) and take the last match; ensure you still reset lastIndex for global regexes after scanning.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/plugin/transform/gemini.test.ts`:
- Around line 648-651: The test "returns false for gemini-2.5-flash-image"
currently asserts true, contradicting the test intent and flash-model gating;
update the assertion in the test (in src/plugin/transform/gemini.test.ts) to
expect(false) for the input "gemini-2.5-flash-image" so it aligns with the
isFlashImageModel(model: string) behavior and the regex that only matches
gemini-3(-3.1)-flash-image variants.
In `@src/plugin/transform/model-resolver.ts`:
- Around line 221-228: The returned object leaves the original tier (e.g.,
"low"/"medium") while coercing thinkingLevel to flashImageThinkingLevel ("high"
or "minimal"), causing inconsistent state; update the returned tier to the
effective flash-image level by replacing the spread that conditionally returns
tier with one that sets tier: flashImageThinkingLevel (or only include tier when
it matches the effective level), so in the function that constructs the result
(references: resolvedModel, flashImageThinkingLevel, thinkingLevel, tier,
quotaPreference) ensure the tier property reflects flashImageThinkingLevel
instead of the original tier value.
---
Nitpick comments:
In `@src/plugin/transform/prompt-flags.ts`:
- Around line 45-56: Current code uses RESOLUTION_PATTERN.exec(prompt) and
ASPECT_RATIO_PATTERN.exec(prompt) which only return the first match; change each
extraction to iterate over all matches and assign the capture group on each
iteration so the last occurrence wins (e.g., use a while loop with
RESOLUTION_PATTERN.exec(prompt) and set result.resolution = match[1] each
iteration, then reset RESOLUTION_PATTERN.lastIndex = 0; do the same for
ASPECT_RATIO_PATTERN/result.aspectRatio), or use prompt.matchAll(...) and take
the last match; ensure you still reset lastIndex for global regexes after
scanning.
ℹ️ Review info
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (13)
README.mdsrc/plugin/accounts.test.tssrc/plugin/config/models.test.tssrc/plugin/config/models.tssrc/plugin/image-saver.tssrc/plugin/request.tssrc/plugin/transform/gemini.test.tssrc/plugin/transform/gemini.tssrc/plugin/transform/index.tssrc/plugin/transform/model-resolver.test.tssrc/plugin/transform/model-resolver.tssrc/plugin/transform/prompt-flags.test.tssrc/plugin/transform/prompt-flags.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Greptile Review
- GitHub Check: Greptile Review
🧰 Additional context used
🧬 Code graph analysis (4)
src/plugin/transform/prompt-flags.ts (1)
src/plugin/transform/index.ts (3)
ParsedPromptFlags(69-69)parsePromptFlags(66-66)extractLastUserPrompt(67-67)
src/plugin/transform/prompt-flags.test.ts (1)
src/plugin/transform/prompt-flags.ts (2)
parsePromptFlags(39-73)extractLastUserPrompt(82-109)
src/plugin/transform/gemini.test.ts (1)
src/plugin/transform/gemini.ts (5)
isImageGenerationModel(152-158)isFlashImageModel(245-247)buildImageGenerationConfig(271-302)getValidAspectRatios(225-230)getValidImageSizes(235-240)
src/plugin/transform/model-resolver.test.ts (2)
src/plugin/transform/index.ts (1)
resolveModelWithTier(22-22)src/plugin/transform/model-resolver.ts (1)
resolveModelWithTier(169-296)
🔇 Additional comments (17)
src/plugin/accounts.test.ts (1)
1166-1177: Model name migration in strict-header wait-time tests looks correct.This update keeps coverage intact while moving the test to the flash-image model key.
src/plugin/image-saver.ts (1)
13-16: Output directory migration to cwd-relativenanobananais cleanly implemented.The path change and doc updates are consistent, and directory bootstrapping remains intact.
src/plugin/config/models.ts (1)
70-81: New flash-image model definition is well-structured and consistent with the catalog.The added entry cleanly expresses limits, modalities, and variant thinking levels.
src/plugin/transform/prompt-flags.ts (1)
82-109: Backward scan for the last user text prompt is solid.The function correctly prioritizes the most recent actionable user text and returns stable indices for in-place mutation.
src/plugin/transform/gemini.ts (1)
225-302: Model-aware image config validation and override precedence are implemented cleanly.The helper split (
getValidAspectRatios,getValidImageSizes,buildImageGenerationConfig) makes behavior explicit and testable.src/plugin/transform/index.ts (1)
53-69: Transform index exports are updated coherently for the new image/prompt utilities.The re-export surface now cleanly exposes the new flash-image and prompt-flag capabilities.
src/plugin/request.ts (2)
976-1010: Prompt-flag extraction + imageConfig override wiring is well integrated.This segment cleanly parses inline image flags, removes them from user text, and applies model-aware config with correct flash-image thinking handling.
1787-1801: Image-size rejection annotation is a useful recovery hook.Setting
x-antigravity-image-error=imagesize_unsupportedhere is a good signal for downstream fallback behavior.README.md (3)
125-125: Model table addition looks correct.Line 125 cleanly introduces the new image model and its variants in the reference list.
152-175: Image-generation usage docs are clear and actionable.The examples and precedence explanation are easy to follow and match the expected UX for overrides and defaults.
231-239: Full configuration block is consistent with the new model contract.Good addition for copy-paste onboarding.
src/plugin/config/models.test.ts (1)
22-22: Model definition coverage update is correct.Line 22 appropriately extends the expected key list for the new model.
src/plugin/transform/gemini.test.ts (2)
688-838: Great coverage for image config precedence and validation.These tests thoroughly cover override priority, normalization, and model-specific acceptance/rejection paths.
841-879: Helper API tests are well-scoped and valuable.Nice addition of direct assertions for base vs flash-specific option sets.
src/plugin/transform/prompt-flags.test.ts (1)
4-196: Comprehensive parser/extractor test coverage.The scenarios cover practical prompt shapes and edge cases well, especially multi-part user content traversal.
src/plugin/transform/model-resolver.test.ts (2)
82-85: CLI-first routing expectation for image models is correctly enforced.Good guard that image models keep antigravity quota semantics.
158-190: Flash-image resolver behavior is well covered.These assertions meaningfully lock in tier-to-thinking mapping for the new model variants.
| it("returns false for gemini-2.5-flash-image", () => { | ||
| // 2.5 flash image is not a gemini-3 flash image model | ||
| expect(isFlashImageModel("gemini-2.5-flash-image")).toBe(true); | ||
| }); |
There was a problem hiding this comment.
Align the assertion with the test intent and flash-model gating.
Line 650 currently asserts true, but the test title/comment says false. This contradiction can hide incorrect classification behavior for non-3.1 models.
🔧 Proposed fix
- it("returns false for gemini-2.5-flash-image", () => {
- // 2.5 flash image is not a gemini-3 flash image model
- expect(isFlashImageModel("gemini-2.5-flash-image")).toBe(true);
- });
+ it("returns false for gemini-2.5-flash-image", () => {
+ expect(isFlashImageModel("gemini-2.5-flash-image")).toBe(false);
+ });// src/plugin/transform/gemini.ts (outside this line range)
export function isFlashImageModel(model: string): boolean {
return /gemini-3(?:\.1)?-flash-image(?:-preview)?/i.test(model);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| it("returns false for gemini-2.5-flash-image", () => { | |
| // 2.5 flash image is not a gemini-3 flash image model | |
| expect(isFlashImageModel("gemini-2.5-flash-image")).toBe(true); | |
| }); | |
| it("returns false for gemini-2.5-flash-image", () => { | |
| expect(isFlashImageModel("gemini-2.5-flash-image")).toBe(false); | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/plugin/transform/gemini.test.ts` around lines 648 - 651, The test
"returns false for gemini-2.5-flash-image" currently asserts true, contradicting
the test intent and flash-model gating; update the assertion in the test (in
src/plugin/transform/gemini.test.ts) to expect(false) for the input
"gemini-2.5-flash-image" so it aligns with the isFlashImageModel(model: string)
behavior and the regex that only matches gemini-3(-3.1)-flash-image variants.
| const flashImageThinkingLevel = tier === "high" ? "high" : "minimal"; | ||
| return { | ||
| actualModel: resolvedModel, | ||
| isThinkingModel: true, | ||
| isImageModel: true, | ||
| thinkingLevel: flashImageThinkingLevel, | ||
| ...(tier ? { tier } : {}), | ||
| quotaPreference, |
There was a problem hiding this comment.
Normalize returned tier to the effective flash-image level.
In this branch, tier="low" / tier="medium" is coerced to thinkingLevel="minimal", but the original tier is still returned. That creates inconsistent resolved state.
🔧 Suggested fix
- const flashImageThinkingLevel = tier === "high" ? "high" : "minimal";
+ const flashImageThinkingLevel = tier === "high" ? "high" : "minimal";
+ const normalizedTier: ThinkingTier | undefined =
+ tier ? (flashImageThinkingLevel === "high" ? "high" : "minimal") : undefined;
return {
actualModel: resolvedModel,
isThinkingModel: true,
isImageModel: true,
thinkingLevel: flashImageThinkingLevel,
- ...(tier ? { tier } : {}),
+ ...(normalizedTier ? { tier: normalizedTier } : {}),
quotaPreference,
explicitQuota,
};📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const flashImageThinkingLevel = tier === "high" ? "high" : "minimal"; | |
| return { | |
| actualModel: resolvedModel, | |
| isThinkingModel: true, | |
| isImageModel: true, | |
| thinkingLevel: flashImageThinkingLevel, | |
| ...(tier ? { tier } : {}), | |
| quotaPreference, | |
| const flashImageThinkingLevel = tier === "high" ? "high" : "minimal"; | |
| const normalizedTier: ThinkingTier | undefined = | |
| tier ? (flashImageThinkingLevel === "high" ? "high" : "minimal") : undefined; | |
| return { | |
| actualModel: resolvedModel, | |
| isThinkingModel: true, | |
| isImageModel: true, | |
| thinkingLevel: flashImageThinkingLevel, | |
| ...(normalizedTier ? { tier: normalizedTier } : {}), | |
| quotaPreference, |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/plugin/transform/model-resolver.ts` around lines 221 - 228, The returned
object leaves the original tier (e.g., "low"/"medium") while coercing
thinkingLevel to flashImageThinkingLevel ("high" or "minimal"), causing
inconsistent state; update the returned tier to the effective flash-image level
by replacing the spread that conditionally returns tier with one that sets tier:
flashImageThinkingLevel (or only include tier when it matches the effective
level), so in the function that constructs the result (references:
resolvedModel, flashImageThinkingLevel, thinkingLevel, tier, quotaPreference)
ensure the tier property reflects flashImageThinkingLevel instead of the
original tier value.
Greptile SummaryThis PR adds image generation support via a new Key changes:
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Client
participant request.ts
participant prompt-flags.ts
participant gemini.ts
participant model-resolver.ts
participant Antigravity API
participant image-saver.ts
Client->>request.ts: POST /v1/messages (image model)
request.ts->>model-resolver.ts: resolveModelWithTier(model)
model-resolver.ts-->>request.ts: ResolvedModel { isImageModel, thinkingLevel }
request.ts->>prompt-flags.ts: extractLastUserPrompt(contents)
prompt-flags.ts-->>request.ts: { text, contentIndex, partIndex }
request.ts->>prompt-flags.ts: parsePromptFlags(text)
prompt-flags.ts-->>request.ts: { cleanedPrompt, resolution, aspectRatio }
request.ts->>request.ts: mutate contents[i].parts[j].text = cleanedPrompt
request.ts->>gemini.ts: buildImageGenerationConfig(model, overrides)
note over gemini.ts: Priority: flags > env vars > default
gemini.ts->>gemini.ts: getValidAspectRatios / getValidImageSizes
gemini.ts-->>request.ts: ImageConfig { aspectRatio, imageSize }
request.ts->>request.ts: generationConfig.imageConfig = imageConfig
request.ts->>Antigravity API: POST with imageConfig + thinkingConfig
Antigravity API-->>request.ts: SSE stream with inlineData (base64 image)
request.ts->>image-saver.ts: processImageData(inlineData)
image-saver.ts->>image-saver.ts: saveImageToDisk → ./nanobanana/
image-saver.ts-->>request.ts: markdown with file path
request.ts-->>Client: transformed SSE response
Last reviewed commit: 22ad11b |
| function getImageOutputDir(): string { | ||
| const homeDir = os.homedir(); | ||
| const outputDir = path.join(homeDir, '.opencode', 'generated-images'); | ||
| const outputDir = path.join(process.cwd(), 'nanobanana'); |
There was a problem hiding this comment.
CWD-relative output path causes inconsistent image locations
path.join(process.cwd(), 'nanobanana') means images are saved relative to wherever the user runs the CLI. If users invoke the CLI from their home directory, a project subdirectory, or a script, images land in a different place each time — making them hard to find. The previous path (~/.opencode/generated-images/) was always the same regardless of CWD.
Consider using an absolute, stable path instead:
| const outputDir = path.join(process.cwd(), 'nanobanana'); | |
| const outputDir = path.join(require('os').homedir(), '.opencode', 'generated-images'); |
or at minimum document prominently that the directory is CWD-relative, and expose a config option so users can override it.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/plugin/image-saver.ts
Line: 16
Comment:
**CWD-relative output path causes inconsistent image locations**
`path.join(process.cwd(), 'nanobanana')` means images are saved relative to wherever the user runs the CLI. If users invoke the CLI from their home directory, a project subdirectory, or a script, images land in a different place each time — making them hard to find. The previous path (`~/.opencode/generated-images/`) was always the same regardless of CWD.
Consider using an absolute, stable path instead:
```suggestion
const outputDir = path.join(require('os').homedir(), '.opencode', 'generated-images');
```
or at minimum document prominently that the directory is CWD-relative, and expose a config option so users can override it.
How can I resolve this? If you propose a fix, please make it concise.| const RESOLUTION_PATTERN = /--resolution[=\s]+["']?([^\s"']+)["']?/gi | ||
| const ASPECT_RATIO_PATTERN = /--aspect-ratio[=\s]+["']?([^\s"']+)["']?/gi |
There was a problem hiding this comment.
Module-level global regex constants carry mutable lastIndex state across calls
RESOLUTION_PATTERN and ASPECT_RATIO_PATTERN are module-level singletons with the g flag. Every call to exec() advances the shared lastIndex. The manual lastIndex = 0 resets placed throughout the function guard against this today, but the pattern is fragile: any future refactoring that forgets a reset, or an early return/exception before a reset fires, will leave the regex in a dirty state and cause the next call to miss its match.
A safer approach for the exec() step is to create a fresh, non-global regex (or a scoped copy) so lastIndex is never an issue, while keeping the g flag only for the .replace() step:
// Use a scoped copy for exec (no shared state)
const resolutionMatch = new RegExp(RESOLUTION_PATTERN.source, "i").exec(prompt)Alternatively, split the constants into a non-global extractor and a global replacer.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/plugin/transform/prompt-flags.ts
Line: 30-31
Comment:
**Module-level global regex constants carry mutable `lastIndex` state across calls**
`RESOLUTION_PATTERN` and `ASPECT_RATIO_PATTERN` are module-level singletons with the `g` flag. Every call to `exec()` advances the shared `lastIndex`. The manual `lastIndex = 0` resets placed throughout the function guard against this today, but the pattern is fragile: any future refactoring that forgets a reset, or an early return/exception before a reset fires, will leave the regex in a dirty state and cause the *next* call to miss its match.
A safer approach for the `exec()` step is to create a fresh, non-global regex (or a scoped copy) so `lastIndex` is never an issue, while keeping the `g` flag only for the `.replace()` step:
```ts
// Use a scoped copy for exec (no shared state)
const resolutionMatch = new RegExp(RESOLUTION_PATTERN.source, "i").exec(prompt)
```
Alternatively, split the constants into a non-global extractor and a global replacer.
How can I resolve this? If you propose a fix, please make it concise.
Additional Comments (1)
(Drop the platform-specific hint, or detect the platform and choose the right command.) Prompt To Fix With AIThis is a comment left during a code review.
Path: src/plugin/image-saver.ts
Line: 91
Comment:
**macOS-only `open` command embedded in returned markdown**
`open "${filePath}"` only works on macOS. On Linux the equivalent is `xdg-open`, on Windows it is `start`. Embedding a platform-specific command in the returned markdown will be confusing or broken for non-macOS users.
```suggestion
return `\n\nImage saved to: \`${filePath}\``;
```
(Drop the platform-specific hint, or detect the platform and choose the right command.)
How can I resolve this? If you propose a fix, please make it concise. |
Summary
antigravity-gemini-3.1-flash-imagemodel definition with per-model validation for aspect ratios and image sizes--resolutionand--aspect-ratioprompt flags that are parsed from user input and stripped before sending to Gemini~/.opencode/generated-images/to./nanobanana/(relative toprocess.cwd())gemini-3-pro-imagereferences (model was removed by Google from Antigravity — returns 404)Details
New model:
antigravity-gemini-3.1-flash-imageminimal(default) andhighvariants4:1,8:1) and image size0.5K(flash-only)Prompt flags
Users can inline
--resolution=4Kand--aspect-ratio=16:9in their prompt text. Flags are extracted, validated per-model, and stripped before the prompt reaches Gemini. Priority: prompt flags > env vars (OPENCODE_IMAGE_SIZE,OPENCODE_IMAGE_ASPECT_RATIO) > defaults.Per-model validation
4:1,8:1) and0.5Kimage sizeFiles changed (14 files, +805 / -53)
models.ts— flash-image model definition added, pro-image removedgemini.ts—buildImageGenerationConfig()now accepts model + overrides, per-model validation viagetValidAspectRatios()/getValidImageSizes()prompt-flags.ts(new) — parses--resolutionand--aspect-ratiofrom prompt textmodel-resolver.ts— flash-image thinking support (minimal/high), pro-image cleanuprequest.ts— integrates prompt flag parsing into image pipeline, flash-image thinking configimage-saver.ts— output dir changed to./nanobanana/index.ts— barrel exports for new modulesREADME.md— model table, JSON config, usage examples with flag documentationTest Plan
npm run buildcompiles cleanlynpm run typecheckpassesAttribution
All code in this PR was written by Claude Opus 4.6 (Anthropic), pair-programmed with a human operator via OpenCode.