Skip to content

feat(docs): add advanced document tools#278

Open
oskarcode wants to merge 4 commits intogemini-cli-extensions:mainfrom
oskarcode:feat/docs-advanced
Open

feat(docs): add advanced document tools#278
oskarcode wants to merge 4 commits intogemini-cli-extensions:mainfrom
oskarcode:feat/docs-advanced

Conversation

@oskarcode
Copy link
Contributor

Summary

Adds four new Google Docs tools for advanced document manipulation.

New Tools

docs.insertImage

Inserts an image into a Google Doc at a specified position or at the end of the document.

  • Parameters: documentId, imageUrl, positionIndex (optional), tabId (optional), widthPt (optional), heightPt (optional)

docs.insertTable

Inserts a table into a Google Doc at a specified position or at the end of the document.

  • Parameters: documentId, rows, columns, tabId (optional), positionIndex (optional)

docs.createHeaderFooter

Creates a header or footer in a Google Doc with optional initial text.

  • Parameters: documentId, type ("header" or "footer"), text (optional)

docs.addComment

Adds a comment to a Google Doc's comment thread.

  • Parameters: documentId, content

Changes

  • Added insertImage, insertTable, createHeaderFooter, addComment methods to DocsService.ts
  • Added getDriveClient helper method for Drive API access (needed by addComment)
  • Registered all four new tools in index.ts

Testing

Tested locally with Google Docs documents.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the Google Docs integration by introducing a suite of advanced document manipulation tools. These additions empower users to programmatically insert images and tables, manage headers and footers, and add comments within Google Docs, thereby enhancing automation and content generation workflows.

Highlights

  • New Google Docs Tools: Four new tools (insertImage, insertTable, createHeaderFooter, addComment) have been added to enhance Google Docs manipulation capabilities.
  • Image Insertion: A new docs.insertImage tool allows inserting images into Google Docs at specified positions with optional sizing.
  • Table Creation: A new docs.insertTable tool enables creating tables with defined rows and columns at a given position.
  • Header/Footer Management: A new docs.createHeaderFooter tool facilitates creating headers or footers, optionally with initial text.
  • Comment Functionality: A new docs.addComment tool provides the ability to add comments to Google Docs.
  • Drive API Integration: A getDriveClient helper method was introduced to support Drive API access, specifically for the addComment tool.
  • Tool Registration: All newly implemented tools have been registered in the server's tool registry.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • workspace-server/src/index.ts
    • Registered docs.insertImage tool with its description and input schema.
    • Registered docs.insertTable tool with its description and input schema.
    • Registered docs.createHeaderFooter tool with its description and input schema.
    • Registered docs.addComment tool with its description and input schema.
  • workspace-server/src/services/DocsService.ts
    • Imported drive_v3 from googleapis to support Google Drive API interactions.
    • Added a private getDriveClient method to obtain an authenticated Google Drive client.
    • Implemented the insertImage method to handle image insertion into Google Docs, including position and size options.
    • Implemented the insertTable method to create tables in Google Docs with specified dimensions and position.
    • Implemented the createHeaderFooter method to add headers or footers to documents, with an option to include initial text.
    • Implemented the addComment method to post comments to a Google Doc's comment thread using the Drive API.
Activity
  • No human activity (comments, reviews, etc.) has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces four new tools for advanced document manipulation in Google Docs: insertImage, insertTable, createHeaderFooter, and addComment. The implementation is solid, but I've identified a few areas for improvement. Specifically, there's a high-severity issue with image resizing that could cause distortion, some code duplication that affects maintainability, and a minor inconsistency in calculating insertion points. My review includes suggestions to address these points.

Comment on lines +713 to +722
if (widthPt || heightPt) {
imageRequest.insertInlineImage!.objectSize = {
width: widthPt
? { magnitude: widthPt, unit: 'PT' }
: { magnitude: 300, unit: 'PT' },
height: heightPt
? { magnitude: heightPt, unit: 'PT' }
: { magnitude: 200, unit: 'PT' },
};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current logic for setting the image size can lead to distorted images. If only widthPt is provided, a default heightPt of 200 is used, and vice-versa. This will not preserve the image's aspect ratio. The code should only set the dimensions that are explicitly provided by the user, allowing the Docs API to scale the other dimension to maintain the aspect ratio.

Suggested change
if (widthPt || heightPt) {
imageRequest.insertInlineImage!.objectSize = {
width: widthPt
? { magnitude: widthPt, unit: 'PT' }
: { magnitude: 300, unit: 'PT' },
height: heightPt
? { magnitude: heightPt, unit: 'PT' }
: { magnitude: 200, unit: 'PT' },
};
}
if (widthPt || heightPt) {
const objectSize: docs_v1.Schema$Size = {};
if (widthPt) {
objectSize.width = { magnitude: widthPt, unit: 'PT' };
}
if (heightPt) {
objectSize.height = { magnitude: heightPt, unit: 'PT' };
}
imageRequest.insertInlineImage!.objectSize = objectSize;
}

Comment on lines +681 to +701
if (!insertIndex) {
const res = await docs.documents.get({
documentId: id,
fields: 'tabs',
includeTabsContent: true,
});
const tabs = res.data.tabs || [];
let content: docs_v1.Schema$StructuralElement[] | undefined;
if (tabId) {
const tab = tabs.find((t) => t.tabProperties?.tabId === tabId);
if (!tab) {
throw new Error(`Tab with ID ${tabId} not found.`);
}
content = tab.documentTab?.body?.content;
} else if (tabs.length > 0) {
content = tabs[0].documentTab?.body?.content;
}
const lastElement = content?.[content.length - 1];
const endIndex = lastElement?.endIndex || 2;
insertIndex = Math.max(1, endIndex - 1);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This logic to determine the insertIndex when it's not provided is duplicated in the insertTable function (lines 778-798). To improve maintainability and avoid future inconsistencies, consider extracting this logic into a private helper method. This will make the code more DRY (Don't Repeat Yourself).

content = tabs[0].documentTab?.body?.content;
}
const lastElement = content?.[content.length - 1];
const endIndex = lastElement?.endIndex || 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's an inconsistency in how the default endIndex is calculated here compared to the existing writeText function. This line uses || 2, while writeText uses || 1 (see line 330 in the full file). To ensure predictable behavior and maintain consistency, it would be best to align this with the existing pattern.

Suggested change
const endIndex = lastElement?.endIndex || 2;
const endIndex = lastElement?.endIndex || 1;

content = tabs[0].documentTab?.body?.content;
}
const lastElement = content?.[content.length - 1];
const endIndex = lastElement?.endIndex || 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the insertImage function, there's an inconsistency here in calculating the default endIndex (|| 2) compared to the existing writeText function (|| 1 on line 330). Please align this for consistency.

Suggested change
const endIndex = lastElement?.endIndex || 2;
const endIndex = lastElement?.endIndex || 1;

- Add detailed logging for image dimension handling to preserve aspect ratio
- Extract duplicate insertIndex calculation logic into reusable helper method
- Consolidate identical logic between insertImage and insertTable methods
- Add comprehensive documentation for aspect ratio preservation behavior
- Improve code maintainability and reduce duplication
*/

import { google, docs_v1 } from 'googleapis';
import { google, docs_v1, drive_v3 } from 'googleapis';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #255 we are keeping a strict seperation of concerns for oauth scopes. All of our commenting features (which require the drive scope and api) are in the drive.* namespace. Please move commenting features there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback.

  • Removed Drive/comment-specific pieces from DocsService

Copy link
Contributor

@allenhutchison allenhutchison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR! These are great additions to the Docs toolset. I did a review and have some feedback — mostly small things, with a couple of items worth addressing before merge.

Must Fix

1. _calculateInsertionIndex should flatten nested tabs

The new helper searches res.data.tabs directly without calling this._flattenTabs(), but every other tab-aware method in the class (writeText, getText, replaceText) flattens the tab tree first. This means insertImage and insertTable will throw "Tab not found" for any nested/child tab.

-    const tabs = res.data.tabs || [];
+    const tabs = this._flattenTabs(res.data.tabs || []);

2. createHeaderFooter silently drops text when segmentId is missing

If the API doesn't return a headerId/footerId for some reason, the if (text && segmentId) check silently skips text insertion and reports success. The user asked for a header with text but gets one without — and no indication anything went wrong.

Could we surface this as an error instead?

if (text && !segmentId) {
  throw new Error(
    `Created ${type} but could not retrieve its ID from the API response. The provided text was not inserted.`,
  );
}

Should Fix

3. Consider adding isError: true to error responses

The existing getSuggestions method includes isError: true in its error return, which is the MCP-correct way to signal tool failures to clients. The new methods omit it. Worth adding for consistency (or removing from getSuggestions if the project intentionally omits it — but including it everywhere is the better practice).

4. Aspect ratio comment may be misleading

The comment at line 725 says "Google Docs API will automatically preserve aspect ratio" when only one dimension is provided, and the log messages repeat this claim. I don't believe this is a documented API guarantee — the behavior when only one dimension is specified isn't well-defined. Maybe soften to something like:

// Only set explicit dimensions if provided. If only one dimension is given,
// the API behavior for the missing dimension is not well-documented.

5. imageUrl could use .url() validation

Small one — z.string() for imageUrl could be z.string().url() to catch invalid URLs at the schema level instead of producing a confusing API error downstream.

Nits (take or leave)

  • rows/columns schema could benefit from a .max() to prevent opaque API failures with huge tables
  • widthPt || heightPt on line 727 treats 0 as "not provided" — !== undefined would be more precise
  • The _calculateInsertionIndex JSDoc doesn't mention the tabId parameter
  • Line 150 uses || 1 where ?? 1 would be more semantically correct (handles endIndex: 0)

Overall this is solid work — the _calculateInsertionIndex extraction is a nice cleanup, and the createHeaderFooter approach with endOfSegmentLocation is well done. Looking forward to getting these tools in! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants