Skip to content

Add sam2 edit buttons#1308

Open
GlenCarpenter wants to merge 6 commits intomcmonkeyprojects:masterfrom
GlenCarpenter:add-sam2-edit-buttons
Open

Add sam2 edit buttons#1308
GlenCarpenter wants to merge 6 commits intomcmonkeyprojects:masterfrom
GlenCarpenter:add-sam2-edit-buttons

Conversation

@GlenCarpenter
Copy link

PR Summary

This PR adds end-to-end SAM2 interactive masking support in SwarmUI, including frontend editor tools, backend workflow wiring, and custom Comfy nodes/utilities.

What Was Added

1. Backend parameter registration for SAM2 inputs

  • Added new internal T2I params in src/BuiltinExtensions/ComfyUIBackend/ComfyUIBackendExtension.cs:
  • SAM2 Point Image
  • SAM2 Positive Points
  • SAM2 Negative Points
  • SAM2 BBox
  • SAM2 Mask Padding
  • These are hidden/internal, retained for workflow use, and gated behind the sam2 feature flag.

2. Workflow generation steps for SAM2 masking

  • Added new SAM2 processing steps in src/BuiltinExtensions/ComfyUIBackend/WorkflowGeneratorSteps.cs:
  • Point-based segmentation path using Sam2Segmentation with positive/negative coordinates.
  • Bounding-box segmentation path using Sam2BBoxFromJson + Sam2Segmentation.
  • Optional mask padding support via sammaskpadding.
  • Both paths convert mask output to image and save as output.
  • Workflow short-circuits after SAM2 mask generation (SkipFurtherSteps = true) so mask generation acts as the terminal operation for that request.

3. New Comfy extra node package for SAM2

  • Added src/BuiltinExtensions/ComfyUIBackend/ExtraNodes/Sam2BBoxNode/__init__.py:
  • Sam2BBoxFromJson node (JSON bbox string -> BBOX type).
  • Sam2PointSegmentation node (point-prompt segmentation).
  • Sam2BBoxSegmentation node (bbox-prompt segmentation).
  • Registered node class and display-name mappings.

4. SAM2 utility implementation

  • Added src/BuiltinExtensions/ComfyUIBackend/ExtraNodes/Sam2BBoxNode/sam2.py:
  • SAM model loading/caching with GPU/CPU detection.
  • Point and bbox inference helpers.
  • Morphological hole filling and mask dilation/padding.
  • Crop-to-mask with optional aspect ratio logic.
  • Background removal producing RGBA PNG output.

5. Frontend image editor tooling for SAM2

  • Added major editor functionality in src/wwwroot/js/genpage/helpers/image_editor.js:
  • New tool: ImageEditorToolSam2Points
  • Left click adds positive points, right click adds negative points.
  • Requests mask generation after each interaction.
  • Handles in-flight request ordering and stale response protection.
  • Includes clear-mask and mask-padding controls.
  • New tool: ImageEditorToolSam2BBox
  • Click-drag bbox selection and mask generation on release.
  • Includes warmup behavior and mask-padding controls.
  • Added context menu interception support (onContextMenu) so right-click can be used for negative points.
  • Added applyMaskFromImage(...) helper to apply generated masks to mask layers and normalize black pixels to transparent.
  • Reset SAM2 tool state when opening/loading images.
  • Registered both tools in editor tool initialization.

6. New toolbar/tool icons

  • Added new icon assets:
  • src/wwwroot/imgs/crosshair.png
  • src/wwwroot/imgs/rectangle.png

User-visible impact

  • Users can now create masks interactively with SAM2 directly in the image editor using:
  • Point prompts (positive/negative clicks)
  • Bounding box prompts (drag-select)
  • Generated SAM2 masks are applied into the editor mask layer and can be padded to improve inpainting/edge blending workflows.

- Implemented SAM2 point segmentation tool allowing users to place positive and negative points to generate masks.
- Added SAM2 bounding box segmentation tool for users to draw bounding boxes and generate masks.
- Integrated SAM2 model loading and inference into the workflow generator.
- Introduced new UI elements for mask padding configuration and clear mask functionality.
- Added crosshair and rectangle images for cursor representation in the editor.
- Enhanced image editor to handle mask application from generated images.
@mcmonkey4eva
Copy link
Member

Before even properly reviewing this: I see code about downloading and running sam, but swarm already has sam2 handling in the image editor, so I wouldn't expect you to need to add new core execution handling for it?

@GlenCarpenter
Copy link
Author

Before even properly reviewing this: I see code about downloading and running sam, but swarm already has sam2 handling in the image editor, so I wouldn't expect you to need to add new core execution handling for it?

I forgot about that, I'll point the downloader to the existing functionality

@GlenCarpenter
Copy link
Author

If any of the backend stuff needs to be moved just let me know, this was migrated from the extension and I tried not to modify any existing SAM2 classes until you confirmed you wanted it there

@mcmonkey4eva
Copy link
Member

ideally as much as possible existing functionality should be used and the new feature should be entirely in the UI, excluding only where the UI can't perform some part of it and some backend functionality is needed, in which case it should sit as close as possible to where the existing sam handling is

@GlenCarpenter
Copy link
Author

ideally as much as possible existing functionality should be used and the new feature should be entirely in the UI, excluding only where the UI can't perform some part of it and some backend functionality is needed, in which case it should sit as close as possible to where the existing sam handling is

I consolidated the SAM2 downloader.

The Autosegment follows a CreatePreprocessor() pattern while the Points and BBox follow a CreateNode() pattern, they cannot be easily consolidated.

@mcmonkey4eva
Copy link
Member

You still have entire large sections of custom sam handling in the python

@GlenCarpenter
Copy link
Author

You still have entire large sections of custom sam handling in the python

My mistake. I have broken out the fill_mask_holes and add_mask_padding to a post-processing node and cleaned up all of the old downloader Python.

@mcmonkey4eva
Copy link
Member

simple image processing on the sam image should ideally be done on the browser (image editor code) not needing a call to python backend. Can just load the sam map once and operate on it continually

@GlenCarpenter
Copy link
Author

GlenCarpenter commented Mar 9, 2026

simple image processing on the sam image should ideally be done on the browser (image editor code) not needing a call to python backend. Can just load the sam map once and operate on it continually

If you are referring to fill_mask_holes and add_mask_padding these operations are transformations performed on the direct output of the SAM2 inference while constructing the masks. Without fill_mask_holes the SAM2 masks tend to be splotchy/jagged. add_mask_padding allows the user to grow the mask to ensure it contains the entire subject.

The JavaScript simply constructs the points/bbox prompts and then renders the masks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants