Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR integrates SAM2-based segmentation to compute and cache embeddings, and adds an interactive PyQt5 GUI for browsing images, viewing detections and masks, and filtering by embedding similarity.
- Introduces
Sam2Runnerto run SAM2 on image crops and pool features into embeddings. - Updates
DatasetManagerandFeatureIndexto build or load a persistent embedding index. - Implements a PyQt5 GUI (
MainWindowandImageViewer) for image/detection lists and similarity-based filtering.
Reviewed Changes
Copilot reviewed 13 out of 25 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| ssya/widgets/image_viewer.py | Added widget to render images with bounding boxes, masks, and similarity scores. |
| ssya/models/detection.py | Defined Detection dataclass and bounding-box normalization conversion. |
| ssya/main.py | Built main PyQt5 GUI logic: file/detection lists, sliders, and filter actions. |
| ssya/helpers/metrics.py | Added cosine_similarity utility. |
| ssya/controllers/sam2_wrapper.py | Wrapped SAM2 inference to produce masks and embeddings. |
| ssya/controllers/features_index.py | Created FeatureIndex for embedding persistence and lookup. |
| ssya/controllers/dataset_manager.py | Manages dataset loading, annotation parsing, index building/loading. |
| pyproject.toml | Declared sam2 dependency. |
| README.md | Updated project overview, installation, usage, and documentation (in Polish). |
| .vscode/launch.json | Updated VSCode debug configuration for the new entrypoint. |
Comments suppressed due to low confidence (1)
.vscode/launch.json:18
- Remove the stray comma on its own line to fix invalid JSON syntax in the launch configuration.
,
| for det in detections: | ||
| if det.embedding is None: | ||
| for e in self.entries: | ||
| if e["image_idx"] == det.image_idx and e["det_idx"] == det.class_id: |
There was a problem hiding this comment.
This condition compares the stored det_idx against det.class_id, but det_idx represents the detection's position, not its class. Consider comparing e['det_idx'] to the detection's index or storing class_id instead to ensure correct embedding lookup.
| self.detections[img_path] = [ | ||
| Detection(int(cls), (float(xc), float(yc), float(w), float(h)), img_idx) for cls, xc, yc, w, h in lines | ||
| ] | ||
| logger.info("Dataset: %d images (%d with annotations)", len(self.images), len(self.detections)) |
There was a problem hiding this comment.
len(self.detections) returns the total number of images (keys in the dict), not only those with annotations. To accurately log images with annotations, count entries where the detection list is non-empty.
| logger.info("Dataset: %d images (%d with annotations)", len(self.images), len(self.detections)) | |
| logger.info("Dataset: %d images (%d with annotations)", len(self.images), sum(1 for dets in self.detections.values() if dets)) |
| self.lbl = QLabel(alignment=Qt.AlignCenter) | ||
| self.lbl.setSizePolicy(QSizePolicy.Expanding, QSizePolicy.Expanding) | ||
| self.lbl.setScaledContents(True) | ||
| # Fixed with and height set, 1920 - 200 , 1080 - 200 |
There was a problem hiding this comment.
Typo in comment: 'with' should be 'width'.
| # Fixed with and height set, 1920 - 200 , 1080 - 200 | |
| # Fixed width and height set, 1920 - 200 , 1080 - 200 |
| # Fixed with and height set, 1920 - 200 , 1080 - 200 | ||
| self.lbl.setFixedWidth(1620) | ||
| self.lbl.setFixedHeight(880) |
There was a problem hiding this comment.
[nitpick] Using hardcoded width and height may limit flexibility; consider making these dimensions configurable or adapting to the parent layout dynamically.
| # Fixed with and height set, 1920 - 200 , 1080 - 200 | |
| self.lbl.setFixedWidth(1620) | |
| self.lbl.setFixedHeight(880) | |
| # Allow dynamic resizing based on parent layout | |
| self.lbl.setSizePolicy(QSizePolicy.Expanding, QSizePolicy.Expanding) | |
| self.lbl.setScaledContents(True) |
No description provided.