Core classification models and utilities for multi-modal data.
.. currentmodule:: marvis.models
.. autoclass:: marvis.models.marvis_tsne.MarvisTsneClassifier
:members:
:undoc-members:
:show-inheritance:
:special-members: __init__
.. rubric:: Methods
.. autosummary::
:toctree: generated/
~marvis.models.marvis_tsne.MarvisTsneClassifier.fit
~marvis.models.marvis_tsne.MarvisTsneClassifier.predict
~marvis.models.marvis_tsne.MarvisTsneClassifier.evaluate
.. automodule:: marvis.models.knn_utils
:members:
:undoc-members:
.. automodule:: marvis.models.vq
:members:
:undoc-members:
.. autoclass:: marvis.models.vq.vector_quantizer.VectorQuantizer
:members:
:undoc-members:
:show-inheritance:
.. note::
Vector quantization modules are available for advanced use cases.
The main classifier accepts the following key parameters:
Parameter
Type
Description
modality
str
Data modality: "tabular", "audio", or "vision"
vlm_model_id
str
Vision Language Model identifier (e.g., "Qwen/Qwen2.5-VL-3B-Instruct")
use_3d
bool
Whether to use 3D visualizations (default: False)
use_knn_connections
bool
Whether to show KNN connections in visualizations (default: False)
Parameter
Type
Description
tsne_perplexity
int
t-SNE perplexity parameter (default: 30)
tsne_n_iter
int
Number of t-SNE iterations (default: 1000)
enable_multi_viz
bool
Enable multi-visualization framework (default: False)
visualization_methods
List[str]
Visualization methods to use (e.g., ["pca", "tsne", "umap"])
Parameter
Type
Description
api_model
str
Generic API model (auto-detects provider)
openai_model
str
OpenAI model (e.g., "gpt-4o")
gemini_model
str
Google Gemini model (e.g., "gemini-2.0-flash-exp")
enable_thinking
bool
Enable thinking mode for API models (default: True)
Parameter
Type
Description
max_vlm_image_size
int
Maximum image size for VLM (default: 2048)
gpu_memory_utilization
float
GPU memory utilization factor (default: 0.9)
cache_dir
str
Directory for caching embeddings
max_tabpfn_samples
int
Maximum samples for TabPFN (default: 3000)
Modality-Specific Parameters
Parameter
Type
Description
embedding_model
str
Audio embedding model: "whisper" or "clap"
whisper_model
str
Whisper model variant (default: "large-v2")
include_spectrogram
bool
Include spectrogram in prompts (default: True)
Parameter
Type
Description
dinov2_model
str
DINOV2 model variant (default: "dinov2_vitb14")
use_pca_backend
bool
Use PCA instead of t-SNE (default: False)
from marvis .models .marvis_tsne import MarvisTsneClassifier
from sklearn .datasets import make_classification
# Create sample data
X , y = make_classification (n_samples = 100 , n_features = 10 , n_classes = 3 )
# Basic classifier
classifier = MarvisTsneClassifier (modality = "tabular" )
classifier .fit (X , y )
predictions = classifier .predict (X )
# Advanced classifier with 3D visualization
classifier = MarvisTsneClassifier (
modality = "tabular" ,
vlm_model_id = "Qwen/Qwen2.5-VL-3B-Instruct" ,
use_3d = True ,
use_knn_connections = True ,
knn_k = 5 ,
tsne_perplexity = 25 ,
max_vlm_image_size = 1024 ,
cache_dir = "./cache"
)
# Multi-visualization framework
classifier = MarvisTsneClassifier (
modality = "tabular" ,
enable_multi_viz = True ,
visualization_methods = ["pca" , "tsne" , "spectral" ],
layout_strategy = "adaptive_grid" ,
reasoning_focus = "comparison"
)
# OpenAI GPT-4V
classifier = MarvisTsneClassifier (
modality = "vision" ,
openai_model = "gpt-4o" ,
enable_thinking = True
)
# Google Gemini
classifier = MarvisTsneClassifier (
modality = "vision" ,
gemini_model = "gemini-2.0-flash-exp"
)
# Whisper embeddings
classifier = MarvisTsneClassifier (
modality = "audio" ,
embedding_model = "whisper" ,
whisper_model = "large-v2" ,
include_spectrogram = True
)
# CLAP embeddings
classifier = MarvisTsneClassifier (
modality = "audio" ,
embedding_model = "clap" ,
clap_version = "2023"
)
# DINOV2 embeddings
classifier = MarvisTsneClassifier (
modality = "vision" ,
dinov2_model = "dinov2_vitb14" ,
use_3d = False
)
Common exceptions and how to handle them:
from marvis .models .marvis_tsne import MarvisTsneClassifier
import logging
try :
classifier = MarvisTsneClassifier (
modality = "tabular" ,
vlm_model_id = "invalid-model"
)
classifier .fit (X , y )
except ValueError as e :
logging .error (f"Configuration error: { e } " )
except RuntimeError as e :
logging .error (f"Runtime error: { e } " )
except Exception as e :
logging .error (f"Unexpected error: { e } " )
Start Simple : Begin with basic configuration and add complexity gradually
Cache Embeddings : Use cache_dir to avoid recomputing embeddings
Monitor Resources : Adjust gpu_memory_utilization based on your hardware
Use Appropriate Models : Smaller models for development, larger for production
Validate Data : Ensure your data format matches the expected modality