This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
HyperTools is a Python library for visualizing and manipulating high-dimensional data. It provides a unified interface for dimensionality reduction, data alignment, clustering, and visualization, built on top of matplotlib, scikit-learn, and seaborn.
pytest- Run all tests from the hypertools/ directorypytest tests/test_<module>.py- Run tests for a specific modulepytest tests/test_<module>.py::test_<function>- Run a specific test function
pip install -e .- Install in development modepip install -r requirements.txt- Install dependenciespip install -r docs/doc_requirements.txt- Install documentation dependencies
cd docs && make html- Build HTML documentationcd docs && make clean- Clean documentation build files
DataGeometry Class (hypertools/datageometry.py)
- Central data container that holds raw data, transformed data, and transformation parameters
- Stores matplotlib figure/axes handles and animation objects
- Contains normalization, reduction, and alignment model parameters
Main API Functions (hypertools/__init__.py)
plot()- Primary visualization functionanalyze()- Data analysis and dimensionality reductionreduce()- Dimensionality reduction utilitiesalign()- Data alignment across datasetsnormalize()- Data normalizationdescribe()- Data description and summarycluster()- Clustering functionalityload()- Data loading utilities
Tools Module (hypertools/tools/)
align.py- Hyperalignment and Procrustes alignmentreduce.py- Dimensionality reduction (PCA, t-SNE, UMAP, etc.)normalize.py- Data normalization methodscluster.py- K-means and other clustering algorithmsformat_data.py- Data preprocessing and formattingtext2mat.py- Text-to-matrix conversiondf2mat.py- DataFrame-to-matrix conversionload.py- Data loading from various sourcesmissing_inds.py- Missing data handlingprocrustes.py- Procrustes analysis
Plot Module (hypertools/plot/)
plot.py- Main plotting interface and logicbackend.py- matplotlib backend configurationdraw.py- Low-level drawing functions
External Dependencies (hypertools/_externals/)
ppca.py- Probabilistic Principal Component Analysissrm.py- Shared Response Model
- Input Processing: Data is formatted and validated through
format_data() - Normalization: Optional data normalization via
normalize() - Alignment: Optional cross-dataset alignment via
align() - Dimensionality Reduction: Data is reduced via
reduce() - Clustering: Optional clustering via
cluster() - Visualization: Final plotting through
plot()
- Modular Architecture: Each major operation (align, reduce, normalize, etc.) is in its own module
- Unified Interface: All functions accept similar input formats (lists of arrays, DataFrames, etc.)
- Flexible Data Types: Supports numpy arrays, pandas DataFrames, text data, and mixed inputs
- Matplotlib Integration: Deep integration with matplotlib for customizable visualizations
- Animation Support: Built-in support for animated visualizations
- The package follows a functional programming style with separate modules for each operation
- All major functions are designed to work with multiple input formats
- The DataGeometry class serves as the central data container and state manager
- Tests are located in
tests/directory and follow pytest conventions - Documentation is built with Sphinx and uses example galleries
- The codebase maintains compatibility with Python 3.9+
- Unit tests for individual tools and functions
- Integration tests for end-to-end workflows
- Example-based testing through documentation
- Visual regression testing for plot outputs