Skip to content

hed-standard/hed-python

Repository files navigation

PyPI - Status Python3 Maintainability Code Coverage DOI Docs

HEDTools - Python

Python tools for validation, analysis, and transformation of HED (Hierarchical Event Descriptors) tagged datasets.

Overview

HED (Hierarchical Event Descriptors) is a framework for systematically describing both laboratory and real-world events as well as other experimental metadata. HED tags are comma-separated path strings that provide a standardized vocabulary for annotating events and experimental conditions.

Key Features:

  • Validate HED annotations against schema specifications
  • Analyze and summarize HED-tagged datasets
  • Full HED support in BIDS (Brain Imaging Data Structure)
  • HED support in NWB (Neurodata Without Borders) when used the ndx-hed extension.
  • Platform-independent and data-neutral
  • Command-line tools and Python API

Note: Table remodeling tools have been moved to a separate package. See table-remodeler on PyPI or visit https://www.hedtags.org/table-remodeler for more information.

Quick start

Online tools (no installation required)

For simple validation or transformation tasks, use the online tools at https://hedtools.org/hed - no installation needed!

Browser-based validation (no data upload) is available at https://www.hedtags.org/hed-javascript

A development version of the online tools is available at: https://hedtools.org/hed_dev

Python installation

Requirements: Python 3.10 or higher

Install from PyPI:

pip install hedtools

Or install from GitHub (latest):

pip install git+https://github.com/hed-standard/hed-python/@main

Development installation

For development work or to access optional features, install from the cloned repository:

# Clone the repository
git clone https://github.com/hed-standard/hed-python.git
cd hed-python

# Install in editable mode with base dependencies
pip install -e .

# Install with optional dependency groups
pip install -e ".[dev]"       # Development tools (ruff, typos)
pip install -e ".[docs]"      # Documentation tools (sphinx, furo)
pip install -e ".[test]"      # Testing tools (coverage)
pip install -e ".[examples]"  # Jupyter notebook support

# Install all optional dependencies
pip install -e ".[dev,docs,test,examples]"

Optional dependency groups:

  • dev - Code quality tools: ruff (linter + formatter), typos, mdformat
  • docs - Documentation generation: sphinx, furo theme, myst-parser
  • test - Code coverage reporting: coverage
  • examples - Jupyter notebook support: jupyter, notebook, ipykernel

Basic usage

from hed import HedString, load_schema_version


# Load the latest HED schema
schema = load_schema_version("8.4.0")

# Create and validate a HED string
hed_string = HedString("Sensory-event, Visual-presentation, (Onset, (Red, Square))", schema)
issues = hed_string.validate()

if issues:
    print(get_printable_issue_string(issues, title="Validation issues found"))
else:
    print("HED string is valid!")

Command-line tools

HEDTools provides a unified command-line interface with git-like subcommands:

# Main command (new unified interface)
hedpy --help

# Validate a BIDS dataset
hedpy validate-bids /path/to/bids/dataset

# Extract sidecar template from BIDS dataset
hedpy extract-sidecar /path/to/dataset --suffix events

# Validate HED schemas
hedpy schema validate /path/to/schema.xml

# Convert schema between formats (XML, MEDIAWIKI, TSV, JSON)
hedpy schema convert /path/to/schema.xml

Legacy commands (deprecated, use hedpy instead):

validate_bids /path/to/dataset
hed_validate_schemas /path/to/schema.xml

Note: The run_remodel command has been removed. Table remodeling functionality is now available in the separate table-remodeler package.

Note: The visualization tools such as the word cloud visualization have been moved to a separate hed-vis project.

For more examples, see the user guide.

Jupyter notebook examples

Note: Example notebooks are available in the GitHub repository only, not in the PyPI package.

The examples/ directory contains Jupyter notebooks demonstrating common HED workflows with BIDS datasets:

  • validate_bids_dataset.ipynb - Validate HED annotations in a BIDS dataset
  • summarize_events.ipynb - Summarize event file contents and value counts
  • sidecar_to_spreadsheet.ipynb - Convert JSON sidecars to spreadsheet format
  • merge_spreadsheet_into_sidecar.ipynb - Merge spreadsheet annotations into JSON sidecars
  • extract_json_template.ipynb - Generate JSON sidecar templates from event files
  • find_event_combinations.ipynb - Find unique combinations of event values
  • validate_bids_dataset_with_libraries.ipynb - Validate with HED library schemas

To use these notebooks:

# Clone the repository to get the examples
git clone https://github.com/hed-standard/hed-python.git
cd hed-python

# Install HEDTools with Jupyter support
pip install -e .[examples]

# Launch Jupyter
jupyter notebook examples/

See examples/README.md for more details.

Documentation

📖 Full Documentation: https://www.hedtags.org/hed-python

Building docs locally

# Install documentation dependencies
pip install -e .[docs]

# Build the documentation
cd docs
sphinx-build -b html . _build/html

To view the built documentation open docs/_build/html/index.html in your browser

Code Formatting

This project uses ruff format for consistent code formatting.

# Check if code is properly formatted (without making changes)
ruff format --check .

# Check and show what would change
ruff format --check --diff .

# Format all Python code in the repository
ruff format .

# Format specific files or directories
ruff format hed/
ruff format tests/

Configuration: Formatter settings are in pyproject.toml under [tool.ruff.format] with a line length of 120 characters.

Exclusions: Ruff automatically excludes .venv/, __pycache__/, auto-generated files (hed/_version.py), and external repos (spec_tests/hed-examples/, spec_tests/hed-schemas/).

CI integration: All code is automatically checked for formatting in GitHub Actions. Run ruff format . before committing to ensure your code passes CI checks.

Related repositories

The HED ecosystem consists of several interconnected repositories:

Repository Description
hed-python Python validation and analysis tools (this repo)
hed-web Web interface and deployable Docker services
hed-resources Tutorials and other HED resources
hed-specification Official HED specification documents
hed-schemas Official HED schema repository
table-remodeler Table transformation and remodeling tools
ndx-hed HED support for NWB
hed-javascript JavaScript HED validation tools

Contributing

We welcome contributions! Here's how you can help:

  1. Report issues: Use GitHub Issues for bug reports and feature requests
  2. Submit pull requests (PRs): PRs should be from a non-main fork and target the main branch
  3. Improve documentation: Help us make HED easier to use
  4. Share examples: Contribute example code and use cases

Development setup:

# Clone the repository
git clone https://github.com/hed-standard/hed-python.git
cd hed-python

# Install in development mode with all dependencies
pip install -e .[dev,test,docs,examples]

# Or just core + test dependencies
pip install -e .[test]

# Run tests
python -m unittest discover tests

# Run specific test file
python -m unittest tests/path/to/test_file.py

# Test notebooks (requires examples dependencies)
python -m unittest tests.test_notebooks

For detailed contribution guidelines, please see CONTRIBUTING.md.

Configuration

Schema caching

By default, HED schemas are cached in ~/.hedtools/ (location varies by OS).

# Change the cache directory
import hed
hed.schema.set_cache_directory('/custom/path/to/cache')

Starting with hedtools 0.2.0, local copies of recent schema versions are bundled within the package for offline access.

Citation

If you use HEDTools in your research, please cite:

@software{hedtools,
  author = {Ian Callanan, Robbins, Kay and others},
  title = {HEDTools: Python tools for HED},
  year = {2024},
  publisher = {GitHub},
  url = {https://github.com/hed-standard/hed-python},
  doi = {10.5281/zenodo.8056010}
}

License

HEDTools is licensed under the MIT License. See LICENSE for details.

Support

Funding

Partial support for this project was provided by NIH 1R01MH126700-01A1.