Skip to content

Hugging Face Model integration in Superbench#803

Open
Aishwarya-Tonpe wants to merge 1 commit intomainfrom
hf-models-clean
Open

Hugging Face Model integration in Superbench#803
Aishwarya-Tonpe wants to merge 1 commit intomainfrom
hf-models-clean

Conversation

@Aishwarya-Tonpe
Copy link
Copy Markdown
Contributor

@Aishwarya-Tonpe Aishwarya-Tonpe commented Apr 13, 2026

Adds support for loading and benchmarking models from HuggingFace Hub across Inference micro-benchmarks -ORT/TensorRT inference. Users can run any compatible HF-hosted model through the existing benchmark harness using --model_source huggingface --model_identifier <org/model>.

SuperBench previously only supported in-house model definitions with hardcoded architectures. Adding new models required code changes. This PR allows benchmarking any compatible HuggingFace model with a CLI flag change, including gated models via HF_TOKEN.

Key Changes

New modules:

  • HuggingFaceModelLoader — Downloads, caches, and loads models from HF Hub. Estimates parameter count from model config (few KB) and checks GPU
    memory before downloading full weights to avoid failed multi-GB downloads.

  • ModelSourceConfig — Dataclass for model source configuration (in-house / huggingface), dtype, revision, auth token, and device mapping.

    Micro-benchmarks (inference):

  • ORT inference — Downloads HF model → exports to ONNX → runs ORT inference. Handles both vision (pixel_values) and NLP (input_ids) inputs
    automatically.

  • TensorRT inference — Same flow: download → ONNX export → trtexec engine build → inference. Includes dynamic input shape detection from the
    exported ONNX graph.

  • ONNX exporter — New export_huggingface_model() method with vision/NLP auto-detection, dynamic axes, and external data support for large models
    (>2GB).

Testing

  • test_model_source_config.py — Unit tests for validation, defaults, and edge cases.
  • test_huggingface_loader.py — Unit tests for dtype conversion, model size calculation, memory estimation, and param count estimation.
  • test_huggingface_e2e.py — End-to-end integration tests covering micro-benchmarks with real HF models.

Usage

Training benchmark

ORT inference
python examples/benchmarks/ort_inference_performance.py
--model_source huggingface --model_identifier bert-base-uncased

TensorRT inference
python examples/benchmarks/tensorrt_inference_performance.py
--model_source huggingface --model_identifier microsoft/resnet-50

Gated models
export HF_TOKEN=hf_xxxxx

@Aishwarya-Tonpe Aishwarya-Tonpe requested a review from a team as a code owner April 13, 2026 17:36
Copilot AI review requested due to automatic review settings April 13, 2026 17:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds HuggingFace Hub as a first-class model source across SuperBench training benchmarks and ORT/TensorRT inference micro-benchmarks, enabling users to benchmark arbitrary HF models via CLI flags (including gated models via HF_TOKEN).

Changes:

  • Introduces ModelSourceConfig and HuggingFaceModelLoader for unified HF model configuration/loading and memory-fit checks.
  • Extends PyTorch model benchmarks to optionally load HF backbones and wrap them with task-specific heads.
  • Adds HF→ONNX export support and integrates HF flows into ORT and TensorRT inference micro-benchmarks, plus new tests and examples.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/benchmarks/micro_benchmarks/test_model_source_config.py Adds unit tests for ModelSourceConfig validation/defaulting.
tests/benchmarks/micro_benchmarks/test_huggingface_loader.py Adds unit tests for HF loader dtype handling, load flow, and size estimation.
tests/benchmarks/micro_benchmarks/test_huggingface_e2e.py Adds integration tests that download real HF models and validate basic forward pass.
superbench/benchmarks/model_benchmarks/pytorch_mixtral_impl.py Adds HF config customization + wrapper and HF-loading branch for Mixtral benchmark.
superbench/benchmarks/model_benchmarks/pytorch_lstm.py Adds HF-loading path + wrapper and refactors in-house model creation.
superbench/benchmarks/model_benchmarks/pytorch_llama.py Adds HF-loading path + wrapper and refactors in-house model creation.
superbench/benchmarks/model_benchmarks/pytorch_gpt2.py Adds HF-loading path + wrapper and refactors in-house model creation.
superbench/benchmarks/model_benchmarks/pytorch_cnn.py Adds HF-loading path + wrapper for HF vision backbones, keeps in-house torchvision path.
superbench/benchmarks/model_benchmarks/pytorch_bert.py Adds HF-loading path + wrapper and refactors in-house model creation.
superbench/benchmarks/model_benchmarks/pytorch_base.py Adds shared HF model loading flow, memory estimation, and CLI args for model source/identifier.
superbench/benchmarks/micro_benchmarks/tensorrt_inference_performance.py Adds HF model preprocessing: config-only memory check, HF load, ONNX export, TRT build command.
superbench/benchmarks/micro_benchmarks/ort_inference_performance.py Adds HF preprocessing (config memory check, HF load, ONNX export/quantize) + dynamic input handling.
superbench/benchmarks/micro_benchmarks/model_source_config.py New dataclass encapsulating model source, identifier, dtype, token, and loader kwargs.
superbench/benchmarks/micro_benchmarks/huggingface_model_loader.py New loader for HF Hub with tokenizer support, size/memory estimation utilities, and pre-checks.
superbench/benchmarks/micro_benchmarks/_export_torch_to_onnx.py Adds HF model ONNX export with vision/NLP detection, dynamic axes, and optional external data output.
examples/benchmarks/tensorrt_inference_performance.py Updates example script to show in-house vs HF usage via CLI.
examples/benchmarks/pytorch_huggingface_models.py New example demonstrating HF-backed training benchmarks, incl. distributed option.
examples/benchmarks/ort_inference_performance.py Updates ORT example script to show in-house vs HF usage via CLI.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread superbench/benchmarks/model_benchmarks/pytorch_base.py Outdated
Comment thread tests/benchmarks/micro_benchmarks/test_model_source_config.py
Comment thread superbench/benchmarks/micro_benchmarks/_export_torch_to_onnx.py Outdated
Comment thread superbench/benchmarks/micro_benchmarks/ort_inference_performance.py
Comment thread superbench/benchmarks/micro_benchmarks/model_source_config.py Outdated
Comment thread superbench/benchmarks/model_benchmarks/pytorch_base.py Outdated
@Aishwarya-Tonpe Aishwarya-Tonpe changed the title Hf models clean Hugging Face Model integration in Superbench Apr 14, 2026
Copilot AI review requested due to automatic review settings April 14, 2026 17:30
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 13 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread superbench/benchmarks/model_benchmarks/pytorch_base.py Outdated
Comment thread superbench/benchmarks/model_benchmarks/pytorch_base.py Outdated
Comment thread superbench/benchmarks/model_benchmarks/pytorch_base.py Outdated
Comment thread superbench/benchmarks/micro_benchmarks/huggingface_model_loader.py
Comment thread superbench/benchmarks/micro_benchmarks/huggingface_model_loader.py
Comment thread superbench/benchmarks/micro_benchmarks/_export_torch_to_onnx.py
Comment thread superbench/benchmarks/micro_benchmarks/ort_inference_performance.py
Comment thread superbench/benchmarks/micro_benchmarks/tensorrt_inference_performance.py Outdated
Comment thread superbench/benchmarks/micro_benchmarks/ort_inference_performance.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 10 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread superbench/benchmarks/micro_benchmarks/huggingface_model_loader.py Outdated
Comment thread superbench/benchmarks/micro_benchmarks/huggingface_model_loader.py
Comment thread superbench/benchmarks/micro_benchmarks/huggingface_model_loader.py
Comment thread superbench/benchmarks/micro_benchmarks/model_source_config.py
Comment thread superbench/benchmarks/micro_benchmarks/model_source_config.py Outdated
Comment thread superbench/benchmarks/micro_benchmarks/model_source_config.py Outdated
Comment thread superbench/benchmarks/micro_benchmarks/tensorrt_inference_performance.py Outdated
Comment thread superbench/benchmarks/micro_benchmarks/tensorrt_inference_performance.py Outdated
Comment thread superbench/benchmarks/micro_benchmarks/_export_torch_to_onnx.py
Comment thread tests/benchmarks/micro_benchmarks/test_huggingface_e2e.py Outdated
Copilot AI review requested due to automatic review settings April 14, 2026 20:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread superbench/benchmarks/micro_benchmarks/huggingface_model_loader.py
Comment thread superbench/benchmarks/micro_benchmarks/huggingface_model_loader.py Outdated
Comment thread superbench/benchmarks/micro_benchmarks/_export_torch_to_onnx.py
Comment thread tests/benchmarks/micro_benchmarks/test_huggingface_loader.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/benchmarks/micro_benchmarks/test_huggingface_e2e.py
Comment thread superbench/benchmarks/micro_benchmarks/huggingface_model_loader.py
Comment thread superbench/benchmarks/micro_benchmarks/huggingface_model_loader.py
Comment thread tests/benchmarks/micro_benchmarks/test_huggingface_e2e.py Outdated
Comment thread superbench/benchmarks/micro_benchmarks/huggingface_model_loader.py Outdated
Comment thread superbench/benchmarks/micro_benchmarks/huggingface_model_loader.py
…e benchmarks

- Add HuggingFaceModelLoader for downloading and caching models from HF Hub
- Support both NLP (AutoModelForCausalLM) and vision (AutoModelForImageClassification) models
- Add model_source and model_identifier parameters to TensorRT/ORT benchmarks
- Add ONNX export pipeline for HuggingFace models with dynamic axes
- Derive vision input shapes from ONNX graph dims with HF config fallback
- Filter ONNX initializers from graph.input for correct NLP input handling
- Add PyTorch 2.8+ compatibility (external_data vs use_external_data_format)
- Add example script, unit tests, and config schema updates
- Support HF_TOKEN env var for gated model access
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants