feat: multi-GPU support, hardware compat template, Dockerfile.cuda by tbitcs · Pull Request #20 · BitConcepts/oea-framework-paper

tbitcs · 2026-05-19T17:20:17Z

Adds community GPU support, honest test-gap documentation, and a hardware bug-reporting path.

What's in this PR

New files:

\Dockerfile.cuda\ — NVIDIA CUDA 12.1 image (
vidia/cuda:12.1.1-runtime-ubuntu22.04, torch 2.3.1+cu121)
.github/ISSUE_TEMPLATE/hardware_compat.md\ — hardware compatibility report template for ROCm/XPU/MPS community contributors

Experiment harness (
eal_lm_experiment.py):

Add --device\ flag: \cuda,
ocm, \xpu, \mps, \cpu\
Auto-detection chain: CUDA > ROCm (HIP) > Intel XPU > MPS > CPU
ROCm/XPU/MPS paths print community-tested note + issues link in device output

Requirements / Docker:

equirements-lock.txt: install instructions for all backends (ROCm 6.x, Intel XPU, CUDA 12.4+, MPS) with verified/community labels; fix incorrect ABI comment from dependabot bump
\Dockerfile: update to current pinned versions, clarify CPU-only scope

Docs:

\README.md: GPU table with test-status column, Docker sub-section, CI hardware gap note
\REPRODUCE.md: hardware test matrix, untested hardware / help-wanted section
\CHANGELOG.md: all changes documented under [Unreleased]\

Co-Authored-By: Oz oz-agent@warp.dev

- real_lm_experiment.py: add --device flag (cuda/rocm/xpu/mps/cpu); auto-detection extended to AMD ROCm (HIP) and Intel XPU/Arc; community-tested backends print issue link in device output - requirements-lock.txt: add install commands for ROCm 6.x, Intel XPU, CUDA 12.4+, MPS with per-backend verified/community-tested labels; fix incorrect ABI comment left by dependabot bump - Dockerfile: update to current pinned versions (numpy 2.4.5 etc.); clarify CPU-only scope and point to Dockerfile.cuda for GPU - Dockerfile.cuda: new NVIDIA CUDA 12.1 image (nvidia/cuda:12.1.1-runtime-ubuntu22.04); verified on RTX 4070 SUPER with torch 2.3.1+cu121 - .github/ISSUE_TEMPLATE/hardware_compat.md: hardware compatibility report template for ROCm/XPU/MPS community contributors - README.md: GPU table with test-status column; Docker sub-section; CI hardware gap note; link to hardware compat template - REPRODUCE.md: hardware test matrix; untested hardware help-wanted section - CHANGELOG.md: document all changes under [Unreleased] Co-Authored-By: Oz <oz-agent@warp.dev>

Co-Authored-By: Oz <oz-agent@warp.dev>

tbitcs and others added 2 commits May 19, 2026 13:17

fix: MD040 missing code fence language in hardware_compat.md

b2c85ec

Co-Authored-By: Oz <oz-agent@warp.dev>

tbitcs merged commit 3e85da7 into main May 19, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: multi-GPU support, hardware compat template, Dockerfile.cuda#20

feat: multi-GPU support, hardware compat template, Dockerfile.cuda#20
tbitcs merged 2 commits into
mainfrom
develop

tbitcs commented May 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tbitcs commented May 19, 2026

What's in this PR

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant