Wave 2 polish: fix shader build, remove hardcoded paths, add CI#1
Merged
Conversation
- compile.sh: target Vulkan 1.1 (SPIR-V 1.3) so the 12 subgroup-using
quantized kernels (matmul_q*k*, matmul_gpuq*) actually compile; the default
`glslangValidator -V` emits SPIR-V 1.0 and failed on them. Added set -euo
pipefail and a non-zero exit on any failure.
- vulkan_engine.cpp: drop the hardcoded /home/raz/... shader path (leaked a
developer path and broke on every other machine). Fall back to the
TORCH_VULKAN_SHADER_DIR env var instead.
- __init__.py: export the resolved bundled-shader dir into
TORCH_VULKAN_SHADER_DIR so the lazily-constructed VulkanEngine resolves the
same shaders. Aligned the module docstring with the README (.to("vulkan")
is the supported path; torch.randn(device=) / .vulkan() are partial).
- tests: removed hardcoded /home/raz/builds/pytorch-gfx1150 sys.path inserts;
honour TORCH_VULKAN_PYTORCH_PATH instead. Fixed the misleading "relu isn't
implemented" fallback test (relu IS wired) and split it into a real relu
correctness test plus a genuine CPU-fallback test using an unimplemented op.
- Removed dead imports (numpy, time, sys) and unused locals flagged by ruff.
- Added pyproject.toml (ruff config) and .github/workflows/ci.yml that
compiles all shaders and runs ruff + py_compile. The GPU build/test suite is
not run in CI (no GPU on hosted runners).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Owner
Author
Independent verification — verdict: solid (mergeable)Re-checked every concrete claim on a separate machine (glslangValidator + ruff present; cmake/Vulkan GPU absent, so C++ build/pytest remain unverified — matching the PR's own UNVERIFIED section). Shader fix — confirmed real and correct
Paths — confirmed removed
Tests — correct
Lint/CI — confirmed
Overclaims: none. The body is unusually disciplined about the build/pytest gap — independently confirmed cmake is unavailable here, so those remain legitimately unverified rather than overclaimed. Recommend merge. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Polish pass on the Vulkan-compute PyTorch backend. Focus: a real reproducible-build bug in the shader toolchain, removing hardcoded/leaked developer paths, and adding CI for what can be checked without a GPU.
Changes
Shader build (real bug, fixed + verified)
csrc/shaders/compile.shusedglslangValidator -V, which emits SPIR-V 1.0. The 12 subgroup-using quantized kernels (matmul_q4k/q5k/q6k, their_batchvariants, andmatmul_gpuq4/5/6, plusargmax) useGL_KHR_shader_subgroupreductions that require SPIR-V >= 1.3, so the script could not rebuild them. Switched to--target-env vulkan1.1(SPIR-V 1.3). Addedset -euo pipefailand a non-zero exit if any shader fails.glslangValidator: all 38 shaders compile (12 of which failed before).Hardcoded / leaked paths removed (portability)
csrc/vulkan_engine.cpphardcoded/home/raz/projects/torch-vulkan/csrc/shaders/. Removed it; the engine now falls back to theTORCH_VULKAN_SHADER_DIRenv var.torch_vulkan/__init__.pynow exports the resolved bundled-shader directory intoTORCH_VULKAN_SHADER_DIRso the lazily-constructedVulkanEnginefinds the same shaders the Kompute context uses.tests/test_algo_cache.pyandtests/bench_layer.pyhardcoded/home/raz/builds/pytorch-gfx1150onsys.path. Replaced with an optionalTORCH_VULKAN_PYTORCH_PATHenv var.Tests
tests/test_mm.py: the oldtest_cpu_fallbackclaimed "relu isn't implemented" — butreluis wired to the Vulkan backend. Split intotest_relu(real correctness check vs CPU) andtest_unimplemented_op_falls_back_to_cpu(usessign, which has no Vulkan impl, to actually exercise the boxed CPU fallback).Lint / cleanup
numpy,timeinpersistent_pipeline.py;sysinsetup.py) and unused benchmark locals.__init__.pymodule docstring with the README:.to("vulkan")is the supported tensor-creation path;torch.randn(..., device="vulkan")and.vulkan()are only partially wired.cmake -S . -B build(the oldcd buildreferenced a gitignored, non-existent dir).CI (new)
.github/workflows/ci.yml: compiles every shader with--target-env vulkan1.1(guards the regression above) and runsruff+py_compile.pyproject.toml: ruff config (E,F,I;E402ignored because the package intentionally importstorch, then loads_C, then aliases intotorch.vulkan).Verified
.compcompile with the new flag (ran locally).ruff 0.15.15, ran locally on the full tree).py_compile: passes for all Python files..spvbinaries were intentionally left untouched — recompiling them locally produced different bytes (different glslang build) that I can't validate against the target GPU, and the committed ones are the verified artifacts per the README.UNVERIFIED (no toolchain on this machine)
cmakeis not installed, the Kompute dependency is absent, and there is no Vulkan dev headers / GPU here. The C++ edits are minimal and mechanical (one#include <cstdlib>, astd::getenvfallback block, a comment-only change intorch_vulkan.cpp) but were not compiled.pytest tests/) NOT run — requires a Vulkan GPU + Kompute + the custom PyTorch build. No correctness/perf numbers were re-measured; none are claimed here.TODOs left (not addressed — intent unclear / out of scope)
mm_raw(rawVulkanEnginepath) is implemented but still not registered to any aten op.🤖 Generated with Claude Code