Open
Conversation
…ry/Raycore.jl into sd/gpu-instanced-bvh
…aycore.jl into sd/multitype-vec
…12) SetKey.type_idx was changed from UInt8 to UInt32 for LLVM/SPIR-V compatibility, but the @generated with_index function still compared against UInt8 literals. Since Julia's === checks both value and type, UInt32(1) === UInt8(1) is always false, causing all branches to fall through to the default (first material). This made every object render with the same material. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
On Metal, device pointers (Core.LLVMPtr) stored inside GPU buffers cannot be reliably dereferenced by kernels. The inline data (root_aabb) reads correctly, but following embedded pointers to per-BLAS node/primitive arrays returns zeros. Replace the pointer-based BLAS architecture in StaticTLAS with: - BLASDescriptor: lightweight struct with nodes_offset, primitives_offset, root_aabb - Flat concatenated arrays (all_blas_nodes, all_blas_prims) built from per-BLAS GPU arrays - Offset-based indexing in closest_hit/any_hit traversal Management kernels (update_tlas_leaf_aabbs_kernel!, etc.) still use blas_array but only read root_aabb (inline, unaffected). Verified: CPU and Metal produce identical results (mean pixel ~0.327 on 3-sphere test scene). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Closed
Pkg.test() defaults to --check-bounds=yes which injects error paths that can't compile to SPIR-V. GPU tests now auto-skip with @test_broken when bounds checking is forced. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- viewfactors_content.md: Build visualization mesh from TLAS primitives instead of raw merged mesh. The TLAS removes degenerate triangles, so face counts differed (202 vs 250), causing FaceView verification error. - gpu_raytracing_tutorial.md: bvh.primitives → bvh.all_blas_prims (StaticTLAS field was renamed in TLAS/BLAS refactor) - bvh_hit_tests_content.md: Fix swapped benchmark labels (closest_hit was showing any_hit timing and vice versa), remove empty section header, fix test numbering (3 not 4) - instanced-bvh-architecture.md: Replace broken example using TriangleMesh/inv_translate with working high-level TLAS API - raytracing_tutorial_content.md: Fix "Analougus" → "Analogous" - README.md: Add MultiTypeSet and GPU TLAS to features list
Results (400×720, 4spp, 6014 triangles): Wavefront GPU: 2.7 ms (winner, 223x vs CPU baseline) Tiled (32×16): 7.5 ms Tiled (32×32): 8.3 ms Unrolled: 14.2 ms Baseline GPU: 16.2 ms Tiled (8×8): 16.7 ms Wavefront CPU: 97.0 ms Baseline CPU: 602.7 ms
2d3d32f to
dc3c159
Compare
The example scene uses Y-up geometry (floor at y=-1.5), but the wavefront renderer defaulted to camera_up=Vec3f(0,0,1) (Z-up), producing an upside-down/rotated view. Also fix camera_lookat to look along +Z matching the simple camera used by other kernels.
Shows how to enable hw_accel=true with Lava backend, explains the architecture (extract-trace-shade pipeline), includes benchmark comparison between SW BVH and HW RT on materials scene (20 spheres, AMD RX 7900 XTX). Honest results: parity on simple scenes, HW RT advantage on complex geometry (3.5M+ triangles).
AMD RX 7900 XTX via AMDGPU.jl, dragon mesh (249K tris) + procedural. Key findings: Raycore 3.5-20x faster for ray tracing (single-pass closest-hit with early termination vs two-pass BV candidate list). ImplicitBVH 2-5x faster for BVH build (simpler construction).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This adds: