[Release SM 6.9] Cherry-Pick Fix GVN and SROA miscompilation of min precision vector element access#8370
Draft
alsepkow wants to merge 1 commit intomicrosoft:release-1.9.2602from
Draft
Conversation
microsoft#8269) ## Summary Fixes three related optimizer bugs that cause miscompilation of min precision vector element access (`[]` operator) on `min16float`, `min16int`, and `min16uint` types at optimization levels O1+. Resolves microsoft#8268 ## Root Cause DXC's data layout pads min precision types (`i16:32`, `f16:32`). The HLSL change in `DataLayout::getTypeSizeInBits` makes vector sizes use alloc size per element (e.g., `<3 x i16>` = 96 bits), but scalar `getTypeSizeInBits(i16)` returns the primitive width (16 bits). This inconsistency causes three bugs: 1. **GVN ICE**: `CanCoerceMustAliasedValueToLoad` creates a padded-width integer (i96) then attempts a bitcast from the 48-bit LLVM vector type — assert fires. 2. **GVN incorrect store forwarding**: `processLoad` forwards a zeroinitializer store past partial element stores because MemoryDependence uses padded sizes for aliasing. 3. **SROA element misindexing** (primary bug): `getNaturalGEPRecursively` uses `getTypeSizeInBits(i16)/8 = 2` for element offsets, while GEP uses `getTypeAllocSize(i16) = 4`. Byte offset 4 (element 1) maps to index `4/2 = 2` instead of `4/4 = 1`, causing SROA to misplace or eliminate element stores. ## Changes **`lib/Transforms/Scalar/GVN.cpp`** - `CanCoerceMustAliasedValueToLoad`: Reject coercion when type sizes include padding (`getTypeSizeInBits != getPrimitiveSizeInBits`) - `processLoad` StoreInst handler: Skip store-to-load forwarding for padded types **`lib/Transforms/Scalar/SROA.cpp`** - `getNaturalGEPRecursively`: Use `getTypeAllocSizeInBits` for vector element size to match GEP offset stride - `isVectorPromotionViable`: Same fix for element size calculation - `AllocaSliceRewriter` constructor: Same fix for `ElementSize` ## Testing - All 6 min precision ArrayOperator tests (StaticAccess + DynamicAccess × 3 types) pass with the fix - Verified optimized DXIL output retains all 3 element stores with correct indices ## Co-authored This fix was investigated and implemented with the assistance of GitHub Copilot (AI pair programming). The root cause analysis — tracing the bug through `-print-after-all` pass dumps, identifying SROA as the culprit, and understanding the `getTypeSizeInBits` vs `getTypeAllocSize` mismatch — was a collaborative effort. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Cherry-pick PR (#8269)
Assisted by gh copilot.
Depends on: #8369
SHA ccba9f8