Skip to content

[Release SM 6.9] Cherry-Pick Fix GVN and SROA miscompilation of min precision vector element access#8370

Draft
alsepkow wants to merge 1 commit intomicrosoft:release-1.9.2602from
alsepkow:user/alsepkow/cherry-pick-ccba9f8
Draft

[Release SM 6.9] Cherry-Pick Fix GVN and SROA miscompilation of min precision vector element access#8370
alsepkow wants to merge 1 commit intomicrosoft:release-1.9.2602from
alsepkow:user/alsepkow/cherry-pick-ccba9f8

Conversation

@alsepkow
Copy link
Copy Markdown
Contributor

Cherry-pick PR (#8269)

Assisted by gh copilot.

Depends on: #8369

SHA ccba9f8

microsoft#8269)

## Summary

Fixes three related optimizer bugs that cause miscompilation of min
precision vector element access (`[]` operator) on `min16float`,
`min16int`, and `min16uint` types at optimization levels O1+.

Resolves microsoft#8268

## Root Cause

DXC's data layout pads min precision types (`i16:32`, `f16:32`). The
HLSL change in `DataLayout::getTypeSizeInBits` makes vector sizes use
alloc size per element (e.g., `<3 x i16>` = 96 bits), but scalar
`getTypeSizeInBits(i16)` returns the primitive width (16 bits). This
inconsistency causes three bugs:

1. **GVN ICE**: `CanCoerceMustAliasedValueToLoad` creates a padded-width
integer (i96) then attempts a bitcast from the 48-bit LLVM vector type —
assert fires.
2. **GVN incorrect store forwarding**: `processLoad` forwards a
zeroinitializer store past partial element stores because
MemoryDependence uses padded sizes for aliasing.
3. **SROA element misindexing** (primary bug):
`getNaturalGEPRecursively` uses `getTypeSizeInBits(i16)/8 = 2` for
element offsets, while GEP uses `getTypeAllocSize(i16) = 4`. Byte offset
4 (element 1) maps to index `4/2 = 2` instead of `4/4 = 1`, causing SROA
to misplace or eliminate element stores.

## Changes

**`lib/Transforms/Scalar/GVN.cpp`**
- `CanCoerceMustAliasedValueToLoad`: Reject coercion when type sizes
include padding (`getTypeSizeInBits != getPrimitiveSizeInBits`)
- `processLoad` StoreInst handler: Skip store-to-load forwarding for
padded types

**`lib/Transforms/Scalar/SROA.cpp`**
- `getNaturalGEPRecursively`: Use `getTypeAllocSizeInBits` for vector
element size to match GEP offset stride
- `isVectorPromotionViable`: Same fix for element size calculation
- `AllocaSliceRewriter` constructor: Same fix for `ElementSize`

## Testing

- All 6 min precision ArrayOperator tests (StaticAccess + DynamicAccess
× 3 types) pass with the fix
- Verified optimized DXIL output retains all 3 element stores with
correct indices

## Co-authored

This fix was investigated and implemented with the assistance of GitHub
Copilot (AI pair programming). The root cause analysis — tracing the bug
through `-print-after-all` pass dumps, identifying SROA as the culprit,
and understanding the `getTypeSizeInBits` vs `getTypeAllocSize` mismatch
— was a collaborative effort.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: New

Development

Successfully merging this pull request may close these issues.

1 participant