Summary
vkGetPhysicalDeviceCooperativeMatrixPropertiesKHR access-violates inside amdvlk64.dll when called against the Radeon 860M integrated GPU on Strix Point under Windows 11 with current AMD Adrenalin. The driver advertises both the VK_KHR_cooperative_matrix extension and the cooperativeMatrix feature flag (so callers reasonably proceed to query properties), but the property query itself crashes the host process.
Environment
- Hardware: AMD Ryzen AI 9 HX 370 (Strix Point) with Radeon 860M (RDNA 3.5) integrated GPU. Hybrid laptop, also has an NVIDIA RTX 5050 Laptop dGPU.
- OS: Windows 11 Home Single Language 10.0.26200.
- Driver: AMD Adrenalin,
driverInfo: 26.3.1 (LLPC), driverVersion: 2.0.388. ICD reports as VK_DRIVER_ID_AMD_PROPRIETARY (driverID = 1) but the loaded binary is amdvlk64.dll. The (LLPC) suffix confirms the AMDVLK lineage.
- Device: vendorID
0x1002, deviceID 0x1114, deviceType PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU.
- Vulkan SDK: 1.4.341.1 (LunarG).
- Device API: 1.4.344. Conformance: 1.4.3.3.
Crash
Captured a full memory dump via Sysinternals ProcDump on a RelWithDebInfo build of ggml-org/llama.cpp tag b9016 (which calls the relevant Vulkan APIs). Visual Studio 2022 resolved the call stack with PDBs:
amdvlk64.dll!0x00007ffc4c415672 ← crash site (no symbols)
ggml-vulkan.dll!ggml_vk_get_device(unsigned __int64 idx) Line 5476 ← caller in ggml-vulkan.cpp
ggml-vulkan.dll!ggml_backend_vk_host_buffer_type() Line 13909
... (consumer code)
Exception: 0xC0000005 Access violation writing location 0x00007FFD2FBE79C0 (an address inside vulkan-1.dll's mapped region).
The crash sits between the count-only call (pProperties=nullptr, line 5474 in ggml-vulkan.cpp) and the immediately-following cm_props.resize(cm_props_num) (line 5476). The first call returns successfully with cm_props_num=4. The access violation surfaces immediately after, before the second (fill) call at line 5482.
Reproducer
The minimal reproducer is the standard two-call pattern documented in the spec:
// Vulkan 1.4 instance, AMD physical device for the Radeon 860M iGPU.
// Device advertises VK_KHR_cooperative_matrix in extensions; vkGetPhysicalDeviceFeatures2
// reports cooperativeMatrix == VK_TRUE on the device feature struct.
PFN_vkGetPhysicalDeviceCooperativeMatrixPropertiesKHR fn =
(PFN_vkGetPhysicalDeviceCooperativeMatrixPropertiesKHR)
vkGetInstanceProcAddr(instance, "vkGetPhysicalDeviceCooperativeMatrixPropertiesKHR");
uint32_t count = 0;
fn(physicalDevice, &count, nullptr); // returns count=4 successfully
std::vector<VkCooperativeMatrixPropertiesKHR> props(count);
// access violation occurs at or before this point, inside amdvlk64.dll.
A self-contained C++ reproducer can be built from the source above plus a minimal Vulkan 1.4 instance + physical-device selection. The b9016 build of ggml-org/llama.cpp exhibits the crash through ggml_vk_get_device. Crash dump available on request.
Notes for triage
- The bug appears to be specific to integrated AMD GPUs reporting cm support on this driver line. Discrete AMD GPUs running this driver have not been tested by us; the reporter's hybrid system only has the Radeon 860M as an AMD device.
vkGetPhysicalDeviceFeatures2 reports cooperativeMatrix == VK_TRUE, which is what causes consumers (including ggml-vulkan in llama.cpp) to proceed to query properties. If the driver does not actually support cm on this hardware, the feature flag should be VK_FALSE. If it does support cm, the property query should not access-violate.
- A workaround patch has been proposed upstream in
ggml-org/llama.cpp (skip cm on integrated AMD GPUs regardless of advertised support). Link will be added once the PR is open.
Related
Summary
vkGetPhysicalDeviceCooperativeMatrixPropertiesKHRaccess-violates insideamdvlk64.dllwhen called against the Radeon 860M integrated GPU on Strix Point under Windows 11 with current AMD Adrenalin. The driver advertises both theVK_KHR_cooperative_matrixextension and thecooperativeMatrixfeature flag (so callers reasonably proceed to query properties), but the property query itself crashes the host process.Environment
driverInfo: 26.3.1 (LLPC),driverVersion: 2.0.388. ICD reports asVK_DRIVER_ID_AMD_PROPRIETARY(driverID = 1) but the loaded binary isamdvlk64.dll. The(LLPC)suffix confirms the AMDVLK lineage.0x1002, deviceID0x1114, deviceTypePHYSICAL_DEVICE_TYPE_INTEGRATED_GPU.Crash
Captured a full memory dump via Sysinternals ProcDump on a
RelWithDebInfobuild ofggml-org/llama.cpptagb9016(which calls the relevant Vulkan APIs). Visual Studio 2022 resolved the call stack with PDBs:Exception:
0xC0000005 Access violation writing location 0x00007FFD2FBE79C0(an address insidevulkan-1.dll's mapped region).The crash sits between the count-only call (
pProperties=nullptr, line 5474 in ggml-vulkan.cpp) and the immediately-followingcm_props.resize(cm_props_num)(line 5476). The first call returns successfully withcm_props_num=4. The access violation surfaces immediately after, before the second (fill) call at line 5482.Reproducer
The minimal reproducer is the standard two-call pattern documented in the spec:
A self-contained C++ reproducer can be built from the source above plus a minimal Vulkan 1.4 instance + physical-device selection. The
b9016build ofggml-org/llama.cppexhibits the crash throughggml_vk_get_device. Crash dump available on request.Notes for triage
vkGetPhysicalDeviceFeatures2reportscooperativeMatrix == VK_TRUE, which is what causes consumers (including ggml-vulkan inllama.cpp) to proceed to query properties. If the driver does not actually support cm on this hardware, the feature flag should beVK_FALSE. If it does support cm, the property query should not access-violate.ggml-org/llama.cpp(skip cm on integrated AMD GPUs regardless of advertised support). Link will be added once the PR is open.Related
vkGetPhysicalDeviceCooperativeMatrixPropertiesKHRandvkGetPhysicalDeviceCooperativeMatrixFlexibleDimensionsPropertiesNV, on the same hybrid laptop's RTX 5050 Laptop dGPU, has been filed with NVIDIA: https://forums.developer.nvidia.com/t/blackwell-rtx-5050-laptop-sm-120-vulkan-driver-crashes-in-cooperative-matrix-property-queries/369162/1