ggml : use CL_DEVICE_GLOBAL_MEM_SIZE as estimate for OpenCL --fit by fl0rianr · Pull Request #22688 · ggml-org/llama.cpp

fl0rianr · 2026-05-04T18:25:12Z

Overview

This is a follow-up to PR #22614.

Since OpenCL does not support reliably reporting free memory, using
total device memory is a non-ideal but preferred option to handle this.
In order to avoid fitting too much data in common/fit.cpp, an additional margin is set.

The simplest and least intrusive way to add this margin was to implement it
inside the OpenCL backend.

Additional information

For the full discussion regarding this please refer to the linked PR above.

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure:
NO

Signed-off-by: Florian Reinle <f.reinle@otec.de>

ggml-gh-bot · 2026-05-04T18:29:11Z

Hi @fl0rianr, thanks for your contribution!

Per our contribution guidelines, the automated PR checker found the following issue(s) that need your attention:

Multiple open PRs from a new contributor: We limit new contributors (those without a previously merged PR) to 1 open PR at a time. You currently have 2 open PRs.

Please note that maintainers reserve the right to make final decisions on PRs. If you believe there is a mistake, please comment below.

lhez · 2026-05-04T18:45:48Z

-    // no memory to report
-    *free  = 0;
-    *total = 0;
+    const ggml_backend_opencl_device_context * dev_ctx =


Let's put the global memory size in ggml_backend_opencl_context and use it in get_memory.

Something like this,

index 74948c27e..0ee7d490d 100644 --- a/ggml/src/ggml-opencl/ggml-opencl.cpp +++ b/ggml/src/ggml-opencl/ggml-opencl.cpp @@ -389,6 +389,7 @@ struct ggml_backend_opencl_context { ADRENO_GPU_GEN adreno_gen; cl_int alignment; + size_t global_mem_size; size_t max_alloc_size; size_t max_workgroup_size; bool fp16_support; @@ -3385,6 +3386,9 @@ static ggml_backend_opencl_context * ggml_cl2_init(ggml_backend_dev_t dev) { backend_ctx->alignment = base_align_in_bits / 8u; GGML_LOG_INFO("ggml_opencl: mem base addr align: %u\n", backend_ctx->alignment); + clGetDeviceInfo(device, CL_DEVICE_GLOBAL_MEM_SIZE, sizeof(size_t), &backend_ctx->global_mem_size, NULL); + GGML_LOG_INFO("ggml_opencl: global mem size: %zu MB\n", backend_ctx->global_mem_size/1024/1024); + clGetDeviceInfo(device, CL_DEVICE_MAX_MEM_ALLOC_SIZE, sizeof(size_t), &backend_ctx->max_alloc_size, NULL); GGML_LOG_INFO("ggml_opencl: max mem alloc size: %zu MB\n", backend_ctx->max_alloc_size/1024/1024); @@ -6547,11 +6551,10 @@ static const char * ggml_backend_opencl_device_get_description(ggml_backend_dev_ } static void ggml_backend_opencl_device_get_memory(ggml_backend_dev_t dev, size_t * free, size_t * total) { - // no memory to report - *free = 0; - *total = 0; - - GGML_UNUSED(dev); + ggml_backend_opencl_device_context *dev_ctx = (ggml_backend_opencl_device_context *) dev->context; + ggml_backend_opencl_context * backend_ctx = (ggml_backend_opencl_context *) dev_ctx->backend_ctx; + *free = backend_ctx->global_mem_size - 1024*1024*1024; // leave 1.5 GB + *total = backend_ctx->global_mem_size; } static enum ggml_backend_dev_type ggml_backend_opencl_device_get_type(ggml_backend_dev_t dev) {

Thanks for the brief and detail review! Implemented.

CL_DEVICE_GLOBAL_MEM_SIZE is now cached in ggml_backend_opencl_context during init and reused in get_memory, so we no longer query it there.

I kept the existing guard around the reserved memory margin to avoid underflow on devices reporting less memory than the margin.

Sorry for the extra iteration.

Signed-off-by: Florian Reinle <f.reinle@otec.de>

lhez

Looks good, thank you!

ggml : report estimated OpenCL memory for --fit

1d5cbc8

Signed-off-by: Florian Reinle <f.reinle@otec.de>

fl0rianr requested a review from a team as a code owner May 4, 2026 18:25

lhez reviewed May 4, 2026

View reviewed changes

ggml : estimated OpenCL memory backend integrated

df0b3ef

Signed-off-by: Florian Reinle <f.reinle@otec.de>

github-actions Bot added ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend labels May 4, 2026

lhez approved these changes May 5, 2026

View reviewed changes

taronaeo approved these changes May 5, 2026

View reviewed changes

fl0rianr mentioned this pull request May 5, 2026

common: do not fit to unknown device memory #22614

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml : use CL_DEVICE_GLOBAL_MEM_SIZE as estimate for OpenCL --fit#22688

ggml : use CL_DEVICE_GLOBAL_MEM_SIZE as estimate for OpenCL --fit#22688
fl0rianr wants to merge 2 commits intoggml-org:masterfrom
fl0rianr:fix/openCL_reporting_zero_memory

fl0rianr commented May 4, 2026

Uh oh!

ggml-gh-bot Bot commented May 4, 2026

Uh oh!

lhez May 4, 2026

Uh oh!

fl0rianr May 4, 2026

Uh oh!

lhez left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

fl0rianr commented May 4, 2026

Overview

Additional information

Requirements

Uh oh!

ggml-gh-bot Bot commented May 4, 2026

Uh oh!

lhez May 4, 2026

Choose a reason for hiding this comment

Uh oh!

fl0rianr May 4, 2026

Choose a reason for hiding this comment

Uh oh!

lhez left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants