Skip to content

ggml : use CL_DEVICE_GLOBAL_MEM_SIZE as estimate for OpenCL --fit#22688

Open
fl0rianr wants to merge 2 commits intoggml-org:masterfrom
fl0rianr:fix/openCL_reporting_zero_memory
Open

ggml : use CL_DEVICE_GLOBAL_MEM_SIZE as estimate for OpenCL --fit#22688
fl0rianr wants to merge 2 commits intoggml-org:masterfrom
fl0rianr:fix/openCL_reporting_zero_memory

Conversation

@fl0rianr
Copy link
Copy Markdown

@fl0rianr fl0rianr commented May 4, 2026

Overview

This is a follow-up to PR #22614.

Since OpenCL does not support reliably reporting free memory, using
total device memory is a non-ideal but preferred option to handle this.
In order to avoid fitting too much data in common/fit.cpp, an additional margin is set.

The simplest and least intrusive way to add this margin was to implement it
inside the OpenCL backend.

Additional information

For the full discussion regarding this please refer to the linked PR above.

Requirements

Signed-off-by: Florian Reinle <f.reinle@otec.de>
@fl0rianr fl0rianr requested a review from a team as a code owner May 4, 2026 18:25
@ggml-gh-bot
Copy link
Copy Markdown

ggml-gh-bot Bot commented May 4, 2026

Hi @fl0rianr, thanks for your contribution!

Per our contribution guidelines, the automated PR checker found the following issue(s) that need your attention:

  • Multiple open PRs from a new contributor: We limit new contributors (those without a previously merged PR) to 1 open PR at a time. You currently have 2 open PRs.

Please note that maintainers reserve the right to make final decisions on PRs. If you believe there is a mistake, please comment below.

Comment thread ggml/src/ggml-opencl/ggml-opencl.cpp Outdated
// no memory to report
*free = 0;
*total = 0;
const ggml_backend_opencl_device_context * dev_ctx =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put the global memory size in ggml_backend_opencl_context and use it in get_memory.

Something like this,

index 74948c27e..0ee7d490d 100644
--- a/ggml/src/ggml-opencl/ggml-opencl.cpp
+++ b/ggml/src/ggml-opencl/ggml-opencl.cpp
@@ -389,6 +389,7 @@ struct ggml_backend_opencl_context {
     ADRENO_GPU_GEN adreno_gen;

     cl_int alignment;
+    size_t global_mem_size;
     size_t max_alloc_size;
     size_t max_workgroup_size;
     bool fp16_support;
@@ -3385,6 +3386,9 @@ static ggml_backend_opencl_context * ggml_cl2_init(ggml_backend_dev_t dev) {
     backend_ctx->alignment = base_align_in_bits / 8u;
     GGML_LOG_INFO("ggml_opencl: mem base addr align: %u\n", backend_ctx->alignment);

+    clGetDeviceInfo(device, CL_DEVICE_GLOBAL_MEM_SIZE, sizeof(size_t), &backend_ctx->global_mem_size, NULL);
+    GGML_LOG_INFO("ggml_opencl: global mem size: %zu MB\n", backend_ctx->global_mem_size/1024/1024);
+
     clGetDeviceInfo(device, CL_DEVICE_MAX_MEM_ALLOC_SIZE, sizeof(size_t), &backend_ctx->max_alloc_size, NULL);
     GGML_LOG_INFO("ggml_opencl: max mem alloc size: %zu MB\n", backend_ctx->max_alloc_size/1024/1024);

@@ -6547,11 +6551,10 @@ static const char * ggml_backend_opencl_device_get_description(ggml_backend_dev_
 }

 static void ggml_backend_opencl_device_get_memory(ggml_backend_dev_t dev, size_t * free, size_t * total) {
-    // no memory to report
-    *free  = 0;
-    *total = 0;
-
-    GGML_UNUSED(dev);
+    ggml_backend_opencl_device_context *dev_ctx = (ggml_backend_opencl_device_context *) dev->context;
+    ggml_backend_opencl_context * backend_ctx = (ggml_backend_opencl_context *) dev_ctx->backend_ctx;
+    *free  = backend_ctx->global_mem_size - 1024*1024*1024; // leave 1.5 GB
+    *total = backend_ctx->global_mem_size;
 }

 static enum ggml_backend_dev_type ggml_backend_opencl_device_get_type(ggml_backend_dev_t dev) {

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the brief and detail review! Implemented.

CL_DEVICE_GLOBAL_MEM_SIZE is now cached in ggml_backend_opencl_context during init and reused in get_memory, so we no longer query it there.

I kept the existing guard around the reserved memory margin to avoid underflow on devices reporting less memory than the margin.

Sorry for the extra iteration.

Signed-off-by: Florian Reinle <f.reinle@otec.de>
@github-actions github-actions Bot added ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend labels May 4, 2026
Copy link
Copy Markdown
Contributor

@lhez lhez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants