ggml : use CL_DEVICE_GLOBAL_MEM_SIZE as estimate for OpenCL --fit#22688
ggml : use CL_DEVICE_GLOBAL_MEM_SIZE as estimate for OpenCL --fit#22688fl0rianr wants to merge 2 commits intoggml-org:masterfrom
Conversation
Signed-off-by: Florian Reinle <f.reinle@otec.de>
|
Hi @fl0rianr, thanks for your contribution! Per our contribution guidelines, the automated PR checker found the following issue(s) that need your attention:
Please note that maintainers reserve the right to make final decisions on PRs. If you believe there is a mistake, please comment below. |
| // no memory to report | ||
| *free = 0; | ||
| *total = 0; | ||
| const ggml_backend_opencl_device_context * dev_ctx = |
There was a problem hiding this comment.
Let's put the global memory size in ggml_backend_opencl_context and use it in get_memory.
Something like this,
index 74948c27e..0ee7d490d 100644
--- a/ggml/src/ggml-opencl/ggml-opencl.cpp
+++ b/ggml/src/ggml-opencl/ggml-opencl.cpp
@@ -389,6 +389,7 @@ struct ggml_backend_opencl_context {
ADRENO_GPU_GEN adreno_gen;
cl_int alignment;
+ size_t global_mem_size;
size_t max_alloc_size;
size_t max_workgroup_size;
bool fp16_support;
@@ -3385,6 +3386,9 @@ static ggml_backend_opencl_context * ggml_cl2_init(ggml_backend_dev_t dev) {
backend_ctx->alignment = base_align_in_bits / 8u;
GGML_LOG_INFO("ggml_opencl: mem base addr align: %u\n", backend_ctx->alignment);
+ clGetDeviceInfo(device, CL_DEVICE_GLOBAL_MEM_SIZE, sizeof(size_t), &backend_ctx->global_mem_size, NULL);
+ GGML_LOG_INFO("ggml_opencl: global mem size: %zu MB\n", backend_ctx->global_mem_size/1024/1024);
+
clGetDeviceInfo(device, CL_DEVICE_MAX_MEM_ALLOC_SIZE, sizeof(size_t), &backend_ctx->max_alloc_size, NULL);
GGML_LOG_INFO("ggml_opencl: max mem alloc size: %zu MB\n", backend_ctx->max_alloc_size/1024/1024);
@@ -6547,11 +6551,10 @@ static const char * ggml_backend_opencl_device_get_description(ggml_backend_dev_
}
static void ggml_backend_opencl_device_get_memory(ggml_backend_dev_t dev, size_t * free, size_t * total) {
- // no memory to report
- *free = 0;
- *total = 0;
-
- GGML_UNUSED(dev);
+ ggml_backend_opencl_device_context *dev_ctx = (ggml_backend_opencl_device_context *) dev->context;
+ ggml_backend_opencl_context * backend_ctx = (ggml_backend_opencl_context *) dev_ctx->backend_ctx;
+ *free = backend_ctx->global_mem_size - 1024*1024*1024; // leave 1.5 GB
+ *total = backend_ctx->global_mem_size;
}
static enum ggml_backend_dev_type ggml_backend_opencl_device_get_type(ggml_backend_dev_t dev) {
There was a problem hiding this comment.
Thanks for the brief and detail review! Implemented.
CL_DEVICE_GLOBAL_MEM_SIZE is now cached in ggml_backend_opencl_context during init and reused in get_memory, so we no longer query it there.
I kept the existing guard around the reserved memory margin to avoid underflow on devices reporting less memory than the margin.
Sorry for the extra iteration.
Signed-off-by: Florian Reinle <f.reinle@otec.de>
Overview
This is a follow-up to PR #22614.
Since OpenCL does not support reliably reporting free memory, using
total device memory is a non-ideal but preferred option to handle this.
In order to avoid fitting too much data in
common/fit.cpp, an additional margin is set.The simplest and least intrusive way to add this margin was to implement it
inside the OpenCL backend.
Additional information
For the full discussion regarding this please refer to the linked PR above.
Requirements
NO