From 2d434675e5d5e2d111d7ca7ee53d9a4d367189dc Mon Sep 17 00:00:00 2001 From: Felix Weibach Date: Tue, 30 Dec 2025 07:43:29 -0800 Subject: [PATCH] Mitigate deadlock on DLL unload (#16416) Summary: ThreadPool gets stored in a static variable here extension/threadpool/threadpool.cpp:146 This means the destructor of ThreadPool will be run when the process exits or a DLL containing this code unloads. While working with ExecuTorch I experienced a deadlock during unloading our DLL (which contained ExecuTorch) at runtime. This was caused by the pthreadpool_destroy function pthreadpool/src/windows.c:366 waiting forever on the worker threads. Why this is happening exactly is unclear to me. It is likely a race condition inside Windows Parallel Loader (https://blogs.blackberry.com/en/2017/10/windows-10-parallel-loading-breakdown) as I could see its functions in the stack trace of the stuck worker threads after they returned from their main function. The issue was mitigated on my side by calling `executorch::extension::threadpool::get_threadpool()->_unsafe_reset_threadpool(0);` before unloading the DLL. This is just a workaround. I think a proper fix would be to rework the ThreadPool singleton and allow for explicit termination of it. Differential Revision: D89889628 --- extension/threadpool/threadpool.cpp | 6 ++++++ extension/threadpool/threadpool.h | 9 +++++++++ 2 files changed, 15 insertions(+) diff --git a/extension/threadpool/threadpool.cpp b/extension/threadpool/threadpool.cpp index e244954de6e..26acc9fb81b 100644 --- a/extension/threadpool/threadpool.cpp +++ b/extension/threadpool/threadpool.cpp @@ -75,6 +75,12 @@ bool ThreadPool::_unsafe_reset_threadpool(uint32_t new_thread_count) { return true; } +void ThreadPool::_unsafe_destroy_threadpool() { + std::lock_guard lock{mutex_}; + ET_LOG(Info, "Destroying threadpool."); + threadpool_.reset(); +} + void ThreadPool::run( runtime::FunctionRef fn, const size_t range) { diff --git a/extension/threadpool/threadpool.h b/extension/threadpool/threadpool.h index b4fe7e4be96..43957473a26 100644 --- a/extension/threadpool/threadpool.h +++ b/extension/threadpool/threadpool.h @@ -69,6 +69,15 @@ class ThreadPool final { "This API is experimental and may change without notice. Consider using UseNThreadsThreadPoolGuard")]] bool _unsafe_reset_threadpool(uint32_t num_threads); + /** + * INTERNAL: Destroys the threadpool. This is not a thread safe call. When calling this method, + * threads of the threadpool might be doing some work. Some other code may + * also be holding on to the threadpool pointer, that is no longer valid. + */ + [[deprecated( + "This API is experimental and may change without notice.")]] + void _unsafe_destroy_threadpool(); + /** * Run, in parallel, function fn(task_id) over task_id in range [0, range). * This function is blocking. All input is processed by the time it returns.