[MIOpen] Configurable problem size threshold for direct solver#4212
Conversation
|
Could MIOPEN_SEARCH_CUTOFF be used instead of introducing this new environment variable? See rocm-libraries/projects/miopen/src/conv/solver_finders.cpp Lines 268 to 274 in ca9405e Or, if that is not suitably cutting off search, then perhaps we could tweak that existing code ? Maybe we could use problem size in addition to time if needed? |
It might be nice if MIOPEN_SEARCH_CUTOFF set a default value for MIOPEN_CONV_DIRECT_MAX_SIZE, though the new env may need to exist to allow an optional level. |
I think this makes sense. It should be very easy to implement as a follow-up. |
…4212) ## Motivation The direct solver is problematic when the problem size is very large. Tuning becomes extremely slow since the direct solver takes an infeasible amount of time to run (up to several hours with 3D convs in video diffusion). Since the direct solver is a good fallback for small problems, setting `MIOPEN_DEBUG_CONV_DIRECT=0` is not always a good solution, instead the solver should be selectively disabled for large convolutions. This PR instroduces a new environment variable `MIOPEN_CONV_DIRECT_MAX_SIZE` which disables the direct solver when the the max size is exceeded. The problem size is determined by the number of elements in the result tensor (`output` for fwd, `input` for bwd, `weight` for wrw). The default value of the environment variable is `0`, which disables the limit. This PR introduces the functionality, but leaves OOTB behavior as it is. ## Technical Details Run time is not only determined by the result tensor element count, but it was chosen as a proxy measure since it is easier to reason about than the total number of operations performed. For a user, it is intuitive to use the size of a single tensor to determine the total problem size. As long as the measure correlates well with the computational load, it is useful. ## Test Plan The tests were carried out with a "normal" shape that has non-direct solvers available. ## Test Result | Scenario | Result | | --- | --- | | Just MIOpenDriver command | non-direct solver chosen | | Only direct enabled | direct solver chosen | | Only direct enabled + Max size 10 | Failed to find solver | The tests were repeated with both clean and populated databases. Even if manual tuning records that direct solver should be used, it is ignored when max size is exceeded. ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
Motivation
The direct solver is problematic when the problem size is very large. Tuning becomes extremely slow since the direct solver takes an infeasible amount of time to run (up to several hours with 3D convs in video diffusion). Since the direct solver is a good fallback for small problems, setting
MIOPEN_DEBUG_CONV_DIRECT=0is not always a good solution, instead the solver should be selectively disabled for large convolutions.This PR instroduces a new environment variable
MIOPEN_CONV_DIRECT_MAX_SIZEwhich disables the direct solver when the the max size is exceeded. The problem size is determined by the number of elements in the result tensor (outputfor fwd,inputfor bwd,weightfor wrw).The default value of the environment variable is
0, which disables the limit. This PR introduces the functionality, but leaves OOTB behavior as it is.Technical Details
Run time is not only determined by the result tensor element count, but it was chosen as a proxy measure since it is easier to reason about than the total number of operations performed. For a user, it is intuitive to use the size of a single tensor to determine the total problem size.
As long as the measure correlates well with the computational load, it is useful.
Test Plan
The tests were carried out with a "normal" shape that has non-direct solvers available.
Test Result
The tests were repeated with both clean and populated databases. Even if manual tuning records that direct solver should be used, it is ignored when max size is exceeded.
Submission Checklist