Commit 6009626
committed
CANN: Add support for async operator submission
Submit operators using asynchronous threads to improve performance.
Use the environment variable GGML_CANN_ASYNC_MODE to control whether
asynchronous submission is enabled. It is disabled by default.
Testing shows a 10%–20% performance improvement in scenarios with
small parameter sizes, especially in quantized models.1 parent 12b1750 commit 6009626
4 files changed
Lines changed: 604 additions & 356 deletions
0 commit comments