-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Description
This issue proposes integrating RDMA support with the MLX backend in LocalAI to enable high-performance distributed inference on Apple Silicon machines. \n\nThe mlx-jaccl-cluster repository (https://github.com/alexziskind1/mlx-jaccl-cluster) demonstrates a promising approach to RDMA integration with MLX that could be adapted for LocalAI. \n\nRDMA would allow running large models like Kimi locally on networks of Mac machines with sufficient memory, significantly improving inference performance for distributed setups. \n\nThe integration could leverage the cluster management and RDMA communication patterns demonstrated in mlx-jaccl-cluster to enable LocalAI to distribute model inference across multiple Apple Silicon devices over a high-speed network.