Skip to content

realtime quantization to workers #63

@evilsocket

Description

@evilsocket

in theory, if the master has the full res model, while streaming to workers it could quantize it automatically on the fly so that a worker can host a quantized version of the full model the master is using / orchestrating.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions