Significant bottleneck with text encoder outputs and latent caching to disk with .npz vs .safetensors

Hello, I wanted to ask if there are plans to officially support .safetensors for latent and text encoder caching (as discussed in #1750) or if it's already implemented that I'm not aware of. However, skimming through the newer [Musubi Tuner code](https://github.com/kohya-ss/musubi-tuner/blob/main/src/musubi_tuner/cache_latents.py), it seems like that isn't the case.

I've been testing on very large datasets (hundreds of thousands of images or millions of images) and found that the current .npz implementation is a significant bottleneck with huge additional file sizes and with both read and write operations, in many cases up to 40-50 times slower reading cache from disk compared to .safetensors.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Significant bottleneck with text encoder outputs and latent caching to disk with .npz vs .safetensors #2266

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Significant bottleneck with text encoder outputs and latent caching to disk with .npz vs .safetensors #2266

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions