-
Notifications
You must be signed in to change notification settings - Fork 38
How to compress t5-v1_1-xxl and Gemma-2-2B? #28
Copy link
Copy link
Open
Description
I managed to successfully compress Cosmos-Predict2 and Chroma, but when I tried to compress the T5 text encoder model used by Flux, I get the following error instead:
Traceback (most recent call last):
File "F:\AI setups\Diffusers\models\compress t5.py", line 42, in <module>
compress_model(
File "F:\AI setups\Diffusers\diffusers-venv\Lib\site-packages\dfloat11\dfloat11.py", line 622, in compress_model
save_file(model.state_dict(), os.path.join(save_path, 'model.safetensors'))
File "F:\AI setups\Diffusers\diffusers-venv\Lib\site-packages\safetensors\torch.py", line 352, in save_file
serialize_file(_flatten(tensors), filename, metadata=metadata)
^^^^^^^^^^^^^^^^^
File "F:\AI setups\Diffusers\diffusers-venv\Lib\site-packages\safetensors\torch.py", line 577, in _flatten
raise RuntimeError(
RuntimeError:
Some tensors share memory, this will lead to duplicate memory on disk and potential differences when loading them again: [{'shared.weight', 'encoder.embed_tokens.weight'}].
A potential way to correctly save your model is to use `save_model`.
More information at https://huggingface.co/docs/safetensors/torch_shared_tensors
Is this due to an error with my compression code, or is what I am trying to do not supported? The complete code I used, including the pattern_dict, is below:
import torch
from dfloat11 import compress_model
from transformers import T5EncoderModel
save_path = r".\t5-v1_1-xxl-DF11"
save_single_file = True
check_correctness = True
block_range = (0, 100)
text_encoder_2 = T5EncoderModel.from_pretrained(
r"..\models\FLUX.1-dev",
subfolder = "text_encoder_2",
torch_dtype = torch.bfloat16,
local_files_only = True
)
pattern_dict={
"block\.\d+": (
"layer.0.SelfAttention.q",
"layer.0.SelfAttention.k",
"layer.0.SelfAttention.v",
"layer.0.SelfAttention.o",
"layer.1.DenseReluDense.wi_0",
"layer.1.DenseReluDense.wi_1",
"layer.1.DenseReluDense.wo",
)
}
# Compress the model using DFloat11 compression
compress_model(
model=text_encoder_2,
pattern_dict= pattern_dict,
save_path = save_path,
save_single_file = save_single_file,
check_correctness = check_correctness,
block_range = block_range,
)Edit: Found an issue with the pattern_dict, block should be replaced with encoder.block, but the shared tensors issue will still stop the file from being saved after the compression process finishes.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels