How to compress t5-v1_1-xxl and Gemma-2-2B?

I managed to successfully compress Cosmos-Predict2 and Chroma, but when I tried to compress the T5 text encoder model used by Flux, I get the following error instead:
```
Traceback (most recent call last):
  File "F:\AI setups\Diffusers\models\compress t5.py", line 42, in <module>
    compress_model(
  File "F:\AI setups\Diffusers\diffusers-venv\Lib\site-packages\dfloat11\dfloat11.py", line 622, in compress_model
    save_file(model.state_dict(), os.path.join(save_path, 'model.safetensors'))
  File "F:\AI setups\Diffusers\diffusers-venv\Lib\site-packages\safetensors\torch.py", line 352, in save_file
    serialize_file(_flatten(tensors), filename, metadata=metadata)
                   ^^^^^^^^^^^^^^^^^
  File "F:\AI setups\Diffusers\diffusers-venv\Lib\site-packages\safetensors\torch.py", line 577, in _flatten
    raise RuntimeError(
RuntimeError:
            Some tensors share memory, this will lead to duplicate memory on disk and potential differences when loading them again: [{'shared.weight', 'encoder.embed_tokens.weight'}].
            A potential way to correctly save your model is to use `save_model`.
            More information at https://huggingface.co/docs/safetensors/torch_shared_tensors
```

Is this due to an error with my compression code, or is what I am trying to do not supported? The complete code I used, including the `pattern_dict`, is below:

```python
import torch
from dfloat11 import compress_model
from transformers import T5EncoderModel
save_path = r".\t5-v1_1-xxl-DF11"
save_single_file = True
check_correctness = True
block_range = (0, 100)

text_encoder_2 = T5EncoderModel.from_pretrained(
	r"..\models\FLUX.1-dev",
	subfolder = "text_encoder_2",
	torch_dtype = torch.bfloat16,
	local_files_only = True
)

pattern_dict={
	"block\.\d+": (
		"layer.0.SelfAttention.q",
        "layer.0.SelfAttention.k",
        "layer.0.SelfAttention.v",
        "layer.0.SelfAttention.o",
        "layer.1.DenseReluDense.wi_0",
        "layer.1.DenseReluDense.wi_1",
        "layer.1.DenseReluDense.wo",
	)
}

# Compress the model using DFloat11 compression
compress_model(
	model=text_encoder_2,
	pattern_dict= pattern_dict,
	save_path = save_path,
	save_single_file = save_single_file,
	check_correctness = check_correctness,
	block_range = block_range,
)
```
Edit: Found an issue with the `pattern_dict`, `block` should be replaced with `encoder.block`, but the shared tensors issue will still stop the file from being saved after the compression process finishes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to compress t5-v1_1-xxl and Gemma-2-2B? #28

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to compress t5-v1_1-xxl and Gemma-2-2B? #28

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions