Adding support for Bria's FIBO model #732
Conversation
Add support for Bria AI's FIBO model including: - FiboModel implementation with DimFusion text encoding - Shifted logit-normal timestep sampling for training - Model-specified default timestep type and content_or_style - Fix dtype and broadcasting in flowmatch scheduler add_noise
|
Sorry it took so long, I am talking a look at this today. |
|
I am running into a few potential challenges with this model since it is unique in the face that it is JSON-native and trained on long structured captions. Which is awesome, but AI Toolkit is not really setup to handle that currently. I am curious how you guys recommend handling this.
|
A general note on FIBO - it's trained on long, structured JSON prompts and this is what gives the model its strong disentanglement and controllability.
2+3. To generate captions for a dataset we suggest using either Gemini or a local VLM we've finetuned for the FIBO JSON schema, and in order to support "trigger words" you can just add them to the
def get_default_negative_prompt(existing_json: dict) -> str:
negative_prompt = ""
style_medium = existing_json.get("style_medium", "").lower()
if style_medium in ["photograph", "photography", "photo"]:
negative_prompt = """{'style_medium':'digital illustration','artistic_style':'non-realistic'}"""
return negative_promptso it's possible to use short, unstructured prompts or JSONs with only some of the fields |
Adding support for Bria's FIBO model, including training (with low VRAM flags) and inference.