Flux Dev F8 Diffusers

Transformer support in float8_e4m3fn precision. Compatible with the native FluxPipeline.

Requires RTX 3000 or newer card. 2-3x speedup in inference time.

You will need the full weight Flux model, but the transformer directory is in float8_e4m3fn.

Also compatible with:

Make sure your torch version is 2.4 or newer.

Inference

It replaces and transforms the linear layers of a float8 model to bfloat16 on the fly, using 2x less VRAM.

from diffusers import AutoencoderKL, FluxTransformer2DModel, FluxPipeline
from linear_8 import replace_regular_linears
import torch

transformer = FluxTransformer2DModel.from_pretrained('twodgirl/flux-dev-fp8-e4m3fn-diffusers',
                                                     subfolder='transformer',
                                                     torch_dtype=torch.float8_e4m3fn)
replace_regular_linears(transformer)
vae = AutoencoderKL.from_pretrained('black-forest-labs/FLUX.1-dev', subfolder='vae').to(torch.bfloat16)
pipe = FluxPipeline.from_pretrained('black-forest-labs/FLUX.1-dev',
                                    transformer=transformer,
                                    vae=vae)
pipe.enable_model_cpu_offload()

Inference with PuLID

Tested with v0.30.2 diffusers library, it's likely to need small modifications in the future version.

pip install -r requirements.txt

Download T5 Encoder and the content of PuLID (without the requirements file).

You should have:

eva_clip
pulid
flux-t5
flux_model.py
linear_8.py
the-file-below.py

The demo file in the same directory:

from diffusers import AutoencoderKL, FluxPipeline
from flux_model import FluxTransformer2DModel
from linear_8 import replace_regular_linears
import torch

from optimum.quanto.models import QuantizedTransformersModel
import numpy as np
from PIL import Image
from pulid.pipeline_flux import PuLIDPipeline
from transformers import T5EncoderModel
from torchvision import transforms

class T5Model(QuantizedTransformersModel):
    auto_class = T5EncoderModel

class FluxGenerator:
    def __init__(self, pipe):
        self.pipe = pipe
        self.pulid_model = PuLIDPipeline(pipe.transformer, 'cuda', weight_dtype=torch.bfloat16)
        self.pulid_model.load_pretrain()

    def clear_id(self):
        self.pipe.transformer.pul_id = None
        self.pipe.transformer.pul_weight = 1.0

    def set_id(self, id_image, id_weight=1.0, true_cfg=1.0):
        # Variable use_true_cfg is False by default.
        use_true_cfg = abs(true_cfg - 1.0) > 1e-2
        if id_image is not None:
            id_embeddings, uncond_id_embeddings = self.pulid_model.get_id_embedding(id_image, cal_uncond=use_true_cfg)
        else:
            id_embeddings = None
            uncond_id_embeddings = None
        # The pipe cannot accept its module's parameters,
        # change the module's state instead.
        self.pipe.transformer.pul_id = uncond_id_embeddings if use_true_cfg else id_embeddings
        self.pipe.transformer.pul_id_weight = id_weight

T5EncoderModel.from_config = lambda c: T5EncoderModel(c).to(dtype=torch.bfloat16)
t5 = T5Model.from_pretrained('./flux-t5')._wrapped
transformer = FluxTransformer2DModel.from_pretrained('twodgirl/flux-dev-fp8-e4m3fn-diffusers',
                                                     subfolder='transformer',
                                                     torch_dtype=torch.float8_e4m3fn)
replace_regular_linears(transformer)
vae = AutoencoderKL.from_pretrained('black-forest-labs/FLUX.1-dev', subfolder='vae').to(torch.bfloat16)
pipe = FluxPipeline.from_pretrained('black-forest-labs/FLUX.1-dev',
                                    text_encoder_2=t5,
                                    transformer=transformer,
                                    vae=vae)
pipe.enable_model_cpu_offload()
face = transforms.Resize(1024)(Image.open('reference.png').convert('RGB'))
gen = FluxGenerator(pipe)
gen.set_id(np.array(face))
image = pipe('portrait, color, cinematic', num_inference_steps=10).images[0]
image.save('portrait.png')

Disclaimer

Use of this code requires citation and attribution to the author via a link to their Hugging Face profile in all resulting work.

The model weights have a custom license.

Downloads last month
0
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for twodgirl/flux-dev-fp8-e4m3fn-diffusers

Finetuned
this model