par GuangyuanSD
Open source · 27k downloads · 31 likes
FLUX.2 klein 9B Blitz ComfyUI est un modèle spécialisé dans l'échange de visages, optimisé pour des résultats ultra-réalistes et naturels. Il intègre la technologie BFS (Best Face Swap) qui élimine les artefacts rigides des anciennes méthodes, préservant fidèlement l'identité, les expressions et les éclairages des visages. Conçu pour une vitesse d'inférence exceptionnelle, il permet des générations en seulement 4 à 5 étapes avec un CFG fixe, même sur du matériel grand public. Idéal pour les applications nécessitant des remplacements de visages précis et fluides, comme le montage photo, les effets spéciaux ou la création de contenu, il se distingue par sa capacité à offrir des intégrations parfaites sans compromis sur la qualité. Son approche allie la puissance du FLUX.2 klein 9B accéléré à des optimisations ciblées, garantissant des performances à la fois rapides et réalistes.
This is the next-level face-swap specialized evolution of the Dark Beast lineage, built on the lightning-fast FLUX.2 Klein 9B accelerated model from Black Forest Labs.
Engineered with targeted optimizations for face swapping practices, it integrates BFS (Best Face Swap) technology to completely eliminate the rigid, unnatural look that plagued earlier face replacements — delivering seamless, lifelike integrations with preserved identity, expression, and lighting.
It also fully fixes the portrait reference issue from the previous DB BlitZ versions, ensuring right reference adherence every time.
Special thanks to the scheme provider: https://github.com/alisson-anjos for the powerful BFS foundation that powers this breakthrough.🟦

Important notes:
This version is exclusively designed around the Klein 9B accelerated edition — no base model exists.
Usage is identical to Black Forest Labs' official FLUX.2 Klein 9B accelerated release: ultra-low steps (e.g., 4-5), CFG=1 fixed, blazing inference speed on consumer hardware.
In one sentence: Dark Beast's ferocious soul meets BFS (Best Face Swap) technology — more natural, and truly unstoppable! 🟦
for more infomation about BFS (Best Face Swap) :
https://huggingface.co/Alissonerdx
Alternatively, it can be directly applied to the entire Klein 9b/Qwen Edit base and Fine-tune models, through LoRA Adapter parameter injection.

DarkBeast5steps_extracted_lora_r256 uploaded
working fine with FLUX.2 Klein 9b models
Fine-tunning of black-forest-labs/FLUX.2-klein-9B with BF16\FP8e4m3fn\NVFP4 quantization.
And Merge with @alcaitiff klein-9b-unchained-xxx
This is the ultimate speed-optimized Dark Beast V1 evolution, based on Flux.2 Klein 9B,
engineered specifically for lightning-fast low-step + CFG=1 workflows (5steps).
Also available in NVFP4 quantized format, optimized for acceleration on Blackwell architecture GPUs.
( like RTX50XX, PRO6000, B200, and others )
Also supports non-50 series GPUs (automatic 16-bit operation), Verify environment is my ComfyUI 0.11
Fully preserves the signature Dark Beast style, rich details, and intense Black Beast aesthetic from the standard lineage
Refined through advanced targeted distillation & fine-tuning, now perfectly dialed in for zero-CFG guidance at minimal steps
BlitZ-level inference speed — breathtaking high-quality images in just 5 steps ⚡
Recommended settings: 5 steps, CFG=1 (fixed), any seed you want
In one sentence: Taking Klein’s already blazing speed and cranking it to absolute BlitZ velocity while keeping every drop of that ferocious Dark Beast soul! 🟦
Lightning-fast generation awaits — unleash it now! 🚀
Usage:
pip install sdnq
import torch
import diffusers
from sdnq import SDNQConfig # import sdnq to register it into diffusers and transformers
from sdnq.common import use_torch_compile as triton_is_available
from sdnq.loader import apply_sdnq_options_to_model
pipe = diffusers.Flux2KleinPipeline.from_pretrained("GuangyuanSD/FLUX.2-klein-9B-Blitz-Diffusers", torch_dtype=torch.bfloat16)
# Enable INT8 MatMul for AMD, Intel ARC and Nvidia GPUs:
if triton_is_available and (torch.cuda.is_available() or torch.xpu.is_available()):
pipe.transformer = apply_sdnq_options_to_model(pipe.transformer, use_quantized_matmul=True)
pipe.text_encoder = apply_sdnq_options_to_model(pipe.text_encoder, use_quantized_matmul=True)
# pipe.transformer = torch.compile(pipe.transformer) # optional for faster speeds
pipe.enable_model_cpu_offload()
prompt = "A cat holding a sign that says hello world"
image = pipe(
prompt=prompt,
height=1024,
width=1024,
guidance_scale=1.0,
num_inference_steps=4,
generator=torch.manual_seed(0)
).images[0]
image.save("flux-klein-Blitz.png")
Original BF16 vs Blitz fine-tune comparison:
| Quantization | Model Size | Visualization |
|---|---|---|
| Original BF16 | 18.2 GB | ![]() |
| Blitz fine-tune | 18.2 GB | ![]() |
Big thanks to @alcaitiff for the awesome work and killer contributions to training Z-Image and Klein models! Seriously impressive stuff! 🚀
非常感谢 @alcaitiff 对 Zimage 和 Klein 9b 的模型训练做出的杰出贡献!