AI ExplorerAI Explorer
OutilsCatégoriesSitesLLMsComparerQuiz IAAlternativesPremium

—

Outils IA

—

Sites & Blogs

—

LLMs & Modèles

—

Catégories

AI Explorer

Trouvez et comparez les meilleurs outils d'intelligence artificielle pour vos projets.

Fait avecen France

Explorer

  • Tous les outils
  • Sites & Blogs
  • LLMs & Modèles
  • Comparer
  • Chatbots
  • Images IA
  • Code & Dev

Entreprise

  • Premium
  • À propos
  • Contact
  • Blog

Légal

  • Mentions légales
  • Confidentialité
  • CGV

© 2026 AI Explorer. Tous droits réservés.

AccueilLLMsZ Image Turbo

Z Image Turbo

par Tongyi-MAI

Open source · 1M downloads · 4443 likes

4.6
(4443 avis)ImageAPI & Local
À propos

Z Image Turbo est un modèle d'IA spécialisé dans la génération d'images haute qualité, capable de produire des visuels réalistes ou artistiques à partir de simples descriptions textuelles. Il se distingue par sa rapidité exceptionnelle, offrant des résultats en moins d'une seconde sur des configurations matérielles avancées, tout en restant accessible sur des appareils grand public grâce à sa faible consommation de mémoire. Le modèle excelle particulièrement dans la génération de textes bilingues (anglais et chinois) intégrés naturellement dans les images, ainsi que dans l'interprétation précise des instructions pour des rendus variés et créatifs. Ses cas d'usage couvrent la création d'images professionnelles, l'illustration, le design ou encore l'édition visuelle, avec une grande fidélité aux prompts fournis. Ce qui le rend unique, c'est son équilibre entre performance, polyvalence et accessibilité, le positionnant comme un outil puissant pour les créateurs comme pour les développeurs.

Documentation

⚡️- Image
An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Official Site  GitHub  Hugging Face  Hugging Face  Hugging Face  ModelScope Model  ModelScope Space  Art Gallery PDF  Web Art Gallery 

Welcome to the official repository for the Z-Image(造相)project!

✨ Z-Image

Z-Image is a powerful and highly efficient image generation model family with 6B parameters. Currently there are four variants:

  • 🚀 Z-Image-Turbo – A distilled version of Z-Image that matches or exceeds leading competitors with only 8 NFEs (Number of Function Evaluations). It offers ⚡️sub-second inference latency⚡️ on enterprise-grade H800 GPUs and fits comfortably within 16G VRAM consumer devices. It excels in photorealistic image generation, bilingual text rendering (English & Chinese), and robust instruction adherence.

  • 🎨 Z-Image – The foundation model behind Z-Image-Turbo. Z-Image focuses on high-quality generation, rich aesthetics, strong diversity, and controllability, well-suited for creative generation, fine-tuning, and downstream development. It supports a wide range of artistic styles, effective negative prompting, and high diversity across identities, poses, compositions, and layouts.

  • 🧱 Z-Image-Omni-Base – The versatile foundation model capable of both generation and editing tasks. By releasing this checkpoint, we aim to unlock the full potential for community-driven fine-tuning and custom development, providing the most "raw" and diverse starting point for the open-source community.

  • ✍️ Z-Image-Edit – A variant fine-tuned on Z-Image specifically for image editing tasks. It supports creative image-to-image generation with impressive instruction-following capabilities, allowing for precise edits based on natural language prompts.

📥 Model Zoo

ModelPre-TrainingSFTRLStepCFGTaskVisual QualityDiversityFine-TunabilityHugging FaceModelScope
Z-Image-Omni-Base✅❌❌50✅Gen. / EditingMediumHighEasyTo be releasedTo be released
Z-Image✅✅❌50✅Gen.HighMediumEasyHugging Face
Hugging Face Space
ModelScope Model
ModelScope Space
Z-Image-Turbo✅✅✅8❌Gen.Very HighLowN/AHugging Face
Hugging Face Space
ModelScope Model
ModelScope Space
Z-Image-Edit✅✅❌50✅EditingHighMediumEasyTo be releasedTo be released

🖼️ Showcase

📸 Photorealistic Quality: Z-Image-Turbo delivers strong photorealistic image generation while maintaining excellent aesthetic quality.

Showcase of Z-Image on Photo-realistic image Generation

📖 Accurate Bilingual Text Rendering: Z-Image-Turbo excels at accurately rendering complex Chinese and English text.

Showcase of Z-Image on Bilingual Text Rendering

💡 Prompt Enhancing & Reasoning: Prompt Enhancer empowers the model with reasoning capabilities, enabling it to transcend surface-level descriptions and tap into underlying world knowledge.

reasoning.jpg

🧠 Creative Image Editing: Z-Image-Edit shows a strong understanding of bilingual editing instructions, enabling imaginative and flexible image transformations.

Showcase of Z-Image-Edit on Image Editing

🏗️ Model Architecture

We adopt a Scalable Single-Stream DiT (S3-DiT) architecture. In this setup, text, visual semantic tokens, and image VAE tokens are concatenated at the sequence level to serve as a unified input stream, maximizing parameter efficiency compared to dual-stream approaches.

Architecture of Z-Image and Z-Image-Edit

📈 Performance

According to the Elo-based Human Preference Evaluation (on Alibaba AI Arena), Z-Image-Turbo shows highly competitive performance against other leading models, while achieving state-of-the-art results among open-source models.

Z-Image Elo Rating on AI Arena
Click to view the full leaderboard

🚀 Quick Start

Install the latest version of diffusers, use the following command:

Click here for details for why you need to install diffusers from source

We have submitted two pull requests (#12703 and #12715) to the 🤗 diffusers repository to add support for Z-Image. Both PRs have been merged into the latest official diffusers release. Therefore, you need to install diffusers from source for the latest features and Z-Image support.

Bash
pip install git+https://github.com/huggingface/diffusers
Python
import torch
from diffusers import ZImagePipeline

# 1. Load the pipeline
# Use bfloat16 for optimal performance on supported GPUs
pipe = ZImagePipeline.from_pretrained(
    "Tongyi-MAI/Z-Image-Turbo",
    torch_dtype=torch.bfloat16,
    low_cpu_mem_usage=False,
)
pipe.to("cuda")

# [Optional] Attention Backend
# Diffusers uses SDPA by default. Switch to Flash Attention for better efficiency if supported:
# pipe.transformer.set_attention_backend("flash")    # Enable Flash-Attention-2
# pipe.transformer.set_attention_backend("_flash_3") # Enable Flash-Attention-3

# [Optional] Model Compilation
# Compiling the DiT model accelerates inference, but the first run will take longer to compile.
# pipe.transformer.compile()

# [Optional] CPU Offloading
# Enable CPU offloading for memory-constrained devices.
# pipe.enable_model_cpu_offload()

prompt = "Young Chinese woman in red Hanfu, intricate embroidery. Impeccable makeup, red floral forehead pattern. Elaborate high bun, golden phoenix headdress, red flowers, beads. Holds round folding fan with lady, trees, bird. Neon lightning-bolt lamp (⚡️), bright yellow glow, above extended left palm. Soft-lit outdoor night background, silhouetted tiered pagoda (西安大雁塔), blurred colorful distant lights."

# 2. Generate Image
image = pipe(
    prompt=prompt,
    height=1024,
    width=1024,
    num_inference_steps=9,  # This actually results in 8 DiT forwards
    guidance_scale=0.0,     # Guidance should be 0 for the Turbo models
    generator=torch.Generator("cuda").manual_seed(42),
).images[0]

image.save("example.png")

🔬 Decoupled-DMD: The Acceleration Magic Behind Z-Image

arXiv

Decoupled-DMD is the core few-step distillation algorithm that empowers the 8-step Z-Image model.

Our core insight in Decoupled-DMD is that the success of existing DMD (Distributaion Matching Distillation) methods is the result of two independent, collaborating mechanisms:

  • CFG Augmentation (CA): The primary engine 🚀 driving the distillation process, a factor largely overlooked in previous work.
  • Distribution Matching (DM): Acts more as a regularizer ⚖️, ensuring the stability and quality of the generated output.

By recognizing and decoupling these two mechanisms, we were able to study and optimize them in isolation. This ultimately motivated us to develop an improved distillation process that significantly enhances the performance of few-step generation.

Diagram of Decoupled-DMD

🤖 DMDR: Fusing DMD with Reinforcement Learning

arXiv

Building upon the strong foundation of Decoupled-DMD, our 8-step Z-Image model has already demonstrated exceptional capabilities. To achieve further improvements in terms of semantic alignment, aesthetic quality, and structural coherence—while producing images with richer high-frequency details—we present DMDR.

Our core insight behind DMDR is that Reinforcement Learning (RL) and Distribution Matching Distillation (DMD) can be synergistically integrated during the post-training of few-step models. We demonstrate that:

  • RL Unlocks the Performance of DMD 🚀
  • DMD Effectively Regularizes RL ⚖️

Diagram of DMDR

⏬ Download

Bash
pip install -U huggingface_hub
HF_XET_HIGH_PERFORMANCE=1 hf download Tongyi-MAI/Z-Image-Turbo

📜 Citation

If you find our work useful in your research, please consider citing:

Bibtex
@article{team2025zimage,
  title={Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer},
  author={Z-Image Team},
  journal={arXiv preprint arXiv:2511.22699},
  year={2025}
}

@article{liu2025decoupled,
  title={Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield},
  author={Dongyang Liu and Peng Gao and David Liu and Ruoyi Du and Zhen Li and Qilong Wu and Xin Jin and Sihan Cao and Shifeng Zhang and Hongsheng Li and Steven Hoi},
  journal={arXiv preprint arXiv:2511.22677},
  year={2025}
}

@article{jiang2025distribution,
  title={Distribution Matching Distillation Meets Reinforcement Learning},
  author={Jiang, Dengyang and Liu, Dongyang and Wang, Zanyi and Wu, Qilong and Jin, Xin and Liu, David and Li, Zhen and Wang, Mengmeng and Gao, Peng and Yang, Harry},
  journal={arXiv preprint arXiv:2511.13649},
  year={2025}
}
Liens & Ressources
Spécifications
CatégorieImage
AccèsAPI & Local
LicenceOpen Source
TarificationOpen Source
Note
4.6

Essayer Z Image Turbo

Accédez directement au modèle