AI/EXPLORER
ToolsCategoriesSitesLLMsCompareAI QuizAlternativesPremium
—AI Tools
—Sites & Blogs
—LLMs & Models
—Categories
AI Explorer

Find and compare the best artificial intelligence tools for your projects.

Made within France

Explore

  • ›All tools
  • ›Sites & Blogs
  • ›LLMs & Models
  • ›Compare
  • ›Chatbots
  • ›AI Images
  • ›Code & Dev

Company

  • ›Premium
  • ›About
  • ›Contact
  • ›Blog

Legal

  • ›Legal notice
  • ›Privacy
  • ›Terms

© 2026 AI Explorer·All rights reserved.

HomeLLMsZ Image Turbo SDNQ int8

Z Image Turbo SDNQ int8

by Disty0

Open source · 5k downloads · 19 likes

1.6
(19 reviews)ImageAPI & Local
About

Z Image Turbo SDNQ int8 is an optimized and 8-bit quantized version of the Z-Image-Turbo model, designed to enhance efficiency while maintaining performance. Using the SDNQ technique, it accelerates matrix multiplication operations in INT8 while remaining compatible with BF16 calculations for flexible use. This model stands out for its ability to run faster on compatible hardware without compromising the quality of the generated results. It is particularly well-suited for applications requiring rapid inference, such as real-time image generation or processing large volumes of visual data. Its main advantage lies in its balance between performance and lightweight design, making it ideal for environments with limited resources.

Documentation

8 bit quantization of Tongyi-MAI/Z-Image-Turbo using SDNQ.
This model is quantized with group sizes disabled for faster INT8 MatMul.
Example code to enable INT8 MatMul is provided in the Usage.
INT8 MatMul is optional and disabled by default.

Usage:

Code
pip install sdnq
Py
import torch
import diffusers
from sdnq import SDNQConfig # import sdnq to register it into diffusers and transformers
from sdnq.common import use_torch_compile as triton_is_available
from sdnq.loader import apply_sdnq_options_to_model

pipe = diffusers.ZImagePipeline.from_pretrained("Disty0/Z-Image-Turbo-SDNQ-int8", torch_dtype=torch.bfloat16)

# Enable INT8 MatMul for AMD, Intel ARC and Nvidia GPUs:
if triton_is_available and (torch.cuda.is_available() or torch.xpu.is_available()):
    pipe.transformer = apply_sdnq_options_to_model(pipe.transformer, use_quantized_matmul=True)
    pipe.text_encoder = apply_sdnq_options_to_model(pipe.text_encoder, use_quantized_matmul=True)
    pipe.transformer = torch.compile(pipe.transformer) # optional for faster speeds

pipe.enable_model_cpu_offload()

prompt = "Young Chinese woman in red Hanfu, intricate embroidery. Impeccable makeup, red floral forehead pattern. Elaborate high bun, golden phoenix headdress, red flowers, beads. Holds round folding fan with lady, trees, bird. Neon lightning-bolt lamp (⚡️), bright yellow glow, above extended left palm. Soft-lit outdoor night background, silhouetted tiered pagoda (西安大雁塔), blurred colorful distant lights."
image = pipe(
    prompt=prompt,
    height=1024,
    width=1024,
    num_inference_steps=9,
    guidance_scale=0.0,
    generator=torch.manual_seed(42),
).images[0]
image.save("z-image-turbo-sdnq-int8")

Original BF16 vs SDNQ quantization comparison:

QuantizationModel SizeVisualization
Original BF1612.3 GBOriginal BF16
SDNQ INT86.2 GBSDNQ INT8
SDNQ INT8 MatMul6.2 GBSDNQ INT8 MatMul
Capabilities & Tags
diffuserssafetensorssdnqz_image8-bit
Links & Resources
Specifications
CategoryImage
AccessAPI & Local
LicenseOpen Source
PricingOpen Source
Rating
1.6

Try Z Image Turbo SDNQ int8

Access the model directly