AI ExplorerAI Explorer
OutilsCatégoriesSitesLLMsComparerQuiz IAAlternativesPremium

—

Outils IA

—

Sites & Blogs

—

LLMs & Modèles

—

Catégories

AI Explorer

Trouvez et comparez les meilleurs outils d'intelligence artificielle pour vos projets.

Fait avecen France

Explorer

  • Tous les outils
  • Sites & Blogs
  • LLMs & Modèles
  • Comparer
  • Chatbots
  • Images IA
  • Code & Dev

Entreprise

  • Premium
  • À propos
  • Contact
  • Blog

Légal

  • Mentions légales
  • Confidentialité
  • CGV

© 2026 AI Explorer. Tous droits réservés.

AccueilLLMsVibeVoice Large Q8

VibeVoice Large Q8

par FabioSarracino

Open source · 868 downloads · 95 likes

2.5
(95 avis)AudioAPI & Local
À propos

VibeVoice Large Q8 est un modèle d'IA optimisé pour la génération vocale qui se distingue par sa capacité à fonctionner en 8 bits tout en conservant une qualité audio parfaite, contrairement aux autres modèles quantifiés qui produisent souvent du bruit. Grâce à une technique de quantification sélective, il réduit la taille du modèle de 38 % (11,6 Go au lieu de 18,7 Go) tout en utilisant moins de mémoire vive (12 Go au lieu de 20 Go), le rendant accessible aux cartes graphiques comme les RTX 3060 ou 4070 Ti. Il excelle dans les applications nécessitant un équilibre optimal entre performance et qualité, comme la production audio professionnelle ou les environnements à ressources limitées, tout en restant compatible avec les outils comme ComfyUI. Ce modèle se positionne comme une solution fiable pour les utilisateurs cherchant à exploiter la puissance des modèles vocaux sans sacrifier la clarté du résultat final.

Documentation

VibeVoice-Large-Q8 - Selective 8bit Quantization

The first 8-bit VibeVoice model that actually works

License Model Size Quality

🤗 Model • 💻 ComfyUI • 📖 Docs


🎯 Why This Model is Different

If you've tried other 8-bit quantized VibeVoice models, you probably got nothing but static noise. This one actually works.

The secret? Selective quantization: I only quantized the language model (the most robust part), while keeping audio-critical components (diffusion head, VAE, connectors) at full precision.

Results

  • ✅ Perfect audio, identical to the original model
  • ✅ 11.6 GB instead of 18.7 GB (-38%)
  • ✅ Uses ~12 GB VRAM instead of 20 GB
  • ✅ Works on 12 GB GPUs (RTX 3060, 4070 Ti, etc.)

🚨 The Problem with Other 8-bit Models

Most 8-bit models you'll find online quantize everything aggressively: Result: Audio components get quantized → numerical errors propagate → audio = pure noise.


✅ The Solution: Selective Quantization

I only quantized what can be safely quantized without losing quality.

Result: 52% of parameters quantized, 48% at full precision = perfect audio quality.


📊 Quick Comparison

ModelSizeAudio QualityStatus
Original VibeVoice18.7 GB⭐⭐⭐⭐⭐Full precision
Other 8-bit models10.6 GB💥 NOISE❌ Don't work
This model11.6 GB⭐⭐⭐⭐⭐✅ Perfect

+1.0 GB vs other 8-bit models = perfect audio instead of noise. Worth it.


💻 How to Use It

With Transformers

Python
from transformers import AutoModelForCausalLM, AutoProcessor
import torch
import scipy.io.wavfile as wavfile

# Load model
model = AutoModelForCausalLM.from_pretrained(
    "FabioSarracino/VibeVoice-Large-Q8",
    device_map="auto",
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
)

processor = AutoProcessor.from_pretrained(
    "FabioSarracino/VibeVoice-Large-Q8",
    trust_remote_code=True
)

# Generate audio
text = "Hello, this is VibeVoice speaking."
inputs = processor(text, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=None)

# Save
audio = output.speech_outputs[0].cpu().numpy()
wavfile.write("output.wav", 24000, audio)

With ComfyUI (recommended)

  1. Install the custom node:

    Bash
    cd ComfyUI/custom_nodes
    git clone https://github.com/Enemyx-net/VibeVoice-ComfyUI
    
  2. Download this model to ComfyUI/models/vibevoice/

  3. Restart ComfyUI and use it normally!


💾 System Requirements

Minimum

  • VRAM: 12 GB
  • RAM: 16 GB
  • GPU: NVIDIA with CUDA (required)
  • Storage: 11 GB

Recommended

  • VRAM: 16+ GB
  • RAM: 32 GB
  • GPU: RTX 3090/4090, A5000 or better

⚠️ Not supported: CPU, Apple Silicon (MPS), AMD GPUs


⚠️ Limitations

  1. Requires NVIDIA GPU with CUDA - won't work on CPU or Apple Silicon
  2. Inference only - don't use for fine-tuning
  3. Requires:
    • transformers>=4.51.3
    • bitsandbytes>=0.43.0

🆚 When to Use This Model

✅ Use this 8-bit if:

  • You have 12-16 GB VRAM
  • You want maximum quality with reduced size
  • You need a production-ready model
  • You want the best size/quality balance

Use full precision (18.7 GB) if:

  • You have unlimited VRAM (24+ GB)
  • You're doing research requiring absolute precision

Use 4-bit NF4 (~6.6 GB) if:

  • You only have 8-10 GB VRAM
  • You can accept a small quality trade-off

🔧 Troubleshooting

"OutOfMemoryError" during loading

  • Close other GPU applications
  • Use device_map="auto"
  • Reduce batch size to 1

"BitsAndBytes not found"

Bash
pip install bitsandbytes>=0.43.0

Audio sounds distorted

This shouldn't happen! If it does:

  1. Verify you downloaded the correct model
  2. Update transformers: pip install --upgrade transformers
  3. Check CUDA: torch.cuda.is_available() should return True

📚 Citation

Bibtex
@misc{vibevoice-q8-2025,
  title={VibeVoice-Large-Q8: Selective 8-bit Quantization for Audio Quality},
  author={Fabio Sarracino},
  year={2025},
  url={https://huggingface.co/FabioSarracino/VibeVoice-Large-Q8}
}

Original Model

Bibtex
@misc{vibevoice2024,
  title={VibeVoice: High-Quality Text-to-Speech with Large Language Models},
  author={Microsoft Research},
  year={2024},
  url={https://github.com/microsoft/VibeVoice}
}

🔗 Related Resources

  • Original Model - Full precision base
  • ComfyUI Node - ComfyUI integration

📜 License

MIT License.


🤝 Support

  • Issues: GitHub Issues
  • Questions: HuggingFace Discussions

If this model helped you, leave a ⭐ on GitHub!


Created by Fabio Sarracino

The first 8-bit VibeVoice model that actually works

🤗 HuggingFace • 💻 GitHub

Liens & Ressources
Spécifications
CatégorieAudio
AccèsAPI & Local
LicenceOpen Source
TarificationOpen Source
Note
2.5

Essayer VibeVoice Large Q8

Accédez directement au modèle