AI/EXPLORER
ToolsCategoriesSitesLLMsCompareAI QuizAlternativesPremium
—AI Tools
—Sites & Blogs
—LLMs & Models
—Categories
AI Explorer

Find and compare the best artificial intelligence tools for your projects.

Made within France

Explore

  • ›All tools
  • ›Sites & Blogs
  • ›LLMs & Models
  • ›Compare
  • ›Chatbots
  • ›AI Images
  • ›Code & Dev

Company

  • ›Premium
  • ›About
  • ›Contact
  • ›Blog

Legal

  • ›Legal notice
  • ›Privacy
  • ›Terms

© 2026 AI Explorer·All rights reserved.

HomeLLMsmusicgen small stereo onnx

musicgen small stereo onnx

by chinedudave06

Open source · 270 downloads · 0 likes

0.0
(0 reviews)AudioAPI & Local
About

MusicGen Small Stereo ONNX is an artificial intelligence model specialized in generating stereo music from text descriptions. It utilizes a version optimized for mobile devices, incorporating a key-value cache (KV-cache) mechanism to accelerate autoregressive music generation. The model stands out for its ability to produce stereo tracks with enhanced sound quality, achieved through the use of 8 codebooks (4 per audio channel). It is particularly well-suited for mobile applications like DJNed, enabling the rapid generation of customized music from text prompts. Its export in ONNX format ensures efficient local execution, even on devices with limited resources.

Documentation

MusicGen Small Stereo — ONNX (KV-Cache)

ONNX export of facebook/musicgen-stereo-small with KV-cache decoder for efficient on-device autoregressive generation.

Model Details

PropertyValue
Base Modelfacebook/musicgen-stereo-small
PrecisionFP32
AudioStereo (2 channels)
Codebooks8 (4 per channel)
Hidden Size1024
Sample Rate32 kHz
Max Length1500 steps (~30s)
Total Size~3.7 GB

Files

FileDescriptionSize
decoder_model.onnxStep-0 decoder (no KV-cache)1.7 GB
decoder_with_past_model.onnxSteps 1+ decoder (with KV-cache)1.5 GB
text_encoder.onnxT5 text encoder419 MB
encodec_decode.onnxEnCodec audio decoder113 MB
tokenizer.jsonT5 tokenizer vocabulary2.4 MB
config.jsonModel architecture config<1 KB
generation_config.jsonGeneration parameters<1 KB

Stereo Export Notes

The stereo model uses 8 codebooks (4 per audio channel). During export, the EnCodec quantizer's decode method was monkeypatched to handle the codebook index mismatch (EnCodec has 4 physical layers, but stereo needs 8 codebook indices). The exported EnCodec ONNX is replaced with the mono version, which handles both mono and stereo decoding.

Usage

These models are designed for the DJNed Android app using ONNX Runtime.

Pipeline

  1. Text encoding: text_encoder.onnx encodes the text prompt
  2. Step 0: decoder_model.onnx generates the first token + initial KV-cache
  3. Steps 1+: decoder_with_past_model.onnx generates subsequent tokens using KV-cache
  4. Audio decode: encodec_decode.onnx converts 8 codebook streams (4 per channel) to stereo audio

License

This model is derived from Meta's MusicGen under the CC-BY-NC-4.0 license.

Capabilities & Tags
onnxruntimeonnxmusicgenmusic-generationkv-cachetext-to-audiostereoon-deviceandroiden
Links & Resources
Specifications
CategoryAudio
AccessAPI & Local
LicenseOpen Source
PricingOpen Source
Rating
0.0

Try musicgen small stereo onnx

Access the model directly