AI ExplorerAI Explorer
ToolsCategoriesSitesLLMsCompareAI QuizAlternativesPremium

—

AI Tools

—

Sites & Blogs

—

LLMs & Models

—

Categories

AI Explorer

Find and compare the best artificial intelligence tools for your projects.

Made within France

Explore

  • All tools
  • Sites & Blogs
  • LLMs & Models
  • Compare
  • Chatbots
  • AI Images
  • Code & Dev

Company

  • Premium
  • About
  • Contact
  • Blog

Legal

  • Legal notice
  • Privacy
  • Terms

© 2026 AI Explorer. All rights reserved.

HomeLLMsstable audio open models

stable audio open models

by AEmotionStudio

Open source · 155 downloads · 0 likes

0.0
(0 reviews)AudioAPI & Local
About

Stable Audio Open is an audio generation model that converts text descriptions into stereo sound effects and ambient textures, with a maximum duration of 47 seconds at 44.1 kHz. It excels particularly in creating realistic sounds such as footsteps, impacts, ambiences (rain, wind), or complex soundscapes, while also producing atmospheric musical textures like pads or drones. Unlike other models, it does not generate full songs with vocals, high-fidelity musical instruments, or speech synthesis, focusing instead on creative and immersive uses. Available under a community license, it is ideal for artists, developers, or content creators looking to enrich their projects with unique and varied sounds. Its simplified integration through tools like Mæstræa makes it especially practical for immediate use.

Documentation

Stable Audio Open 1.0 (Mæstræa Mirror)

Text-to-Audio SFX & Ambient Textures — Up to 47s Stereo @ 44.1kHz

Original Model by Stability AI · Stability AI Community License

This is an ungated mirror of the Stable Audio Open 1.0 model weights for use with Mæstræa AI Workstation. Only safetensors-format weights are included (legacy .ckpt files stripped). All credits go to the original authors.

What's in This Repo

PathDescriptionSize
model.safetensorsMain model checkpoint~3 GB
transformer/diffusion_pytorch_model.safetensorsDiT transformer~1.5 GB
text_encoder/model.safetensorsT5 text encoder~1.2 GB
vae/diffusion_pytorch_model.safetensorsVAE decoder~150 MB
projection_model/diffusion_pytorch_model.safetensorsProjection model~50 MB
tokenizer/T5 tokenizer files< 10 MB
model_config.jsonModel architecture config< 1 KB
model_index.jsonDiffusers pipeline index< 1 KB
scheduler/Scheduler config< 1 KB

What Stable Audio Open Does

Stable Audio Open generates stereo audio at 44.1kHz from text prompts. It excels at:

  • Sound effects — Foley, impacts, transitions
  • Ambient textures — Rain, wind, crowds, environments
  • Musical textures — Pads, drones, atmospheric sounds
  • Audio scenes — Complex layered soundscapes

Up to 47 seconds of stereo audio per generation.

What It's NOT Good At

  • Full songs with vocals
  • High-fidelity musical instruments (use Foundation-1 for that)
  • Speech synthesis

VRAM Requirements

  • Minimum: ~4 GB (FP16)
  • Recommended: ~7 GB (FP16, longer durations)

Usage with Mæstræa

These models are automatically downloaded by the Mæstræa AI Workstation backend.

Direct Usage (diffusers)

Python
from diffusers import StableAudioPipeline
import torch

pipe = StableAudioPipeline.from_pretrained(
    "AEmotionStudio/stable-audio-open-models",
    torch_dtype=torch.float16,
).to("cuda")

audio = pipe(
    prompt="Thunderstorm with heavy rain and distant rolling thunder",
    negative_prompt="low quality, distorted",
    audio_end_in_s=10.0,
    num_inference_steps=100,
).audios[0]

Using stable-audio-tools

Python
from stable_audio_tools import get_pretrained_model
model, model_config = get_pretrained_model("AEmotionStudio/stable-audio-open-models")

License

Stability AI Community License — see LICENSE.md for full terms.

Key points:

  • Free for research and non-commercial use
  • Commercial use requires revenue < $1M/year or a separate license from Stability AI
  • Model outputs cannot be used to train competing models

Credits

  • Model: Stability AI
  • Paper: Stable Audio Open
  • Training Data: FreeSound + Free Music Archive (see attribution CSVs)
  • Mirror by: AEmotionStudio
Capabilities & Tags
diffuserssafetensorsaudiotext-to-audiosound-effectsambientdiffusionstable-audiomaestraea
Links & Resources
Specifications
CategoryAudio
AccessAPI & Local
LicenseOpen Source
PricingOpen Source
Rating
0.0

Try stable audio open models

Access the model directly