by ACE-Step
Open source · 1k downloads · 66 likes
ACE-Step 1.5 XL Base is an artificial intelligence model specialized in audio generation and manipulation, designed to produce high-quality music from text or other inputs. With its 4 billion parameters, it delivers superior sound quality compared to lighter versions while remaining accessible for a variety of uses. The model supports multiple tasks, such as creating music from text descriptions, remixing existing tracks, modifying audio tracks, and extracting specific sound elements. Its training is based on legally compliant data, ensuring secure commercial use of the generated creations. What sets it apart is its balance of performance, versatility, and copyright compliance, making it a tool suitable for both professionals and independent creators.
Project | Hugging Face | ModelScope | Space Demo | Discord | Tech Report
This is the XL (4B) Base variant of ACE-Step 1.5 — a larger DiT decoder with ~4B parameters for higher audio quality. It is the foundation model supporting all tasks: text-to-music, cover, repaint, extract, lego, and complete.
| Parameter | Value |
|---|---|
| DiT Decoder hidden_size | 2560 |
| DiT Decoder layers | 32 |
| DiT Decoder attention heads | 32 |
| Encoder hidden_size | 2048 |
| Encoder layers | 8 |
| Total params | ~4B |
| Weights size (bf16) | ~18.8 GB |
| Inference steps | 50 (with CFG) |
| VRAM | Support |
|---|---|
| ≥12 GB | With CPU offload + INT8 quantization |
| ≥16 GB | With CPU offload |
| ≥20 GB | Without offload |
| ≥24 GB | Full quality (XL + 4B LM) |
All LM models (0.6B / 1.7B / 4B) are fully compatible with XL.
# Install ACE-Step
git clone https://github.com/ace-step/ACE-Step-1.5.git
cd ACE-Step-1.5
pip install -e .
# Download this model
huggingface-cli download ACE-Step/acestep-v15-xl-base --local-dir ./checkpoints/acestep-v15-xl-base
# Run with Gradio UI
python acestep --config-path acestep-v15-xl-base
| DiT Model | CFG | Steps | Quality | Diversity | Tasks | Hugging Face | ModelScope |
|---|---|---|---|---|---|---|---|
acestep-v15-xl-base | ✅ | 50 | High | High | All (extract, lego, complete) | This repo | Link |
acestep-v15-xl-sft | ✅ | 50 | Very High | Medium | Standard | Link | Link |
acestep-v15-xl-turbo | ❌ | 8 | Very High | Medium | Standard | Link | Link |
| DiT Model | CFG | Steps | Hugging Face | ModelScope |
|---|---|---|---|---|
acestep-v15-turbo (default) | ❌ | 8 | Link | Link |
acestep-v15-sft | ✅ | 50 | Link | Link |
acestep-v15-base | ✅ | 50 | Link | Link |
| LM Model | Params | Audio Understanding | Composition | Hugging Face | ModelScope |
|---|---|---|---|---|---|
acestep-5Hz-lm-0.6B | 0.6B | Medium | Medium | Link | Link |
acestep-5Hz-lm-1.7B | 1.7B | Medium | Medium | Included in main | Included in main |
acestep-5Hz-lm-4B | 4B | Strong | Strong | Link | Link |
This project is co-led by ACE Studio and StepFun.
@misc{gong2026acestep,
title={ACE-Step 1.5: Pushing the Boundaries of Open-Source Music Generation},
author={Junmin Gong, Yulin Song, Wenxiao Zhao, Sen Wang, Shengyuan Xu, Jing Guo},
howpublished={\url{https://github.com/ace-step/ACE-Step-1.5}},
year={2026},
note={GitHub repository}
}