par ACE-Step
Open source · 2k downloads · 46 likes
L’ACE-Step Captioner est un modèle d’IA spécialisé dans la génération de descriptions détaillées et structurées de contenus musicaux. Il excelle dans l’analyse des styles, des instruments, des structures et des caractéristiques sonores, offrant une précision supérieure à des solutions comme Gemini Pro 2.5. Grâce à son vocabulaire riche et sa capacité à identifier plus de 1000 instruments et termes descriptifs, il produit des annotations professionnelles adaptées à divers besoins. Ce modèle est particulièrement utile pour l’entraînement de systèmes d’IA musicale, la création de métadonnées pour des bases de données audio ou encore l’éducation musicale. Son approche holistique en fait un outil polyvalent pour documenter, analyser et catégoriser la musique avec une grande finesse.
ACE-Step Captioner is the annotation model used by ACE-Step v1.5 for training data labeling. It is a professional-grade music captioning model that generates detailed, structured descriptions of audio content.
🏆 Accuracy surpasses Gemini Pro 2.5 in music description tasks
The usage is the same as Qwen2.5 Omni-7B.
Use the following prompt to caption audio:
*Task* Describe this audio in detail
<audio>
The model generates natural language descriptions covering multiple aspects of the music.
A melancholic indie folk track featuring fingerpicked acoustic guitar
as the primary instrument. The song opens with a sparse, contemplative
intro before the vocals enter with a breathy, intimate delivery.
The arrangement gradually builds through the verse, adding subtle
string pads and a gentle kick drum. The chorus lifts with layered
harmonies and a warmer, fuller texture. The bridge introduces a
key change and emotional climax before returning to the stripped-down
acoustic arrangement for the outro.
| Category | Styles |
|---|---|
| Electronic | Ambient, Techno, House, Drum & Bass, Synthwave, IDM, Downtempo |
| Rock | Alternative, Indie, Post-Rock, Progressive, Psychedelic, Grunge |
| Pop | Synth-pop, Electropop, Dream Pop, Art Pop, Indie Pop |
| Classical | Orchestral, Chamber, Minimalist, Neo-Classical, Cinematic |
| World | Latin, African, Middle Eastern, Asian Traditional, Celtic |
| Jazz | Fusion, Smooth, Bebop, Modal, Free Jazz |
| Hip-Hop | Trap, Boom Bap, Lo-fi, Instrumental, Cloud Rap |
| Category | Examples |
|---|---|
| Strings | Acoustic Guitar, Electric Guitar, Violin, Cello, Bass, Harp, Mandolin |
| Keys | Piano, Synthesizer, Organ, Rhodes, Wurlitzer, Mellotron |
| Percussion | Drums, Electronic Drums, Congas, Bongos, Timpani, Vibraphone |
| Wind | Saxophone, Trumpet, Flute, Clarinet, Oboe, French Horn |
| Electronic | Synth Bass, Pad, Lead, Arpeggiator, Sampler, 808, 303 |
| Dimension | Descriptors |
|---|---|
| Texture | Warm, Bright, Dark, Crisp, Muddy, Clean, Distorted, Saturated |
| Space | Reverberant, Dry, Spacious, Intimate, Cavernous, Tight |
| Dynamics | Punchy, Soft, Aggressive, Gentle, Compressed, Dynamic |
| Character | Ethereal, Gritty, Smooth, Raw, Polished, Organic, Synthetic |