AI/EXPLORER
OutilsCatégoriesSitesLLMsComparerQuiz IAAlternativesPremium
—Outils IA
—Sites & Blogs
—LLMs & Modèles
—Catégories
AI Explorer

Trouvez et comparez les meilleurs outils d'intelligence artificielle pour vos projets.

Fait avecen France

Explorer

  • ›Tous les outils
  • ›Sites & Blogs
  • ›LLMs & Modèles
  • ›Comparer
  • ›Chatbots
  • ›Images IA
  • ›Code & Dev

Entreprise

  • ›Premium
  • ›À propos
  • ›Contact
  • ›Blog

Légal

  • ›Mentions légales
  • ›Confidentialité
  • ›CGV

© 2026 AI Explorer·Tous droits réservés.

AccueilLLMsmusicgen songstarter v0.2

musicgen songstarter v0.2

par nateraw

Open source · 727 downloads · 170 likes

2.8
(170 avis)AudioAPI & Local
À propos

MusicGen SongStarter v0.2 est un modèle d'IA spécialisé dans la génération de mélodies et d'idées musicales pour les producteurs. Il produit des morceaux stéréo en 32 kHz à partir de prompts textuels décrivant des styles, des instruments ou des ambiances, offrant ainsi des bases créatives pour composer des morceaux. Ce modèle se distingue par sa version améliorée, entraînée sur un jeu de données trois fois plus large et plus qualitatif que la précédente, avec un modèle de transformeur deux fois plus grand pour des résultats plus riches et nuancés. Idéal pour les musiciens cherchant à explorer rapidement des concepts musicaux ou à surmonter des blocages créatifs, il se positionne comme un outil pratique pour l'inspiration plutôt que pour la production finale. Son approche repose sur des échantillons manuellement sélectionnés et affinés, garantissant une cohérence stylistique adaptée aux besoins des créateurs.

Documentation

Model Card for musicgen-songstarter-v0.2

Replicate demo and cloud API Open In Colab Open in Spaces

musicgen-songstarter-v0.2 is a musicgen-stereo-melody-large fine-tuned on a dataset of melody loops from my Splice sample library. It's intended to be used to generate song ideas that are useful for music producers. It generates stereo audio in 32khz.

👀 Update: I wrote a blogpost detailing how and why I trained this model, including training details, the dataset, Weights and Biases logs, etc.

Compared to musicgen-songstarter-v0.1, this new version:

  • was trained on 3x more unique, manually-curated samples that I painstakingly purchased on Splice
  • Is twice the size, bumped up from size medium ➡️ large transformer LM

If you find this model interesting, please consider:

  • following me on GitHub
  • following me on Twitter

Usage

Install audiocraft:

Perl
pip install -U git+https://github.com/facebookresearch/audiocraft#egg=audiocraft

Then, you should be able to load this model just like any other musicgen checkpoint here on the Hub:

Python
import torchaudio
from audiocraft.models import MusicGen
from audiocraft.data.audio import audio_write

model = MusicGen.get_pretrained('nateraw/musicgen-songstarter-v0.2')
model.set_generation_params(duration=8)  # generate 8 seconds.
wav = model.generate_unconditional(4)    # generates 4 unconditional audio samples
descriptions = ['acoustic, guitar, melody, trap, d minor, 90 bpm'] * 3
wav = model.generate(descriptions)  # generates 3 samples.

melody, sr = torchaudio.load('./assets/bach.mp3')
# generates using the melody from the given audio and the provided descriptions.
wav = model.generate_with_chroma(descriptions, melody[None].expand(3, -1, -1), sr)

for idx, one_wav in enumerate(wav):
    # Will save under {idx}.wav, with loudness normalization at -14 db LUFS.
    audio_write(f'{idx}', one_wav.cpu(), model.sample_rate, strategy="loudness", loudness_compressor=True)

Prompt Format

Follow the following prompt format:

VB.NET
{tag_1}, {tag_2}, ..., {tag_n}, {key}, {bpm} bpm

For example:

Arduino
hip hop, soul, piano, chords, jazz, neo jazz, G# minor, 140 bpm

For some example tags, see the prompt format section of musicgen-songstarter-v0.1's readme. The tags there are for the smaller v1 dataset, but should give you an idea of what the model saw.

Samples

Audio PromptText PromptOutput
Your browser does not support the audio element. trap, synthesizer, songstarters, dark, G# minor, 140 bpm Your browser does not support the audio element.
Your browser does not support the audio element. acoustic, guitar, melody, trap, D minor, 90 bpm Your browser does not support the audio element.

Training Details

For more verbose details, you can check out the blogpost.

  • code:
    • Repo is here. It's an undocumented fork of facebookresearch/audiocraft where I rewrote the training loop with PyTorch Lightning, which worked a bit better for me.
  • data:
    • around 1700-1800 samples I manually listened to + purchased via my personal Splice account. About 7-8 hours of audio.
    • Given the licensing terms, I cannot share the data.
  • hardware:
    • 8xA100 40GB instance from Lambda Labs
  • procedure:
    • trained for 10k steps, which took about 6 hours
    • reduced segment duration at train time to 15 seconds
  • hparams/logs:
    • See the wandb run, which includes training metrics, logs, hardware metrics at train time, hyperparameters, and the exact command I used when I ran the training script.

Acknowledgements

This work would not have been possible without:

  • Lambda Labs, for subsidizing larger training runs by providing some compute credits
  • Replicate, for early development compute resources

Thank you ❤️

Liens & Ressources
Spécifications
CatégorieAudio
AccèsAPI & Local
LicenceOpen Source
TarificationOpen Source
Note
2.8

Essayer musicgen songstarter v0.2

Accédez directement au modèle