Name: MusicGen
Author: Meta

Meta's AI music generation model available in 300M, 1.5B, and 3.3B parameter sizes. Single-stage auto-regressive Transformer trained over 32kHz EnCodec tokenizer with 4 codebooks sampled at 50Hz. Trained on 20K hours of licensed music from internal dataset (10K high-quality tracks) plus ShutterStock and Pond5 collections (390K instrument-only tracks). Eliminates cascading model requirement through efficient token interleaving - only 50 auto-regressive steps per second of audio. Supports both text-to-music and melody-guided generation. Evaluated on MusicCaps benchmark showing superiority vs baselines. Released April-May 2023. Part of AudioCraft toolkit alongside AudioGen and EnCodec.

MusicGen

Strengths

Caveats

Capabilities

Resources

Reviews

Comments