About ACE-Step
ACE-Step is a revolutionary open-source foundation model for AI music generation, created through a collaborative effort between ACE Studio and StepFun. By combining their expertise, ACE-Step pushes the boundaries of creative AI and empowers a new era of music creation.
Our Vision: The Dawn of a Music AI Foundation
Our goal is to establish a foundation model for music AI—akin to the "Stable Diffusion moment for music" in the image generation space. ACE-Step is designed to be fast, general-purpose, efficient, and flexible, serving as a robust base for further innovation. Its architecture makes it easy to train additional features and sub-tasks on top of the core model, empowering creators and fostering rapid progress in the AI music landscape.
Our Approach: Innovative Integrated Architecture
ACE-Step is an architectural design that integrates key AI technologies to overcome the traditional trade-offs between generation speed, musical coherence, and controllability.
- Integrated Technologies: ACE-Step combines diffusion-based generation with Sana’s Deep Compression AutoEncoder (DCAE) and a lightweight linear transformer. This innovative blend allows ACE-Step to maintain fine acoustic details while ensuring long-range structural coherence.
- Semantic Alignment: During training, we utilize MERT and m-hubert for semantic representation alignment (REPA), facilitating rapid convergence and improving the alignment between text and music.
What Sets Us Apart: Speed, Quality, and Control
- Unprecedented Speed: ACE-Step can synthesize up to 4 minutes of music in just 20 seconds on an A100 GPU—about 15x faster than LLM-based models. It also performs efficiently on high-end consumer GPUs like the RTX 4090 and 3090. Recent memory optimizations have reduced VRAM requirements, potentially down to around 8GB in some configurations, making local use more accessible.
- Superior Musical Coherence: The model delivers superior coherence across melody, harmony, and rhythm, producing natural-sounding music and preserving fine-grained acoustic details.
- Advanced Controllability: ACE-Step offers remarkable control features:
- Variations Generation: Create slightly different versions by adjusting noise mixing ratios during inference.
- Repainting: Regenerate specific sections of audio by adding noise and applying mask constraints.
- Lyric Editing: Modify lyrics while aiming to preserve the original melody and rhythm using flow-edit technology.
- Diverse Applications: Supports fine-tuned models like Lyric2Vocal (generating vocals from lyrics) and Text2Samples (generating instrumental samples). Future capabilities on the roadmap include RapMachine, StemGen, and Singing2Accompaniment (creating instrumental backing for vocals).
- Multilingual Support: ACE-Step supports 19 languages, enabling creators worldwide. The top 10 well-performing languages are English, Chinese, Russian, Spanish, Japanese, German, French, Portuguese, Italian, and Korean.
Open Source & Empowering Creators
ACE-Step is an open-source project, with code and models available on GitHub and Hugging Face. Our commitment to openness fosters a vibrant community and accelerates innovation. We encourage responsible use, including verifying originality, disclosing AI involvement, and respecting copyrights and cultural elements. ACE-Step is designed to support positive and artistic use cases across creative production, education, and entertainment.
For Artists, Producers, and Developers
ACE-Step is designed as a flexible tool to integrate into the creative workflows of music artists, producers, and content creators. Whether you're generating quick demos, creating instrumental loops, editing lyrics on existing songs, or experimenting with new sounds, ACE-Step provides the speed, quality, and control to bring your musical ideas to life.
Get Started
- Explore the Code: Find the ACE-Step code on GitHub and the model files on Hugging Face.
- Install Locally: Follow the instructions in the repositories to install and run ACE-Step on your own hardware.
- Try the Demo: Experience ACE-Step in action with the Hugging Face demo.
Note: ACE-Step is an open-source project by ACE Studio and StepFun. For the latest updates, documentation, and community discussions, please visit the official GitHub and Hugging Face pages.