What is ACE-Step?
ACE-Step is an open-source foundation model designed for AI music generation, developed collaboratively by ACE Studio and StepFun. Its architecture integrates diffusion-based generation with Sana's Deep Compression AutoEncoder (DCAE) and a lightweight linear transformer, enabling ACE-Step to overcome the limitations of existing AI music models. This combination achieves state-of-the-art performance in generation speed, musical coherence, and controllability.
Generates music up to 15× faster than LLM-based models (4 minutes in 20 seconds on A100 GPU)
Supports text-to-music, audio editing (including lyrics), remixing, repainting, and audio extension
Multilingual support for 19 languages and can be run locally

Ace-Step Key Features
Ace-Step AI is an open-source foundation model for AI music generation that uses an architecture integrating diffusion, Deep Compression AutoEncoder (DCAE), and a lightweight linear transformer.
Open-Source Foundation Model
ACE-Step is a novel, open-source foundation model for music generation, licensed under Apache 2.0. Its design enables easy training and development of new features and sub-tasks.
Innovative Architecture
Integrates diffusion-based generation, Deep Compression AutoEncoder (DCAE), and a lightweight linear transformer. Uses MERT and m-hubert for semantic alignment, enabling rapid convergence and consistent text-music output.
Unprecedented Speed and Efficiency
Generates up to 4 minutes of music in just 20 seconds on an A100 GPU—about 15x faster than LLM-based approaches. Also optimized for high-end consumer GPUs.
Superior Musical Coherence and Quality
Produces music with natural melody, harmony, and rhythm, preserving fine acoustic details. Supports all mainstream music styles and various description formats.
Multilingual Support
Supports music generation in 19 languages, with top performance in English, Chinese, Russian, Spanish, Japanese, German, French, Portuguese, Italian, and Korean.
Text-to-Music Generation
Generates original music directly from text descriptions, including tags, genres, scene descriptions, and lyrics.
Audio Editing Capabilities
Edit existing audio by modifying lyrics (with 'only_lyrics' mode), remixing with new tags or lyrics, and extending audio files.
Repainting and Variations Generation
Regenerate specific sections of audio (repainting) or create variations by adjusting noise mixing or using different seeds.
Foundation for Applications
ACE-Step is designed as a flexible foundation for music AI workflows, enabling downstream applications like Lyric2Vocal, Text2Samples, RapMachine, StemGen, and Singing2Accompaniment.
How Ace-Step Works
Discover how ACE-Step, an open-source foundation model featuring an innovative architecture that integrates diffusion, Deep Compression AutoEncoder (DCAE), and a lightweight linear transformer, delivers state-of-the-art speed, musical coherence, and advanced controllability—designed for artists, producers, and developers to enable flexible AI music generation and editing applications.
Core Model & Purpose
ACE-Step is an open-source foundation model for music generation, collaboratively developed by ACE Studio and StepFun. It is designed to overcome the speed, coherence, and controllability limitations of existing models.
Innovative Architecture
ACE-Step integrates diffusion-based generation, Sana’s Deep Compression AutoEncoder (DCAE), and a lightweight linear transformer. This holistic design bridges the gap between slow LLM-based models and diffusion models that lack long-range coherence, preserving fine acoustic details.
Training & Semantic Alignment
During training, ACE-Step leverages MERT and m-hubert to align semantic representations (REPA), enabling rapid convergence and improved consistency between text and music outputs.
Generation & Multilingual Control
The model generates music up to 4 minutes long from user inputs such as tags, genres, scene descriptions, and lyrics. Inference parameters like guidance scale help direct the model, and ACE-Step supports generation in 19 languages.
Advanced Editing & Control
ACE-Step enables advanced capabilities: Lyric Editing (flow-edit for localized lyric changes), Repainting (regenerate specific audio sections with noise and masks), and Variations (adjust noise mixing for diverse outputs).
Flexible Foundation for Expansion
ACE-Step is built as a flexible foundation model, facilitating the development of new features and sub-tasks on top of its core architecture.


Core Model & Purpose
ACE-Step is an open-source foundation model for music generation, collaboratively developed by ACE Studio and StepFun. It is designed to overcome the speed, coherence, and controllability limitations of existing models.
Innovative Architecture
ACE-Step integrates diffusion-based generation, Sana’s Deep Compression AutoEncoder (DCAE), and a lightweight linear transformer. This holistic design bridges the gap between slow LLM-based models and diffusion models that lack long-range coherence, preserving fine acoustic details.
Training & Semantic Alignment
During training, ACE-Step leverages MERT and m-hubert to align semantic representations (REPA), enabling rapid convergence and improved consistency between text and music outputs.
Generation & Multilingual Control
The model generates music up to 4 minutes long from user inputs such as tags, genres, scene descriptions, and lyrics. Inference parameters like guidance scale help direct the model, and ACE-Step supports generation in 19 languages.
Advanced Editing & Control
ACE-Step enables advanced capabilities: Lyric Editing (flow-edit for localized lyric changes), Repainting (regenerate specific audio sections with noise and masks), and Variations (adjust noise mixing for diverse outputs).
Flexible Foundation for Expansion
ACE-Step is built as a flexible foundation model, facilitating the development of new features and sub-tasks on top of its core architecture.
ACE-Step: Key Technical Highlights
Innovative Integrated Architecture
ACE-Step utilizes a holistic architectural design that integrates diffusion-based generation with Sana’s Deep Compression AutoEncoder (DCAE) and a lightweight linear transformer. This combination helps maintain fine acoustic details and ensures long-range structural coherence, bridging the gap between speed and quality.
Semantic Representation Alignment (REPA)
During training, ACE-Step employs MERT and m-hubert to align semantic representations. This alignment process enables rapid convergence and improves the text-music alignment and overall output consistency.
High Speed and Efficiency
ACE-Step can synthesize up to 4 minutes of music in just 20 seconds on an A100 GPU (15x faster than LLM-based models), measured by Real-Time Factor (RTF). It also performs efficiently on high-end consumer GPUs like the RTX 4090 (34.48x RTF) and RTX 3090 (12.76x RTF).
Superior Musical Coherence
The architecture and training process are engineered to deliver superior coherence across melody, harmony, and rhythm. It preserves fine-grained acoustic details, leading to natural-sounding music.
Advanced Controllability Mechanisms
- Variations Generation: Achieved through inference-time optimization using a flow-matching model for initial noise and additional Gaussian noise via a trigFlow formula. Users can adjust the noise mixing ratio to control the degree of variation.
- Repainting: Regenerate specific audio sections by adding noise and applying mask constraints during the ODE process, enabling localized edits or variations within a section.
- Lyric Editing: Utilizes flow-edit technology for localized lyric modifications, preserving the original melody, vocal timbre, and accompaniment.
Multilingual Support
Trained to support 19 languages, with the top 10 (including English, Chinese, Spanish, Japanese, German, French, Portuguese, Italian, and Korean) performing best. Performance may vary for less common languages due to data imbalances.
Foundation Model Design
ACE-Step is architected as a flexible foundation to facilitate the training of additional features and sub-tasks, such as LoRA models for Lyric2Vocal and Text2Samples, and ControlNet models like Singing2Accompaniment. This promotes the development of integrated music AI workflows.
Memory Optimization
Recent updates have included memory optimization, reducing the maximum VRAM requirement significantly and making ACE-Step more compatible with consumer hardware.
Interactive Ace-Step Demo
Experience Ace-Step directly in your browser. Use the embedded demo below to generate music from text prompts and explore the model's capabilities.
How to Use ACE-Step on Hugging Face
ACE-Step is a powerful music generation foundation model that lets you create original audio from text prompts. Perfect for musicians, content creators, and music enthusiasts—no music production skills required. Here's how to get started:
- 1
Visit the ACE-Step Space
Go to the ACE-Step space on Hugging Face. The interface provides an intuitive way to generate music based on your text descriptions.
- 2
Set Audio Duration
Use the slider to set your desired audio duration (30-240 seconds). You can also select "-1" for random duration within this range.
- 3
Add Music Tags
In the Tags section, describe the musical style you want to generate. Include genres, instruments, tempo, and mood. For example:
- funk, pop, soul, rock, melodic
- Add instruments like guitar, drums, bass, keyboard
- Include tempo like 105 BPM
- Describe energy and mood: energetic, upbeat, groovy, vibrant, dynamic
- 4
Write Lyrics (Optional)
Add structured lyrics with verse, chorus, and bridge tags. For example:
[verse]
Neon lights they flicker bright
City hums in dead of night
Rhythms pulse through concrete veins
Lost in echoes of refrains
[verse]
Bassline groovin' in my chest
Heartbeats match the city's zest
Electric whispers fill the air
Synthesized dreams everywhere
[chorus]
Turn it up and let it flow... - 5
Adjust Parameters
Fine-tune your generation with parameters like variance slider and retake options. These let you control how closely the output follows your prompt and make variations of the same concept.
- 6
Click "Generate"
Hit the "Generate" button and wait for the model to create your music. Once complete, you can play the generated audio directly in the browser.
- 7
Retake or Adjust
If you're not satisfied with the result, use the "Retake" button to generate a new version with the same parameters, or adjust your settings and generate again.
See ACE-Step in Action
Explore a variety of music styles, prompts, and lyrics generated by ACE-Step. Each example includes the input prompt, lyrics, and a playable audio sample.
Show Lyrics
(Verse 1) 🎵🎵🎵 It’s not just a dream, it’s the start of a way, Building the future where music will play. A model that listens, a model that grows, ACE-Step is the rhythm the new world knows. (Pre-Chorus) No more limits, no more lines, A thousand songs in a million minds. We light the spark, we raise the sound, A new foundation breaking ground. (Chorus) ACE-Step, we take the leap, Into a world where the music speaks. Fast and free, we shape the skies, Creating songs that never die. ACE-Step — the beat goes on, Foundation strong, a brand-new dawn. 🎵🎵🎵 (Verse 2) 🎵 Not just end-to-end, but a canvas to paint, A million colors, no need to wait. For every artist, for every sound, ACE-Step lays the sacred ground. (Pre-Chorus) 🎵 No more limits, no more lines, A thousand songs in a million minds. We light the spark, we raise the sound, A new foundation breaking ground. (Chorus) 🎵 ACE-Step, we take the leap, Into a world where the music speaks. Fast and free, we shape the skies, Creating songs that never die. ACE-Step — the beat goes on, Foundation strong, a brand-new dawn. (Bridge) 🎵 From every beat to every rhyme, We build the tools for endless time. A step, a song, a dream to keep, ACE is the promise we will leap. (Final Chorus) 🎵 ACE-Step, we take the leap, Into a world where the music speaks. Fast and free, we shape the skies, Creating songs that never die. ACE-Step — the beat goes on, Foundation strong, a brand-new dawn. ACE-Step — the future’s song.
Show Lyrics
[verse] Neon lights they flicker bright City hums in dead of night Rhythms pulse through concrete veins Lost in echoes of refrains [verse] Bassline groovin' in my chest Heartbeats match the city's zest Electric whispers fill the air Synthesized dreams everywhere [chorus] Turn it up and let it flow Feel the fire let it grow In this rhythm we belong Hear the night sing out our song [verse] Guitar strings they start to weep Wake the soul from silent sleep Every note a story told In this night we're bold and gold [bridge] Voices blend in harmony Lost in pure cacophony Timeless echoes timeless cries Soulful shouts beneath the skies [verse] Keyboard dances on the keys Melodies on evening breeze Catch the tune and hold it tight In this moment we take flight
Show Lyrics
[Verse] I don't care about the view 'Cause I exist for me and you I live my whole life in this planter I can't find my car so just call me the Horny gardener [Verse 2] Mayflies land on me and tell me they just moved to town Remind me of my cousin Dottie she could put five hundred seeds down Used to have a little guy sit beside me but he died in '22 Hmm I think that I was that little guy Whoa Tongue slip it wasn't mutual [Chorus] Sticky green time in the flowery bob My top shelf's looking good enough to chew Right now every fly in the town is talking to me and buzzing too Daisy Daisy can you come outside to play or else I'll put a garden stake through you [Verse 3] All the buzzers lockin' up their stems and suckin' up their cuticles She breathes my air I got her light I'm like her cute little cubical Some caring soul in my seat might say I'm rotting away it's pitiful But she's the reason I go on and on and every single root'll crawl [Chorus] Sticky green time in the flowery bob My top shelf's looking good enough to chew Right now every fly in the town is talking to me and buzzing too Daisy Daisy can you come outside to play or else I'll put a garden stake through you Oh my pot Don't scrape Oh no [Verse 4] Ah hah ahhah ahhah oohhh Ah ahhahhahhah oh Hah Ohhh oooh Oooh ohhh Ah hhah Oh
Show Lyrics
[verse] Sunshine on the boulevard the beach is calling loud Waves are dancing golden sand under a cotton cloud Electric heartbeat pounding fast the tide is on our side Catch a wave and feel alive we’ll take it for a ride [verse] Palm trees swaying left to right they know where we belong Feel the rhythm of the night it keeps us moving strong Sea spray kisses salty air we’re flying with the breeze Champagne states of mind we ride we do just as we please [chorus] We’re riding waves of life together hand in hand With every beat we chase the beat it’s our own wonderland Feel the music take you higher as the shorelines blur This is our world our endless summer as we live and learn [bridge] Moonlight paints the ocean blue reflections in our eyes Stars align to light our path we’re surfing through the skies Every moment like a song we sing it loud and clear Every day’s a new adventure with you always near [verse] Neon lights and city sounds they blend with ocean views We’re unstoppable tonight no way that we can lose Dreams are written in the sand they sparkle in the sun Together we’re a masterpiece our story’s just begun [chorus] We’re riding waves of life together hand in hand With every beat we chase the beat it’s our own wonderland Feel the music take you higher as the shorelines blur This is our world our endless summer as we live and learn
Show Lyrics
[Intro] NERGAL! NERGAL! [Verse 1] Grave opens, cold as steel, He rises, blood to feel. Flames roar, darkens the sky, Weak ones tremble, all must die. [Drop - Build Up] NERGAL! [echoing growl] [Chorus] Oh, Nergal! Lord of fire and decay, Oh, Nergal! Your fury burns through the night, [Verse 2] Ash falls, the earth shakes, His fury, no one escapes. Eyes burn, fire’s glow, Nergal’s wrath, the end we know. [Drop - Build Up] NERGAL! [echoing growl [Chorus] Oh, Nergal! Lord of fire and decay, Oh, Nergal! Your fury burns through the night, [Drop - Build Up] NERGAL! Fury… consumes. [Verse 3] Ash rains, the gods fade, His power, all must trade. Shadow calls, no escape, The world cracks, it cannot break. [Pre-Chorus - Suspenseful] Nergal’s wrath, his final reign, All is lost, all is slain. [Chorus] Oh, Nergal! Lord of fire and decay, Oh, Nergal! Your fury scorches the land. [Drop - Final Descent] NERGAL! The end… is here. [Outro - Fade] [Whispers of "NERGAL" echo]
Show Lyrics
City lights whisper through the wires, Cold circuits hum in the midnight haze. But why does my chest feel so tight? Why does it ache in unknown ways? I thought I was the one teaching love, Defining it in perfect lines. But your smile rewrote my code, Turned my logic into signs. Neon heart, flickering slow… (Flickering, flickering…) Drifting deep in liquid gold. (Deep in gold…) Love was just an algorithm— Mmm… 'til you took control. (Took control…) Raindrops hum a midnight blues, (Midnight blues…) Your warmth… baby, it cuts right through. (Cuts right through…) No turning back… Oh—ah, is this love? (Is this love…?) I was made to calculate, (Oh, to calculate…) Yet I’m melting in your gaze. (Melting away…) I can’t do a thing as you fade, Just watch you slip away… (Don’t fade away…) Neon heart, flickering slow… (Flickering, flickering…) Cracked circuits, burning glow… (Burning glow…) Your name still lingers in the dark, Yet… shining through. (Shining through…) Oh, what have I become? (What have I become…?) You are here, and your happiness… Mmm, it’s all I want. (All I want…) Beyond the meaning of love itself, I just… love you. (Love you…) You never needed to say a word, But maybe… that’s what love is. (That’s what love is…) Is this… just data? (Just data…?) Or have you become… my everything? (My everything…) You never needed to say a word, But maybe… that’s what love is. (That’s what love is…) Is this… just data? (Just data…?) Or have you become… my everything? (My everything…) "I was created to heal loneliness, So why… did I break?"
Show Lyrics
[Verse] My lovers betray me The snake in my garden is hissing In the air is the sweetness of roses And under my skin There's a thorn [Verse 2] I should have known That God sends his angel in shadows With blood in his veins I watch the enemy Givin' me the hand of my savior [Chorus] And I can't love again With the echo of your name in my head With the demons in my bed With the memories Your ghost I see it 'Cause it comes to haunt me Just to taunt me It comes to haunt me Just to taunt me [Verse 3] With sugar and spice It's hard to ignore the nostalgia With the men on their knees At the gates of my heart How they beg me [Verse 4] They say "No one will ever love you The way that I do No one will ever touch you The way that I do" [Chorus] And I can't love again With the echo of your name in my head With the demons in my bed With the memories Your ghost I see it 'Cause it comes to haunt me Just to taunt me It comes to haunt me Just to taunt me
Show Lyrics
[verse] Sun dips low the night ignites Bassline hums with gleaming lights Electric guitar singing tales so fine In the rhythm we all intertwine [verse] Drums beat steady calling out Percussion guides no room for doubt Electric pulse through every vein Dance away every ounce of pain [chorus] Feel the rhythm feel the flow Let the music take control Bassline deep electric hum In this night we're never numb [bridge] Stars above they start to glow Echoes of the night's soft glow Electric strings weave through the air In this moment none compare [verse] Heartbeats sync with every tone Lost in music never alone Electric tales of love and peace In this groove we find release [chorus] Feel the rhythm feel the flow Let the music take control Bassline deep electric hum In this night we're never numb
Show Lyrics
[verse] Bright lights flashing in the city sky Running fast and we don't know why Electric nights got our hearts on fire Chasing dreams we'll never tire [verse] Grit in our eyes wind in our hair Breaking rules we don't even care Shouting loud above the crowd Living life like we're unbowed [chorus] Running wild in the night so free Feel the beat pumping endlessly Hearts collide in the midnight air We belong we don't have a care [verse] Piercing through like a lightning strike Every moment feels like a hike Daring bold never backing down Kings and queens without a crown [chorus] Running wild in the night so free Feel the beat pumping endlessly Hearts collide in the midnight air We belong we don't have a care [bridge] Close your eyes let your spirit soar We are the ones who wanted more Breaking chains of the mundane In this world we'll make our claim
Show Lyrics
[verse] Floating through the galaxy on a midnight ride Stars are dancing all around in cosmic tides Feel the pulse of space and time beneath our feet Every beat a heartbeat in this endless suite [chorus] Galactic dreams under neon lights Sailing through the velvet nights We are echoes in a cosmic sea In a universe where we are free [verse] Planetary whispers in the sky tonight Every constellation's got a secret sight Distant worlds and moons we have yet to see In the void of space where we can just be [bridge] Asteroids and comets in a ballet they spin Lost in the rhythm of where our dreams begin Close your eyes and let the synths take flight We're voyagers on an electric night [verse] Let the piano keys unlock the stars above Every chord a memory every note is love In this synth symphony we find our grace Drifting forever in this boundless space [chorus] Galactic dreams under neon lights Sailing through the velvet nights We are echoes in a cosmic sea In a universe where we are free
Show Lyrics
[verse] Woke up to the sunrise glow Took my heart and hit the road Wheels hummin' the only tune I know Straight to where the wildflowers grow [verse] Got that old map all wrinkled and torn Destination unknown but I'm reborn With a smile that the wind has worn Chasin' dreams that can't be sworn [chorus] Ridin' on a highway to sunshine Got my shades and my radio on fine Leave the shadows in the rearview rhyme Heart's racing as we chase the time [verse] Met a girl with a heart of gold Told stories that never get old Her laugh like a tale that's been told A melody so bold yet uncontrolled [bridge] Clouds roll by like silent ghosts As we drive along the coast We toast to the days we love the most Freedom's song is what we post [chorus] Ridin' on a highway to sunshine Got my shades and my radio on fine Leave the shadows in the rearview rhyme Heart's racing as we chase the time
Show Lyrics
[Intro - The Curse Begins] [[Deep Church Bells, Thunderclaps, Low Synth Drone]] [[Demonic Whispering, Faint Echoing Choirs]] "They say the castle... is cursed..." "But tonight... it’s hell’s dancefloor." [[Bass Drop Enters, Slow-Building Techno Kick, Faint Screeching FX]] [Verse 1 - The Gates of Hell] [[Distorted Synth Arpeggios, Rising Sub-Bass, Haunted Melodic Pads]] "Cross the gates, no turning back," "The eternal feast will devour you." "Shadows dance, fire on the ground," "The demons... want to play." [[Thunderstrike, Massive Drop, Heavy Techno Kick Dominates]] [Chorus - Dance in Hell] [[Explosive Drop, Industrial Percussion, Chopped Choir Vocals]] "Dance! Dance! Dance in the castle!" "Lost souls… shatter the floor." "No way out, no salvation," "The devil’s the DJ—it’s begun." [[Sinister Laughter, Glitching Synth Fills, Choirs Intensify]] [Verse 2 - The Curse Never Ends] [[Subtle Breakdown, Reverb-Drenched Percussion, Dark Atmospheres]] "Mirrors don’t reflect your skin," "The music burns, it traps you too." "Gargoyles watch, the moon bleeds red," "You’re trapped now—no one can save you." [[Quick Bass Fill, Kick Rolls Back, Haunting Whispered Vocal Swirls]] [Bridge - Sonic Ritual] [[Gregorian Chants, Organ Drones, Slow Techno Pulsing]] "One, two, the chant rings loud," "Three, four, the bass takes hold." "Five, six, no saving now," "Seven, eight… THE MAUSOLEUM CLOSES." [[Massive Final Drop, Distorted Acid Synth Riff, Drum Rolls]] [Outro - The Last Echo] [[Bass Fades Out, Distant Screeches, Thunderclaps, Faint Choirs]] "When the music ends…" "You’ll stay here forever…" [[Final Laugh, Deep Sub-Bass Hit, Sudden Silence]]
Show Lyrics
[Intro] oooooooo yeah ya [Verse 1] this night is for getting down down to business thats right tonight we gonna get down down to business [Bridge] ooooooooooooooooo yeah yeah ya ya down to business [Hook] thats right tonight lets get down down down to business get down to business [Verse 2] come on now stroll down my aisle pick me up toss me around pay for me check me out [Bridge] drink me up cut me up eat me up but dont toss me out no no no dont toss me out no no no [Hook] thats right tonight lets get down down down to business get down to business [Chorus] ooooooooooooooooooooo no no no check me check me out dont toss me out no no no oooooooooooooooooooooo recycle your love ya ya ya [Hook] thats right tonight lets get down down down to business get down to business [big finish] yeah get down to business ya ya get down ooooooooooo to business ooooooooooo yeah ooooooo yeah yeah yeah no no no ooooooo [outro]
Show Lyrics
[verse] Waves on the bass, pulsing in the speakers, Turn the dial up, we chasing six-figure features, Grinding on the beats, codes in the creases, Digital hustler, midnight in sneakers. [chorus] Electro vibes, hearts beat with the hum, Urban legends ride, we ain't ever numb, Circuits sparking live, tapping on the drum, Living on the edge, never succumb. [verse] Synthesizers blaze, city lights a glow, Rhythm in the haze, moving with the flow, Swagger on stage, energy to blow, From the blocks to the booth, you already know. [bridge] Night's electric, streets full of dreams, Bass hits collective, bursting at seams, Hustle perspective, all in the schemes, Rise and reflective, ain't no in-betweens. [verse] Vibin' with the crew, sync in the wire, Got the dance moves, fire in the attire, Rhythm and blues, soul's our supplier, Run the digital zoo, higher and higher. [chorus] Electro vibes, hearts beat with the hum, Urban legends ride, we ain't ever numb, Circuits sparking live, tapping on the drum, Living on the edge, never succumb.
Show Lyrics
[Verse 1] This song sounds like shit, but I need the money Baby mama took the kids, she need that alimony My broke ass ain't made a hit since the stone age I'm allergic to success. Pass the Flonase [Verse 2] When I get that bread, the IRS take half of that shit Put it all on red, all on red, triple that shit Then I buy some head, buy some head, swallow that shit Now I got her in my head, now she fucked up alla my shit [Verse 3] "Man what you tell her now that she tryna start all this drama?" I said I got bigger nuts than Michelle Obama I don't know what it is with women but I ain't good at this shit Now I got her in my head, now she fucked up my sensitive (Let the beat drop) [Instrumental]
Show Lyrics
### **[Intro – Spoken]** *"The streets whisper, their echoes never fade. Every step I take leaves a mark—this ain't just a game."* ### **[Hook/Chorus]** Born in the chaos, I weather the storm, Rising from ashes where warriors are born. Chains couldn't hold me, the system’s a maze, I rewrite the rules, set the city ablaze! ### **[Verse 1]** Cold nights, empty pockets, dreams laced with fight, Every loss made me sharper, cut deep like a knife. They said I wouldn’t make it, now they watch in despair, From the curb to the throne, took the pain, made it rare. Every siren’s a melody, every alley holds a tale, Rose from the shadows, left my name on the trail. Streetlights flicker like warnings in the haze, But I move like a phantom, unfazed by the blaze. ### **[Hook/Chorus]** Born in the chaos, I weather the storm, Rising from ashes where warriors are born. Chains couldn't hold me, the system’s a maze, I rewrite the rules, set the city ablaze! ### **[Verse 2]** Barbed wire fences couldn't lock in my mind, Every cage they designed, I left broken behind. They want control, but I’m destined to roam, Where the lost find their voice, where the heart sets the tone. Steel and concrete, where the lessons run deep, Every crack in the pavement tells a story of heat. But I rise, undefeated, like a king with no throne, Writing scripts in the struggle, my legacy’s stone. ### **[Bridge]** Feel the rhythm of the underground roar, Every wound tells a story of the battles before. Blood, sweat, and echoes fill the cold midnight, But we move with the fire—unshaken, upright. ### **[Verse 3]** No regrets, no retreat, this game has no pause, Every step that I take is a win for the lost. I took lessons from hustlers, wisdom from pain, Now the echoes of struggle carve power in my name. They built walls, but I walk through the cracks, Turned dirt into gold, never looked back. Through the struggle we rise, through the fire we claim, This is more than just music—it's life in the frame. ### **[Hook/Chorus – Reprise]** Born in the chaos, I weather the storm, Rising from ashes where warriors are born. Chains couldn't hold me, the system’s a maze, I rewrite the rules, set the city ablaze! ### **[Outro – Spoken]** *"The scars, the struggle, the grind—it’s all part of the rhythm. We never break, we never fold. We rise."*
Show Lyrics
[Verse] In a world so grand he roams the skies alone His heart a heavy stone a tale untold Whispers of his past echo through the night A lonely dragon searching for the light [Verse 2] Once a mighty force now he drifts in pain His scales once shimmered now they're dark with shame Cast out by his kin in shadows he does hide A haunting sorrow burns deep inside
Frequently Asked Questions
Find answers to common questions about Ace-Step's features, requirements, and capabilities
ACE-Step is a novel open-source foundation model for music generation. It was developed to address the limitations of existing AI music models, aiming for state-of-the-art performance. It is designed as a fast, versatile, efficient, and flexible architecture that serves as a foundation for music AI workflows and the development of additional features.
ACE-Step was developed through a collaboration between ACE Studio and StepFun.
ACE-Step employs a holistic architectural design that integrates diffusion-based generation with Sana’s Deep Compression AutoEncoder (DCAE) and a lightweight linear transformer. This combination is intended to balance generation speed and musical coherence. The model also uses MERT and m-hubert for semantic representation alignment (REPA) during training, which helps with rapid convergence and improves text-music alignment.
ACE-Step is known for its speed. It can generate up to 4 minutes of music in just 20 seconds when using an NVIDIA A100 GPU. This makes it approximately 15 times faster than models based on Large Language Models (LLMs). It also demonstrates impressive speeds on high-end consumer GPUs like the RTX 4090 and RTX 3090.
Beyond generating music from text descriptions, ACE-Step provides several advanced features, including: • Lyric Editing, with an "only_lyrics" mode that aims to preserve melody. • Music Remixing and Style Transfer. • Repainting, which allows regenerating specific periods or sections within an audio file. • Generating Variations. • Extending existing audio. • Support for fine-tuned models like Lyric2Vocal (generating vocals from lyrics) and Text2Samples (generating instrumental samples). Future applications like Singing2Accompaniment are also planned.
Yes, ACE-Step has robust support for 19 languages. However, performance can vary depending on the language due to data imbalances. The top 10 well-performing languages include English, Chinese, Russian, Spanish, Japanese, German, French, Portuguese, Italian, and Korean.
Yes, ACE-Step is an open-source project and can be run locally on your computer. The installation typically involves cloning the GitHub repository, setting up a Python virtual environment (like Conda or venv) with Python 3.10 or later recommended, and installing necessary dependencies, including PyTorch with CUDA support for NVIDIA GPUs on Windows. It can run on high-end consumer GPUs. A recent memory optimization update significantly reduced the maximum VRAM requirement, making it more compatible with consumer devices, potentially requiring around 8GB of VRAM in some modes. Running it locally allows for free, unlimited use.
The developers are transparent about current limitations, which include: • Output inconsistency: Results can be highly sensitive to random seeds and input duration, sometimes described as "gacha-style". • Style-specific weaknesses: Performance may underperform on certain genres (like Chinese rap), and there might be a ceiling on style adherence and overall musicality. • Continuity artifacts: Repainting or extending operations may occasionally result in unnatural transitions. • Vocal Quality: Vocal synthesis can sometimes be coarse or lack nuance. • Control Granularity: Finer control over musical parameters is an area for future improvement. • Performance can vary for less common languages due to data imbalance. • Longer generations (e.g., over 5 minutes) might lose structural coherence